Arora Portfolio

Education

Master of Computer Application From MDU (Distance)

2018-2020

Motivated professional pursuing Master of Computer Applications through distance learning while managing job commitments.

Bachelor of Computer Applications From KUK

2015-2018

Enthusiastic BCA graduate with a keen interest and proficiency in programming languages such as C and Java. Solid foundation in computer science concepts and a passion for software development.

High School From CBSE

2014-2015

I passed high school 12th standard with 7.8 CGPA in Commerce.

Work Experience

Senior DevOps Engineer at EPAM Systems (LSEG)

Aug'2023 - Present

Responsibilities :

At LSEG, initially joined the LSE project where I led the end-to-end migration of CI/CD pipelines from Jenkins to GitLab CI, improving pipeline consistency and reliability. Integrated multiple SAST tools including Semgrep, SonarQube, and Black Duck to enforce security and code quality standards.
Designed and implemented custom GitLab CI/CD pipeline templates to standardize build, test, security, and deployment stages across 240 microservice repositories.
Established scalable pipeline governance with branching rules, reusable jobs, and environment-specific deployment strategies for staging, UAT, and production
Deployed and managed GitLab private runners on Amazon EKS to enable scalable, containerized CI/CD workloads with Kubernetes-native resource control and auto-scaling.

Successfully transitioned ownership of CI/CD pipeline management to another team with proper documentation, ensuring smooth continuity, and moved on to a new internal initiative: Research Bulk Documents project.
Developed and executed migration scripts to transition source code repositories from Bitbucket and legacy GitLab instances to a modern GitLab setup, ensuring version consistency and minimal downtime.
Managed CI/CD pipelines for over 700 repositories across three application IDs, covering all environments (Dev, QA, Release, and Production), and implemented branching strategies and protection rules to streamline safe and compliant production deployments.
Set up Nginx Ingress Controller with Let's Encrypt Certificate Manager for secure traffic routing of various microservices in the Kubernetes cluster. This configuration ensured efficient load balancing and automated SSL certificate management, enhancing security and reliability.
Architected GitLab pipelines to clearly separate CI and CD workflows into different GitLab groups. CI pipelines handle code compilation and artifact/image generation, while CD pipelines handle versioned deployments using CI-generated artifacts.
Standardized and simplified GitLab CI/CD templates, enabling developers to quickly onboard new services or repositories by referencing reusable templates with minimal environment configuration.
Provided end-to-end support to development and QA teams, including infrastructure troubleshooting, CI/CD debugging, application-level log analysis, and alarm/alert resolution based on severity levels.
Created and managed infrastructure repositories, authored Terraform (IaC) to onboard new services efficiently and maintain cloud resource consistency.
Administered and automated TLS/SSL certificate management (including yearly renewals) across all production and internal HTTPS endpoints, and developed pipelines to handle certificate provisioning and renewal across all application IDs.
Built pipelines to create golden AMIs by installing the latest packages from the Cloud SRE team, and regularly refreshed EC2 instances to align with updated security baselines.
Improved AWS resource tracking by enforcing tagging and naming conventions, tagging resources with app IDs and repository references for improved visibility and cost management.
Reduced infrastructure costs by automating scale-down and shutdown of non-production environments on weekends or during inactivity windows.
Wrote and executed production change management documentation, including risk assessments, implementation and rollback plans, customer communication plans, and coordinated L2 deployments through incident and change tickets.
Regularly engaged with the Cloud Custodian team to ensure infrastructure compliance with internal security policies and enforced remediation actions where required.
Created SonarQube Portfolios for each application ID, enabling a consolidated view of code quality, security, and maintainability across all critical customer-facing applications.
Led DNS migration efforts, transitioning services from legacy domains to new branded domains with minimal service disruption.
Conducted regular KT sessions for developers and QA teams covering new CI/CD practices, branching strategies, and the SonarQube upgrade process.
Set up private SFTP servers with IP whitelisting to securely provide access endpoints to external contributors involved in the new Research Bulk Documents initiative.
Conducted technical interviews to evaluate and onboard new DevOps talent as part of the recruitment process at EPAM.

Senior DevOps Engineer at Healofy

July'2022 - July'2023

Responsibilities :

Achieved 35% infra cost reduction through effective optimizations, right-sizing, and saving plans.
Overseeing and managing a multitude of servers and databases across GCP and AWS environments.
Enable seamless data access for business teams by implementing and utilizing Metabase as a self-service data exploration and visualization tool. Develop and maintain dashboards, and reports that provide meaningful insights to support data-driven decision-making.
Created CI/CD Pipelines for ECS, backend services in Jenkins to reduce manual deployment time by 50%.

Established comprehensive billing dashboard in data studio to analyze spend app wise and infra resources in GCP, leveraging resource labels for accurate cost allocation.
Develop detailed technical architecture diagrams for both infrastructure and applications prior to AWS migration from GCP.
Managing and maintaining the whole infrastructure operations and services using different AWS services that are CodeBuild, CodeDeploy, S3, Elastic Beanstalk, CloudFormation, WAF, Shield, EC2, VPC, Route53, CloudFront, System Manager, RDS, Backup, Lambda, CloudWatch, Inspector, Guard Duty, Elasticache, EKS, ECS, Fargate, IAM, EBS, EFS and more.
Utilize Control Tower and OU to establish new AWS accounts. Enable SSO with Google Workspace as an external identity using SAML authentication for secure team access. Enforce least privilege permission sets to safeguard AWS resources and sensitive data.
Automated setting up process of Elasticsearch, PostgreSQL, Memcached, Jetty, Bastion host, Jenkins and metabase servers on AWS using CFTs for all environments dev, test and prod.
Plan, coordinate, and execute PostgreSQL databases migration from GCP to AWS, ensuring minimal downtime, data integrity, and smooth transition of all databases components.
Improved APIs response latency after implemented Response time Monitoring with the help of different GCP services like Cloud Log Explorer, Big Query & Data Studio. Post migration set up same process on AWS using Redshift, Lambda, Java, Python, EventBridge and StepFunctions.
Post migration I've implemented backup & restore DR in the aws cloud, which involves ready to use CFTs for provisoning infra faster in the recovery region, use cicd pipelines to deploy latest application version, update backup and amis on regular basis to achieve minimum RPO, use S3 CRR to asynchronously copy objects to an S3 bucket in the DR region continuously.
Collaborate with other teams to ensure that disaster recovery is working properly after testing fail over, we achieved 40min RPO after provisioning resources with CFTs and real-time RTO as postgres databases servers are already in sync.
Mitigate security risks with Cloud Armor on GLB and transition to AWS by configuring AWS WAF with custom rules on ALB.
Maintain comprehensive and up-to-date documentation in Confluence, covering processes, procedures, and project-related information.
Implemented Prometheus for detailed monitoring of PostgreSQL databases, ensuring real-time visibility into replication status and resource utilization. Configure alert manager rules to trigger notifications via emails in case of replication failures or excessive resource usage.
Implemented DB monitoring process by utilizing PostgreSQL inbuilt pg_stat library to identify and analyze slow queries, bottlenecks, and areas of optimization.
Optimize GCP instance group scaling by leveraging additional metrics such as HTTP request count per backend service in GLB. Tune autoscaling groups in AWS for faster scaling based on custom CloudWatch alarms and configure HTTP requests as secondary metrics for effective scale-up operations.

DevOps Lead at Vetifly (Freelance)

Apr'2021 - June'2022

Responsibilities :

Architected secure, scalable, highly available application and infrastructure for several projects using micro-services architecture handling user base of close to 500k users.
Collaborated with product managers, business stakeholders, and fellow engineers to design and incorporate infrastructure solutions that are going to use by thousands of customers

Closely worked with project managers in defining branching strategy, permissions, and access management for 5+ engineering teams. Created a high-level architecture diagram before implementing branching strategy within organization.
Set up multi-AZ EKS cluster using CloudFormation Template and converted NodeJS, ReactJS APIs into microservices. Also included security, logging, and tracing tools like Xray, Fluentd, CloudWatch, SES, WAF, Shield, ACM and many more
Managing and maintaining the whole infrastructure operations and services using different AWS services that are CodeBuild, CodeDeploy, S3, Elastic Beanstalk, CloudFormation, WAF, Shield, EC2, VPC, Route53, CloudFront, System Manager, RDS, Backup, Lambda, CloudWatch, Inspector, Guard Duty, Elasticache, EKS, ECS, Fargate, IAM, EBS, EFS and more.
Design, implement, and maintain active/active DR on AWS with realtime RTO/RPO. Regularly assess and test recovery region DR resources to ensure its effectiveness.
Reduced costs each quarter by eliminating unnecessary servers and consolidating databases.
Worked on AppDynamics to track slow transactions and API calls response time. Sometimes for anomaly detection and Root cause diagnostics.
Slack Workplace Administration. Integrated AWS bot, victorops, loggly, Jenkins, Jira for notifications & alerts
Participated in the audits conducted by Cybersecurity Malaysia
Involvement in VAPT to scan all the application and find vulnerabilities.
Implemented OpenVPN to provide secure access to organization resources for internal team members.
Migration of .NET Teamcity pipelines to Azure Pipeline
Set up Tally and backup mechanism on AWS and provide user access through Tally Client by creating .NET type users
Performed build and release of all application cycles, test, production, update, patches, and maintenance
Ensured 100% of all project confluence documentation was created and updated, including design, development, and deployment documentation.
Automate Change Request process with the help of Power Automate and O365

DevOps Engineer at Team Computers

Dec'2018 - Mar'2021

Responsibilities :

Managing 50+ servers in a distributed and highly available critical infrastructure
Automated mobile application builds and deployment using Bitrise to reduce human errors and speed up production processe

Working with GitHub Enterprise to manage source code repositories and performed branching, merging, and tagging depending on the requirement.
Built and deployed Docker containers to break up monolithic app into micro-services, improving developer workflow, increasing scalability, and optimizing speed through Kubernetes.
Create and maintain fully automated CI/CD pipelines for different programming language projects like .NET, Java, Scala, Angular, Android, Python Django, etc. on different platforms on-cloud & on-prem.
Create and maintain highly scalable and fault tolerant multi-tier GCP, AWS and Azure environments spanning across multiple availability zones using Terraform.
Wrote Ansible playbooks to automatically install Hadoop system components, saving 80% time consumed by delivery team.
Interacting with clients for requirement gathering and preparing functional specifications & low-level design documents
Writing, updating, and maintaining technical program, document design, end user documentation and operational procedures of different application system
Worked on the constant improvement of existing network operations to maximize efficiency and security.
Developed and implemented new deployment and scaling processes
Performing comprehensive unit and integration testing of all software produced and contributing to overall quality processes.
Installed and configured Prometheus and Grafana to constantly monitor network bandwidth, memory usage, and hard drive status.
Wrote Shell scripts to automate regular tasks like update Jenkins, deletion of older AMIs and snapshots in AWS and DB backups to s3 bucket.
Responsible for taking the source code and compiling using Maven and package it in its distributable format, such as a WAR file

Android Developer at DeetyaSoft (IVRGURU)

Apr'2018 - Nov'2018

Responsibilities :

Results-driven Android developer proficient in Android Studio and Java
Developed a business application utilizing IVR (Interactive Voice Response) technology.
Implemented efficient CI/CD processes for the Android application using Code Magic.