Madhosh Yagnik | DevOps Engineer
Madhosh Yagnik
Production infrastructure across AWS and Azure - IaC migrations, CI/CD pipelines, and automation that replaces slow manual work. I care about systems that are reliable, cost-efficient, and easy for the next person to understand.
01AWS Disaster Recovery & Infrastructure AutomationAWS›
Designed and delivered multiple DR and infrastructure automation solutions for a production banking environment. Every decision was weighed against cost.
- Migrated Bastion servers from CentOS 7 (EOL) to Ubuntu 24.04 LTS with no service disruption. Rewrote the setup guide for Debian and YUM-based distros.
- Built a cost-effective DR system: EventBridge + Lambda for automated RDS snapshots, cross-region copy, and health-monitored auto-recovery if primary failed.
- Developed a parallel EC2 DR solution with AMI rotation, cross-region copy, auto-launch on health check failure, and SNS alerting.
- Evaluated AWS Read Replicas and DMS; excluded both on cost grounds after full POC - documented the decision with analysis.
- Resolved urgent Bitbucket pipeline failures and stabilised CI/CD long-term.
- Recovered access to a Windows EC2 instance after the private key was lost.
02DevOps Backlog - Docker, CMake, Makefile, Git, LinuxLinux›
Cleared a backlog of 72 DevOps tasks across Docker, Linux, Git, Makefile, and CMake - 64 accepted on first review.
- Picked up CMake and Makefile tasks while learning both tools in parallel - no blocked work or delays.
- Daily stand-up transparency with clear progress tracking throughout.
03Full Azure Migration & CI/CD PipelineAzure›
Took a project running entirely on local setups and brought it to a production-ready Azure deployment in one month.
- Migrated full stack to Azure Cloud, resolving routing and network configuration issues along the way.
- Iterated CI/CD across three approaches - GH Actions, Azure DevOps via SSH, and finally an Azure agent-based pipeline with client-approved security controls.
- Optimised the frontend Dockerfile to serve static files - load time dropped from seconds to milliseconds.
- Dockerized all services; created Docker Compose stacks for consistent local and cloud environments.
- Added systemd services and cron jobs for self-starting apps at VM boot.
04Terraform IaC Migration & Cost OptimisationAWS›
Inherited a production AWS environment with no IaC, no state management, and known security gaps. Left it fully Terraform-managed, secured, and cheaper to run.
- Migrated all AWS resources to Terraform with Terraform Cloud for remote state and environment isolation. Negligible downtime during migration.
- Partnered with the security team to audit past incidents and implement preventive IAM controls.
- Reduced monthly AWS spend by ~$95-100 via right-sizing and cleanup.
- Moved the on-prem chatbot server from the office to the server room - eliminated recurring accidental disconnections.
- Delivered a complete handover; incoming engineer appreciated the thoroughness.
05Lightweight RDS Backup AutomationAWS›
Client was paying for daily automated RDS backups on a staging environment that did not need them. Replaced the default behaviour with a purpose-built, native solution.
- Disabled built-in backups; implemented monthly snapshot Lambda and quarterly cleanup Lambda retaining the latest snapshot.
- Scheduled via EventBridge with SNS alerts. Wrote complete manual recovery documentation.
06Production Server Management & Deployment SecurityLinux›
Ongoing management of production and staging for two separate products. Minimal setup, stable operations.
- Manage deployments and NGINX configurations; coordinate with hosting provider for system-level updates.
- Resolved CORS and React routing issues from misconfigured NGINX paths.
- Replaced Git token-based deployment with SSH deploy keys - documented in a one-page team guide, adopted on a separate Azure project for consistency.
07SSL, DNS Recovery & Chatbot StabilisationAWS + DNS›
Picked up a production chatbot platform mid-incident - expired SSL, broken auto-renewal, and a domain blocked by a major social platform.
- Diagnosed and fixed the failed auto-renewal mechanism; renewed certificates.
- Resolved a domain blockage that had disrupted chatbot operations.
- Set up a temporary subdomain for business continuity; decommissioned cleanly after the main domain stabilised.
- Managed full domain transition to a new subdomain, updating backend and dependent service configurations.
- Upcoming: scoping migration of chatbot logic from AWS to client's own GCP VM.
08OpenShift CI/CD & Lab AutomationConfidential›
Working across two internal projects - a completed OpenShift plugin platform and an ongoing lab provisioning system used by around 1000 engineers.
- Contributed to a monorepo-based solution enabling consistent developer deployments across internal teams.
- Implemented GitLab CI pipelines for linting, SonarQube scanning, and container image build/release automation. Supported plugin releases from v0.0.2 to v0.0.24 - stable with minimal maintenance since.
- Actively migrating RHEL7 Lab Controllers to RHEL9 as part of a vulnerability remediation effort.
- Automated VPN connection setup - removed manual OTP and credential steps, now single-click.
- Built a parallel bash script to recover machines marked 'Broken' using xargs: fetches a live list from the UI, confirms with the user, processes in parallel. Reduced lookup time from ~1 hour to a few minutes. This operation had been manual for over a decade.