Engineering
systems that
never sleep.
I'm Nishant Kamal. I build resilient, automated cloud platforms that scale without breaking.

Reliability by the Numbers
Years of hands-on experience across cloud infrastructure, platform engineering, and site reliability.
About Me.
Building Robust Systems & Scalable Infrastructure
Site Reliability Engineer
Nishant Kamal
Site Reliability Engineer working extensively with cloud-native and Kubernetes-based distributed systems at scale — handling production reliability, mitigating outages during high-traffic events, and building resilient platform infrastructure.
I was never the smartest in the room — I just never left until I was.
Always eager to collaborate on DevOps, SRE, and platform engineering initiatives — whether it's infrastructure design, automation strategies, or building resilient, scalable systems.
Technical Stack
& Ecosystem.
Work that
moves the needle.
Deployed scalable, highly available infrastructure on AWS EKS with integrated monitoring via Prometheus, New Relic, and Grafana for proactive issue resolution. Streamlined CI/CD pipelines by integrating FluxCD and GitHub Actions, enhancing automation and improving deployment cycles across teams. Leveraged Helm Charts to automate application deployments, manage Kubernetes resources, and simplify infrastructure-as-code processes.
Implemented Karpenter for dynamic scaling, effectively replacing the Auto Scaling Group (ASG), optimizing resource utilization, and improving cost efficiency across the Kubernetes environment. Deployed multiple microservices using Karpenter for autoscaling while ensuring seamless traffic management. Implemented cost-saving measures by optimizing resource utilization across all cloud-based infrastructure environments.
Designed and executed a proof-of-concept for Oracle Cloud Infrastructure (OCI) and successfully deployed the first production environment. Deployed Karpenter on OCI to automate node provisioning and manage compute costs efficiently through intelligent bin-packing. Deployed Istio, Virtual Services, and Gateways to manage traffic routing and enhance microservices communication within Kubernetes clusters, streamlining service mesh configurations. Resolved all QA-reported issues to achieve a stable, live deployment.
Led decommissioning of the old Azure Container Registry and migrated all assets to a new ACR instance. Executed a full audit and cleanup of stale and unused images, reducing storage bloat and eliminating legacy registry dependencies across all pipelines. Troubleshot production issues effectively, working closely with development teams to implement solutions that improved system uptime and minimized downtime throughout the migration.
Built a Helm-based centralized alerting system improving monitoring consistency across 50+ services. Reduced MTTR by establishing unified runbooks, alert routing rules, and escalation policies. Conducted root-cause analyses after major incidents to identify areas for process improvement or technical enhancement. Enhanced system reliability by implementing monitoring tools and automation techniques. Proactively identified performance bottlenecks, working on continuous improvements to system resilience and reliability.
Implemented Crossplane across Oracle Cloud (OCI) and Azure to standardise infrastructure definitions and eliminate manual provisioning. Created reusable Compositions and XRDs enabling developer self-service for on-demand cloud environments. Reduced provisioning time by 40–60% and improved reproducibility across both cloud providers.
Building on Engineering Excellence.
Recognition &
Awards.
7 awards across 4 years
For being the Customer Happiness Champion for OND 2024 Quarter, reflecting outstanding dedication to client success and service quality.
Recognised for emerging leadership and innovation in SRE practices, demonstrating exceptional growth and technical initiative at FarEye.
Recognised for consistently going above and beyond designated responsibilities to support customers, often extending beyond regular shift hours, reflecting strong ownership, team-first mindset, and a customer-centric approach in high-pressure situations.
For demonstrating the superpower of passion for customers for OND 2022.
For demonstrating the superpower of complex problem solving for JAS 2022.
For demonstrating the superpower of complex problem-solving for JFM 2022.
For demonstrating the superpower of passion for customers for JAS 2020.
Let's Connect.
Interested in building reliable, scalable infrastructure or discussing cloud-native architectures? Drop a message.