Cloud Infrastructure · Kubernetes · Platform Engineering

Helping teams ship safer,
more resilient infrastructure

14 years running Kubernetes in production. From monolith migrations to supply chain incident response — I architect platforms that scale without breaking.

14+ Years in production
250+ Environments managed
50+ Microservices operated
2 Cloud platforms, simultaneously
Get in touch View case studies

What I do

01
Kubernetes & Cloud Architecture

Designing and operating production-grade EKS and GKE clusters. Multi-account, multi-cloud, with Istio service mesh, RBAC hardening, and network policies that actually enforce boundaries.

EKS GKE Istio Cilium RBAC
02
Infrastructure as Code & GitOps

Multi-environment Terraform and Terragrunt setups with versioned module pipelines. ArgoCD App-of-Apps patterns, sync waves, and GitOps workflows where Git is the only way things change.

Terraform Terragrunt ArgoCD Helm SOPS
03
DevSecOps & Runtime Security

Supply chain hardening, OIDC-based CI/CD, runtime threat detection with Falco and Tetragon, WAF consolidation, and secrets management. Security baked in, not bolted on.

Falco Tetragon OIDC WAF eBPF

Deep dives

2025 Security · Incident response
Supply chain attack: from compromised runner to hardened pipeline

A malicious package was injected via a compromised GitHub Actions runner, exposing secrets across AWS and GCP. Led full incident response: forensic analysis of 58 findings, credential rotation, OIDC migration replacing static service account keys, runtime threat detection deployment, and a reusable playbook shared org-wide.

Zero confirmed breach · Pipeline fully hardened · No downtime
2024 Architecture · Multi-cloud
WAF & API gateway consolidation across 250+ environments

Fragmented WAF policies, inconsistent DNS, and per-service certificates made security audits painful. Designed a unified architecture behind Imperva WAF with DNS delegation via Terraform — enabling instant backend switching between AWS and GCP without DNS changes, eliminating certificate sprawl across all environments.

Single security perimeter · Instant multi-cloud failover · 100% managed as code
2022 Platform · GitOps
GitOps at scale: ArgoCD App-of-Apps across 250+ environments

Ad-hoc deployments caused config drift, inconsistent rollouts, and no audit trail for production changes. Designed a GitOps platform with ArgoCD App-of-Apps and sync waves for ordered deployment orchestration. Git became the single source of truth — no manual kubectl, rollbacks are a git revert.

Zero config drift · Full audit trail · Dramatically faster time-to-production
2019 Platform engineering · Kubernetes
Developer-first deployment platform with custom Kubernetes operator

Every new microservice required DevOps coordination for manifests, databases, buckets, and secrets — a constant bottleneck. Built a simple YAML config per service declaring its full context boundary. A custom Kubernetes operator reconciled everything automatically: deployment, database, storage, secrets, networking. Developers gained full autonomy. DevOps kept guardrails.

Self-service deployments · Consistent resource ownership · Zero DevOps bottleneck
2017 Open source · SRE
5–10TB/day: building HA logging at scale and contributing to FluentBit

Handling terabytes of daily VPN traffic with EFK stack on Kubernetes — existing tooling couldn't throttle log ingestion without losing data. Designed a custom HA logging and monitoring setup and contributed the "Throttle" filter plugin to FluentBit, which became part of the upstream project. Still used in production systems today.

Open source contribution · System stability at petabyte scale

Stack & tooling

Cloud & Orchestration
AWS (EKS, EC2, S3, RDS, IAM) GCP (GKE, GCS, IAM, Pub/Sub) Kubernetes (since 2015) Docker (since 2013) Istio service mesh Cilium · eBPF
IaC & GitOps
Terraform · Terragrunt ArgoCD · Flux Helm · Kustomize GitHub Actions · OIDC SOPS · KMS External Secrets Operator
Security & Observability
Falco · Tetragon KubeArmor Imperva WAF AWS GuardDuty · Security Hub Prometheus · Grafana Loki · FluentBit
Messaging & Data
Kafka (Strimzi · KRaft) PostgreSQL · Redis Kong API Gateway Bash · Python · Go NetBird · WireGuard k3s · home lab

Writing & experiments

Coming soon
eBPF tracing in production: what Tetragon actually sees

A practical look at deploying Tetragon for runtime security — what syscall-level visibility looks like in a real Kubernetes cluster, and the false positive problem nobody talks about.

// runtime-security · ebpf
Coming soon
OIDC for CI/CD: why static keys are a disaster waiting to happen

Migrating GCP service account keys to Workload Identity Federation after a supply chain incident. The before/after, the gotchas, and a Terraform setup you can actually use.

// security · oidc · gcp
Coming soon
ArgoCD sync waves: ordering chaos in 250+ environments

How sync waves solve the "CRD not ready" problem at scale. Practical patterns for App-of-Apps, what breaks first, and the monitoring setup that saves you at 2am.

// gitops · argocd · kubernetes
Coming soon
Home lab: 5-node k3s cluster with NetBird and PXE boot

Building a proper home lab for testing security tooling — QNAP NAS as PXE server, UniFi networking, NetBird for remote access, and why the GL.iNet GL-RM10 is worth it.

// homelab · k3s · networking
Coming soon
Kafka KRaft migration: Strimzi 3.7 → 4.1 without downtime

The full runbook for migrating a production Kafka cluster through ZooKeeper deprecation to KRaft mode. Metadata version transitions, the rollback plan, and what the monitoring looks like mid-migration.

// kafka · strimzi · kubernetes
Coming soon
Multi-cloud naming strategy: deterministic hashes at 250+ envs

When you have 250+ environments across AWS and GCP, resource naming becomes a serious operational problem. A hash-based convention that makes debugging production logs take seconds instead of minutes.

// platform · multi-cloud · iac