Writing
Notes from the infra stack.
Deep dives on GPU orchestration, CI/CD, Kubernetes, observability, and the operational practices behind production AI infrastructure.
- 7 min read
Dynamic PR environments for services on EKS
Per-PR Kubernetes environments on a shared EKS cluster — ArgoCD git-directory generators, a 3-layer Helm value cascade, multi-source Applications, and a namespace-reaper controller for the cleanup ArgoCD pruning misses.
- gitops
- eks
- argocd
- kubernetes
- helm
- 1 min read
vLLM vs SGLang vs TGI — operational benchmarks (coming soon)
Side-by-side production benchmarks of the three open-source inference servers — throughput, latency, KV-cache behavior, and the operational gotchas you'll hit at scale.
- inference
- vllm
- sglang
- tgi