About Me
I am a Machine Learning Systems Engineer with deep expertise in building production-grade AI infrastructure. My work spans from distributed training systems to edge deployment, with a focus on scalability, efficiency, and reliability.
Distributed Training
Large-scale model training across GPU clusters with optimized data parallelism and model parallelism strategies.
ML Infrastructure
Building robust ML platforms with Kubernetes, Kubeflow, and custom orchestration for production workloads.
Data Pipelines
High-throughput data processing pipelines with Apache Spark, Ray, and modern data lake architectures.
Cloud-Native ML
Multi-cloud deployments on AWS, GCP, and Azure with focus on cost optimization and scalability.
Model Optimization
Quantization, pruning, and distillation techniques for deploying efficient models at the edge and cloud.
MLOps & Security
End-to-end ML lifecycle management with CI/CD, monitoring, and enterprise-grade security practices.