Sudipta Pathak
Lead AI Infrastructure Software Engineer
Professional Summary
Lead Machine Learning Infrastructure Engineer with 8+ years of experience building and scaling production AI platforms and distributed systems. Deep expertise in LLM-powered systems, ML lifecycle management, and cloud-native infrastructure on AWS. Proven ability to design low-latency, highly available inference services, optimize performance and cost at scale, and collaborate across engineering, research, and product teams to deliver responsible, enterprise-grade AI systems.
Technical Skills
AI & ML
ML Platforms
Cloud
Distributed Systems
Languages
CI/CD
Work Experience
Lead ML Infrastructure Engineer
JPMorgan Chase, Machine Learning Center of Excellence
Nov 2023 - Current
Jersey City, New Jersey, USA
- •Spearheaded the design and implementation of a multi-region infrastructure for LLM Suite, ensuring high availability and scalable enterprise adoption
- •Led development of an agentic LLM framework for summarization and multi-step reasoning using LangChain and LangGraph
- •Designed and operated production LLM inference pipelines with observability, rate-limiting, and failure isolation to support enterprise-scale usage
- •Engineered Kubernetes-based inference services for chat-based AI assistants, optimizing resource utilization and system performance
- •Partnered with platform, product, and research teams to deliver secure, reliable AI capabilities across the organization
Software Development Engineer (Backend Infrastructure)
Amazon Web Services, Glue
Sept 2022 - Aug 2023
New York City, New York, USA
- •Led a team of 6 engineers to deliver AWS Glue support for large instance types, enabling high-memory, high-throughput ETL workloads
- •Designed and implemented backend solutions to mitigate hot partition issues, significantly improving service scalability
- •Eliminated recurring customer issues by introducing automated cleanup for leaked Elastic Network Interfaces (ENIs)
- •Independently architected and drove features to reduce Glue job startup latency, improving customer time-to-insight
- •Participated in on-call rotations, triaging and resolving high-severity customer-facing production incidents
Senior Software Engineer
Bloomberg LP
July 2020 - Sept 2022
Princeton, NJ, USA
- •Led migration of critical financial data services from legacy C++ systems to event-driven, containerized Python microservices
- •Drove adoption of Kubernetes-based deployments, improving scalability, reliability, and operational consistency across production services
- •Designed distributed ingestion pipelines for high-volume financial and news data supporting downstream analytics and products
Machine Learning Engineer
Siemens Corporation
Oct 2017 - July 2020
Princeton, NJ, USA
- •Principal Investigator and Technical Lead for a DARPA-funded project delivering scalable platforms for information extraction and document understanding
- •Architected and implemented end-to-end machine learning systems for complex data modalities, including point cloud datasets
Machine Learning Engineering Intern
Bentley Systems Inc.
Feb 2015 - May 2015
Watertown, CT, USA
- •Developed and evaluated machine learning models for smart water networks to predict water usage and detect abnormal events in real time
- •Improved prediction accuracy by 2.2% over a baseline ANN framework, reducing water leakage and false alarms on production datasets
- •Scaled machine learning pipelines using AWS and GPU acceleration, improving training and inference throughput
Education
PhD, Computer Science and Engineering
University of Connecticut
Aug. 2011 - Jun. 2017
Storrs, CT, USA
BS, Computer Science and Engineering
West Bengal University of Technology
Aug. 2011 - Jun. 2017
Kolkata, WB, India