// about
Inference Engineer based in New York.
I'm Prem — an inference engineer focused on low-latency LLM serving, multi-model routing, and cost-efficient production AI infrastructure. I've deployed scalable AI systems on GCP and AWS with a hard bias toward latency, reliability, and the cost-latency-accuracy tradeoff — not whichever framework is trending this quarter.
Right now I'm finishing my M.S. in Artificial Intelligence at the Rochester Institute of Technology and shipping Emotion Engine — a research project showing that emotion-like internal states (fear, grief, suspicion) emerge from a 72-feature predictive LSTM with no hardcoded rules (p = 3.3e-113 across 205K agent-step records). Before RIT, I built a production LLM routing layer at Concentrix + Webhelp — 3+ foundation models, 1.5s end-to-end, 18% cost reduction, 42% fewer incidents.
The throughline of everything on this site: “Simple systems scale better than clever ones.”
Education
- M.S. in Artificial Intelligence — Rochester Institute of Technology, Aug 2024 – May 2026
- B.Tech in Computer Science and Engineering — National Institute of Technology, Silchar, Aug 2020 – May 2024
Experience
- Generative AI Engineer, Concentrix + Webhelp — Feb 2024 – Jul 2024
- Data Science Intern, AlphaBits Technologies — Aug 2023 – Jan 2024
- ML Engineer Intern, iNeuron.ai — Jun 2023 – Aug 2023
Publications
- Lightweight Channel Attention for Efficient CNNs — Designed and evaluated a lightweight channel attention module (LCA) achieving competitive accuracy with negligible parameter and latency overhead on ResNet-18 and MobileNetV2.
Certifications
- Machine Learning Specialization — Stanford Online · Coursera
/ stack
- Backend & Programming
- Python ·SQL ·FastAPI ·REST APIs ·PostgreSQL ·MongoDB ·PySpark
- Distributed Systems & Data
- Kafka ·Streaming pipelines ·Feature stores ·Data pipelines ·Vector DBs ·System design
- Cloud & Infrastructure
- GCP ·AWS (EC2, S3, Bedrock, SageMaker) ·Docker ·Kubernetes ·Terraform ·CI/CD ·MLflow
- Machine Learning
- PyTorch ·TensorFlow ·Scikit-learn ·LLMs ·RAG ·NLP ·Computer Vision ·MCP