About Me
I am an Applied Deep Learning Research Scientist at NVIDIA, where I work on Megatron-Inference. I completed my Ph.D. in Computer Science at the University of Maryland (UMD), College Park, advised by Prof. Abhinav Bhatele in the Parallel Software and Systems Group. Prior to UMD, I completed my Bachelors and Masters in Computer Science and Engineering at the Indian Institute of Technology, Kharagpur.
My research focuses on High Performance Computing (HPC) for AI, with an emphasis on scalable solutions for efficient deep learning inference and training on large-scale GPU clusters. During my Ph.D., I was honored to receive the Outstanding Graduate Research Assistant Award at the University of Maryland (AY 2023-24) for my contributions to HPC and AI training.
At NVIDIA, I work on Megatron-LM, building low-latency multi-GPU inference for Mixture-of-Experts (MoE) models. My work spans MXFP8 quantized MoE inference, NVLS-based communication kernels, and scaling to Blackwell NVL72 systems.
