- Built PySpark pipeline to extract clean, structured features from 500GB+ of unstructured diagnostic logs, powering ML models that forecasted component failures and reduced undetected errors by 20%
- Reduced customer escalations by 15% through A/B testing to identify at-risk customer satisfaction scores, driving cross-functional collaboration with product, service, and engineering stakeholders
- Embedded automated data validation and unit testing into CI/CD pipelines, elevating test coverage to 95% and reducing production data anomalies

I build scalable data infrastructure and deploy AI applications that drive strategic business outcomes. With 4+ years developing large-scale applications at global 100 firms such as Mercedes-Benz R&D, I specialize in transforming complex data challenges into production-grade ML solutions.
Professional Experience
- Built a computer vision pipeline in PyTorch to process and analyze 400K+ highly imbalanced medical images, achieving 94% sensitivity
- Trained hybrid CNN-Vision Transformer architecture utilizing ResNet-50 with transfer learning, achieving 0.92 AUC-ROC for melanoma detection
- Conducted EDA on 50+ features across 100K+ customer records, applied SHAP for model interpretability
- Built ETL pipelines with data validation to consolidate multi-source customer data, feeding Power BI dashboards and reducing manual prep by 4 hrs/week
Technical Expertise
Languages & Core Tools
MLOps & Cloud
Machine Learning & GenAI
Data Processing & Viz
Featured Projects
View All on GitHubAskNEU - RAG-Based Conversational AI
Built a RAG-based chatbot using LangChain and Hugging Face Transformers to answer student questions by indexing 1,000+ web pages into Pinecone Vector DB. Automated data ingestion with Airflow and deployed on GCP Cloud Run via Docker with GitHub Actions, achieving 92% retrieval accuracy.
Personalized Financial Recommendation System
Built a recommendation engine by clustering users into financial personas using K-Means and matching profiles to 26K+ financial products via cosine similarity. Trained XGBoost and Random Forest classifiers achieving 0.91 macro F1 and 0.93 precision, tracked experiments with MLflow.
Skin Cancer Detection Pipeline
Developed a deep learning pipeline in PyTorch processing 400K+ medical images with advanced techniques for handling class imbalance. Implemented hybrid CNN-Vision Transformer architecture with ResNet-50 transfer learning, achieving 0.92 AUC-ROC for melanoma detection.
Education
Northeastern University
Master of Science in Data Science
Boston, MA
Visvesvaraya Technological University
Bachelor of Engineering in Computer Science
Karnataka, India
Let's Connect
I'm actively exploring opportunities in data engineering, machine learning engineering, and data science. Whether you have a role that might be a fit or just want to discuss data and ML, I'd love to hear from you.