|
Yogita Bisht
Data Scientist & Engineer
Boston, MA

I build scalable data infrastructure and deploy AI applications that drive strategic business outcomes. With 4+ years developing large-scale applications at global 100 firms such as Mercedes-Benz R&D, I specialize in transforming complex data challenges into production-grade ML solutions.

4+Years Experience
500GB+Data Processed
20%Error Reduction
Open to OpportunitiesData EngineeringMachine LearningData ScienceOpen to OpportunitiesData EngineeringMachine LearningData Science

Professional Experience

Northeastern University

Data Science Research AssistantMay 2025 - Aug 2025 | Boston, MA
  • Built a computer vision pipeline in PyTorch to process and analyze 400K+ highly imbalanced medical images, achieving 94% sensitivity
  • Trained hybrid CNN-Vision Transformer architecture utilizing ResNet-50 with transfer learning, achieving 0.92 AUC-ROC for melanoma detection
PyTorchComputer VisionDeep LearningTransfer Learning

Capgemini Engineering

Data Science InternMar 2021 - Sep 2021 | Bangalore, India
  • Conducted EDA on 50+ features across 100K+ customer records, applied SHAP for model interpretability
  • Built ETL pipelines with data validation to consolidate multi-source customer data, feeding Power BI dashboards and reducing manual prep by 4 hrs/week
ETLPower BISHAPEDA

Technical Expertise

Languages & Core Tools

JavaPythonSQLPostgreSQLMongoDBGitBash/Linux

MLOps & Cloud

DockerFastAPIMLflowSnowflakeDatabricksdbtAirflowAWSGCP

Machine Learning & GenAI

Scikit-learnPyTorchNLPLangChainLangSmithHugging FacePinecone

Data Processing & Viz

NumPyPandasPySparkPower BIStreamlit

Featured Projects

View All on GitHub
RAG / LLM

AskNEU - RAG-Based Conversational AI

Built a RAG-based chatbot using LangChain and Hugging Face Transformers to answer student questions by indexing 1,000+ web pages into Pinecone Vector DB. Automated data ingestion with Airflow and deployed on GCP Cloud Run via Docker with GitHub Actions, achieving 92% retrieval accuracy.

LangChainPineconeAirflowGCPDocker
View Project
ML / Recommendation

Personalized Financial Recommendation System

Built a recommendation engine by clustering users into financial personas using K-Means and matching profiles to 26K+ financial products via cosine similarity. Trained XGBoost and Random Forest classifiers achieving 0.91 macro F1 and 0.93 precision, tracked experiments with MLflow.

XGBoostK-MeansFastAPIStreamlitMLflow
View Project
Computer Vision

Skin Cancer Detection Pipeline

Developed a deep learning pipeline in PyTorch processing 400K+ medical images with advanced techniques for handling class imbalance. Implemented hybrid CNN-Vision Transformer architecture with ResNet-50 transfer learning, achieving 0.92 AUC-ROC for melanoma detection.

PyTorchVision TransformerResNet-50Transfer Learning
View Project

Education

Northeastern University

Master of Science in Data Science

Sep 2024 - May 2026 | GPA: 3.6/4.0

Boston, MA

Visvesvaraya Technological University

Bachelor of Engineering in Computer Science

Aug 2017 - Aug 2021 | CGPA: 8.5/10

Karnataka, India

Let's Connect

I'm actively exploring opportunities in data engineering, machine learning engineering, and data science. Whether you have a role that might be a fit or just want to discuss data and ML, I'd love to hear from you.

bisht.yo@northeastern.edu
Boston, MA
LinkedIn ProfileGitHub Profile
Built with v0