Yachika Anand

Data Science Portfolio

Skills & Expertise

Every skill below is backed by a project, certification, or measurable outcome — not just listed.

AI & LLMs

RAG Systems
MedVault-RAG — 95% API cost reduction via two-tier semantic caching
See proof →
LangGraph
Stateful agentic loops with self-correction across MedVault-RAG & Research Hub
See proof →
LangChain
Collaborative Research Intelligence Platform — multi-tenant RAG pipeline
See proof →
Semantic Search
Hybrid dense + sparse vector search with cross-encoder reranking in MedVault-RAG
See proof →
Prompt Engineering
Guardrails, faithfulness scoring, and query reformulation in production RAG
See proof →
Vector Databases (Qdrant)
Multi-tenant hybrid search with BM25 + semantic vectors in MedVault-RAG
See proof →

MLOps & DevOps

DVC
Emotion Detection MLOps Pipeline — full experiment reproducibility with one command
See proof →
Docker
Docker Foundations Professional Certificate — Docker, Inc. (2025)
See proof →
FastAPI
Production backend for MedVault-RAG and Book Recommender System
See proof →
CI/CD
Automated Docker Compose production stack with Nginx, Gunicorn & Certbot
See proof →
MLflow
Experiment tracking and model versioning in Emotion Detection MLOps Pipeline
See proof →
Kubernetes
Container orchestration in Collaborative Research Intelligence Platform
See proof →

Machine Learning

Python
Primary language across all 4 portfolio projects — from data to deployment
See proof →
Scikit-learn
Book Recommender System — collaborative filtering engine in production
See proof →
NLP
Emotion Detection MLOps Pipeline — text classification with NLTK pipeline
See proof →
Deep Learning
Seneca Polytechnic: Cloud, Big Data, IT & AI — PyTorch & TensorFlow projects
See proof →
Model Evaluation
Faithfulness scoring and A/B testing across RAG and ML projects
See proof →

Data Engineering

PostgreSQL
Multi-tenant user auth, trace logs, and analytics in MedVault-RAG
See proof →
Redis
Sub-millisecond exact-match cache reducing LLM calls in MedVault-RAG
See proof →
SQL
Query optimization: 8 hours → 45 minutes — real production systems thinking
See proof →
ETL/ELT Pipelines
Real-time data retrieval pipelines at No Worker Left Behind
See proof →
Data Validation & Governance
Strict data lineage and experiment versioning in all ML pipelines
See proof →

Analytics & BI

Power BI
Automated yearly reporting dashboards at No Worker Left Behind
See proof →
A/B Testing
25% conversion lift through data-driven optimization at IEC Hamilton
See proof →
Streamlit
Real-time SSE streaming frontend for MedVault-RAG and Book Recommender
See proof →
D3.js
Trained 330+ students at Conestoga College on interactive data visualization
See proof →

Cloud & Infrastructure

Microsoft Azure
Microsoft Certified: Azure Fundamentals (2026)
See proof →
Nginx
SSL termination, rate limiting, and reverse proxy in MedVault-RAG production
See proof →
AWS
S3, EC2, SageMaker — cloud-native ML deployment and storage
See proof →
LangSmith
Full agent trace observability in MedVault-RAG production pipeline
See proof →