Yachika Anand

Data Science Portfolio

My Projects

Explore my portfolio of data science and AI projects showcasing production-ready systems, reproducible ML pipelines, and scalable data solutions.

Machine Learning

Built a production-grade book recommendation engine that bridges the gap between exploratory data science and real-world application deployment. The project solves the problem of 'information overload' in digital libraries by providing users with two distinct recommendation strategies: a popularity-based discovery engine for trending titles and a personalized collaborative filtering engine for niche discoveries.

Python FastAPI Pandas Numpy Scikit-learn Streamlit

2025-10

Data Engineering

AI Applications

Built a full-stack, production-ready AI research platform for healthcare teams that transforms a static PDF library into a conversational knowledge base. Features a multi-stage agentic RAG pipeline with hybrid search, cross-encoder reranking, LangGraph agent with self-correction, real-time SSE streaming, multi-tenant data isolation, and a Smart Research mode that auto-ingests PubMed papers and answers clinical questions in under 60 seconds.

Python FastAPI LangGraph Qdrant Redis PostgreSQL Docling OpenAI Streamlit Docker Nginx LangSmith RAG

2026-05

Developed a collaborative Research Intelligence Platform to streamline and enhance the research workflow for data scientists. Built a robust, containerized architecture that supports team-based document ingestion, intelligent semantic search, and AI-driven question-answering capabilities.

FastAPI Streamlit LangGraph LLMs Vector DB Docker Kubernetes PostgreSQL Redis RAG JWT AI Langchain Qdrant

2026-04

Other Projects

Built a robust, reproducible Machine Learning pipeline for Emotion Detection using DVC, ensuring strict version control of datasets, models, and code.

Python DVC Scikit-learn NLTK Pandas

2026-04