Work
Projects
Shipping AI/ML applications. Each project links to its GitHub repository.
Upload any PDF or text document and ask questions conversationally. Built on Amazon Bedrock
(Titan Embed v2 + Claude 3.5 Haiku), ChromaDB for vector storage, and FastAPI.
Streaming responses with inline [Source N] citations
and multi-turn conversation memory per session.
Enterprise-grade "Ask Our Docs" system with hybrid retrieval (BM25 + vector search), cross-encoder reranking, citation enforcement, and CI-gated evaluation pipeline. Production-ready with Ragas metrics (faithfulness, answer relevance, context precision) integrated into GitHub Actions.
Privacy-first tool to run, benchmark, and compare local models entirely offline. Compare 3 quantized models (Mistral 7B, Llama 2, Phi) on quality (accuracy/BLEU) vs. latency (p50/p95) vs. memory usage. FastAPI wrapper with streaming support and interactive dashboard.
Fine-tune Llama 2 7B with LoRA + DPO to extract structured data from unstructured text (contracts/invoices). Phase 1: SFT on extraction task. Phase 2: DPO to prefer correct JSON format. Baselines against GPT-3.5 with before/after metrics (exact match %, F1 score, latency).
Deep-dive into AWS security services and patterns — IAM least-privilege design, Security Hub findings, GuardDuty threat detection, and KMS encryption strategies. Practical reference for securing cloud workloads at the architecture level.
The RAG pipeline applied to security documentation — query threat reports, CVE databases, and compliance frameworks conversationally. Forms the foundation for agentic STRIDE and ATLAS threat modeling workflows using Claude.
Python microservices platform demonstrating secure-by-design patterns: IAM-controlled SQS/Kinesis access, encrypted PostgreSQL connections, API Gateway authorization, and secrets management via AWS Secrets Manager.
Practice implementations across sorting, searching, graphs, trees, and dynamic programming — collected from LeetCode, HackerRank, GeeksForGeeks, and Codility. Written in Java with complexity annotations on each solution.
Focused collection of dynamic programming solutions — knapsack, LCS, edit distance, coin change, matrix chain multiplication, and more. Each solution includes the recurrence relation, memoization and tabulation variants, and complexity analysis.
LeetCode problem solutions organized by pattern: sliding window, two pointers, binary search, BFS/DFS, backtracking, and union-find. Consistent structure makes it easy to recognize patterns in new problems.
Efficient solutions to Codility algorithm challenges with a focus on time and space complexity. Covers prefix sums, caterpillar method, leader algorithms, and prime/composite sieve patterns.
317+ solutions to HackerRank algorithm and data structure challenges. Covers arrays, strings, trees, graphs, greedy algorithms, and SQL. Organized by domain and difficulty.
Python microservices platform built on AWS SQS, Kinesis, API Gateway, and PostgreSQL. Models a real e-commerce order processing domain — order intake, fulfillment, inventory, and notifications — as independent services communicating via events.
End-to-end AI pipeline on AWS Bedrock — Titan Embed v2 for vector embeddings, Claude 3.5 Haiku for generation, ChromaDB for local vector storage. Includes a CDK deployment stack for ECS Fargate + EFS + ALB.
Production monitoring for RAG systems. Track latency (p50/p95/p99), cost-per-request, quality metrics (Ragas), and token efficiency. CI regression gating: fail if quality drops >5% or latency > 2s. Dashboards in Langfuse + Grafana with custom eval framework.
Voice-enabled chatbot (speak → transcribe → respond → speak) with <1s end-to-end latency. Deepgram for ASR, Claude API for generation, ElevenLabs for TTS. Latency budget breakdown, graceful degradation, and timeout handling. Built with Pipecat framework and FastAPI.