Upendra Pant | AI Engineer & RAG Specialist

PORTFOLIO

Upendra P.

Architecting Scalable RAG Systems & Autonomous Agents

I take AI from prototype to production, with a focus on systems that are reliable, observable, and built to last.

View Work Let's Build Together

Available for Work

10+AI Systems Shipped

40%Avg. Cost Reduction

10M+Documents Processed

1000+ hoursManual Time Saved

Industries Served

Enterprise SaaSContent AutomationCustomer SupportEdTech Platform

02 / Work

SELECTED WORK

View GitHub

Print-on-Demand CLI Tool01

POD Automator

End-to-end workflow automated in a single command

CLI tool that automates the full print-on-demand pipeline — from AI image generation to SEO-optimized listings and scheduled RedBubble uploads. Built for freelancers who want to ship designs without the manual overhead.

PythonSeleniumOpenAIClickPillow

Adobe CC Caption Add-on02

CaptionForge

Production-ready add-on with Document Sandbox integration

A full-featured Adobe CC add-on for advanced video captioning, built with React and TypeScript inside Adobe's Document Sandbox Runtime. Supports customizable caption styles, local storage, and a responsive UI across Adobe applications.

ReactTypeScriptAdobe SDKWebpackCSS

NLP-Powered Spotify Playlist Generator03

MoodQueue

Natural language to playlist in under 2 seconds

Full-stack app that lets users describe a mood in plain English and get a matched Spotify playlist back. A React frontend talks to a Python backend that maps natural language to Spotify's audio feature API — because keyword search doesn't understand feelings.

ReactPythonSpotify APINLPFastAPI

Document RAG System04

DocMind

Instant answers from your own documents

Streamlit-based RAG app that lets users upload documents and query them in natural language. Uses ChromaDB for vector storage and Gemini as the reasoning layer — making document search feel like talking to someone who actually read everything.

StreamlitChromaDBGeminiPythonLangChain

03 / Stack

TECHNICAL TOOLKIT

Languages

Python
TypeScript
SQL

LLM Frameworks

LangChain
LlamaIndex
LangGraph
Haystack

Models

OpenAI
Claude
Gemini
Llama
Whisper

Vector DBs

Pinecone
ChromaDB

Infrastructure

AWS
Docker
Vercel

Backend

FastAPI
PostgreSQL
Supabase

Engineering Philosophy

Why I Prioritize Modular Agent Design

Every system I build follows a composable, evaluation-first approach. Agents should be testable in isolation, observable in production, and replaceable without rewriting the orchestration layer. I optimize for debuggability over cleverness, and measure everything.

04 / Writing & Code

HOW I THINK

Technical Writing

Engineering

Optimizing Vector Search Latency at Scale

A deep dive into HNSW parameter tuning, index sharding, and pre-filtering strategies that brought p99 latency under 200ms for 10M+ documents.

RAG Systems

Handling Context Window Limits in Production RAG

Practical strategies for chunking, summarization chains, and dynamic context assembly when your retrieval returns more than the model can handle.

ML Ops

Evaluating LLM Outputs Without Human Labels

Building automated evaluation pipelines using LLM-as-judge, reference-free metrics, and regression testing for continuous deployment.

Pinned Repositories

View GitHub

ai-video-transcriber

Automatic video caption generation add-on

TypeScript1

05 / Contact

LET'S BUILD YOUR AI INFRASTRUCTURE

Have a project in mind? I work best when the problem is hard and the stakes are real.

Email LinkedIn Upwork