Available for full-time · Early-stage startups · Remote

Parbhat Kapila · Full-Stack Engineer

Building
Production AI
Systems

Ship to production, then keep it running.

Multiple systems live in production, built and maintained independently. Cut AI processing costs by 50-80%, achieved sub-250ms latency at scale. I build internal AI tools, RAG infrastructure, and data-heavy SaaS - then keep them running. Fast, cost-efficient, reliable under load.

10K+
Emails indexed · VectorMail
50-80%
AI cost saved · Visura
<250ms
Risk scoring · Sentinel
Parbhat Kapila - Full Stack Engineer
Full-Stack Engineer · Featured Work

Production AI Systems

Production systems I ship and operate. Measurable outcomes, live in production

Pipeline Intelligence

Sentinel

Detects deals that are starting to stall before it's visible in a CRM. It models time decay, stage velocity, and engagement signals from live pipeline data. Fast, explainable, and designed for real integration load.

Before

Deals die silently. CRM shows green until the opp is gone. You find out when it's too late.

After

Explainable risk scoring. Most AI is a black box. This shows why. See which deals are rotting before they slip. Sub-250ms.

<250ms
Query latency
Live sync
CRM & calendar sync
Predictive
Explainable risk scoring
99.9%
Continuous uptime
Next.jsTypeScriptPostgreSQLPrismaRedisOpenRouterWebhooks
Semantic Search · Email

VectorMail

Email client with vector search and LLM composition. Connects Gmail via Aurinko, syncs threads, searches by meaning (pgvector), and composes replies with context. Single database for inbox and embeddings - no separate vector store.

Before

Gmail: keywords only. 'Find that email about the pricing conversation.' Good luck.

After

One DB for inbox and vectors. Semantic search across 10k+ threads. 'Emails about pricing.' Instant. Inbox and AI in one place.

Sub-second
10k+ emails
One DB
inbox + vectors
By meaning
not keywords
AI compose
thread context
Next.js 15TypeScripttRPCPrismaPostgreSQLpgvectorClerkAurinkoOpenRouterGemini
RAG Pipeline · PDF Infrastructure

Visura

PDF processing infrastructure with cost guardrails. Hash-based chunk reuse cuts reprocessing costs 50-80%. Self-healing pipelines, full observability, sub-2.5s P50.

Before

Upload a PDF, wait, pay full price every time. Update it? Pay again. No visibility into costs, no recovery if something fails mid-process.

After

50-80% cost savings on re-processed docs. Automatic crash recovery. P50 under 2.5s. Full observability: Sentry, OpenTelemetry, business metrics. Vector search with 85%+ cache hit rate.

50-80%
AI cost savings
<2.5s
P50 processing
85%+
Embedding cache hit
Self-heal
Auto recovery
Next.js 15TypeScriptPostgreSQLpgvectorOpenRouterGeminiRedisSentryClerk
Tech Stack

Tech Stack (Production)

Frontend (Product UI)

TypeScript, React, Next.js (App Router), Tailwind CSS

Backend & APIs

Node.js, Python, FastAPI, Express.js, REST APIs, WebSockets

AI Systems (Production)

OpenAI / GPT-4, LangChain, RAG pipelines, pgvector, embedding search, LLM orchestration

Data & Infrastructure

PostgreSQL, Redis, Object Storage (S3), queues / async processing

Cloud & Deployment

AWS (EC2, S3, RDS), Docker, Vercel, CI/CD (GitHub Actions)

Architecture & Practices

Multi-tenant SaaS, distributed systems, event-driven design, system design, performance optimization

About Parbhat Kapila

Full-Stack & AI: Expertise & Impact

System Architecture

Designing scalable, production-grade systems from the ground up. Built multi-tenant SaaS architectures with isolated data models, auto-scaling infrastructure for real production workloads, and cost-optimized deployments driven by architectural tradeoffs, reducing infrastructure spend by 95% without sacrificing reliability.

AI/ML Production

Productionized LLM and RAG systems backed by vector databases at scale. Built retrieval pipelines processing 10,000+ documents with 94%+ accuracy, optimized pgvector queries for sub-200ms latency, and implemented multi-provider LLM orchestration with GPT-4 and resilient fallback strategies.

Performance & Optimization

Driving measurable business impact through technical optimization. Reduced per-document processing costs from $5.00 to $0.05 through architectural changes and chunk reuse, achieved sub-200ms semantic search latency under load, and maintained 99.9% uptime across live production systems.

Full-Stack Ownership

Building and operating production systems live in production. Independently responsible for technical decisions, feature delivery, deployments, monitoring, and post-launch reliability across TypeScript, Next.js, Python, PostgreSQL, Redis, AWS, and Vercel. Owning systems from first commit through live operation.

I'm an AI-focused full-stack engineer with multiple systems live in production. Over the past three years, I've shipped and operated products handling large data volumes, reduced operational costs by 95%, and maintained the infrastructure myself.

I specialize in turning complex AI pipelines into reliable software retrieval, vector storage, and model orchestration optimized for low latency (sub-200ms), high accuracy (94%+), and real production constraints.

I'm seeking a full-time role at an early-stage startup where execution matters and engineers are trusted to ship systems that deliver measurable business value.

Professional Journey

Full-Stack & AI Engineer Experience

May 2022 - Present

Full-Stack Engineer · AI Product Builder

Full ownership of system design, feature delivery, reliability, and iteration. Shipped Sentinel, VectorMail, and Visura - all live in production and maintained independently.

Owned backend services, data stores, AI pipelines, and deployment infrastructure, including authentication, payments, and third-party integrations. Debugged production incidents, performance bottlenecks, and scaling limits while shipping improvements continuously without breaking live systems.

Next.jsTypeScriptPythonPostgreSQLRedisOpenAIpgvectorDockerAWS
Contact

Let's Work Together

Open to full-time remote engineering roles at early-stage startups building production AI systems. Best fit for teams that value ownership, speed, and engineers who ship and maintain what they build. Flexible with overlapping time zones globally - I adapt my schedule to your team's rhythm, wherever you're based.

Let's build together

Available for full-time roles

Book