Why I shipped VORA before writing a single line of backend code
The story of building an AI meeting assistant that runs entirely in the browser — and why going serverless from day one was the best decision I made.
Series: VORA B.LOG
- 1. Why I shipped VORA before writing a single line of backend code ← you are here
- 2. From Python Server to Pure Browser: The Architecture Pivot That Changed Everything
- 3. The Whisper WASM Experiment: Why Browser AI Is Harder Than It Looks
- 4. Why We Killed Speaker Identification (And What We Learned from Two Weeks of Failure)
- 5. Building an N-Best Reranking Layer for Better Korean STT (Without Extra API Calls)
- 6. Building the Priority Queue: How We Stopped Gemini API Chaos — and Why the First Two Designs Both Failed
- 7. Groq Dual-AI Integration: Why We Added a Second AI and What It Actually Fixed
- 8. The Meeting Summary Timer Bug: Why setInterval Isn't Enough for Reliable Scheduling
- 9. Building a Real Meeting Export: From Raw Transcript to a Usable Report
- 10. The Dark Theme Redesign: Building a UI That Looks Like a Professional Tool (After It Looked Like a Hobbyist Project)
- 11. The Branding Journey: From a Functional Name to VORA
- 12. How We Made VORA Bilingual Without a Heavy Localization Stack
- 13. Deploying to Cloudflare Pages: Static Hosting, CORS Headers, and the Sitemap/Robots Incident
- 14. How I Fixed AI Over-correction
The constraint that became a feature
When I started building VORA, I had a simple rule: no backend. Not because I couldn't build one, but because I wanted to see how far browser APIs could take me.
The Web Speech API for transcription. The Gemini API called directly from the client. Local storage for persistence. No server, no database, no infrastructure to maintain.

Why this matters
Most meeting assistants require:
- Account creation
- Monthly subscriptions
- Sending your audio to someone else's servers
- Trusting a company with your meeting content
VORA requires none of that. Your audio never leaves your browser. There's nothing to sign up for. It just... works.
The technical reality
Going fully client-side isn't all roses:
- Web Speech API is inconsistent across browsers — Chrome is great, Firefox is spotty, Safari is... Safari
- No persistent storage means if you clear your browser data, your transcripts are gone
- API keys in the client required careful thought about security boundaries
But every trade-off was worth it for the simplicity of the user experience.
Real-time context correction
The feature I'm most proud of is the real-time context correction. The Web Speech API gets things wrong constantly — especially with technical terms, names, and acronyms.
VORA sends chunks of transcript to Gemini with the prompt: "Correct this transcript for context. The meeting is about [topic]. Fix names, technical terms, and obvious misrecognitions."
The result is surprisingly good. Not perfect — but good enough that the final transcript is actually useful.
The lesson
Ship the simplest version that solves the core problem. Everything else is optimization.
VORA doesn't have user accounts, doesn't have a mobile app, doesn't integrate with calendars. But it transcribes meetings in real-time with AI-powered correction and generates useful reports. That's the core. Everything else can come later.
VORA B.LOG — Follow along as each experiment unfolds.