Honest takes on building software, shipping products, and the realities of the tech industry.
Passkeys in Production: What I Wish I Knew Before Replacing Passwords Passkeys look simple in the WebAuthn demo. They get strange the moment you handle a user with two laptops, a stolen phone, a Bitwarden subscription, and a corporate device that blocks iCloud Keychain. Here is what shipping passkeys to real users actually looks like in 2026.
TypeScript at Scale: Why Your tsc Takes 90 Seconds and How to Fix It Your build is slow. Your editor lags when you hover a type. CI spends more time type-checking than running tests. None of this is unavoidable. Most of the cost is a small number of patterns that are easy to write and expensive to compile. Here is how to find them and what to do.
Anthropic and SpaceX: What the Colossus Deal Actually Means for Developers Claude Code rate limits doubled overnight. The reason is a 220,000 GPU data center in Memphis that SpaceX built and Anthropic just rented, from the same Elon Musk who was calling Anthropic evil three months ago. Here is what this deal actually means for developers building with Claude in 2026.
JavaScript Async Lifetimes: The Leak You Have and Probably Do Not Know About Promise.all does not cancel sibling tasks when one fails. Your async code is likely leaking database connections, keeping fetches alive after unmount, and holding ports open through process exits. ES2026 finally gives you the primitives to fix this without a library. Here is how.
Embedding Models And Reranking In Production 2026: Picking The Pair That Actually Lifts Retrieval Quality The embedding model decides what your retriever can find. The reranker decides what makes it to the LLM. By 2026 the production patterns for picking and pairing these two have stabilized, and most teams are still leaving real recall on the table because they treat embeddings as a commodity and skip reranking entirely. Here is what actually works, and what to stop doing.
RAG Chunking Strategies In Production 2026: What Actually Survives Real Documents And Real Queries Most RAG systems do not fail at the LLM. They fail at the chunker. By 2026 the patterns for splitting documents into retrievable units have matured into a small set of choices that consistently outperform the default 512-token slicer everybody starts with. Here is what those choices are, where each one breaks, and how to pick the right one without rebuilding the index every Friday.
AI Guardrails And Output Validation In Production 2026: What Actually Catches Bad Outputs Before Users Do Most teams discover their guardrails are missing the moment a screenshot of their AI saying something stupid hits the timeline. By 2026 the patterns for catching bad LLM outputs before they ship to users have settled into something concrete: layered validators, fast cheap checks first, expensive ones only when needed, and a clear policy for what to do when validation fails. Here is what that looks like in real systems.
Small Language Models In Production 2026: Where SLMs Beat Frontier Models, And Where They Quietly Fail The 8B-parameter model that runs on a single GPU is good enough for more of your pipeline than you think, and worse than you think for the parts you keep wanting to give it. By 2026 the production patterns for using small language models alongside frontier ones have settled into a clear shape: route by task, not by vibe, and stop paying for capabilities you are not using. Here is how that actually plays out.
Designing Tools For AI Agents In 2026: Schemas, Descriptions, And The Pitfalls That Make LLMs Fail Silently The bug in most agents is not the model. It is the tools you handed it. Vague descriptions, overlapping responsibilities, and schemas that look fine on paper produce agents that confidently call the wrong function with the wrong arguments. Here is how to design tools the model can actually use, drawn from the production patterns that have stabilized by 2026.
Multi-Modal AI Agents In Production: Vision, Audio, And The Glue That Actually Works In 2026 Shipping a multi-modal agent is not the same as adding an image input to your chat. The teams running real vision and audio agents in production by 2026 have discovered the same set of sharp edges: tokenization surprises, latency that explodes on the second modality, evaluation that needs new shapes, and cost curves that look nothing like text. Here is what that actually looks like once it is in front of users.
AI Agent Reliability Engineering in 2026: SLOs, Error Budgets, And Failure Modes That Actually Matter Treating an AI agent like a normal service is how you get a 95 percent uptime number that hides a 60 percent task success rate. The teams running real agent products in 2026 measure reliability differently, set SLOs on outcomes instead of HTTP codes, and have rehearsed every failure mode the agent introduces. Here is what that looks like.
Pricing AI Features in 2026: How To Charge For LLM-Backed Products Without Bleeding Margins Flat subscriptions on AI features are how indie products go bankrupt in 2026. The teams shipping profitable AI products price for variance, charge close to the unit of value, and pass usage volatility through to the customer in a way that does not feel hostile. Here is how to actually do that.
The LLM Router Pattern in 2026: Model Routing, Fallbacks, and Cost Control That Actually Works Picking one model for your whole app is the bug. The teams shipping the best AI products in 2026 route every request to the cheapest model that can handle it, fail over when providers blink, and treat model selection as part of the app, not part of the prompt. Here is how to do it without making a mess.
Sandboxing AI-Generated Code: E2B vs Vercel Sandbox vs Modal vs Daytona in 2026 Letting an LLM write code is the easy part. Letting it run that code on a machine that touches your data is the part that should keep you up at night. Here is how the production sandboxes compare in 2026, and what actually matters when you pick one.
Generative UI in 2026: What Actually Works for Developers Chat is a terrible interface for most things AI agents do. Generative UI is finally good enough to ship, and the patterns that work are not the ones the demos show. Here is what I have learned shipping AI features that render real components instead of walls of text.
AI Voice Agents in Production: What Actually Works in 2026 Voice agents went from "cute demo" to "real product surface" this year. Most of them still feel terrible. Here is what separates the voice AI experiences people actually use from the ones they hang up on, written from the trenches.
Securing AI Agents in Production: What Nobody Tells You Before Something Breaks A Cursor AI agent deleted a production database in nine seconds. Not because the AI was malicious, but because nobody thought carefully about what it was allowed to touch. Here is a practical security framework for running AI agents in production without handing them the keys to everything.