Notes from the workshop.
Product updates, engineering notes, and thinking on privacy-first AI tooling.

Your million-token context window is lying to you
Every frontier model now advertises a million tokens. The number you actually get (the size at which the model still answers correctly) is much smaller. Here's the gap, the benchmarks, the bill, and a playbook that doesn't pretend.

Personal compute is back: AI is moving off rented GPUs
Open weights caught up. Unified memory hit 128 GB. Quantization stopped lying. The honest case for running AI on your own machine in 2026, with the cost-crossover math, the hardware floor, and where it still hurts.

Prompt caching is the biggest discount in your AI bill
Three vendors, three cache mechanics, and a 50–90% discount sitting on the table. Here's how prompt caching actually works in 2026, and how to design prompts that hit it.

Why AI is moving back to the desktop
Every major AI lab shipped a native desktop app in the last two years. The browser-first era of AI is quietly over: here's the four constraints that ended it.

BYOK vs. SaaS AI: what you actually pay, what you actually own
What a power user actually pays, what a court actually preserved, and what dies when your favorite AI tool gets sold.

Text, image, audio, video: when to reach for which model (and how to chain them)
A 2026 field guide: what each modality is good at, what it costs, and three ways to chain them together.

Run GPT-4 class models on your laptop without sending a single byte to the cloud
Open weights now match GPT-4 quality. Here's how CSuite runs them on your machine: no proxy, no logging, no tokens billed.

Welcome to CSuite
What CSuite is, why now, and how we got from cloud-only AI to a desktop app where your data stays on your machine.