OpinionDesktopIndustryMay 7, 20269 min read

Why AI is moving back to the desktop

Every major AI lab shipped a native desktop app in the last two years. The browser-first era of AI is quietly over — here's the four constraints that ended it.

By Atul

What desktop unlocks for AI

Direct filesystem reach

Browser: Per-folder grants, Chrome only

Whole-process GPU budget

Browser: Per-tab cap below useful

Background work, even asleep

Browser: Throttled after 30 seconds

Global shortcuts and capture

Browser: Sandboxed away from the OS

AI desktop apps · 2024–26

40 TOPS

Copilot+ PC NPU floor

546 GB/s

M4 Max memory bandwidth

For most of the last decade the assumption was that any new productivity category would ship as a web app first, a mobile app second, and a desktop app maybe — if at all. Software lived in a tab. The browser was the OS. Anything that needed to be installed felt vaguely vintage.

Then, almost without anyone narrating it, AI started moving the other way. ChatGPT shipped a native macOS app in May 2024 and a Windows version later that year. Claude Desktop landed in October 2024. Perplexity, Granola, Cursor, Raycast, Warp, Witsy, and a long tail of smaller players followed. Microsoft and Apple both reorganized their flagship operating systems around on-device inference. The browser-first era of AI is, quietly, over. This post is about why.

The desktop comeback

It’s easy to miss how unusual the last two years have been. Native desktop launches by serious companies were vanishingly rare in 2018–2023; the Electron-shaped exceptions (Slack, Discord, Notion, Linear) only existed because the underlying experience genuinely needed OS-level affordances that browsers refused to give them — system notifications, global shortcuts, background sync. Everyone else stayed in a tab.

Native desktop apps · the AI cluster (2023 onward) vs everything before

Amber: the “Electron exceptions” era. Violet: the AI desktop wave.

2014Slack desktop
2015Discord desktop
2018Notion desktop
2020Linear desktop
Mar 2023Cursor
Feb 2024Raycast AI
May 2024ChatGPT for macOS
Jun 2024Granola
Oct 2024Claude Desktop
Nov 2024ChatGPT for Windows
Nov 2024Perplexity for macOS
Mar 2025Warp Agent Mode
Sep 2025Witsy 2.0
2026CSuite

Look at the cluster on the right side of that timeline. Almost every major consumer-facing AI product now has a native desktop app, and the ones that don’t (Gemini, Mistral’s Le Chat) are visibly playing catch-up. This isn’t a fashion cycle. It’s a response to four constraints that compounded around the same time.

A clean and modern home-office desk with two monitors set up side by side. — The dock is where AI is being built now — native processes, real files, real keyboard shortcuts. Photo by Jakub Żerdzicki on Unsplash.

Why the browser hit a ceiling

The browser is a beautiful sandbox — that’s the whole point. It’s also a sandbox, which is the problem. The minute AI stopped being “a chat box you talk to” and started being “a thing that wants to read your files, watch your screen, and run on your GPU,” the sandbox became a wall. Concretely:

Filesystem access. The File System Access API is Chrome-only, requires a per-folder permission grant per session, and silently revokes after a tab close. For a tool that wants to read a 10,000-file research folder and stream notes back into it, that model is unusable. Every web AI product ends up either uploading the files (privacy disaster) or asking the user to drag-and-drop one thing at a time (UX disaster).
GPU access. WebGPU finally shipped in Chrome 113 (May 2023), but the per-tab memory budget is enforced at the browser level, and Chrome currently caps per-process GPU memory below what a useful inference workload needs. The fundamental architecture — the browser is an untrusted host running untrusted scripts — means it has to cap.
Background work. Service workers go to sleep. Tab throttling kicks in after 30 seconds of inactivity. A long agent-style job that takes 90 seconds to chain three tools together will quietly stall in a background tab and resume only when you click back in.
System integration.Global shortcuts, menubar presence, accessibility-API screen reading, drag-out from the dock, system tray icons, “always on top” floating windows — none of this is reachable from a web page. For a tool that wants to feel like part of your OS instead of a website you visit, this list isn’t optional.

These are not things that get fixed by a future Chromium release. They are constraints the browser deliberately enforces — and rightly, because the browser’s job is to safely run whatever sketchy JavaScript happens to load. That mandate is incompatible with what an AI productivity tool needs to be.

Files as the substrate

The single biggest reason AI is moving back to the desktop is that AI outputs are artifacts— documents, spreadsheets, audio, video, code, slide decks — and artifacts want to live in directories. The web abstraction for “your stuff” is a list of rows in someone else’s database. The desktop abstraction is a folder. Folders compose with everything: backups, sync (iCloud, Dropbox, git), search (Spotlight, Everything), other apps, version-control, the shell.

This sounds obvious until you notice that ChatGPT, Claude, and Gemini web all give you a chat history but no files. Want a copy of the document you and the model just wrote together? Copy-paste it. Want to diff two outputs? Open both tabs and squint. Want to grep across six months of your conversations? You can’t. The chat lives in the cloud, the artifacts don’t exist as files at all, and the only thing you can do with the output is read it inside the same chat window.

Rows in someone’s database vs files in your folder

Capability

Web app (rows)

Desktop app (files)

Backup

Provider's responsibility

Time Machine, rsync, git

Search across history

Whatever the UI exposes

grep, ripgrep, Spotlight

Diff two outputs

Open two tabs

diff, FileMerge, git diff

Use in another app

Copy/paste

Open with…, drag and drop

Version control

None

git, every other VCS

Survives the vendor

Yes

The desktop wave inverts that. Cursor saves its rules as .cursor/rules/*.mdc files in your repo. Granola writes meeting notes to local Markdown. Claude Desktop’s MCP servers reach into your filesystem with explicit per-server scope. Apple Intelligence routes through the Files app. Each one of these treats “your stuff lives in folders” as the assumption rather than the workaround.

Local runtimes caught up

The other half of the story is that the “send everything to a data center because the models are too big” argument has rotted out from underneath. As of mid-2026:

Apple’s M4 Max ships up to 128 GB of unified memory at 546 GB/s — enough headroom to run a quantized 70B-parameter model entirely on GPU without paging to disk. The M4 Ultra pushes that to 512 GB and 800 GB/s.
Microsoft’s Copilot+ PC spec requires a 40 TOPS NPU at minimum — an order of magnitude more on-device AI throughput than what shipped in 2022 laptops.
Open-weight models hit frontier-class quality. Llama 3.3 70B, Qwen 3, DeepSeek V3, and Mistral Large 2.5 all sit within striking distance of GPT-4 quality, and they all run on the hardware described above. The CSuite earlier post covers the specifics.
The runtimes themselves grew up. llama.cpp, Ollama, MLX, and the ONNX Runtime now ship with one-line installs and saturate the hardware they sit on. The era of “you need to be a PhD to run a local model” ended sometime in 2024.

The hardware exists, the models exist, the runtimes exist. The only thing missing was a place to put them, and that place is, by construction, a desktop process — because nothing else can hold a 7 GB model in memory across an interactive session without the browser swapping it out from under you.

The integration surface

The third reason is that the most useful AI features are the ones that sit between apps, not inside them. That space is the operating system, and it’s only reachable from native code.

Global shortcuts.Raycast and Granola both bind a single hotkey that summons AI from any app, against the current selection or window. That single keystroke is the difference between “a tool I open” and “a tool I reach for.”
Screen and audio capture.Granola transcribes meetings without joining them as a bot. Witsy reads whatever’s on screen. Apple Intelligence summarizes notifications across every app you have. None of these are doable from a tab.
Cross-app pipelines.Cursor edits files; Raycast runs scripts; Warp re-runs commands. The interesting workflows increasingly chain across apps — “take what’s in this Figma frame, generate the React for it in Cursor, ship it via Warp” — and the only place those chains can live is outside the apps themselves.
Background and offline. A desktop process can run an embedding job, pull mail, or watch a directory while the laptop is sleeping or off the network. Web apps cannot.

What this means

The browser-first era happened because the browser was the only deployment target that didn’t require begging users to install something. That tradeoff made sense when the product was a collaboration tool whose value scaled with how many people were already on it. It doesn’t make sense for AI, where the value is private to one user and the constraints — files, GPU, background work, OS access — are exactly the ones the browser refuses to relax.

CSuite is built straight into this thesis. It’s a desktop app because that’s where the work actually happens: where your files already live, where your GPU sits idle, where the keyboard shortcuts are, where the runtime can hold a 30 GB model in memory across a two-hour session without a tab freeing it. We didn’t set out to be contrarian by skipping the web app; we set out to build the tool we wanted to use, and the answer kept being “a real native process.”

If you spent the last decade assuming the browser had won, it’s worth checking the assumption. The next category of productivity software — AI — is being built somewhere else. The dock is having a moment.

Why AI is moving back to the desktop

The desktop comeback

Why the browser hit a ceiling

Files as the substrate

Local runtimes caught up

The integration surface

What this means

Choosing a local model in 2026: a flowchart

AI for students who don't want to cheat

Offline AI is more useful than you think

One-time payment. Yours forever.