OpinionLocal AIIndustryMay 17, 202612 min read

AI products are mortal. Your workflow shouldn't be.

Eight obituaries in 24 months, nine Claude models retired, a four-hour ChatGPT outage. The case for keeping your daily AI workflow on a machine you own.

By Atul

Three numbers your workflow depends on

966

venture-backed startups shut down in 2024: a 25.6% jump on 2023, AI-heavy cohort leading

Carta / TechCrunch, Jan 2025 ↑

Claude models retired or scheduled to retire in 12 months ending June 2026

Anthropic model deprecation page ↑

established OpenAI models (incl. gpt-4-turbo, gpt-4o-2024-05-13, o1) shut down on 23 Oct 2026

OpenAI Deprecations notice ↑

Make a quick list of the AI tools you used in 2023. The chatbot you trained to write in your voice. The image app that became your whiteboard. The wearable on your lapel. The browser extension that lived in every tab. Now mark which ones are still here in 2026: still owned by the same company, still on the same pricing, still running on the model you tuned your prompts for, still holding your conversation history. Most lists come back with more strikethroughs than entries.

We talk about local-first AI as a privacy story, and it is one. But the more boring, more urgent reason to keep your AI workflow on your own machine isn’t privacy. It’s continuity. Cloud AI is a single point of failure you don’t own: bound to a vendor’s burn rate, a model retirement schedule, a board pivot, and an outage page you can refresh but not fix. Photoshop CS6 still opens your 2012 .psd. Your six-month-old ChatGPT thread doesn’t survive a closed account. This post is about the gap.

A red and white 'sorry we're closed' sign hanging on the glass door of a small storefront. — Eight named obituaries in 24 months, and that’s the short list. Photo by Tim Mossholder on Unsplash.

The 2024 graveyard you forgot you used

Start with the obituaries. None of these are obscure: each was, at some point, a product people were paying for or planning around. Each is now, in some form, dead, wound down, or stripped for parts.

Eight obituaries · 24 months

Mar 2024
Inflection AI · Pi
Founders + most of 70-person team hired by Microsoft; $620M licensing fee; consumer Pi caps within 5 months.
Source ↑
Mar–Jun 2024
Stability AI
CEO Emad Mostaque resigns; 10% of staff laid off; Sean Parker–led group recapitalizes with new CEO Prem Akkaraju.
Source ↑
Jun 2024
Adept AI
Amazon hires CEO + top execs and licenses tech; investors get ~$25M back on a $414M raise; product survives smaller.
Source ↑
Aug 2024
Character.AI
Google licenses tech and rehires Shazeer & De Freitas in a $2.7B deal; Character quits training its own LLMs.
Source ↑
Feb 2025
Humane · AI Pin
HP buys assets for $116M (vs. $230M raised, $1B asking price); Pin services switched off 28 Feb; cloud data deleted.
Source ↑
2024–2026
Rabbit · R1
100,000 buyers, ~5,000 daily users; staff strikes over unpaid wages; company teases next-gen hardware.
Source ↑
2024
Google · Bard / Podcasts / Dropcam
Bard rebranded into Gemini; Podcasts shut April 2024; Dropcam services ended after a decade; eight entries added to the Google Graveyard.
Source ↑
23 Oct 2026
OpenAI · gpt-3.5-turbo-0125 et al.
Mass shutdown: gpt-4-turbo, gpt-4o-2024-05-13, o1, o3-mini and 3+ more models go dark. Every prompt tuned for them breaks silently.
Source ↑

Two patterns. First, the modal exit for an ambitious AI startup in 2024 was not an IPO or a clean acquisition; it was a reverse acquihire: the platform vendor hires the founders and the senior team, licenses the weights, hands the investors back roughly what they put in, and the product, once the “crown jewel,” becomes a wind-down project. Inflection, Adept, Character.AI all followed the same script in a single year, with the FTC writing the details down. Second, when the product survives the deal, it survives with caps, with a smaller team, with a different mission, and often with the original founders gone. Pi added usage limits within five months of the Microsoft deal. Character.AI quit building its own models. Stability replaced its CEO and 10% of its staff inside six weeks.

The pattern that matters for your workflow isn’t the headline deal. It’s the small print: when the founders leave, the roadmap leaves. When the model is licensed away, the tuning leaves. When the company “shifts focus,” your use case is the one being shifted away from.

AI startups die faster than the rest of software

Software startups have always died at high rates. The new thing is speed. Carta’s 2024 data shows 966 venture-backed startups shut down in a single year: a 25.6% jump on 2023, attributed to “the insane number of companies that were funded in the crazy days of 2020 and 2021,” which is the AI funding cohort. Silicon Valley Bank’s enterprise data adds the burn-rate side of the picture: the 2022 cohort of AI startups spends $100M in roughly three years, versus close to six for the comparable software peers a decade earlier. The capital is going in twice as fast; the exits aren’t twice as quick.

The mechanism is mechanical, not moral. Training and serving a frontier-class model costs hundreds of millions of dollars a year in compute alone. Stability AI’s leaked 2023 numbers showed it on track to spend $99M renting GPUs against $11M of expected sales. Rabbit reportedly stopped paying parts of its 26-person staff for months while its R1 device went from 100,000 buyers to roughly 5,000 daily users within a year. The unit economics of running someone else’s frontier API while subsidizing your own free tier are merciless. When the next round doesn’t close, the product’s next release doesn’t either.

None of this is a prediction about a specific company. It’s a base-rate argument. Whatever AI app you reach for ten times a day, the category’s 2024 mortality rate says you should expect a non-zero chance, every quarter, that the company behind it gets bought, recapitalized, repositioned, or quietly wound down. The workflow you built on top of it is a hostage to that base rate.

The survivors retire your prompts on a schedule

Even if your favorite AI product survives, the model behind it probably won’t. The big three model labs now publish formal deprecation calendars, and the cadence has tightened. A prompt you finely tuned for one version of Claude or GPT in 2024 lives on a countdown clock.

Model retirement schedule, 2024 to 2026

Lab

Model

Retired

Note

OpenAI

text-davinci-003 / davinci

4 Jan 2024

Stable 'davinci' alias auto-upgraded to davinci-002; legacy prompts had to migrate to gpt-3.5-turbo-instruct.

OpenAI

gpt-3.5-turbo-0613 / -16k-0613

17 Jun 2024

First widely-deployed chat model versions to go dark.

Anthropic

claude-1.x · claude-instant-1.x

6 Nov 2024

60-day notice; recommended replacement was Haiku 3.

Anthropic

claude-2.0 · claude-2.1 · claude-3-sonnet

21 Jul 2025

All three retired together; full migration to the Claude 4 family.

Anthropic

claude-3-5-sonnet-20240620 / -20241022

28 Oct 2025

The model many production prompts were tuned against in 2024.

Anthropic

claude-3-opus-20240229

5 Jan 2026

Anthropic's stated commitment: preserve weights internally, retire the endpoint.

Anthropic

claude-3-7-sonnet · claude-3-5-haiku

19 Feb 2026

Two retirements on the same day. Recommended replacement: Sonnet 4.6.

Anthropic

claude-sonnet-4 · claude-opus-4

15 Jun 2026 (planned)

Deprecated 14 Apr 2026; the 60-day clock is ticking as you read this.

OpenAI

gpt-3.5-turbo-0125 · gpt-4-turbo · gpt-4o-2024-05-13 · o1 · o3-mini

23 Oct 2026 (planned)

A single-day cull of seven+ established models. The largest deprecation list to date.

Sources: Anthropic model deprecations, OpenAI 2026 deprecation notice, OpenAI Completions API retirements.

Read that as a workflow risk, not as a developer’s problem. Prompts have personalities. A system prompt that consistently produces the tone you want on claude-3-5-sonnet-20241022 is not the same prompt on claude-sonnet-4-6; it’s a new piece of writing you’ll spend an evening rewriting. The retirement of gpt-3.5-turbo-0125 in October 2026 will silently break every script that was hard-coded against it. Anthropic has published commitments to preserve the weights of retired models internally; that’s a meaningful gesture, but the API endpoint is what your tool calls, and the API endpoint goes dark on the date in the table.

A close-up of dim, blue-lit server racks in a data center, equipment racks receding into the dark. — Somebody else’s computer, on somebody else’s schedule. Photo by Kier in Sight Archives on Unsplash.

One cloud goes down. Every wrapper goes with it.

Vendor mortality plays out over months. The other failure mode plays out over minutes. On 11 December 2024, a new telemetry service deployment overwhelmed OpenAI’s Kubernetes control plane and took ChatGPT, the API, and Sora down for roughly four hours. Every third-party app built on top of those APIs (coding assistants, meeting summarizers, support bots, drafting tools) was offline at the same time. That is the architectural cost of building a workflow on a single provider: one bad config change in San Francisco is a four-hour gap in your day.

Outages aren’t the only way history leaves. Early in 2025 a ChatGPT incident erased months of conversation history for many users; some of it was never recovered. Anthropic’s September 2025 outage took Claude, the developer Console, and the API down together: one company’s bad day, thousands of downstream products quietly serving an error page. Cloud-only workflows inherit the operator’s availability number whether you signed up for that SLA or not.

What local-first actually preserves

Set aside the privacy argument for a moment and look at what a local-first setup actually saves from the mortality table above. It helps to think in terms of files on disk: the boring, durable unit that’s carried serious software through forty years of company collapses.

What survives when the vendor doesn’t

Workflow asset

Cloud AI

Local-first

The model itself

Endpoint goes dark on the lab's retirement date. Weights are not yours to keep.

GGUF / safetensors file on disk. Open formats, multiple runtimes (llama.cpp, MLX, vLLM).

Conversation history

Lives in the vendor's database; can be erased by an incident or a closed account.

Plain text or SQLite in a folder you own. Greppable, syncable, archivable.

Tuned prompts & system instructions

Tuned to a specific model version that may be retired in 12–18 months.

Files on disk, paired with a model you control. The pairing doesn't expire.

Custom RAG / embeddings index

Hosted in the vendor's vector DB. Migration is a project when the vendor pivots.

Local index (LanceDB, sqlite-vss, faiss) next to your documents.

Availability

Inherits the vendor's incident page. Dec 2024 outage took every wrapper down for ~4 h.

Works on a plane, after a layoff cuts your SSO, while the cloud is on fire.

Pricing

Subject to mid-contract changes as the underlying token economics shift.

Fixed cost of a laptop you already own. No per-action billing.

None of these are exotic claims. They’re the same properties every other piece of serious software on your computer already has. Photoshop is a folder of files on disk; the .psd opens whether Adobe is having a good quarter or not. Excel is a folder of files on disk; the .xlsx your accountant sent in 2009 opens today. A Lightroom catalog is a folder of files; you can move it between machines, back it up, and read it after the subscription lapses. Local-first AI is the same property applied to the model: a checkpoint on disk, a history of conversations as plain text, a config you can grep, a tool that doesn’t need a billing system’s permission to run. We’ve made the broader case that AI is moving back to the desktop and that BYOK is the better deal; the continuity argument here is the third leg of that stool.

A wall of vintage wooden filing-cabinet drawers with small brass label holders, each drawer indexed by hand. — File formats that outlived their first three employers. Photo by Jéan Béller on Unsplash.

The data-control bonus follows from the continuity bonus, not the other way around. Your prompts aren’t in someone else’s training set because they’re in a file on your machine. The next “ChatGPT history leak” headline can’t implicate a conversation that never crossed a network. Legal, medical, and financial work runs through a clean-room compliance posture without an enterprise contract. It’s a meaningful bonus, not the argument. The argument already won the post upstream.

What local doesn’t fix

Local-first is a hedge, not an absolution. Four honest caveats worth keeping in mind:

Frontier cloud is still ahead on the hardest tasks. The realistic stance is hybrid: a strong open-weights model on disk for the daily 80%, a cloud frontier model on tap for the heavy lift. The continuity argument is about which one is your default, not which one exists.
Your local model file is also a single point of failure. Back up your weights the way you’d back up a Lightroom catalog. A 70B-parameter GGUF or safetensors file is 40–80 GB of data you do not want to re-download under deadline. A second copy on an external SSD costs $40 once.
The open-weights ecosystem can fragment too. Mitigation is open formats: GGUF for quantized inference, safetensors for raw weights, both are file formats with multiple independent runtimes (llama.cpp, MLX, vLLM, transformers). The risk collapses to “the format,” which is the same kind of risk a .docx file faces, and a .docx file has aged remarkably well.
Some products are genuinely irreducible to local. Live web search, voice cloning at frontier quality, fresh news-aware reasoning, image generation at the very top end. Some of these still need cloud-scale compute or up-to-the-minute data. The local-first posture is about your daily driver, not everything.

What to do this week

Two concrete things, neither of which requires a manifesto. First, pick one AI workflow you’d be quietly upset to lose (the chat history you grep for past answers, the system prompt you tuned for your tone, the custom GPT your team built) and ask, out loud: where does this live, and what survives the company that hosts it? If the answer is “nowhere I control,” copy the prompts into a markdown file in a folder you back up. Export the conversation history. Note which model produced your favorite answers, while the model is still around to ask. Five minutes of work; survives any one company.

Second, install one local model on your laptop and run your most common task through it. Not your hardest task, your most common one. The bar isn’t “beats GPT-5 on math olympiads.” The bar is “handles the 80% of things you ask an AI for, with no login, no rate limit, no outage, no model retirement.” If you need help picking one, we’ve written a flowchart for that.

The point of this post isn’t that the cloud is bad or that every workload should run on a laptop. It’s that the AI tools you depend on most live in an industry with an unusually short mean-time-to-failure, and the people who built the rest of your durable software stack figured out the answer to that problem decades ago: ship the file to the user. The companies behind your AI tools are mortal. The workflow doesn’t have to be.

AI products are mortal. Your workflow shouldn't be.

The 2024 graveyard you forgot you used

AI startups die faster than the rest of software

The survivors retire your prompts on a schedule

One cloud goes down. Every wrapper goes with it.

What local-first actually preserves

What local doesn’t fix

What to do this week

Sora vs Veo vs Kling in 2026: one shutdown, one successor, one survivor

ByteDance models with real examples: Seedream and Seedance

Most AI apps are wrappers, and you're paying the markup

One-time payment. Yours forever.