You don't need every AI model. You need the right five.
2.88 million models on Hugging Face. 359 on the leaderboard. You can do almost any job with five. The case for curation over catalogs.
Quick. Name the best image model for product photography. Name the best video model for an eight-second cinematic B-roll. Name the best text model for refactoring a TypeScript file. If you had to Google any of them, the dropdown lost.
The dominant pitch from AI tools in 2026 is we give you access to everything. Every provider, every version, every fine-tune. That sounds like generosity; it’s actually a bill. You pay it in decision fatigue, in stale “best model for X” trivia that rots in three weeks, in time spent A/B-ing two options for a task where either would have been fine. A deliberately small, hand-picked list of “use this for this right now” beats a 400-row dropdown every single time. This post is about why.

The catalog is genuinely absurd now
Start with the headline number. Hugging Face hosts 2,883,687 models as of May 2026 — up from one million in late 2024, and two million in mid-2025. Most are forks, quantizations, and fine-tunes nobody will run. But the “serious” sub-list isn’t small either: the Artificial Analysis leaderboard currently tracks 359 large language models, 224 of them with open weights. And that’s before you even touch image, video, audio, or music.
You don’t need to read the table line by line. The point is the feeling. Eight or twelve “serious” options in every modality you might reach for, each with a partisan blog post arguing it’s the best, each with weekly updates that nudge the ranking around the top three. Stand in front of that wall on a Tuesday afternoon with a thing you need to ship by 4pm, and the catalog becomes the problem.
“Access to everything” is a tax
The pitch is intuitive: more choice equals more power. The behavioral research has been pushing back on that intuition for twenty-five years. In 2000, Sheena Iyengar and Mark Lepper set up two tasting displays in a Menlo Park grocery store — one with 24 jams, one with 6. The 24-jar table drew bigger crowds. The 6-jar table converted them at ten times the rate (30% bought a jar versus 3%). Bigger menu, more attention, fewer decisions made. Barry Schwartz turned that finding into a book; UX designers turned the underlying mechanism into Hick’s Law: the time it takes to decide grows with the number of options you can see.
That’s the textbook version. The lived version is worse, because AI picker dropdowns have three properties that compound the problem:
- The right answer changes constantly. The Artificial Analysis leaderboard reshuffled its top ten three times in Q1 2026 alone; even a routine vote-pipeline change at LMSYS in January shifted Elo scores by 30+ points on models that hadn’t changed at all. A list you compiled six weeks ago is already wrong.
- Marginal differences dominate the picker.For probably 80% of real tasks you’d open the dropdown for, the top three models in a modality produce output you couldn’t tell apart in a blind test. The picker pretends they’re different. Mostly they aren’t.
- Default paralysis hits new users hardest.A power user can ignore 90% of the dropdown. A first-time user freezes. The “we have 47 providers” pitch is the worst onboarding experience in software.
Every decision you make at the model picker is a decision you didn’t make about your actual work. That’s the tax. It looks free because nobody invoices you for it.
Curation is the actual product
The categories that figured out this problem long ago all share a shape. They aggressively exclude.
The Criterion Collection has shipped roughly 1,500 titles since 1984. Netflix has more films in any given month. Criterion is more valued. Wirecutter publishes one pick per category, not a comparison sheet. Pitchfork’s Best New Music is a column, not a directory. The Hacker News front page is thirty items. In each case, the product isn’t the catalog — the product is the curation applied to the catalog. The taste is the value-add. The aggressive exclusion is the feature.

Apply the same lens to AI. A tool that says “here are 47 image models, good luck” is the Netflix back catalog. A tool that says “for product shots use this; for stylized covers use that; for poster typography use the third one” is the Criterion shelf. The first feels generous and is exhausting. The second feels narrow and is liberating. Restaurant menus that exceed forty items usually do it because the kitchen wants to seem competent at everything; every chef who has worked one tells you it’s the worst dish on the menu that defines your reputation, not the best.
The frontier moves. Maintenance is the work.
Anyone can publish a list. The hard part is rotating it. Flux 2 Pro leapfrogged Midjourney v6 on photorealism in late 2025; six months earlier the answer was inverted. Veo 3.1 took the cinematic video crown from Sora; OpenAI shipped GPT-5.5 in March 2026 and reclaimed the Intelligence Index top spot from Claude. Claude Opus 4.7 then took back the SWE-bench Pro lead at 64.3%. Anyone running a serious coding workflow on the “best” model from January 2026 is already wrong.
A list maintained by people whose actual job is to watch the frontier is wildly more valuable than the same list scraped quarterly. This is why the model-picker pattern fails as a UX choice: the dropdown is a snapshot, but the underlying truth is a moving target. Curation has to be a job, not a one-time table in a launch blog post.
What a curated list looks like in May 2026
Concrete is better than abstract. Here’s the small list, by job, as of the day this post went up. It will be wrong by the time you read it — and that’s the point. The list rotates. What doesn’t rotate is the shape of the list: one pick per job, six jobs, justified in a sentence.
That’s it. Six jobs, six picks. No model picker. No decision-fatigue tax. The trade-off is real — you don’t get to A/B Claude against Gemini for your daily writing — and for almost everyone, almost all the time, the trade-off is a giveaway. The cost of standing in front of the wall every Tuesday vastly outweighs the cost of using a model that is 4% behind the current leader on a benchmark you don’t actually run. The right question isn’t “which model is best?” The right question is “what am I trying to do?” A good tool answers the second.

Where curation is the wrong call
Curation is for the 80%, not everyone. Three honest cases where the narrow list breaks down:
- You have a genuinely specialist need.A medical fine-tune. A Japanese-first text model. A regulated provider on a specific compliance list. A music tool with cleared commercial rights for YouTube monetization — ElevenLabs Music ships with the Merlin and Kobalt deals that Suno doesn’t. The curated default fails these jobs; you need the broader catalog.
- Editorial bias is real and worth naming.Any curator’s picks reflect criteria — output quality, latency, cost, availability, safety profile, license terms. Publish the criteria, not just the picks. Readers should be able to disagree with the rubric, not just the choices it produces. We’ve laid ours out in the modality field guide.
- Curation can calcify. The mitigation is transparency about when the list was last reviewed and what changed. A curated list with no date stamp ages worse than a dropdown.
Pick a job. Not a model.
This is the bet we made with CSuite. Instead of bolting on every provider’s catalog, we ship a deliberately small, hand-picked roster across text, image, video, and audio. The picks rotate when something genuinely better arrives. You don’t tune dials; you pick a job and the tool routes you to the model that’s currently best for it. It’s the same posture that moving AI localis starting to put on the rest of the stack — fewer moving parts, owned by you, designed to disappear.
Spotify won by having every song, the argument goes; surely AI should copy the move. Spotify won by having every song andDiscover Weekly — an editorial algorithm whose entire purpose is to do the picking for you. The model picker is the part Spotify quietly replaced. The catalog isn’t the product. The taste is.
Next Tuesday at 4pm, you’ll have a thing to ship. The tool that helps you ship it isn’t the one with the longest dropdown. It’s the one where you don’t see a dropdown at all.


