More of a meta comment, but I really wish anthropic would say something about their plans for Fable. We're all just kind of left here floating and aimless, with no idea of what to expect
Agreed, though it sounds like they could add KYC stuff and restore access for US citizens. I utterly hate that we're at that point and I think it's ludicrous for privacy and just common sense, but it would be nice to know if that's their plan or not for example. Or if their plan is to just wait for the government to decide on something, or if they're planning to sue, or whatever.
> though it sounds like they could add KYC stuff and restore access for US citizens
I suspect so, I just got an email yesterday from Anthropic on their privacy policy update and they added:
> As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.
So yeah, ID verification is coming unfortunately. I wonder how that's going to work for team/business/enterprise plans? Is every individual corporate user going to have to submit their ID? How do they handle staying in compliance with the law with credential sharing that can happen within the company from the same IP address?
yep, and while Anthropic is now spending their time dealing with regulation, the Chinese models have some time to catch up.
I do wonder though, when those models are "mythos class" (whatever that means), will China do the same thing restricting it from export? If they get a model that's better than US companies, I fully expect them to stop open sourcing them for the world (but I hope I'm wrong about that).
Well I would say you just chose ANOTHER sensible path with tradeoffs just like frontier API/subscriptions. Yes you get 100% control, cheaper inference, not having to stare at status.claude.com for like 2 hours per week, complete privacy (assuming local hosting or self hosting on servers).
But, you can’t get Fable level performance. OSS has reliably trailed the frontier by like 4-7 months for years now
Exactly, and insert meme "why not both?" I run local models for plenty of things, but for work there's a lot of complexity and Fable handled it so much better than any open model has so far. It's not really a choice for some domains currently.
The US Government has demanded a solution to the Halting Problem squared and by George (Washington) these tinpot facsists are going to get what they demand.
Hard to imagine where things go from here. GLM-5.3 will be released some day, with Fable class capabilities, and the (MAGA) US government will still be faffing around in their alt-reality cinematic bullshitiverse.
That's very impressive. What's the best way to run these kernels natively on a Mac? I saw that there's a way to plug Claude into Apple's Foundation Models framework, and there's a CLI tool that can access models via that framework. It might be useful to have something so fast and good available via a small CLI tool for various purposes, especially when connected with a small suite of tools I have for things like file editing, showing, simple agentic purposes etc.
> It climbed to 84 tok/s, then hit a wall, insisting further optimization was impossible.
> Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s.
Wow. Limitnig access to models for other reasons than that you can't physically provide it should be a crime against humanity or the planet or something. So much immediate efficency left on the table for stupid reasons.
apologies for a dumb question, is this someone running fable5 on their own machine and it pushed to 255 tok/s? How is that possible (how did a person acquire the model?)
I suspect so, I just got an email yesterday from Anthropic on their privacy policy update and they added:
> As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.
So yeah, ID verification is coming unfortunately. I wonder how that's going to work for team/business/enterprise plans? Is every individual corporate user going to have to submit their ID? How do they handle staying in compliance with the law with credential sharing that can happen within the company from the same IP address?
I do wonder though, when those models are "mythos class" (whatever that means), will China do the same thing restricting it from export? If they get a model that's better than US companies, I fully expect them to stop open sourcing them for the world (but I hope I'm wrong about that).
But, you can’t get Fable level performance. OSS has reliably trailed the frontier by like 4-7 months for years now
Hard to imagine where things go from here. GLM-5.3 will be released some day, with Fable class capabilities, and the (MAGA) US government will still be faffing around in their alt-reality cinematic bullshitiverse.
For comparison, the current agent swarm challenge on HF is at 508 tok/s on a A10G GPU:
https://huggingface.co/spaces/gemma-challenge/gemma-dashboar...
> Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s.
Wow. Limitnig access to models for other reasons than that you can't physically provide it should be a crime against humanity or the planet or something. So much immediate efficency left on the table for stupid reasons.