Anthropic's Safety Superpower

botw44 · 2026-06-15T11:21:02 1781522462

The whole thesis falls apart though. You can't be on your way to "power over everything" and get distilled into free Chinese models within months. Pick one.

The bottleneck is compute and data, not the model. That's why they could only gate it for a bit. The ITAR thing proves it: no nationality controls in place, so the only option was killing the whole thing. Not exactly what an all-powerful gatekeeper does.

olmo23 · 2026-06-15T11:27:18 1781522838

> no nationality controls in place

Not for now, but how long before we have KYC regulations concerning LLMs?

thefounder · 2026-06-15T11:34:54 1781523294

That’s really what Dario wants. Let’s hope he doesn’t get it

baq · 2026-06-15T11:50:50 1781524250

what Dario wants is to retain any influence whatsover on how the research progresses before the inevitable nationalization of the frontier. he gets to keep the N-2 tech and maybe influence the N-1 tech, but the only influence on the frontier he has is today; whatever he imprints in the pipeline the government takes over.

IOW I don't think he thinks in the same categories as most folks here.

dofm · 2026-06-15T11:47:51 1781524071

Regulatory capture is the OpenAI and Anthropic end goal, for certain.

But I also think they exist in a sort of un-designed corporate narcissism, which is a common trait in bubble economies — I am not judging them particularly severely.

Netscape under Clark and Andreessen and Sun under McNealy both fell into corporate narcissism: the belief that only they really mattered, that they were chosen, and that the world needed to rearrange itself to just let them shine. They arguably let themselves get played by Oracle (a corporate psychopath) and others as a result.

OpenAI's position is profoundly corporate-narcissistic: all we need is all the money in the economy and not to have to do anything upsetting like think about turning a profit for the next four years. Like rich kids. It would be nice if you believed we were so important that we should get an enormous stipend for just being us.

Anthropic's position is: we think we're so unique and ominous that government needs to make us both essential and terrifying. We have to exist otherwise worse people will.

Both narcissistic positions.

baq · 2026-06-15T11:52:25 1781524345

> Regulatory capture is the OpenAI and Anthropic end goal, for certain.

it has to be, because the other way around - the government taking over parts or the whole thing - is inevitable if the trend holds.

blitzar · 2026-06-15T12:02:13 1781524933

the inevitable trend is that numbers will be free and nobody will control the whole thing

ai-celebrities are just clinging to relevance like all the other celebrities out there

dofm · 2026-06-15T11:53:38 1781524418

Porque no los dos?

baq · 2026-06-15T11:56:30 1781524590

this is exactly the play is my point

throw1234567891 · 2026-06-15T11:53:15 1781524395

Yeah yeah, but after the IPO!

swalsh · 2026-06-15T12:02:15 1781524935

The distilled versions miss the spark of the model. Its like they land in the uncanny valley of models.

kordlessagain · 2026-06-15T11:02:23 1781521343

> To that end, I can certainly buy the case that Fable/Mythos is in fact more capable when it comes to identifying and exploiting security issues

This has been covered before: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... (https://news.ycombinator.com/item?id=47732020)

> Anthropic’s cautious roll-out was justified. The problem with publicly releasing models, however, is that guardrails can be jailbroken, and apparently that is exactly what happened shortly after the release

The future is unevenly distributed. Anthropic, and Amodie in particular, seem to be of the mind they can control a bit of the unknown using words. They are likely being guided by the very product they built. *AI CAN MAKE MISTAKES

That Project Glasswing bullshit reeks of it. Corporations have take control of our attention, our Internet, and now our thinking.

I say it's high time to take it back.

mofeien · 2026-06-15T11:49:28 1781524168

The top comment in the very discussion you linked on that AISLE blog has a strong rebuttal to that blog post...

chasil · 2026-06-15T11:07:20 1781521640

(reposted)

As I understand it, ITAR regulations for export controls have just been applied to any form of Mythos. These are overseen by U.S. Departments of State and Commerce, and forbid foreign nationals from access to any form of Mythos, either within or outside the U.S.

Only U.S. citizens and immigrants that are holders of a "green card" may now access Mythos.

It appears that Anthropic does not have internal controls to implement these restrictions in any form, so the only option was to shut Mythos down.

Penalties for ITAR violation can reach ten years in prison and a million dollars per violation. (I can post a link to those details if there is any interest.)

As long as Anthropic is a U.S. company, there is no escaping this.

https://fortune.com/2026/06/14/how-a-warning-from-amazon-led...

khalic · 2026-06-15T11:21:26 1781522486

This is how the US gov does business now, capricious and vengeful.

Textbook retaliation for not letting them use an abliterated version of Claude in weapons systems.

This effectively renders any US closed model useless for any foreign company. Could happen to OpenAI, Google, etc. Too much of a risk to implement something that can be yanked out because the company didn’t behave the way they want.

Looks like it’s time for Kimi, Z, Deepseek to take the front row. They’ll catch up in a few months anyway. Kimi code 2.6 is crazy good

CuriouslyC · 2026-06-15T12:05:39 1781525139

This is a suicide shot for the American economy. The numbers only lined up for AI to rescue the USA from its debt if it captured a significant portion of the world's AI spend, and while it was a longshot before, there's basically zero percent chance the world trusts American AI when the government is pulling strings.

chasil · 2026-06-15T11:42:50 1781523770

Consider this quote from the main article...

"When you further combine this realization with the company’s pronouncements about AI’s ability to conduct all economic activity, you realize that Anthropic’s leadership effectively wants to have power over everything and everyone."

This is fearful stuff on all sides, and none of the people involved might realistically be able to navigate the danger.

baq · 2026-06-15T12:02:12 1781524932

the whole thing playing out as expected. if you think about it, the only question is the timeline.

the next model with a gap to mythos as mythos is to opus will be controlled technology from the get-go. the one after it may be top secret.

khalic · 2026-06-15T12:07:16 1781525236

Open models will catch up eventually, TOTL models will get distilled into smaller, more efficient versions, it’s not something you can moat indefinitely

khalic · 2026-06-15T11:53:59 1781524439

That part just sounds like hyperbole at best, conspiracy at worst.

By that logic, anybody who values safety has a god complex? It’s absurd…

chasil · 2026-06-15T11:59:59 1781524799

I am just quoting the parent article.

"What this degradation represented was both the capability and willingness of Anthropic to silently alter its models to achieve its policy preferences. In other words, Anthropic willfully validated some of its critics’ worst fears in terms of being a supply chain risk."

khalic · 2026-06-15T12:03:10 1781524990

Again, hyperbole and assumption of evil intent because… they take precautions? Nice prose doesn’t dispense you from forming a sound hypothesis

eloisant · 2026-06-15T11:32:18 1781523138

I never really understood this "US person" restriction. There are 350M people in US, mostly citizens and green cards holders, surely some of them could be working for a foreign power.

vidarh · 2026-06-15T11:45:21 1781523921

They don't even need to know they are. You can assume that if the model becomes available again, a lot of people will find themselves working for companies distilling these models that just happens to ultimately do work for foreign entities, whether or not the people accessing the models knows or not.

WithinReason · 2026-06-15T11:31:56 1781523116

Could Anthropic relocate to a different country?

chasil · 2026-06-15T11:44:01 1781523841

Individuals can leave, but the company cannot transfer restricted intellectual property.

Europe has extradition treaties, so the U.S. can force anyone in Europe back to the U.S. for criminal indictment who demonstrates inappropriate possession of this technology.

khalic · 2026-06-15T11:57:02 1781524622

Well, force is a strong word… it’s still just accords, that the US doesn’t seem to be valuing lately… so if they say no, what’s the US going to do? Start a war over a company?

swalsh · 2026-06-15T11:37:42 1781523462

"they by extension think that only they should have final say over AI generally. When you further combine this realization with the company’s pronouncements about AI’s ability to conduct all economic activity, you realize that Anthropic’s leadership effectively wants to have power over everything and everyone."

That might be one of the most important points in the post. Very troubling.

smackeyacky · 2026-06-15T11:11:30 1781521890

Perhaps they should consider leaving the US. Pretty clearly the descent into a corrupt autocracy is having real consequences.

Zealotux · 2026-06-15T11:17:55 1781522275

Does any other place have the infrastructure Anthropic requires to train their models and run inference?

ramon156 · 2026-06-15T11:23:36 1781522616

No. If we cannot even have an EU CloudFlare, then we definitely do not have the infra for this kind of computing.

The EU options are not even close to what CF can do

eric8bits · 2026-06-15T11:27:53 1781522873

There are fortunately some initiatives and interesting developments in the European market. Take bunny.net for example. We have to start somewhere in Europe, right? Better late than never.

mrits · 2026-06-15T12:02:27 1781524947

This infrastructure may not even be needed in the next decade. Europe should have done this 20 years ago.

s_dev · 2026-06-15T11:33:17 1781523197

>EU CloudFlare

What limitations does bunny.net have?

re-thc · 2026-06-15T11:37:40 1781523460

> What limitations does bunny.net have?

A huge free tier (technically, none)

re-thc · 2026-06-15T11:39:12 1781523552

> Does any other place have the infrastructure

That's not the problem.

The US government can export ban GPUs like they do now to more countries if needed. Even if the infrastructure exists, the GPUs won't.

mcmcmc · 2026-06-15T11:29:30 1781522970

China

Levitz · 2026-06-15T12:03:33 1781525013

Why settle for play pretend autocracy when you can go all-in amirite?

CuriouslyC · 2026-06-15T12:07:28 1781525248

Better to be firmly under the boot of a sane autocrat than have the illusion of freedom under a madman.

thedreammachine · 2026-06-15T11:11:40 1781521900

The interesting part here is not whether Anthropic is right on safety, but that safety gives them a moral vocab for bold policy changes and platform power.

cube2222 · 2026-06-15T11:07:28 1781521648

Relatedly, I think it's worth noting that Anthropic models have consistently been top-scoring in BullshitBench[0], in a league of their own, really.

Not affiliated with the bench in any way, but I think it surfaces important differences between the behavior of the models from different labs.

TLDR: The benchmark is measuring pushback in response to nonsensical requests and questions, as opposed to going with it and hallucinating a nonsensical answer.

[0]: https://petergpt.github.io/bullshit-benchmark/viewer/index.v...

mcintyre1994 · 2026-06-15T11:38:12 1781523492

TBH this is the main thing that made me start trusting Claude enough to actually find it useful, and I'm surprised other models haven't caught up. I assumed they had and I just wasn't aware because I'm not using them in the same way.

LoganDark · 2026-06-15T11:49:22 1781524162

> The entire Anthropic origin story is rooted in the founders’ belief that OpenAI wasn’t taking safety seriously enough; the company believes that only they can control AI, and that because they uniquely care about safety, they are justified in trying to control everyone else, up to and including the U.S. government.

Anthropic believes they have the responsibility to guard their tools from mis-use. That is all. They are not trying to "control" anything or anyone. They do however decide what they think is mis-use.

keybored · 2026-06-15T11:36:03 1781523363

> Here’s the thing about these safety justifications: I think they work because, to Anthropic, they aren’t justifications. The company really believes that they are the only ones who believe in super intelligence, and thus are the only ones who are sufficiently concerned about the dangers. That excuses decision after decision, policy after policy, and confrontation after confrontation that, to people on the outside, look like a bizarre combination of cynicism and naiveté.

I really dislike this belief (that has at least been expressed here) by some that X is okay because they-really-believe-it. This has a real Road to Hell stank on it.

It is incredibly convenient when your predictions or supposed beliefs go south. Well, we really believed that we were doing it for the betterment of human kind. And we really believed that X was an existential threat that was inevitable in which case we had to step up and do it because we we the only good guy ideologues. So sorry but not sorry.

I also don’t care if commenters know rank-and-file on the inside that “really believe it” as well. Not for one second.

Peterz_shu · 2026-06-15T11:13:35 1781522015

This is the part where the USA and allied countries can gain a headstart from using such an overpowered model.

This only just shows how strong Mythos/Fable will be, once released to the public.

I'm guessing about 0.5 year till public.

ben_w · 2026-06-15T11:35:28 1781523328

> USA and allied countries

Doesn't this *exclude* allies countries?

blitzar · 2026-06-15T11:59:16 1781524756

They are probably thinking of the "Board of Peace"