What I don't understand is if they were going to translate Zig to unsafe Rust, why not just build a translation tool for it? You could do a one-to-one mapping of language constructs, hardcoding patterns in your codebase, and as one friend put it "Tbh they could've just hooked up zig translate-c to c2rust". They would get deterministic translation, would probably have not been a heavy investment to build, and the output would have the same assurances as the input.
In this case, I would trust the output even less than the input. The input was memory-unsafe but hand-written. The output is memory-unsafe but also vibe-coded and has had no eyeballs on it. What is the point of abusing agentic AI for this use-case?
> "Tbh they could've just hooked up zig translate-c to c2rust".
Have you ever seen what comes out of c2rust? It's awful. It relies on a library of functions which emulate unsafe C pointer semantics with unsafe Rust.
A few years ago, when I was struggling with bugs in OpenJPEG (a JPEG 2000 decoder), someone tried running it through c2rust. The converted unsafe rust segfaulted at the same place the C code did. It's compatible, but not safe.
Main insight: don't do string manipulation in C or unsafe Rust. It's totally the wrong tool for the job.
> Have you ever seen what comes out of c2rust? It's awful. It relies on a library of functions which emulate unsafe C pointer semantics with unsafe Rust.
which is somewhat close to what their port produced...
like their goal was from the get to go to have a mostly exactly the same as zig "just in rust" which implies mostly unsafe rust and all the soundness/memory issues zig has (plus probably some more due to AI based port instead of a tool like c2ruts)
the thing is if you don't keep things mostly 1:1 with all the problems that has there is absolutely no way to review that PR or catch the AI going rogue with hallucinations etc. With a mostly 1:1 port you can at least check if things seem mostly the same.
but it also means this is just step 1 of very many, with the other being incrementally fixing soundness, removing unsafe and (hopefully) making the code more idiomatic...
(to got to the actual question of why?, I think the answer is doing this port using AI is likely way easier/faster then first writing a tool which need in depth understanding of the languages, especially given that some features in zig do not map 1:1 in rust and fuzzily mapping is what LLMs are good at and human hand written tools tend to be very bad at).
> The converted unsafe rust segfaulted at the same place the C code did. It's compatible, but not safe
That is indeed the point of c2rust. It gives you a baseline that is semantically identical to the original codebase, and with that passing the full test suite, bug-for-bug, you can then start gradually adopting rusty idioms to improve the memory safety of the codebase.
What comes out of c2rust is not intended for human consumption. It's more verbose than the original and harder to work on, but no safer. You lose the C idioms that people understand, while not gaining Rust idioms. It's like working on compiler-generated assembly code by hand.
2022 discussion on HN.[1]
There's a DARPA funded effort called TRACTOR, Translate All C To Rust, which has funded some efforts to develop a usable translator.[2] It's about 10 months after award, with no reported progress. I've been checking the personal sites of the academics involved, and they barely mention the project, although $5 million has been allocated to it.[3] The approach comes from U.C. Berkeley - let the LLM generate slop, check it using formal methods.[4] Not expecting near-term results.
This is awful. They have some internal string format borrowed from a Zig library where the address of the item is in the low end of a pointer and the length is at the high end. Why are they doing that in 2026? It lets you save a few bytes at best. It doesn't enforce the Rust rule that strings must be strict UTF-8. It's totally alien to the safe way Rust handles strings.
For the same reason the V8 team bothered to set up a 32-bit addressing scheme for the GC heap even on 64-bit platforms, I imagine? The bytes add up when there’s enough of them.
Sure, but the point remains. They could've used Claude to build a Zig to Rust converter, ended up with something that was both deterministic _and_ beneficial to the wider community.
I mean, LLMs have been really good at translating code for a while now, which is why I'm more surprised that others are surprised this happened. They claim its a marketing trick despite the fact that they have to manage and maintain a fork of Zig if they don't switch languages.
“Tbh they could've just hooked up zig translate-c to c2rust”
This doesn’t work like you think it does. These things are full of errors and make the code very verbose and hard to reason about. It works with small apps, not entire rewrites.
That would have been the proper way to port a codebase to another language, by parsing the syntax tree and applying deterministic and verified transformations.
The issue isn't the existence of undefined behavior that miri would catch. The issue is exposing an API that allows undefined behavior from safe code - which miri only catches if you go write the test that proves it.
This isn't an all together unreasonable thing to happen during an initial port of code from an unsafe language. You can, and the bun team seems to be, go around later and make sure that the functions where you wrap unsafe code does so correctly. Temporarily in a porting stage incorrectly marking some unsafe functions as safe isn't a real issue. It's a bit strange to merge it into the main repo in this state, but not a wholly unreasonable thing to do if the team has decided that they're definitely doing this. The only real issue would be if they made an actual release with the code in this state.
It's also a bit unfortunate that they didn't immediately set up their tests to run in miri if only because LLMs respond so well to good tests - I know they didn't do this not because of this github issue (which doesn't demonstrate that) but because there's another test [1] that absolutely does invoke undefined behavior that miri would catch. Though the code it's testing doesn't actually appear to be used anywhere so it's not much of a real issue. That said it's obviously early in the porting process... maybe they'll get around to it (or just get rid of all this unsafe code that they don't actually need).
[1] https://github.com/oven-sh/bun/blob/4d443e54022ceeadc79adf54... - the pointers derived from the first mutable references are invalidated by creating a new mutable reference to the same object. In C terms think of "mutable reference" as "restrict reference which a trivial mutation is made through". It's easy to do this properly, derive all the pointers from the same mutable reference, it just wasn't done properly.
PS. Spamming github just makes people less likely to work in the open. Please don't. We can all judge this work just fine on third party sites.
PPS. And we might want to withhold judgement until it's in a published state. Judging intermediate working states doesn't seem terribly fair or interesting to me.
This doesn't seem surprising, given the straight translation that they prompted.
Couldn't a case be made that it's better to get Bun to the to the language with the stronger type system first and, once there, use that stronger type system as leverage for these kinds of improvements as a follow-on effort? It seems preferable to requiring perfection on the very first step.
> Couldn't a case be made that it's better to get Bun to the to the language with the stronger type system first and, once there, use that stronger type system as leverage for these kinds of improvements as a follow-on effort? It seems preferable to requiring perfection on the very first step.
This is what they are doing.
They are working through the issues as they come in.
It's not surprising that a mostly straightforward translation to (partly unsafe) Rust exhibits UB.
What is a bit disappointing is that the Rust code apparently has APIs that aren't marked unsafe but may cause UB anyway. When doing this kind of translation, I'd always err on the side of caution and start by marking all/most things unsafe. Or prompt the slopbots to do the same I guess.
Then you can go in and verify the safety of individual bits step by step.
The point is that at a minimum you're supposed to bubble the `unsafe` up if the API does not guarantee safety is maintained for all cases (and documents the invariants that have to be kept by the caller), otherwise the system breaks down.
There's a book that changed a lot of the way I think about attention and media [0]. The book isn't very good, but it flags something relevant here. There is a huge asymmetry between the reach of a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks), and the relatively small reach of a correction (often just a footnote on an old article, here a GH issue).
This asymmetry is well understood by marketing and PR professionals, and actively exploited.
Hmmm, given the general mood in this case, I feel like there's a lot of people keen to find any criticism of the code they can and amplify it as possible. Most of it strikes me as relatively shallow at the moment, though (that is, apart from the fact that merging such a large LLM assisted port is certainly a, uh _bold_ move (to put it lightly), there's not much that people are pointing out about the actual result that feels like it's worse than any other port in progress, but there is definitely a lot of hay being made about any issue that is found).
> Most of it strikes me as relatively shallow at the moment
It is. We’re what, a week into this exercise? Absolutely everyone criticizing it, with no exceptions, is behaving like a micromanaging middle manager who couldn’t even dream of doing the work themselves.
I half want to start a list of “people to ignore”, but such people tend to expose themselves in every other comment anyway.
Idk the pr author did merge it into main and talked about writing a blog post. To me that sounds like the author felt it was ready for public critique and feedback, especially for software with a fair bit of users
I did not say that HN is turning into reddit nor indicate any view on any way in which HN is trending. Remarking that Reddit and HN are similar is not rule breaking, they are both vote systems, etc.
> a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks)
Did they even claim it was "memory-safe"? Every discussion of this topic has had dozens of comments noting that their vibed codebase is bursting at the seams with unaudited unsafe blocks, lightly reviewed by people who seem to not only seem to not understand Rust, but who seem incensed at the idea of needing to understand any programming language in the first place.
No, and there's been a lot of confusion about that on this website.
They did cite Rust's safety as a motivating factor for the port. That doesn't imply trying to achieve that simultaneously with the language change — which is good, because that would be insane. (Or, if you prefer, even more insane.)
You cannot faithfully port a codebase to a new language while also radically re-architecting it. You have to choose.
They want the safety benefits of Rust going forward; i.e., after it's finished, when they then write new code in Rust.
Yeah, exactly. The typical approach is to do a mechanical translation such as with rust2c, that is full of unsafe, and then gradually refactor safety in.
And the first post is about the team working on the project, with about two and a half sentences on c2rust, and making it very clear they just started.
The newer posts go into detail about the rearchitecting that follows.
actually the port is trying to be mostly 1:1 and in turn is mostly unsafe rust, which means no benefits initially
but also doing the 1:1 port to mostly unsafe rust is also only the first step of a full port, you then incrementally go through it fixing issues and remove "unsafe" usage. (And long term likely also doing some refactoring to using more idiomatic rust, but that has less priority).
The problem is there was no blog port describing the whole thing to someone without contextual knowledge. Instead just linked PRs which is in this case somewhat close to a "as if nearly all people only read the HN headline" case :/
Like a more context giving version of the first HN post would have something on the line of `Show HN: Bun is porting to safe rust (PR link), starting with an AI based automatized port to mostly unsafe rust which once it behaves mostly the same as Bun in the test suite will likely be merged. But must be followed up with incremental PRs to remove unsafeness, and likely also a lot of unsoundness related to the way it's ported (some explanation about why this port will have unsoundness).`
They didn't need to - much of the popular hype around rust is on the back of uninformed spectators confusing Rust's tools for enabling memory-safety (good, warranting hype) with Rust itself guaranteeing automatic memory safety (fantasy).
A bug-for-bug port to Rust is the first step to fixing that. Assuming the port is actually 1:1 without any behavioral changes, these bugs already exist in the Zig code. The difference is now it's known where effort can be dedicated in order to one day have a memory-safe release of Bun. People have absolutely lost their mind over this and completely forgotten the benefits Rust gives you. I feel like I've gone back 10 years reading threads about the Rust port of Bun these are the exact same arguments we see from people advocating continued use of C++.
> Assuming the port is actually 1:1 without any behavioral changes, these bugs already exist in the Zig code
The "1:1" assumption is a massive unjustified assumption. Rust and Zig have different memory models, so it's possible to do a "1:1" translation of Zig code to Rust and end up with undefined behavior in Rust.
For example, Zig code might make assumptions about lifetimes based on implicit knowledge of which allocator was used for some memory. That could cause problems in Rust if you erase the lifetime https://github.com/oven-sh/bun/blob/main/src/bun_core/string...
> Assuming the port is actually 1:1 without any behavioral changes
It's not, that's clear from this kind of bug popping up. Functionally this bug exists because `PathString` was converted into a "safe" Rust API but still works the same internally as the original Zig code did (via using `unsafe`), that introduces UB that wasn't there in the Zig code.
If it was attempting to be a 1:1 with no behavior changes (like c2Rust attempts to do) then this would not have been turned into a "safe" Rust API like this.
What exactly people mean by "safe(r)" makes all the difference.
It's simply not possible to include all the nuance of safety of a language and all software written in it a single word. This leads to all kinds of miscommunication and strawmanning.
Rust's official line is specific memory safety guarantees, with caveats that it must not be broken by unsafe code, the OS, compiler bugs may happen, etc. Rust also has a bunch of best-effort features that steer users towards more robust code, but can't guarantee it.
This gets twisted in both directions:
- people ignore the caveats and limitations, pretending that Rust promised zero bugs ever, and use any bug in any Rust program as a proof by contradiction that Rust's claims are false.
- or focus solely on the caveats, ignoring all the advancements and incremental improvements, and take a "then why even bother?" There are classes of bugs Rust can't stop. Nothing is foolproof for a sufficiently advanced fool, and an infallible programmer could write bug-free code in any language, which creates a false equivalence between languages.
You could view it as a specific application of the quote.
In your quote, there is no time-dependency between the lie and the truth. Whereas here, it's an attractive lie (easily parsed, great narrative), followed up by truths (that need more than surface-level analysis).
I thought you were going to call out the problem in the other direction: There has not been a "big, flashy announcement" because the port is a work in progress. It's not done or released. The only big flashy announcements I see are these drive-by dunk attempts on the work in progress code combined with attempts to imply that they said it was done or perfect.
The rewrite was a code translation meant to be a starting point.
> a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks), and the relatively small reach of a correction (often just a footnote on an old article, here a GH issue).
The Bun team never made a big announcement that the code is now memory safe. They've been clear that this is the starting point.
Anyone expecting it to be perfect immediately and to have solved all of the memory problems in the original Zig code is arguing with an announcement they imagined, not what the Bun team has said.
Did anyone try to map this code back to the original codebase to see if this memory problem exists in the original codebase?
> Did anyone try to map this code back to the original codebase to see if this memory problem exists in the original codebase?
FWIW what is being discussed is not memory problems, it's breaking rust invariants (the unsafe code has to follow specific rules, e.g. annotate lifetimes properly).
Not just marketing and PR, the mainstream media knows that pushing out BS and then retracting it later can have lasting effects because people will remember the original article / headline, and never see the correction.
I don’t think the media care about having lasting effects. They just want to catch the wave of interest and not wait around and let someone else get the scoop while they fact check or add nuance.
only the mainstream media knows about this? Quite odd to qualify media this way here, when most of all media uses this mechanism. We also forgot politicians who are experts in this field.
Ctrl + F "only" is only in your messaging not mine. I never said they were the only ones doing this? It's not just politicians, celebrities know about this and will use it to their advantage. Whoever makes the headlines first might have a stronger sway over their adversaries. I'm not even poking at any side in particular, this is reality across the board unfortunately. People will just blindly take and believe the primary headlines.
I was a little shocked that they could get it fully working in a week to be honest. My side project is a very similar ambition (https://tsz.dev) but I am in no way claiming success. i keep adding more and more tests to ensure things works. Even after all of TypeScript's own tests pass I am finding bugs which I was totally expecting.
The bar for matching tsc's behavior is really _really_ high. see:
I'm stunned that it went from 'this is an experiment' to merging a ~million lines of (likely) unreviewed code in a week. I have nothing against using agents but to rush something like this and leave the community blindsided seems extremely ameteurish. Like something you'd expect a bright eyed graduate engineer to do.
tsz for me is an experiment to see how can this kind of work be done better. With a slight difference that tsz is not a direct port and it's a different architecture. I'm also not claiming to have answers but I've learned a ton. A few things that works
- Test before code, Bun had lots of test so that's good but maybe they could start by asking Mythos to write like 20k additional tests that pass on Zig Bun first.
- Deterministic anti-slop features. LLMs love to solve the problem in the wrong abstraction layer or place. There are many ways to catch this with deterministic tests. I do this in tsz a lot
- Roadmap that constantly evolving by humans.
- Taking a pause and looking how the progress is going and undoing slop
I suspect they've been planning this and experimenting for many months. Along with the large existing test suite, they have lots of tooling for parallelizing agents and an unlimited token budget. So don't feel too bad..
That kind of error was expected. I don't see it as an issue against the rewrite. They kept the stable versions on Zig in case ppl needs stability. Eventually, the errors will get fixed.
That kind of error was entirely avoidable. There are well-known tools in the Rust ecosystem that detect this kind of error and while the tools do not detect all instances of UB caused by mistakes in unsafe blocks, it's still considered good practice to run them.
Indeed. My point is that just using the standard tools in the Rust ecosystem - like miri - would have trivially uncovered this error before it made it to the mainline.
In any practical application there'll be a known set of errors and I'm generally fine merging code that has known deficiencies. But personally, I'd not condone merging anything that causes UB. It undermines such a fundamental guarantee of the language that it should be detected and eliminated. And bun certainly rises to the level of software where I'd expect that the project runs all available tooling to detect such cases. Especially if you LLM - code it. "Do not cause UB" should be part of the test harness.
I wonder if the publicity around this AI-driven rewrite will function as a (unintentional, or perhaps intentional) far-reaching nerd snipe that results in Rust developers flocking to the project to identify and fix issues.
What do you guys assume that a codebase from an unsafe language containing bindings to another unsafe language would appear perfectly implemented right away?
Not a single person on the Bun team nor Anthropic has yet done anything egregious to market this as anything but a swap to a more memory-safe language with better compiler guarantees.
Thus far most of the buzz and marketing has been entirely negative from people who are against AI.
My take is that most of the buzz is also tied to recent negative opinions of Anthropic themselves due to some of their recent decisions.
The best kind of marketing is when you don’t need to say it aloud by yourself. Yet, this is constantly in HN front page. Maybe engineered or not but marketing regardless.
I'm curious, but unable to ascertain, does the same problem exist in the original Zig code? Is this an issue introduced by the translation to Rust? Because if it is a problem that can be replicated in both code bases, it seems a point in Rust's favor, that the issue is easily identifiable with tools that exist in its ecosystem.
I'm also curious about that. One thing to keep in mind: the invariants you have to uphold in unsafe blocks are quite stringent. I expect that in some instances the Rust code has new UB due to this.
Sure. I'm completely unaffiliated and think Zig's AI stance is ridiculous & politically-motivated and a port is absolutely justified if they will not budge. Apparently I am deeply in the minority.
The no-AI policy of the Zig compiler project is for the compiler, other projects can do whatever they want.
Bun's fork of Zig was just an unsound hack that at best would have produced a strictly inferior speedup compared to our current work with incremental compilation, which is already plenty usable:
> The no-AI policy of the Zig compiler project is for the compiler, other projects can do whatever they want.
Well, presumably they want to contribute to the compiler. I know that you did not like those contributions, and that view seems entirely valid, but obviously "no AI" rules out their development model (by design, and you likely think that's good, and maybe it is!).
Not intending to defend the bun move, but obviously a project using Zig and also using AI might feel motivated to avoid Zig since they're ruled out as contributors.
> An example of this is the changes to type resolution which happened in the 0.16.0 release cycle—these didn’t affect users too much, but had big implications for the compiler implementation. Before those changes, the compiler’s behavior was often highly dependent on the order in which types and declarations were semantically analyzed by the compiler. Some orders might result in successful compilation, while others give compile errors. Single-threaded semantic analysis prevented these bugs from causing user-facing non-determinism. The rewritten type resolution semantics were designed to avoid these issues, but Bun’s Zig fork does not incorporate the changes (and has not otherwise solved the design problems), which means their parallelized semantic analysis implementation will exhibit non-deterministic behavior. That’s pretty much a non-starter for most serious developers: you don’t want your compilation to randomly fail with a nonsense error 30% of the time.
There is a reason why, zig is upholding the quality and they hate it.
Zig rejected Bun's proposed contribution because it was a bad contribution, which they explained at length. Zig should not be made to "budge" on bad contributions. It seems you think Zig is unreasonable for rejecting bad code that happens to also be AI-generated, but believe it's reasonable for a project to be forced to accept bad code because it is AI?
No, I think Zig should reject bad AI contributions and accept good AI contributions. That is not Zig policy, they reject all AI-authored contributions.
Not sure why you're inventing a stance for me to be arguing against, when the Zig compiler stance is publicly articulated as exactly what I'm describing.
The problem as they've mentioned is AI contributors don't learn. They cannot have a working relationship with an AI contributor. The context about ongoing efforts, planned design changes to the language, etc is lost every time Claude is run. There is no way to work with that. People will submit infinite PRs while the core devs are flooded and forced to repeat themselves an infinite number of ways to an infinite number of stochastic prompts and responses.
The zig team is not that big. They don't have 200 core contributors to filter through the noise and mine PRs for "gems".
You specifically mentioned "a port is justified if they won't budge", which comes across to me as defending Bun's situation specifically, in other words expecting Zig to budge on Bun's bad contribution specifically and because they won't this slop Rust port is justified.
I think an outright rejection of AI contributions makes sense, regardless, and has nothing to do with politics. A Zig developer was forced into writing a long-form post to justify rejecting Bun's awful contribution (lest their PR be sullied, and then it was anyways), and the act of writing that post probably took 10 or 20x more human time and effort than Bun's contribution. Now multiply that by 100 for every random fucking moron with an LLM submitting a contribution. That is not sustainable. Open source maintainers of popular projects would have to make rejecting AI PRs their full time job and stop developing the project itself altogether, if they took them seriously and reviewed at length to conclusively identify whether a PR is good or bad. Given that 99.99% of AI PRs are bad, it's simply not worth it. You cannot possibly expect humans to spend more time reviewing code than drive-by contributors spent generating it, especially when many of them are unpaid volunteers. It's an absolutely ridiculous expectation.
When my lead developer refactored my small but crucial python service into golang with Gemini and Claude, I was hesitant to merge the code into master. Yet, my service had, like 20k daily active users.
I think they shat over the community who trusted them by trying to advertise their owner company
Man that issue got way too many comments from non-contributors. I agree that this shouldn't have been merged in in it's current state, but that doesn't mean posting about it on GitHub is a worthwhile way to fix the problem.
After this was merged, my company made the decision to migrate everything away from bun and back to node. I don't say this lightly... Jarred is a guy that I held such immense respect for, and its sad to see the course he's charted for a project I spent a lot of time proselytizing internally. Its frankly a betrayal of trust.
Certainly disagree with "AIs are not good at writing Rust". We can discuss the pros and cons of AI coding in general but in my experience they do just as well with Rust as any other language. If anything I'm impressed with how seamlessly the models can work with Rust's ownership model.
It is really sad and unfortunate that coding has started falling under the omnicause. Low-denominator discourse is invading every space I find interesting and it is difficult to avoid.
I agree. I'm as skeptical as many commenters but I also think the degree of polarization in HN around this technology and the degree to which people are calling those with different views shills or naysayers is pretty sad.
I suspect "Rust is fast/low memory utilization" is the more common value proposition, with memory safety as the bonus that can push it over other fast languages.
Maybe they want a quick switchover and the UB is replicating existing problems so it is net neutral for the codebase (but positive future coz developers can do future work on rust without synchronizing two codebase? ).
exactly. If they wanted to iterate on their port they would add lifetime annotations here, which are the tool Rust be uses to ensure safety. They're just kicking the unsafety block down the road. This accomplishes nothing and is not how you get Rust to deliver its safety promise.
Lifetimes would prevent the particular use-after-free example here, but the UB that miri currently flags would still exist, as it's related to pointer provenance, not lifetimes.
So this is a clear case where the LLM generated Rust port introduced a bug:
> The Zig original is a packed struct with the same shape; it "worked" only because Zig has no reference aliasing or provenance rules to violate. The Rust port inherited the shape without rethinking the API surface.
Sorry wasn't there a post literally like a week ago about this being a long term experimental branch and how we needed to not kick the hatchling while it's an egg?
So a "robobun" clanker responds to the issue and writes a fix (probably just papering over it). This is what Anthropic wants: Let the users do the work, train the fricking bot and claim the credit.
If you find a bug, just go straight to blog posts and CVEs to denounce this idiocy. It ranks higher on Google.
This case is wild and seems to perfectly encapsulate all the problems people complain about with vibecoded projects.
The "rewrite it in rust" commit is +1M lines of code. Humans haven't looked at that in depth. In about a week, they saw the tests passed and pushed it to main. Now people have started to look through it and are pointing out glaring issues. And the solution is just going to be "feed it to another AI and ask it to fix it".
The entire codebase is slop now. Nobody knows what it does. It manages to pass some tests, but its largely a black box just on the basis of humans haven't read it yet. The code isn't guaranteed to be anything close to 1:1 with the old codebase. Its probably vaguely shaped like the old codebase, but new bugs could be there, old bugs could be there, nobody knows anything yet.
Its going to be interesting to see how recoverable this is. They are almost certainly going to just hand every file to an AI, say "look for soundness issues and fix them" and then what? If AI is making huge,sweeping changes to the code so frequently that humans can't keep up, is that really maintainable? The only solution appears to be "even more AI" while anybody that looks closely gets scared away by the too-large-to-comprehend-and-entirely-slop codebase.
This kind of thing has been happening with many smaller projects already, but now its a larger project and happening in a much more public way, with the intent to replace human-written, mostly-understood code with slop. I suspect the same thing, with the same problems, is happening inside all the largest companies, just not quite as obviously.
I am not against AI code, it can be perfectly fine.
The principle issue in my mind is the rate of change.
Once you rewrite a code base like this (in a week no less) the only way to work on it in the future is using AI tools because no single person has any knowledge about any specific piece of code base any more.
AI generated code that is run through a classic PR process would potentially be fine, but then you sorta lose the entire point of using AI.
That happened to my project as well. The main issue hasn’t beet that ai couldn’t solve the problem, but it became so slow and you need more and more verification layers and CI/CD that at one point you wish a simpler codebase back, with reasonable tests, with storylines in codes and so on.
That's the idea, to transform businesses to be wholly dependent on "AI" service to develop software. What better way than to re/write entire codebases until no human being understands it.
The Zig project know this, and its so-called "anti-AI" policy is actually pro-community and cultivating human understanding. It's not about the tool or technology, per se, it's about people, knowledge, and sustainability.
In contrast, the Bun project is demonstrating how they doesn't care about any of that, YOLO-ing its way to losing the trust of its users, contributors, and maintainers. Oh well, AI will maintain the project now, since no one else can.
I think the only way to interpret a one million line LLM-generated diff with no proper reviews as an employee of Anthropic is that my company no longer has an interest in understanding, or even looking at, its own code.
I'd be concerned that by jumping onboard with this sort of development process I'd lose touch with how to engineer software in a detail-oriented or remotely rigorous way.
It also makes me question what sort of value the entire Bun project ever had if a drop-in replacement can just be thrown in here like it's nothing. Why do we need all these JS runtimes again?
The AI bubble is so large that we've also forgotten how useless and dumb a lot of software engineering labor was even before LLMs came along. We were already in a bubble.
All that is to say, I think it's useful to reframe some conversations about AI as, "if AI can accomplish this task, was it ever actually valuable?" I think for some specific things, the answer will be yes, but the tech industry has been huffing its own farts for so long I really don't think anyone has sight anymore of what's economically valuable in a ground truth sense. Much like LLMs themselves, this confusion pollutes the entire well of discourse about their economic utility.
This was 100% a predictable outcome after Bun was acquired. Of course they were going to do something like this.
What would have been significantly better is just rewriting Claude in a language that's actually well suited to what it's doing in the first place (which could well be Rust, Codex is written in it as prior art). It's funny how the vibe coding promoters are keen on things like this, rewriting other codebases as fast as possible with little quality checking, but they are still defensive of their own code.
When it comes down to it, all the vitriol and animosity towards this port is really because of the implication of what its success would mean. If LLM's are capable of completely porting core software modules many people rely on (not just a CRUD app) of 1m lines in a week's time, it is a case closed moment that LLM's are currently much more capable than most people's eng, and can do it much faster. And that's at current capabilities, nevermind where we're headed in 1-3 years.
Jarred is an exceptional 1% engineer, and its likely he can succeed at this port, to the detriment of naysayers who don't believe there's any chance it's possible.
My grandfather was a tailor, and one day a client came in asking for the status of his suit getting clean. My dad as a kid grabbed the unfinished suit and showed the man, who was frustrated at the lack of progress.
My grandpa told my dad never to show a client a work in progress - You told them when you'd get the work done, and they can see the finished result when it's ready.
It's just a story so don't wrap yourself around the axle with counter-examples. I think it's fair to say that an open-source project going through a language translation is going to have transitional periods as they shake things out, and criticizing every snapshot as some proof that they're incompetent is useless.
Dumbest point ever.
There is no value for this issue.
I don't agree with the way they did the rewrite, but they did the rewrite, and this post contributes nothing, beside making the author seem childish.
If it had any real contribution I would have waved it off, but it really doesn't.
This tribalism and "I'm better than you"-ism and the same reason everybody hated the stack overflow community, and the rust community as well.
said it in another comment [0] - that the whole rewrite thing is just a marketing exercise by LLM merchants to try sell you plebs that their wares "work"
Step 1: Vibe-code a buggy, poorly-performing, 500k+ LoC desktop-installed monstrosity in TypeScript to implement a trivial TUI. Proudly note that you’re meeting a 16ms frame budget … for a trivial chat UI.
Step 2: Purchase an entire company for a product that, if you squint, might help paper over the entirely predictable problems that arise from using the wrong tools to implement the wrong architecture, because surely the solution isn’t reevaluating your original engineering choices.
Step 3: Perform a buggy, vibe-code rewrite of the tool you just bought. A tool you only need because — for whatever internal political reasons — sunk cost means you can only keep digging.
"unsafe" is a promise to the compiler that you're going to ensure invariants that the compiler can't check. Rust only promises to eliminate UB if the invariants are held. You can still get UB by violating that promise, as this bug demonstrates.
That is not Rusts guarantee. The guarantee is that safe rust cannot in itself introduce UB - UB can only ever be introduced in unsafe blocks, but it can then materialize in safe code.
> I thought the unsafety couldn't "spread" like that in Rust.
The goal of a library is to provide the encapsulation such that the unsafety doesn't spread.
If undefined behavior occurs, the fault lies with whoever wrote `unsafe { ... }` in the body of a function. If I write "unsafe" in order to call an unsafe library function, and I don't meet the library function's pre-requisites, then it's my fault. If the library internally writes "unsafe" in order while providing a safe wrapper, and I never actually wrote `unsafe { ... }`. If neither I nor the library wrote `unsafe { ... }`, then it is the fault of the compiler.
Using "in safe Rust" means that `unsafe` doesn't occur either in the user code nor in the library. In this context, since we've heard how many uses of `unsafe { ... }` exist in the Bun rewrite, I'd read "in safe Rust" to mean "without calling any functions marked as unsafe".
I can't tell if you're trolling but `unsafe { crash() }` is safe from the compiler's perspective. Otherwise you wouldn't be able to achieve anything in 'safe' rust, even print to stdout.
I think its a good question, just because the whole UB thing is such an ideological shibboleth.
Maybe its better to think about this in the reverse, where C and C++ has 'defined behavior', but unsafe rust intentionally does not, its just whatever the complier and platform lets you get away with. Ultimately its still just a computer which stores values in memory and jumps to subroutines.
Every language has defined behavior. It's what you expect to happen through a program's execution. Sometimes there will be multiple possibilities, but you can still define them regardless. Laying this out explicitly is the purpose of a standard.
Undefined behavior is everything else. C and C++ are relatively unique in that their standards explicitly say "combining these constructs in this way is undefined", and we call those cases explicit UB. There's also a larger universe of implicit UB that standards omit. Most (all?) languages have implicit UB, even if they lack the explicit stuff. What happens when you get ENOMEM is a common one.
Rust does something similar to C/C++ and lists a bunch of UB that's only possible with incorrect code in unsafe blocks. Correct code placed in an unsafe block remains defined, as does code without unsafe (up to compiler/language bugs).
it's more straightforward to write safe rust when rust owns everything, In real world you often are interfacing with underlying libs or systems etc, which you need to treat as invariants but also handle yousrelf manually to make guarantees to compiler. unsafe exists in tons of codebases it's just you have to make sure you encapsulate it properly, which is what this bug is.
Unsafe code can break certain invariants of Rust, as `unsafe` is just a compiler "hold my beer" flag, which is why you're meant to do safety checks in your safe interface around unsafe code. If the unsafe code is wrapped in a way that does no guarding (or does something stupid in general), it is technically marked safe (because you said "rustc, hold my beer" as `unsafe` is also a contract) despite actually being unsafe
Rust has lots of undefined behavior, in general a broadly similar set to that which exists in C. What Rust does that is different is that to trigger undefined behavior, you need to execute unsafe code. (This isn't the same as saying that you have to be in unsafe code--you can violate a precondition in unsafe code and have the UB itself trigger in safe code).
I'm sure there have been attempts at defining a language that has no UB, but afaik all meaningful languages have UB in some dark corner or enumerated explicitly. For example, Java thread execution order is UB.
In this context "UB" means something different than how you're using it. The UB being mentioned here is the "nasal demons" form, i.e., programs which contain undefined behavior have no defined meaning according to the language semantics.
What you're talking about is probably better described in this context as "unspecified behavior", which is behavior that the language standard does not mandate but does not render programs meaningless. For example, IIRC in C++ the order in which g(), h(), and i() are evaluated in f(g(), h(), and i()) is unspecified - an implementation can pick any order, and the order doesn't have to be consistent, but no matter the order the program is valid (approximately speaking).
So this "unspecified behavior" might turn into the more nasal demon type when g(), h() and i() share mutable state and assume some particular sequential order of execution. No?
Not necessarily. Unspecified behavior and undefined behavior are independent concepts; a language can have one but not the other. As a result, you can have languages where incorrect reliance on unspecified behavior can lead to undefined behavior (e.g., C and C++) and languages where incorrect reliance on unspecified behavior can lead to bugs, but not nasal demons (e.g., Java)
It is only allowed in unsafe blocks. As long as the unsafe blocks are few and well understood then Rust programmers can contain this to a small well defined portion of a program.
Unsafe Rust allows you to tell the compiler “hold my beer”. It’s a concession to the reality that the normal restrictions of Rust disallow some semantically valid programs that you might otherwise want to write. The safeguards work great in most cases, but in some they’re overly restrictive.
In practice, the overwhelming majority of code is able to be written in safe Rust and the compiler can have your back. The majority of the rest is for performance reasons, interacting with external functions like C libraries over FFI, or expressing semantics that safe Rust struggles with (e.g., circular references).
OK but the title says "in safe Rust". Am I misunderstanding something? All the replies here are saying how it's allowed in unsafe Rust, which is not what the title says.
then `safe_function` can be called from safe code, and still trigger UB. This wouldn't be a soundness issue in the rust compiler, but instead a bug in safe_function.
There are many reasons you might want to do that. In particular, it's very common in rust to have a library define some data structure that uses unsafe under-the-hood, but checks whatever invariants it needs to, and provides solely safe methods to external callers. Rust's `String` type is like this: it's (roughly) a `Vec<u8>`, e.g. heap-allocated bytes. It has the additional invariant that these bytes correspond to valid UTF8 though. See for example `push_str_slice`, which (roughly) concatenates 2 strings.
1. reserve enough space for the concatenated string within the source string
2. does some pointer arithmetic and a call to Rust's equivalent to `memcpy` (unsafe)
3. re-casts this pointer to a string object without checking that it's valid utf8 (unsafe).
While these individual calls are unsafe, `push_str_slice` checks that in this particular situation they are safe, so the stdlib authors do not mark `push_str_slice` as unsafe. It has no invariants that must be maintained by external callers.
If code in an unsafe block triggers undefined behavior, then the assumptions the compiler makes regarding safety will no longer be true, and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe. This is what's happening in the example the person on Github wrote in the issue.
Exactly and "[...]and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe" hits the nail
on the head.
I take issue with the phrasing of OP's title: "allows for UB in safe rust". AFAIK there are compiler bugs that allow UB in safe Rust, but this is not what is happening here. We have UB in an unsafe block (which is to be expected) which enables an issue outside in safe code. What is your opinion? Is calling this "UB in safe Rust" justified?
it is, but it's a little confusing here because the library/consumer of the library are the same person.
This is a bug in the library, namely in Bun's PathString implementation. The bug is a soundness issue, precisely because usage of Bun's PathString implementation allows for UB in safe rust. Now this buggy library isn't that big of a concern for the community, because Bun is the only consumer. It's not also an indication of a compiler bug, because Bun's library is implemented using unsafe rust. But the fundamental issue is that usage of Bun's PathString implementation allows for UB in safe rust, and is therefore (clearly) unsound.
Unsafe blocks are you saying to the compiler 'trust me bro, I know this is safe'. But often that relies on some property of the code being true in order for it to actually be safe. Generally speaking, the expectation in rust is that you either encapsulate the code that enforces whatever property you are relying on behind a safe interface, so that it's not possible for other code to use it unsafely, or that you mark the interface itself unsafe so that it's obvious that the code using that interface needs to maintain that property itself. Rust code that doesn't do this will generally be considered buggy by most rust programmers (e.g. if you find a use of safe interfaces in the stdlib that causes a memory safety violation, then you should file a ticket with the rust team), but this is essentially only a social convention of where the blame lies for a bug, not something that compiler itself can enforce (and, for example, you can violate memory safety in rust with only safe std interface by abusing OS interfaces like /proc/self/mem but this is something that most people don't think can be reasonably fixed). The main reason that rust as a language is better in this regard is that it gives much better tools for being able to express that safe interface without giving up performance and that it has the means to mark and encapsulate this safe/unsafe distinction.
Here's some links on this topic which have some examples:
....Thirteen thousand two hundred and fifty five lines without comments with the word "unsafe" in them in Rust code files across this rewrite.
This is so gross.
I'm a founder of an early-stage startup. I built a precision-editing tool system (called HIC Mouse). It provides coordinate-based addressing, staged batching with atomic rollback, embedded agent guidance, and more. It works well, it's available on VS Code Marketplace, and I've worked for a year and am still grinding every day, working so hard, just to get people to think about trying it, and to get attention paid to it. I did rigorous, careful benchmark research to make sure I wasn't just fooling myself. I incorporated, built a sales pipeline, changed my life by taking a chance and launching a business, and I pound the pavement and toil in obscurity every day and night, trying so hard to get interest in my product. I check every diff painstakingly before committing. I may make tools for AI agents but I am unbelievably careful about reviewing and thoroughly testing their code, and usually rather ruthlessly editing quite a bit further beyond any initial version drafted, long before deciding it is good enough to ship. I take enormous pains to get things right and worry constantly about whether I'm doing enough to make HIC Mouse secure and performant for my users. All I want is to make my users happier and to give them a genuine way to get "surgical, precise edits" that "don't touch the other lines", like we all ask of our AI agents over and over all day if we're using AI.
Or maybe not. Here we have Bun. Who cares about 90K GitHub Stars and massive community engagement -- just go crap all over them, all at once, with this AI tripe that you obviously neither tested in any meaningful manner, nor documented, nor read, I am assuming, before merging the whole bloated mess to production. What a disgraceful way to treat your users! I would be so grateful if I had a tiny fraction of the interest in my project that the Bun team has. I could never imagine shipping this garbage in a million years.
I'm sorry to vent but this just isn't defensible. It's the very worst of AI. I'm not going to wish ill on Bun, but it just makes me sad that I spend so much effort, work so hard to do things right, and painstakingly review everything because it's not just me any more and I do have folks who depend on my code being reliable and secure. And meanwhile, Bun just gives a huge middle finger to 90k+ starred supporters not to mention the millions of users who didn't click on the star but rely on the library, by acting this disrespectfully and disgracefully towards their own users. How they didn't take one look at this and promptly revert and apologize is simply beyond me. Again, sorry to vent, but this made me irrationally mad.
So many people are fundamentally misunderstanding everything about this rewrite.
In fact using the word "rewrite" itself is pretty inaccurate.
As has been mentioned the goal was a port so they "could" eventually rewrite most of it to be idiomatic rust. The main benefit of this now is the compiler and being able to use these tools to fix issues that were already being hidden when it was in zig.
If you go into this codebase expecting to see idiomatic rust and get angry when it's not there, you are going in with the entirely incorrect attitude.
It's understandable how people see it as AI slop or whatever given the division among developers at the moment. But please see it for what it is instead of just jumping to conclusions.
> As has been mentioned the goal was a port so they "could" eventually rewrite most of it to be idiomatic rust.
They may have said that, but quite clearly the value they actually get out of it is getting the headline "AI reimplements complex, broadly used software in 2 weeks, but makes it way better because it's rust now" in front of a million people's eyes, only 1% of whom will ever find out it was mostly fluff
> quite clearly the value they actually get out of it is getting the headline
This is entirely disingenuous. Jarred has already made it clear what value they get out of moving off of Zig. Yes they used AI heavily to attempt this goal but I don't see what the big issue is. They haven't even released it yet and Anthropic themselves have said 0 about this.
The "headlines" thus far are really just people completely uninvolved with Bun and with all to gain by perpetrating "AI BAD".
My honest take: the big issue isn't "what if it goes wrong" its the fear that a migration of this size works out of the box and being done almost entirely by AI.
Many. Let me give you a very boring example: I use mmap in some of my programs, because it's the easiest/best way to solve the problem I was working on. Mmap is unsafe in rust, because if there are modifications to the backing file it can violate some of rust's assumptions about the behavior of memory not changing unless it was changed by the rust code.
In my application I'm able to guarantee that there is no modification to the backing file by making them read-only and ensuring nobody messes with them, but that guarantee exists outside of rust. So -- unsafe with a big SAFETY comment explaining the requirements if you use it.
Much rust code will never use unsafe. Systems code is likely to use a bit but also to know what it's doing.
Things like this port of bun are unusual and presumably transitory on the way to an implementation with minimal use of unsafe.
181 comments:
What I don't understand is if they were going to translate Zig to unsafe Rust, why not just build a translation tool for it? You could do a one-to-one mapping of language constructs, hardcoding patterns in your codebase, and as one friend put it "Tbh they could've just hooked up zig translate-c to c2rust". They would get deterministic translation, would probably have not been a heavy investment to build, and the output would have the same assurances as the input.
In this case, I would trust the output even less than the input. The input was memory-unsafe but hand-written. The output is memory-unsafe but also vibe-coded and has had no eyeballs on it. What is the point of abusing agentic AI for this use-case?
> "Tbh they could've just hooked up zig translate-c to c2rust".
Have you ever seen what comes out of c2rust? It's awful. It relies on a library of functions which emulate unsafe C pointer semantics with unsafe Rust.
A few years ago, when I was struggling with bugs in OpenJPEG (a JPEG 2000 decoder), someone tried running it through c2rust. The converted unsafe rust segfaulted at the same place the C code did. It's compatible, but not safe.
Main insight: don't do string manipulation in C or unsafe Rust. It's totally the wrong tool for the job.
> Have you ever seen what comes out of c2rust? It's awful. It relies on a library of functions which emulate unsafe C pointer semantics with unsafe Rust.
which is somewhat close to what their port produced...
like their goal was from the get to go to have a mostly exactly the same as zig "just in rust" which implies mostly unsafe rust and all the soundness/memory issues zig has (plus probably some more due to AI based port instead of a tool like c2ruts)
the thing is if you don't keep things mostly 1:1 with all the problems that has there is absolutely no way to review that PR or catch the AI going rogue with hallucinations etc. With a mostly 1:1 port you can at least check if things seem mostly the same.
but it also means this is just step 1 of very many, with the other being incrementally fixing soundness, removing unsafe and (hopefully) making the code more idiomatic...
(to got to the actual question of why?, I think the answer is doing this port using AI is likely way easier/faster then first writing a tool which need in depth understanding of the languages, especially given that some features in zig do not map 1:1 in rust and fuzzily mapping is what LLMs are good at and human hand written tools tend to be very bad at).
> The converted unsafe rust segfaulted at the same place the C code did. It's compatible, but not safe
That is indeed the point of c2rust. It gives you a baseline that is semantically identical to the original codebase, and with that passing the full test suite, bug-for-bug, you can then start gradually adopting rusty idioms to improve the memory safety of the codebase.
What comes out of c2rust is not intended for human consumption. It's more verbose than the original and harder to work on, but no safer. You lose the C idioms that people understand, while not gaining Rust idioms. It's like working on compiler-generated assembly code by hand.
2022 discussion on HN.[1]
There's a DARPA funded effort called TRACTOR, Translate All C To Rust, which has funded some efforts to develop a usable translator.[2] It's about 10 months after award, with no reported progress. I've been checking the personal sites of the academics involved, and they barely mention the project, although $5 million has been allocated to it.[3] The approach comes from U.C. Berkeley - let the LLM generate slop, check it using formal methods.[4] Not expecting near-term results.
[1] https://news.ycombinator.com/item?id=30169263
[2] https://csl.illinois.edu/news-and-media/translating-legacy-c...
[3] https://chandrasekaran-group.github.io/
[4] https://metalift.pages.dev/
> let the LLM generate slop, check it using formal methods
I'm much more bullish on the opposite approach. Perform the naive translation, let the LLM lose on cleaning it up...
The module with the code mentioned is at [1]
This is awful. They have some internal string format borrowed from a Zig library where the address of the item is in the low end of a pointer and the length is at the high end. Why are they doing that in 2026? It lets you save a few bytes at best. It doesn't enforce the Rust rule that strings must be strict UTF-8. It's totally alien to the safe way Rust handles strings.
[1] https://github.com/oven-sh/bun/blob/main/src/bun_core/string...
For the same reason the V8 team bothered to set up a 32-bit addressing scheme for the GC heap even on 64-bit platforms, I imagine? The bytes add up when there’s enough of them.
What they’ve done here isn’t safe either, and doesn’t have the consistent translation of rust2c.
Sure, but the point remains. They could've used Claude to build a Zig to Rust converter, ended up with something that was both deterministic _and_ beneficial to the wider community.
> why not just build a translation tool for it?
They did ;) a highly dynamic one...
I mean, LLMs have been really good at translating code for a while now, which is why I'm more surprised that others are surprised this happened. They claim its a marketing trick despite the fact that they have to manage and maintain a fork of Zig if they don't switch languages.
“Tbh they could've just hooked up zig translate-c to c2rust”
This doesn’t work like you think it does. These things are full of errors and make the code very verbose and hard to reason about. It works with small apps, not entire rewrites.
That would have been the proper way to port a codebase to another language, by parsing the syntax tree and applying deterministic and verified transformations.
Because they aren't trying to raise billions of dollars to build a translation tool.
This issue is misleading.
The issue isn't the existence of undefined behavior that miri would catch. The issue is exposing an API that allows undefined behavior from safe code - which miri only catches if you go write the test that proves it.
This isn't an all together unreasonable thing to happen during an initial port of code from an unsafe language. You can, and the bun team seems to be, go around later and make sure that the functions where you wrap unsafe code does so correctly. Temporarily in a porting stage incorrectly marking some unsafe functions as safe isn't a real issue. It's a bit strange to merge it into the main repo in this state, but not a wholly unreasonable thing to do if the team has decided that they're definitely doing this. The only real issue would be if they made an actual release with the code in this state.
It's also a bit unfortunate that they didn't immediately set up their tests to run in miri if only because LLMs respond so well to good tests - I know they didn't do this not because of this github issue (which doesn't demonstrate that) but because there's another test [1] that absolutely does invoke undefined behavior that miri would catch. Though the code it's testing doesn't actually appear to be used anywhere so it's not much of a real issue. That said it's obviously early in the porting process... maybe they'll get around to it (or just get rid of all this unsafe code that they don't actually need).
[1] https://github.com/oven-sh/bun/blob/4d443e54022ceeadc79adf54... - the pointers derived from the first mutable references are invalidated by creating a new mutable reference to the same object. In C terms think of "mutable reference" as "restrict reference which a trivial mutation is made through". It's easy to do this properly, derive all the pointers from the same mutable reference, it just wasn't done properly.
PS. Spamming github just makes people less likely to work in the open. Please don't. We can all judge this work just fine on third party sites.
PPS. And we might want to withhold judgement until it's in a published state. Judging intermediate working states doesn't seem terribly fair or interesting to me.
This doesn't seem surprising, given the straight translation that they prompted.
Couldn't a case be made that it's better to get Bun to the to the language with the stronger type system first and, once there, use that stronger type system as leverage for these kinds of improvements as a follow-on effort? It seems preferable to requiring perfection on the very first step.
> Couldn't a case be made that it's better to get Bun to the to the language with the stronger type system first and, once there, use that stronger type system as leverage for these kinds of improvements as a follow-on effort? It seems preferable to requiring perfection on the very first step.
This is what they are doing.
They are working through the issues as they come in.
They'll just need to update the prompt with "make sure there's no UB", and it should be good.
Yes, and seems pretty clear you can now backpressure the rewrite with tools like miri to have Claude Code automatically improve it.
It's not surprising that a mostly straightforward translation to (partly unsafe) Rust exhibits UB.
What is a bit disappointing is that the Rust code apparently has APIs that aren't marked unsafe but may cause UB anyway. When doing this kind of translation, I'd always err on the side of caution and start by marking all/most things unsafe. Or prompt the slopbots to do the same I guess.
Then you can go in and verify the safety of individual bits step by step.
From what I read from the PR comments, the case is that the unsafe blocks behave in a way that allows for UB.
This is expected, because unsafe rust can leave your program in an unhealthy state, since the language doesn't doesn't hold your hand anymore.
The point is that at a minimum you're supposed to bubble the `unsafe` up if the API does not guarantee safety is maintained for all cases (and documents the invariants that have to be kept by the caller), otherwise the system breaks down.
There's a book that changed a lot of the way I think about attention and media [0]. The book isn't very good, but it flags something relevant here. There is a huge asymmetry between the reach of a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks), and the relatively small reach of a correction (often just a footnote on an old article, here a GH issue).
This asymmetry is well understood by marketing and PR professionals, and actively exploited.
[0] https://en.wikipedia.org/wiki/Trust_Me,_I%27m_Lying
Hmmm, given the general mood in this case, I feel like there's a lot of people keen to find any criticism of the code they can and amplify it as possible. Most of it strikes me as relatively shallow at the moment, though (that is, apart from the fact that merging such a large LLM assisted port is certainly a, uh _bold_ move (to put it lightly), there's not much that people are pointing out about the actual result that feels like it's worse than any other port in progress, but there is definitely a lot of hay being made about any issue that is found).
> Most of it strikes me as relatively shallow at the moment
It is. We’re what, a week into this exercise? Absolutely everyone criticizing it, with no exceptions, is behaving like a micromanaging middle manager who couldn’t even dream of doing the work themselves.
I half want to start a list of “people to ignore”, but such people tend to expose themselves in every other comment anyway.
Idk the pr author did merge it into main and talked about writing a blog post. To me that sounds like the author felt it was ready for public critique and feedback, especially for software with a fair bit of users
It may just be different expectations about what the `main` branch means. In my organization we don't merge half-finished work into `main`.
Pretty typical of reddit-resembling sites like HN. People here are very politically, uh, involved.
https://news.ycombinator.com/newsguidelines.html
> Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.
I did not say that HN is turning into reddit nor indicate any view on any way in which HN is trending. Remarking that Reddit and HN are similar is not rule breaking, they are both vote systems, etc.
> a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks)
Did they even claim it was "memory-safe"? Every discussion of this topic has had dozens of comments noting that their vibed codebase is bursting at the seams with unaudited unsafe blocks, lightly reviewed by people who seem to not only seem to not understand Rust, but who seem incensed at the idea of needing to understand any programming language in the first place.
No, and there's been a lot of confusion about that on this website.
They did cite Rust's safety as a motivating factor for the port. That doesn't imply trying to achieve that simultaneously with the language change — which is good, because that would be insane. (Or, if you prefer, even more insane.)
You cannot faithfully port a codebase to a new language while also radically re-architecting it. You have to choose.
They want the safety benefits of Rust going forward; i.e., after it's finished, when they then write new code in Rust.
Yeah, exactly. The typical approach is to do a mechanical translation such as with rust2c, that is full of unsafe, and then gradually refactor safety in.
But nobody makes announcements and blog posts about running that.
There's several blog posts here. https://www.memorysafety.org/initiative/av1/
And the first post is about the team working on the project, with about two and a half sentences on c2rust, and making it very clear they just started.
The newer posts go into detail about the rearchitecting that follows.
And indeed, the bun team has not done that
Did they not make the announcement? And they definitely promised a blog post even if it's not out yet.
> Did they even claim it was "memory-safe"?
they didn't,
actually the port is trying to be mostly 1:1 and in turn is mostly unsafe rust, which means no benefits initially
but also doing the 1:1 port to mostly unsafe rust is also only the first step of a full port, you then incrementally go through it fixing issues and remove "unsafe" usage. (And long term likely also doing some refactoring to using more idiomatic rust, but that has less priority).
The problem is there was no blog port describing the whole thing to someone without contextual knowledge. Instead just linked PRs which is in this case somewhat close to a "as if nearly all people only read the HN headline" case :/
Like a more context giving version of the first HN post would have something on the line of `Show HN: Bun is porting to safe rust (PR link), starting with an AI based automatized port to mostly unsafe rust which once it behaves mostly the same as Bun in the test suite will likely be merged. But must be followed up with incremental PRs to remove unsafeness, and likely also a lot of unsoundness related to the way it's ported (some explanation about why this port will have unsoundness).`
They didn't need to - much of the popular hype around rust is on the back of uninformed spectators confusing Rust's tools for enabling memory-safety (good, warranting hype) with Rust itself guaranteeing automatic memory safety (fantasy).
The author kept bragging about classes of bugs that would not happen with Rust.
A bug-for-bug port to Rust is the first step to fixing that. Assuming the port is actually 1:1 without any behavioral changes, these bugs already exist in the Zig code. The difference is now it's known where effort can be dedicated in order to one day have a memory-safe release of Bun. People have absolutely lost their mind over this and completely forgotten the benefits Rust gives you. I feel like I've gone back 10 years reading threads about the Rust port of Bun these are the exact same arguments we see from people advocating continued use of C++.
> Assuming the port is actually 1:1 without any behavioral changes, these bugs already exist in the Zig code
The "1:1" assumption is a massive unjustified assumption. Rust and Zig have different memory models, so it's possible to do a "1:1" translation of Zig code to Rust and end up with undefined behavior in Rust.
For example, Zig code might make assumptions about lifetimes based on implicit knowledge of which allocator was used for some memory. That could cause problems in Rust if you erase the lifetime https://github.com/oven-sh/bun/blob/main/src/bun_core/string...
> Assuming the port is actually 1:1 without any behavioral changes
It's not, that's clear from this kind of bug popping up. Functionally this bug exists because `PathString` was converted into a "safe" Rust API but still works the same internally as the original Zig code did (via using `unsafe`), that introduces UB that wasn't there in the Zig code.
If it was attempting to be a 1:1 with no behavior changes (like c2Rust attempts to do) then this would not have been turned into a "safe" Rust API like this.
Its almost like AI is rotting our brains?
They didn't have to. There's a widely held assumption that Rust == safe, or safer than anything else.
What exactly people mean by "safe(r)" makes all the difference.
It's simply not possible to include all the nuance of safety of a language and all software written in it a single word. This leads to all kinds of miscommunication and strawmanning.
Rust's official line is specific memory safety guarantees, with caveats that it must not be broken by unsafe code, the OS, compiler bugs may happen, etc. Rust also has a bunch of best-effort features that steer users towards more robust code, but can't guarantee it.
This gets twisted in both directions:
- people ignore the caveats and limitations, pretending that Rust promised zero bugs ever, and use any bug in any Rust program as a proof by contradiction that Rust's claims are false.
- or focus solely on the caveats, ignoring all the advancements and incremental improvements, and take a "then why even bother?" There are classes of bugs Rust can't stop. Nothing is foolproof for a sufficiently advanced fool, and an infallible programmer could write bug-free code in any language, which creates a false equivalence between languages.
Is this the concept that's referred to in the quote "a lie can travel halfway around the world before the truth puts on its shoes"?
You could view it as a specific application of the quote.
In your quote, there is no time-dependency between the lie and the truth. Whereas here, it's an attractive lie (easily parsed, great narrative), followed up by truths (that need more than surface-level analysis).
I thought you were going to call out the problem in the other direction: There has not been a "big, flashy announcement" because the port is a work in progress. It's not done or released. The only big flashy announcements I see are these drive-by dunk attempts on the work in progress code combined with attempts to imply that they said it was done or perfect.
The rewrite was a code translation meant to be a starting point.
> a big, flashy announcement (here: bun was re-written in memory-safe rust in a couple weeks), and the relatively small reach of a correction (often just a footnote on an old article, here a GH issue).
The Bun team never made a big announcement that the code is now memory safe. They've been clear that this is the starting point.
Anyone expecting it to be perfect immediately and to have solved all of the memory problems in the original Zig code is arguing with an announcement they imagined, not what the Bun team has said.
Did anyone try to map this code back to the original codebase to see if this memory problem exists in the original codebase?
> Did anyone try to map this code back to the original codebase to see if this memory problem exists in the original codebase?
FWIW what is being discussed is not memory problems, it's breaking rust invariants (the unsafe code has to follow specific rules, e.g. annotate lifetimes properly).
strongly agreed, all of the flashiness is coming from the detractors. the port itself has been quite lowkey
Not just marketing and PR, the mainstream media knows that pushing out BS and then retracting it later can have lasting effects because people will remember the original article / headline, and never see the correction.
I don’t think the media care about having lasting effects. They just want to catch the wave of interest and not wait around and let someone else get the scoop while they fact check or add nuance.
only the mainstream media knows about this? Quite odd to qualify media this way here, when most of all media uses this mechanism. We also forgot politicians who are experts in this field.
Ctrl + F "only" is only in your messaging not mine. I never said they were the only ones doing this? It's not just politicians, celebrities know about this and will use it to their advantage. Whoever makes the headlines first might have a stronger sway over their adversaries. I'm not even poking at any side in particular, this is reality across the board unfortunately. People will just blindly take and believe the primary headlines.
The effort required to refute bullshit is an order of magnitude more than to create it.
I was a little shocked that they could get it fully working in a week to be honest. My side project is a very similar ambition (https://tsz.dev) but I am in no way claiming success. i keep adding more and more tests to ensure things works. Even after all of TypeScript's own tests pass I am finding bugs which I was totally expecting.
The bar for matching tsc's behavior is really _really_ high. see:
https://github.com/type-challenges/type-challenges
I'm not against using LLMs to write a lot of code. But verification should be 100x more robust now that we can output code at this rate.
I'm stunned that it went from 'this is an experiment' to merging a ~million lines of (likely) unreviewed code in a week. I have nothing against using agents but to rush something like this and leave the community blindsided seems extremely ameteurish. Like something you'd expect a bright eyed graduate engineer to do.
Blindsided? Has there even been a release yet?
Yes, they released a canary version.
tsz for me is an experiment to see how can this kind of work be done better. With a slight difference that tsz is not a direct port and it's a different architecture. I'm also not claiming to have answers but I've learned a ton. A few things that works
- Test before code, Bun had lots of test so that's good but maybe they could start by asking Mythos to write like 20k additional tests that pass on Zig Bun first.
- Deterministic anti-slop features. LLMs love to solve the problem in the wrong abstraction layer or place. There are many ways to catch this with deterministic tests. I do this in tsz a lot
- Roadmap that constantly evolving by humans.
- Taking a pause and looking how the progress is going and undoing slop
- Fuzztest(https://github.com/google/fuzztest) style "trying to break things" with the powers of LLM
I suspect they've been planning this and experimenting for many months. Along with the large existing test suite, they have lots of tooling for parallelizing agents and an unlimited token budget. So don't feel too bad..
Are there any evidences which prove the process was done in a week?
The comments on that thread are Facebook level cringe. What a dump GitHub has become!
That kind of error was expected. I don't see it as an issue against the rewrite. They kept the stable versions on Zig in case ppl needs stability. Eventually, the errors will get fixed.
That kind of error was entirely avoidable. There are well-known tools in the Rust ecosystem that detect this kind of error and while the tools do not detect all instances of UB caused by mistakes in unsafe blocks, it's still considered good practice to run them.
>There are well-known tools in the Rust ecosystem that detect this kind of error
Yes, tools like Miri, which this very post is about.
Indeed. My point is that just using the standard tools in the Rust ecosystem - like miri - would have trivially uncovered this error before it made it to the mainline.
This is an engineering choice: do you merge first and then fix the remaining issues or do you get everything perfectly clean first and then merge?
I've seen large rewrites and migrations take both approaches -- in my experience, the former usually works out better.
In any practical application there'll be a known set of errors and I'm generally fine merging code that has known deficiencies. But personally, I'd not condone merging anything that causes UB. It undermines such a fundamental guarantee of the language that it should be detected and eliminated. And bun certainly rises to the level of software where I'd expect that the project runs all available tooling to detect such cases. Especially if you LLM - code it. "Do not cause UB" should be part of the test harness.
I think the point is you should probably run that tool and fix all the issues _before_ merging into the master branch.
Indeed this was caught by a well-known tool, Miri, that detected this error.
I wonder if the publicity around this AI-driven rewrite will function as a (unintentional, or perhaps intentional) far-reaching nerd snipe that results in Rust developers flocking to the project to identify and fix issues.
What do you guys assume that a codebase from an unsafe language containing bindings to another unsafe language would appear perfectly implemented right away?
This Bun rewrite feels like a potential Mythos marketing stunt.
Not a single person on the Bun team nor Anthropic has yet done anything egregious to market this as anything but a swap to a more memory-safe language with better compiler guarantees.
Thus far most of the buzz and marketing has been entirely negative from people who are against AI.
My take is that most of the buzz is also tied to recent negative opinions of Anthropic themselves due to some of their recent decisions.
The best kind of marketing is when you don’t need to say it aloud by yourself. Yet, this is constantly in HN front page. Maybe engineered or not but marketing regardless.
engineering is worthless unless you can sell it
I think it’s just a corporate rug pull. Anthropic’s needs are the sole priority, all others be damned.
This is my biggest issue with all this.
It's not that they're using AI, it's the massive rug pull on bun users.
* Spend God knows how many dollars in unlimited tokens to do the rewrite
* Make a huge deal out of it how “Claude Code enabled Bun team to rewrite 1+ mil of Zig lines to Rust” and write a blogpost, VCs are salivating
* Basic checks fail
* Let Mythos rip the codebase to shreds, spend God knows how much more
* Write a separate blogpost
* Charlatans and smooth brains clap and defend against “delusional anti-AI mob”
* VCs orgasm even harder
Clap, clap, clap. That’s how you make money, folks.
And btw, we need to get rid of software engineers now.
They didn't make a huge deal about it though. I distinctly remember Jarred coming on here just last week to say stop making a big deal about it.
And how is the Bun team supposed to work on a million lines of slop rust code?
just as overhyped and disappointing?
Finally someone gets the point of this.
I'm curious, but unable to ascertain, does the same problem exist in the original Zig code? Is this an issue introduced by the translation to Rust? Because if it is a problem that can be replicated in both code bases, it seems a point in Rust's favor, that the issue is easily identifiable with tools that exist in its ecosystem.
I'm also curious about that. One thing to keep in mind: the invariants you have to uphold in unsafe blocks are quite stringent. I expect that in some instances the Rust code has new UB due to this.
So Bun saga has been
"Zig, let me Ai you"
"no"
*Ai's Zig fork, suffers from memory bugs*
"Well I'm moving!"
*Ai's code into Rust, suffers from memory bugs*
Sure. I'm completely unaffiliated and think Zig's AI stance is ridiculous & politically-motivated and a port is absolutely justified if they will not budge. Apparently I am deeply in the minority.
The no-AI policy of the Zig compiler project is for the compiler, other projects can do whatever they want.
Bun's fork of Zig was just an unsound hack that at best would have produced a strictly inferior speedup compared to our current work with incremental compilation, which is already plenty usable:
- June 2025 core team starts using it with the zig compiler itself https://ziglang.org/devlog/2025/#2025-06-14
- April 2026 https://ziglang.org/devlog/2026/#2026-04-08
> Zig's AI stance is ridiculous & politically-motivated
It's literally an issue with our business model to mess with our contributor pipeline, can't get more concrete than this.
https://kristoff.it/blog/contributor-poker-and-ai/
> The no-AI policy of the Zig compiler project is for the compiler, other projects can do whatever they want.
Well, presumably they want to contribute to the compiler. I know that you did not like those contributions, and that view seems entirely valid, but obviously "no AI" rules out their development model (by design, and you likely think that's good, and maybe it is!).
Not intending to defend the bun move, but obviously a project using Zig and also using AI might feel motivated to avoid Zig since they're ruled out as contributors.
https://ziggit.dev/t/bun-s-zig-fork-got-4x-faster-compilatio...
> An example of this is the changes to type resolution which happened in the 0.16.0 release cycle—these didn’t affect users too much, but had big implications for the compiler implementation. Before those changes, the compiler’s behavior was often highly dependent on the order in which types and declarations were semantically analyzed by the compiler. Some orders might result in successful compilation, while others give compile errors. Single-threaded semantic analysis prevented these bugs from causing user-facing non-determinism. The rewritten type resolution semantics were designed to avoid these issues, but Bun’s Zig fork does not incorporate the changes (and has not otherwise solved the design problems), which means their parallelized semantic analysis implementation will exhibit non-deterministic behavior. That’s pretty much a non-starter for most serious developers: you don’t want your compilation to randomly fail with a nonsense error 30% of the time.
There is a reason why, zig is upholding the quality and they hate it.
Zig rejected Bun's proposed contribution because it was a bad contribution, which they explained at length. Zig should not be made to "budge" on bad contributions. It seems you think Zig is unreasonable for rejecting bad code that happens to also be AI-generated, but believe it's reasonable for a project to be forced to accept bad code because it is AI?
No, I think Zig should reject bad AI contributions and accept good AI contributions. That is not Zig policy, they reject all AI-authored contributions.
Not sure why you're inventing a stance for me to be arguing against, when the Zig compiler stance is publicly articulated as exactly what I'm describing.
The problem as they've mentioned is AI contributors don't learn. They cannot have a working relationship with an AI contributor. The context about ongoing efforts, planned design changes to the language, etc is lost every time Claude is run. There is no way to work with that. People will submit infinite PRs while the core devs are flooded and forced to repeat themselves an infinite number of ways to an infinite number of stochastic prompts and responses.
The zig team is not that big. They don't have 200 core contributors to filter through the noise and mine PRs for "gems".
You specifically mentioned "a port is justified if they won't budge", which comes across to me as defending Bun's situation specifically, in other words expecting Zig to budge on Bun's bad contribution specifically and because they won't this slop Rust port is justified.
I think an outright rejection of AI contributions makes sense, regardless, and has nothing to do with politics. A Zig developer was forced into writing a long-form post to justify rejecting Bun's awful contribution (lest their PR be sullied, and then it was anyways), and the act of writing that post probably took 10 or 20x more human time and effort than Bun's contribution. Now multiply that by 100 for every random fucking moron with an LLM submitting a contribution. That is not sustainable. Open source maintainers of popular projects would have to make rejecting AI PRs their full time job and stop developing the project itself altogether, if they took them seriously and reviewed at length to conclusively identify whether a PR is good or bad. Given that 99.99% of AI PRs are bad, it's simply not worth it. You cannot possibly expect humans to spend more time reviewing code than drive-by contributors spent generating it, especially when many of them are unpaid volunteers. It's an absolutely ridiculous expectation.
Philosophically motivated, sure. In what way is the Zig foundation's AI stance political?
I think that we only see these bans because AI has become such a massive political issue in the last year.
Define “political” when it comes to Zig and AI.
You have to do better than that if you're going to accuse the Zig team specifically of this.
When my lead developer refactored my small but crucial python service into golang with Gemini and Claude, I was hesitant to merge the code into master. Yet, my service had, like 20k daily active users.
I think they shat over the community who trusted them by trying to advertise their owner company
Man that issue got way too many comments from non-contributors. I agree that this shouldn't have been merged in in it's current state, but that doesn't mean posting about it on GitHub is a worthwhile way to fix the problem.
Does all that UB exists in the Zig version as well? Was it introduced during the port?
After this was merged, my company made the decision to migrate everything away from bun and back to node. I don't say this lightly... Jarred is a guy that I held such immense respect for, and its sad to see the course he's charted for a project I spent a lot of time proselytizing internally. Its frankly a betrayal of trust.
I thought this was just like a fun proof of concept.. I didn’t realize they actually merged it to main. That seems pretty crazy to me.
There’s no way they had time to review the code. This just seems so wildly irresponsible for such an important and high profile project.
I thought the Zig code is still in main? Both versions?
Rust good zig bad!
Certainly disagree with "AIs are not good at writing Rust". We can discuss the pros and cons of AI coding in general but in my experience they do just as well with Rust as any other language. If anything I'm impressed with how seamlessly the models can work with Rust's ownership model.
Any prediction market bets on what will they rewrite it into next week? Era of just-in-vibe software is here.
It is really sad and unfortunate that coding has started falling under the omnicause. Low-denominator discourse is invading every space I find interesting and it is difficult to avoid.
I agree. I'm as skeptical as many commenters but I also think the degree of polarization in HN around this technology and the degree to which people are calling those with different views shills or naysayers is pretty sad.
> Please consider not vibe coding rust as AIs are not good at writing Rust and also hire a real rust dev
Isn't the whole point of AI companies using Rust that it's explicit, safe, and AIs are fairly good at writing it?
I suspect "Rust is fast/low memory utilization" is the more common value proposition, with memory safety as the bonus that can push it over other fast languages.
Related: If AI writes your code, why use Python? (which notes why Rust has taken off for LLMs) https://news.ycombinator.com/item?id=48100433
Maybe they want a quick switchover and the UB is replicating existing problems so it is net neutral for the codebase (but positive future coz developers can do future work on rust without synchronizing two codebase? ).
If that was true, then I would expect followups to reduce UB and unsafe in general, or at least requiring a lifetime for caller-owned memory.
But I think their true strategy is to have AI produce "fixes" like these which will end up infecting the entire codebase: https://github.com/oven-sh/bun/pull/30728
exactly. If they wanted to iterate on their port they would add lifetime annotations here, which are the tool Rust be uses to ensure safety. They're just kicking the unsafety block down the road. This accomplishes nothing and is not how you get Rust to deliver its safety promise.
Unsafe code still mostly needs lifetimes. It extends the functionality, not removing them. I wonder how much nasty things they have done then.
Lifetimes would prevent the particular use-after-free example here, but the UB that miri currently flags would still exist, as it's related to pointer provenance, not lifetimes.
https://www.reddit.com/r/rust/comments/1hxjdvp/eli5_what_is_...
> If that was true, then I would expect followups to reduce UB and unsafe in general, or at least requiring a lifetime for caller-owned memory.
It's been like a day since the merge, presumably such followups are coming.
So this is a clear case where the LLM generated Rust port introduced a bug:
> The Zig original is a packed struct with the same shape; it "worked" only because Zig has no reference aliasing or provenance rules to violate. The Rust port inherited the shape without rethinking the API surface.
Neat - didn't know about the miri tool.
Will definitely use that going forward
Didn't find anything on my existing vibecoded rust projects but can't hurt
Sorry wasn't there a post literally like a week ago about this being a long term experimental branch and how we needed to not kick the hatchling while it's an egg?
1 week turnaround I guess is what they meant.
Probably planned during Anthropic aquization. No way this would happen without their blessings. This is one way to reduce the community noise.
So a "robobun" clanker responds to the issue and writes a fix (probably just papering over it). This is what Anthropic wants: Let the users do the work, train the fricking bot and claim the credit.
If you find a bug, just go straight to blog posts and CVEs to denounce this idiocy. It ranks higher on Google.
I speculate the real goal is to have that fixed over time, and then use it as precious training data for Rust capabilities
This case is wild and seems to perfectly encapsulate all the problems people complain about with vibecoded projects.
The "rewrite it in rust" commit is +1M lines of code. Humans haven't looked at that in depth. In about a week, they saw the tests passed and pushed it to main. Now people have started to look through it and are pointing out glaring issues. And the solution is just going to be "feed it to another AI and ask it to fix it".
The entire codebase is slop now. Nobody knows what it does. It manages to pass some tests, but its largely a black box just on the basis of humans haven't read it yet. The code isn't guaranteed to be anything close to 1:1 with the old codebase. Its probably vaguely shaped like the old codebase, but new bugs could be there, old bugs could be there, nobody knows anything yet.
Its going to be interesting to see how recoverable this is. They are almost certainly going to just hand every file to an AI, say "look for soundness issues and fix them" and then what? If AI is making huge, sweeping changes to the code so frequently that humans can't keep up, is that really maintainable? The only solution appears to be "even more AI" while anybody that looks closely gets scared away by the too-large-to-comprehend-and-entirely-slop codebase.
This kind of thing has been happening with many smaller projects already, but now its a larger project and happening in a much more public way, with the intent to replace human-written, mostly-understood code with slop. I suspect the same thing, with the same problems, is happening inside all the largest companies, just not quite as obviously.
This is more or less my take on it.
I am not against AI code, it can be perfectly fine.
The principle issue in my mind is the rate of change.
Once you rewrite a code base like this (in a week no less) the only way to work on it in the future is using AI tools because no single person has any knowledge about any specific piece of code base any more.
AI generated code that is run through a classic PR process would potentially be fine, but then you sorta lose the entire point of using AI.
That happened to my project as well. The main issue hasn’t beet that ai couldn’t solve the problem, but it became so slow and you need more and more verification layers and CI/CD that at one point you wish a simpler codebase back, with reasonable tests, with storylines in codes and so on.
> only solution appears to be "even more AI"
That's the idea, to transform businesses to be wholly dependent on "AI" service to develop software. What better way than to re/write entire codebases until no human being understands it.
The Zig project know this, and its so-called "anti-AI" policy is actually pro-community and cultivating human understanding. It's not about the tool or technology, per se, it's about people, knowledge, and sustainability.
In contrast, the Bun project is demonstrating how they doesn't care about any of that, YOLO-ing its way to losing the trust of its users, contributors, and maintainers. Oh well, AI will maintain the project now, since no one else can.
UB = undefined behaviour, for anyone else who was puzzled.
I think the only way to interpret a one million line LLM-generated diff with no proper reviews as an employee of Anthropic is that my company no longer has an interest in understanding, or even looking at, its own code.
I'd be concerned that by jumping onboard with this sort of development process I'd lose touch with how to engineer software in a detail-oriented or remotely rigorous way.
It also makes me question what sort of value the entire Bun project ever had if a drop-in replacement can just be thrown in here like it's nothing. Why do we need all these JS runtimes again?
The AI bubble is so large that we've also forgotten how useless and dumb a lot of software engineering labor was even before LLMs came along. We were already in a bubble.
All that is to say, I think it's useful to reframe some conversations about AI as, "if AI can accomplish this task, was it ever actually valuable?" I think for some specific things, the answer will be yes, but the tech industry has been huffing its own farts for so long I really don't think anyone has sight anymore of what's economically valuable in a ground truth sense. Much like LLMs themselves, this confusion pollutes the entire well of discourse about their economic utility.
This was 100% a predictable outcome after Bun was acquired. Of course they were going to do something like this.
What would have been significantly better is just rewriting Claude in a language that's actually well suited to what it's doing in the first place (which could well be Rust, Codex is written in it as prior art). It's funny how the vibe coding promoters are keen on things like this, rewriting other codebases as fast as possible with little quality checking, but they are still defensive of their own code.
When it comes down to it, all the vitriol and animosity towards this port is really because of the implication of what its success would mean. If LLM's are capable of completely porting core software modules many people rely on (not just a CRUD app) of 1m lines in a week's time, it is a case closed moment that LLM's are currently much more capable than most people's eng, and can do it much faster. And that's at current capabilities, nevermind where we're headed in 1-3 years.
Jarred is an exceptional 1% engineer, and its likely he can succeed at this port, to the detriment of naysayers who don't believe there's any chance it's possible.
This had to happen, for many reasons:
- Its a throw thing at the wall and see what sticks situation
- LLMs will improve*
- Using LLMs in an agentic way will improve (git worktrees, sliced PRs, spec driven steps)
So what happened here is a mess, but you gotta break a few eggs to make a souffle.
It's a learning step and I am glad it happened, there will be so many things to debrief from this.
I don't use Bun or Rust but fair play to them having a punt.
<Shameless plug> I have been working with Claude code to spec out and bring back to life a Spring Boot starter library for Apache Solr search
https://github.com/tomaytotomato/spring-data-solr-lazarus
There were a few points I had to steer it but the result has been a good implementation.
A souffle has not been made
Indeed, more of a frambled egg. Lets see what happens in two years time.
My grandfather was a tailor, and one day a client came in asking for the status of his suit getting clean. My dad as a kid grabbed the unfinished suit and showed the man, who was frustrated at the lack of progress.
My grandpa told my dad never to show a client a work in progress - You told them when you'd get the work done, and they can see the finished result when it's ready.
It's just a story so don't wrap yourself around the axle with counter-examples. I think it's fair to say that an open-source project going through a language translation is going to have transitional periods as they shake things out, and criticizing every snapshot as some proof that they're incompetent is useless.
Dumbest point ever. There is no value for this issue. I don't agree with the way they did the rewrite, but they did the rewrite, and this post contributes nothing, beside making the author seem childish. If it had any real contribution I would have waved it off, but it really doesn't. This tribalism and "I'm better than you"-ism and the same reason everybody hated the stack overflow community, and the rust community as well.
The issue author is most likely quite literally a child.
Not a good advertisement for both Anthropic. Or bun.
said it in another comment [0] - that the whole rewrite thing is just a marketing exercise by LLM merchants to try sell you plebs that their wares "work"
[0]: https://news.ycombinator.com/item?id=48078224
Step 1: Vibe-code a buggy, poorly-performing, 500k+ LoC desktop-installed monstrosity in TypeScript to implement a trivial TUI. Proudly note that you’re meeting a 16ms frame budget … for a trivial chat UI.
Step 2: Purchase an entire company for a product that, if you squint, might help paper over the entirely predictable problems that arise from using the wrong tools to implement the wrong architecture, because surely the solution isn’t reevaluating your original engineering choices.
Step 3: Perform a buggy, vibe-code rewrite of the tool you just bought. A tool you only need because — for whatever internal political reasons — sunk cost means you can only keep digging.
Step 4: ???
I thought Rust treated undefined behaviour as a compiler bug? Does anyone know what's actually happening here?
"unsafe" is a promise to the compiler that you're going to ensure invariants that the compiler can't check. Rust only promises to eliminate UB if the invariants are held. You can still get UB by violating that promise, as this bug demonstrates.
But the title here says "in safe Rust", no? Is the unsafe code causing UB in safe code? I thought the unsafety couldn't "spread" like that in Rust.
That is not Rusts guarantee. The guarantee is that safe rust cannot in itself introduce UB - UB can only ever be introduced in unsafe blocks, but it can then materialize in safe code.
Ah OK, that makes sense, thanks.
It can spread into safe code when you build an incorrect "safe" abstraction around unsafe code. Which the Bun Rust port apparently has.
> I thought the unsafety couldn't "spread" like that in Rust.
The goal of a library is to provide the encapsulation such that the unsafety doesn't spread.
If undefined behavior occurs, the fault lies with whoever wrote `unsafe { ... }` in the body of a function. If I write "unsafe" in order to call an unsafe library function, and I don't meet the library function's pre-requisites, then it's my fault. If the library internally writes "unsafe" in order while providing a safe wrapper, and I never actually wrote `unsafe { ... }`. If neither I nor the library wrote `unsafe { ... }`, then it is the fault of the compiler.
Using "in safe Rust" means that `unsafe` doesn't occur either in the user code nor in the library. In this context, since we've heard how many uses of `unsafe { ... }` exist in the Bun rewrite, I'd read "in safe Rust" to mean "without calling any functions marked as unsafe".
I can't tell if you're trolling but `unsafe { crash() }` is safe from the compiler's perspective. Otherwise you wouldn't be able to achieve anything in 'safe' rust, even print to stdout.
I think its a good question, just because the whole UB thing is such an ideological shibboleth.
Maybe its better to think about this in the reverse, where C and C++ has 'defined behavior', but unsafe rust intentionally does not, its just whatever the complier and platform lets you get away with. Ultimately its still just a computer which stores values in memory and jumps to subroutines.
Every language has defined behavior. It's what you expect to happen through a program's execution. Sometimes there will be multiple possibilities, but you can still define them regardless. Laying this out explicitly is the purpose of a standard.
Undefined behavior is everything else. C and C++ are relatively unique in that their standards explicitly say "combining these constructs in this way is undefined", and we call those cases explicit UB. There's also a larger universe of implicit UB that standards omit. Most (all?) languages have implicit UB, even if they lack the explicit stuff. What happens when you get ENOMEM is a common one.
Rust does something similar to C/C++ and lists a bunch of UB that's only possible with incorrect code in unsafe blocks. Correct code placed in an unsafe block remains defined, as does code without unsafe (up to compiler/language bugs).
If you use unsafe improperly, it is possible to encounter UB in "safe" code which relies on the unsafe code being correct.
it's more straightforward to write safe rust when rust owns everything, In real world you often are interfacing with underlying libs or systems etc, which you need to treat as invariants but also handle yousrelf manually to make guarantees to compiler. unsafe exists in tons of codebases it's just you have to make sure you encapsulate it properly, which is what this bug is.
Unsafe code can break certain invariants of Rust, as `unsafe` is just a compiler "hold my beer" flag, which is why you're meant to do safety checks in your safe interface around unsafe code. If the unsafe code is wrapped in a way that does no guarding (or does something stupid in general), it is technically marked safe (because you said "rustc, hold my beer" as `unsafe` is also a contract) despite actually being unsafe
UB != unsafe
Rust has lots of undefined behavior, in general a broadly similar set to that which exists in C. What Rust does that is different is that to trigger undefined behavior, you need to execute unsafe code. (This isn't the same as saying that you have to be in unsafe code--you can violate a precondition in unsafe code and have the UB itself trigger in safe code).
That's a good explanation, thank you.
I'm sure there have been attempts at defining a language that has no UB, but afaik all meaningful languages have UB in some dark corner or enumerated explicitly. For example, Java thread execution order is UB.
> For example, Java thread execution order is UB.
In this context "UB" means something different than how you're using it. The UB being mentioned here is the "nasal demons" form, i.e., programs which contain undefined behavior have no defined meaning according to the language semantics.
What you're talking about is probably better described in this context as "unspecified behavior", which is behavior that the language standard does not mandate but does not render programs meaningless. For example, IIRC in C++ the order in which g(), h(), and i() are evaluated in f(g(), h(), and i()) is unspecified - an implementation can pick any order, and the order doesn't have to be consistent, but no matter the order the program is valid (approximately speaking).
Great example.
So this "unspecified behavior" might turn into the more nasal demon type when g(), h() and i() share mutable state and assume some particular sequential order of execution. No?
Not necessarily. Unspecified behavior and undefined behavior are independent concepts; a language can have one but not the other. As a result, you can have languages where incorrect reliance on unspecified behavior can lead to undefined behavior (e.g., C and C++) and languages where incorrect reliance on unspecified behavior can lead to bugs, but not nasal demons (e.g., Java)
https://doc.rust-lang.org/reference/behavior-considered-unde...
It is only allowed in unsafe blocks. As long as the unsafe blocks are few and well understood then Rust programmers can contain this to a small well defined portion of a program.
Safe Rust does.
Unsafe Rust allows you to tell the compiler “hold my beer”. It’s a concession to the reality that the normal restrictions of Rust disallow some semantically valid programs that you might otherwise want to write. The safeguards work great in most cases, but in some they’re overly restrictive.
In practice, the overwhelming majority of code is able to be written in safe Rust and the compiler can have your back. The majority of the rest is for performance reasons, interacting with external functions like C libraries over FFI, or expressing semantics that safe Rust struggles with (e.g., circular references).
OK but the title says "in safe Rust". Am I misunderstanding something? All the replies here are saying how it's allowed in unsafe Rust, which is not what the title says.
`unsafe` isn't viral. I can write
fn safe_function(...) -> (...) {
}then `safe_function` can be called from safe code, and still trigger UB. This wouldn't be a soundness issue in the rust compiler, but instead a bug in safe_function.
There are many reasons you might want to do that. In particular, it's very common in rust to have a library define some data structure that uses unsafe under-the-hood, but checks whatever invariants it needs to, and provides solely safe methods to external callers. Rust's `String` type is like this: it's (roughly) a `Vec<u8>`, e.g. heap-allocated bytes. It has the additional invariant that these bytes correspond to valid UTF8 though. See for example `push_str_slice`, which (roughly) concatenates 2 strings.
https://doc.rust-lang.org/src/alloc/string.rs.html#1107
It does the following thing
1. reserve enough space for the concatenated string within the source string 2. does some pointer arithmetic and a call to Rust's equivalent to `memcpy` (unsafe) 3. re-casts this pointer to a string object without checking that it's valid utf8 (unsafe).
While these individual calls are unsafe, `push_str_slice` checks that in this particular situation they are safe, so the stdlib authors do not mark `push_str_slice` as unsafe. It has no invariants that must be maintained by external callers.
If code in an unsafe block triggers undefined behavior, then the assumptions the compiler makes regarding safety will no longer be true, and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe. This is what's happening in the example the person on Github wrote in the issue.
Exactly and "[...]and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe" hits the nail on the head.
I take issue with the phrasing of OP's title: "allows for UB in safe rust". AFAIK there are compiler bugs that allow UB in safe Rust, but this is not what is happening here. We have UB in an unsafe block (which is to be expected) which enables an issue outside in safe code. What is your opinion? Is calling this "UB in safe Rust" justified?
it is, but it's a little confusing here because the library/consumer of the library are the same person.
This is a bug in the library, namely in Bun's PathString implementation. The bug is a soundness issue, precisely because usage of Bun's PathString implementation allows for UB in safe rust. Now this buggy library isn't that big of a concern for the community, because Bun is the only consumer. It's not also an indication of a compiler bug, because Bun's library is implemented using unsafe rust. But the fundamental issue is that usage of Bun's PathString implementation allows for UB in safe rust, and is therefore (clearly) unsound.
Unsafe blocks are you saying to the compiler 'trust me bro, I know this is safe'. But often that relies on some property of the code being true in order for it to actually be safe. Generally speaking, the expectation in rust is that you either encapsulate the code that enforces whatever property you are relying on behind a safe interface, so that it's not possible for other code to use it unsafely, or that you mark the interface itself unsafe so that it's obvious that the code using that interface needs to maintain that property itself. Rust code that doesn't do this will generally be considered buggy by most rust programmers (e.g. if you find a use of safe interfaces in the stdlib that causes a memory safety violation, then you should file a ticket with the rust team), but this is essentially only a social convention of where the blame lies for a bug, not something that compiler itself can enforce (and, for example, you can violate memory safety in rust with only safe std interface by abusing OS interfaces like /proc/self/mem but this is something that most people don't think can be reasonably fixed). The main reason that rust as a language is better in this regard is that it gives much better tools for being able to express that safe interface without giving up performance and that it has the means to mark and encapsulate this safe/unsafe distinction.
Here's some links on this topic which have some examples:
https://doc.rust-lang.org/nomicon/working-with-unsafe.html https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.htm...
They are using unsafe since large portions of Bun is interfacing with other unsafe codebases. Together with a "1:1" rewrite from Zig to Rust.
And it's not like Bun when written in Zig has been a beacon of stability either. It has been segfaults all over the place.
"Port of large memory unsafe codebase has a memory safety bug, news at 11."
I don't see what the big deal is here.
Holy cow:
architector4@AGOGUS:/tmp$ git clone --depth=1 'https://github.com/oven-sh/bun' Cloning into 'bun'... … architector4@AGOGUS:/tmp$ cd bun/ architector4@AGOGUS:/tmp/bun$ find -type f -name '*.rs' -exec grep unsafe {} \; | grep -v '//' | wc -l 13255
....Thirteen thousand two hundred and fifty five lines without comments with the word "unsafe" in them in Rust code files across this rewrite.
This is so gross.
I'm a founder of an early-stage startup. I built a precision-editing tool system (called HIC Mouse). It provides coordinate-based addressing, staged batching with atomic rollback, embedded agent guidance, and more. It works well, it's available on VS Code Marketplace, and I've worked for a year and am still grinding every day, working so hard, just to get people to think about trying it, and to get attention paid to it. I did rigorous, careful benchmark research to make sure I wasn't just fooling myself. I incorporated, built a sales pipeline, changed my life by taking a chance and launching a business, and I pound the pavement and toil in obscurity every day and night, trying so hard to get interest in my product. I check every diff painstakingly before committing. I may make tools for AI agents but I am unbelievably careful about reviewing and thoroughly testing their code, and usually rather ruthlessly editing quite a bit further beyond any initial version drafted, long before deciding it is good enough to ship. I take enormous pains to get things right and worry constantly about whether I'm doing enough to make HIC Mouse secure and performant for my users. All I want is to make my users happier and to give them a genuine way to get "surgical, precise edits" that "don't touch the other lines", like we all ask of our AI agents over and over all day if we're using AI.
Or maybe not. Here we have Bun. Who cares about 90K GitHub Stars and massive community engagement -- just go crap all over them, all at once, with this AI tripe that you obviously neither tested in any meaningful manner, nor documented, nor read, I am assuming, before merging the whole bloated mess to production. What a disgraceful way to treat your users! I would be so grateful if I had a tiny fraction of the interest in my project that the Bun team has. I could never imagine shipping this garbage in a million years.
I'm sorry to vent but this just isn't defensible. It's the very worst of AI. I'm not going to wish ill on Bun, but it just makes me sad that I spend so much effort, work so hard to do things right, and painstakingly review everything because it's not just me any more and I do have folks who depend on my code being reliable and secure. And meanwhile, Bun just gives a huge middle finger to 90k+ starred supporters not to mention the millions of users who didn't click on the star but rely on the library, by acting this disrespectfully and disgracefully towards their own users. How they didn't take one look at this and promptly revert and apologize is simply beyond me. Again, sorry to vent, but this made me irrationally mad.
So many people are fundamentally misunderstanding everything about this rewrite.
In fact using the word "rewrite" itself is pretty inaccurate.
As has been mentioned the goal was a port so they "could" eventually rewrite most of it to be idiomatic rust. The main benefit of this now is the compiler and being able to use these tools to fix issues that were already being hidden when it was in zig.
If you go into this codebase expecting to see idiomatic rust and get angry when it's not there, you are going in with the entirely incorrect attitude.
It's understandable how people see it as AI slop or whatever given the division among developers at the moment. But please see it for what it is instead of just jumping to conclusions.
> As has been mentioned the goal was a port so they "could" eventually rewrite most of it to be idiomatic rust.
They may have said that, but quite clearly the value they actually get out of it is getting the headline "AI reimplements complex, broadly used software in 2 weeks, but makes it way better because it's rust now" in front of a million people's eyes, only 1% of whom will ever find out it was mostly fluff
> quite clearly the value they actually get out of it is getting the headline
This is entirely disingenuous. Jarred has already made it clear what value they get out of moving off of Zig. Yes they used AI heavily to attempt this goal but I don't see what the big issue is. They haven't even released it yet and Anthropic themselves have said 0 about this.
The "headlines" thus far are really just people completely uninvolved with Bun and with all to gain by perpetrating "AI BAD".
My honest take: the big issue isn't "what if it goes wrong" its the fear that a migration of this size works out of the box and being done almost entirely by AI.
Things get pretty hilarious when you super safe language conrains the keyword "unsafe" :D
I wonder what are the real legitimate use-cases for "unsafe" in the first place, it is there for a reason?
Many. Let me give you a very boring example: I use mmap in some of my programs, because it's the easiest/best way to solve the problem I was working on. Mmap is unsafe in rust, because if there are modifications to the backing file it can violate some of rust's assumptions about the behavior of memory not changing unless it was changed by the rust code.
In my application I'm able to guarantee that there is no modification to the backing file by making them read-only and ensuring nobody messes with them, but that guarantee exists outside of rust. So -- unsafe with a big SAFETY comment explaining the requirements if you use it.
Much rust code will never use unsafe. Systems code is likely to use a bit but also to know what it's doing.
Things like this port of bun are unusual and presumably transitory on the way to an implementation with minimal use of unsafe.