>software that used to follow the "release-then-go-back-to-cave" model will have to change to start dealing with maintenance for real, or to just stop being proposed to the world as the ultimate-tool-for-this-and-that because every piece of software becomes a target.
Actually, some software are running the water-heater/heat-pump system in my basement. There is a small blue light screen, it keeps logs of consumed electricity/produced heat and can make small histograms. Of course there is a smart option to make it internet connected. The kind of functionality I’m glad it’s disabled by default and not enforced to be able to operate. If possible, I’ll never upgrade it. Release then go back to the cave has definitely its place in many actual physical product in the world.
I’ll deal with enough WTF software security in my daily job during my career. Sparing some cognitive load of whatever appliance being turned into a brick because the company that produced it or some script-kiddy-on-ai-steroid decided it was desirable to do so, that’s more time to do whatever other thing cosmos allows to explore.
There's an anecdote I remember reading somewhere: When an 'embedded systems' engineer was to present a web-based product they were tasked to build, the managers/reviewers were puzzled they couldn't find any bugs. Asked about this, the engineer replied: "I didn't know that was an option".
Definitely a different mindset/toolset is required when it comes to building systems that have to be working autonomously without "quick fixes" from the web.
Yes but I would push back a little on the idea that you simply put yourself in a "mindset of writing bug-free code."
Simpler code has fewer bugs. Embedded code tends to be simpler and more targeted in its role. Of course, putting yourself in the mindset of writing simpler code is great too - if you have the time to do so, and the problem you are solving is itself sufficiently simple.
Embedded code is also simpler because it has to be. When you are confined to a microcontroller, there isn't room for bloated app frameworks, hundreds of NPM packages, etc.
My gut feeling and expectation is that people will be turning their internet off at night, and at all times. At least for a while until this whole new security situation somehow settles with newly invented automation.
May sound weird, but as author of previous comment noted - a lot of appliances need not be connected ever and still benefit humanity.
> people will finally understand that security bugs are bugs, and that the only sane way to stay safe is to periodically update, without focusing on "CVE-xxx"
Linux devs keep making that point, but I really don't understand why they expect the world to embrace that thinking. You don't need to care about the vast majority of software defects in Linux, save for the once-in-a-decade filesystem corruption bug. In fact, there is an incentive not to upgrade when things are working, because it takes effort to familiarize yourself with new features, decide what should be enabled and what should be disabled, etc. And while the Linux kernel takes compatibility seriously, most distros do not and introduce compatibility-breaking changes with regularity. Binary compatibility is non-existent. Source compatibility is a crapshoot.
In contrast, you absolutely need to care about security bugs that allow people to run code on your system. So of course people want to treat security bugs differently from everything else and prioritize them.
I think part of it is that, especially at the kernel level, it can be hard to really categorise bugs into security or not-security (it has happened in the past that an exploit has used a bug that was not thought to be a security problem). There's good reason to want to avoid updates which add new features and such (because such changes can introduce more bugs), but linux has LTS releases which contain only bug fixes (regardless of security impact) for that situation, and in that case you can just stay up to date with very minimal risk of disruption.
And this is the best-case scenario. Because once updates become opt-out it simply becomes an attack vector of another type.
If the updated code is not open source, you are trusting blindly that not some kind of different remote code execution just happened without you knowing it.
As blind as my belief that Asia exists, because I haven't personally navigated there. Hell, I've used electricity (using it right now), but I couldn't do the experiments you need to do to get myself to an 1850s level of understanding of how it works, much less our current level.
I trust that Linux has a process. I do not believe it is perfect. But it gives me a better assurance than downloading random packages from PyPi (though I believe that the most recent release of any random package on PyPi is still more likely safe than not--it's just a numbers game).
I get what you are saying but as you said, if you are already under attack you can't trust your own computer, you just hope that you aren't downloading another exploit/bogus update. Real software I imagine is not so easy to pwn so completely but I don't know.
Details are important, but my mental model has settled as: Security bugs are being use in a manner to how politicians use think of the children. It's used as an auto-win button. There are things to me that compete with them in priorities. (Performance, functionality, friction, convenience, compatibility etc); it's one thing to weigh. In some cases, I am asking: "Why is this program or functionality an attack surface? Why can someone on the internet write to this system?"
Many times, there will be a system that's core purpose is to perform some numerical operations, display things in a UI, accept user input via buttons etc, and I'm thinking "This has a [mandatory? automatic? People are telling me I have to do this or my life will be negatively affected in some important way?] security update? There's a vulnerability?" I think: Someone really screwed up at a foundational requirements level!.
>people will finally understand that security bugs are bugs, and that the only sane way to stay safe is to periodically update, without focusing on "CVE-xxx"
The problem is that the very same tools, I expect, are behind the supply chain attacks that seem to be particularly notorious recently. No matter where you turn, there's an edge to cut you on that one.
The last paragraph is interesting: "Overall I think we're going to see a much higher quality of software, ironically around the same level than before 2000 when the net became usable by everyone to download fixes. When the software had to be pressed to CDs or written to millions of floppies, it had to survive an amazing quantity of tests that are mostly neglected nowadays since updates are easy to distribute."
Was software made before 2000 better? And, if so, was it because of better testing or lower complexity?
I was a developer at Microsoft in the 90s (Visual Studio (Boston) and Windows teams). I won't claim that software back then was "better," but what is definitely true is that we had to think about everything at a much lower level.
For example, you had to know which Win32 functions caused ring-3 -> ring-0 transitions because those transitions could be incredibly costly. You couldn't just "find the right function" and move on. You had to find the right function that wouldn't bring your app (and entire system) to its knees.
I specifically remember hating my life whenever we ran into a KiUserExceptionDispatcher [0] issue, because even something as simple as an exception could kill your app's performance.
Additionally, we didn't get to just patch flaws as they arose. We either had to send out patches on floppy disks, post them to BBSs, or even send them to PC Magazine.
From the user perspective, Windows and Office certainly crashed more frequently back then. I don't mean that as a criticism of the Microsoft developers at the time: they did some great work within severe constraints. But overall the product quality is far better now.
I wouldn't take that as criticism; you are 100% correct. But that instability was a direct result of the issues I mentioned above: the ring transition protection/implementation was absolutely horrible; 3rd-party developers would discover a useful function in NTDLL and start using it in unintended ways, etc.
Do you remember the CSRSS Backspace Bug? [0]
A simple: printf("hung up\t\t\b\b\b\b\b\b"); from ring-3 would result in a BSOD. That was a pretty major embarrassment.
After retiring, I started volunteering my time to mentor CS students at two local universities. I work with juniors and seniors who have no idea what "heap memory" is because, for the most part, they don't need to know. For many developers, the web browser is the "operating system".
I absolutely love using Python because I don't have to worry about the details that were major issues back in the 90s. But, at the same time, when I run into an issue, I fully understand what the operating system is doing and can still debug it down to assembly if need be.
I can't imagine how much of a breath of fresh air Python / Java must have been if you were used to write typical business crud apps (and server software) in C/C++ (with no sanitizers / modern tooling to speak of).
It wasn’t. Java was very different from its current state before roughly Java 5. It felt like a downgrade from C++ to me at the time. C++ had templates and RAII and smart pointers, all of which Java lacked (and in some respects still lacks today). Not having something like the C preprocessor was quite annoying. Java performance wasn’t great. Tooling was better in some ways, worse in others. Linters did exist in C/C++, as did debug versions of libraries. You could load a crash dump into a debugger and could often get a pretty good picture of what went wrong. While Java certainly became preferable for business code, it wasn’t a sudden breath of fresh air, it was trade-offs that gradually became more favorable to it over the years.
I used to joke that using something like Python or C# felt like "programming with oven mitts". I never felt like I had any control. But that eventually morphed into "Well, I don't need that control and can focus on other things."
I spent the last few months building a toy LLM from scratch. I can't believe that within my lifetime I've gone from using punch cards to arguing with Claude when it does something ridiculous.
It was the best of times, it was the worst of times.
Best/better because yes, QA actually existed and was important for many companies - QA could "stop ship" before the final master was pressed if they found something (hehe as it was usually games) "game breaking". If you search around on folklore or other historical sites you can find examples of this - and programmers working all night with the shipping manager hovering over them ready to grab the disk/disc and run to the warehouse.
HOWEVER, updates did exist - both because of bugs and features, and because programmers weren't perfect (or weren't spending space-shuttle levels of effort making "perfect code" - and even voyager can get updates iirc). Look at DooM for an example - released on BBS and there are various versions even then, and that's 1994 or so?
But it was the "worst" in that the frameworks and code were simply not as advanced as today - you had to know quite a bit about how everything worked, even as a simple CRUD developer. Lots of protections we take for granted (even in "lower level" languages like C) simply didn't exist. Security issues abounded, but people didn't care much because everything was local (who cares if you can r00t your own box) - and 2000 was where the Internet was really starting to take off and everything was beginning to be "online" and so issues were being found left and right.
This was the big thing. There were tons of bugs. Not really bugs but vulnerabilities. Nothing a normal user doing normal things would encounter, but subtle ways the program could be broken. But it didn't matter nearly as much, because every computer was an island, and most people didn't try to break their own computer. If something caused a crash, you just learned "don't do that."
Even so, we did have viruses that were spread by sharing floppy disks.
That's a really big part of it - bugs were ways that the program wouldn't do what the user wanted - and often workarounds existed (don't do that, it'll crash).
Nowadays those bugs still exist but a vast majority of bugs are security issues - things you have to fix because others will exploit them if you don't.
There are some rose-colored glasses when people say this.
Programs didn’t auto save and regularly crashed. It was extremely common to hear someone talk about losing hours of work. Computers regularly blue screened at random. Device drivers weren’t isolated from the kernel so you could easily buy a dongle or something that single-handedly destabilized your system. Viruses regularly brought the white-collar economy to its knees. Computer games that were just starting to come online and be collaborative didn’t do any validation of what the client sent it (this is true sometimes now, but it was the rule back then).
> Viruses regularly brought the white-collar economy to its knees.
Now, it's anti-virus (Crowdstrike) that does that. I don't think many or any virus or ransomware has ever had as big an impact at one time as Crowdstrike did. Maybe the ILOVEYOU worm.
Certainly depended on the software. But disks were slow back then, and a save would commonly block the entire UI. If your software produced big files you could wait for an inconvenient amount of time
It's amazing that the world has largely forgotten the terror of losing entire documents forever. It happened to me. It happened to everyone. And this is the only comment I've seen so far here to even mention this.
Indeed, but it was pretty easy to develop the habit of hitting whatever function key was bound to "Save" fairly frequently. I certainly did.
Also auto-save is a mixed bag. With manual save, I was free to start editing a document and then realize I want to save it as something else, or just throw away my changes and start over. With auto-save, I've already modified my original. It took me quite a while to adjust to that.
Fun fact: I was on the Google Docs team from 2010-2015. Save didn't do anything but we still hooked up an impression to the keystroke to measure how often people tried to save. It was one of the top things people did in the app at first; it was comparable to how often people would bold and unbold text. And then as people gained confidence it went down over time.
AI tools have caused me to trip up a few times too when I fail to notice how many changes haven’t been checked into git, and then the tool obliterates some of its work and a struggle ensues to partially revert (there are ways, both in git and in AI temporary files etc). It’s user error but it is also a new kind of occasional mistake I have to adapt to avoid. As with when auto-save started to become universal.
At the time of release, yes. They had to ensure the software worked before printing CDs and floppies. Nowadays they release buggy versions that users essentially test for them.
Also in terms of security, there was generally a much smaller potential attack surface and those surfaces were harder to reach because we were much less constantly connected.
I wouldn't go that far. As soon as you went online all bets were off.
In the 90s we had java applets, then flash, browsers would open local html files and read/write from c:, people were used to exchanging .exe files all the time and they'd open them without scrutiny (or warnings) and so on. It was not a good time for security.
Then dial-up was so finicky that you could literally disconnect someone by sending them a ping packet. Then came winXP, and blaster and its variants and all hell broke loose. Pre SP2 you could install a fresh version of XP and have it pwned inside 10 minutes if it was connected to a network.
Servers weren't any better, ssh exploits were all over the place (even The Matrix featured a real ssh exploit) and so on...
The only difference was that "the scene" was more about the thrill, the boasting, and learning and less about making a buck out of it. You'd see "x was here" or "owned by xxx" in page "defaces", instead of encrypting everything and asking for a reward.
Software has gotten drastically more secure than it was in 2000. It's hard to comprehend how bad the security picture was in 2000. This very much, extremely includes Linux.
But there was much less awareness of buffer overflows and none of the countermeasures that are widespread today. It was almost defining of the Win95 era that applications (eg. Word) frequently crashed because of improper and unsafe memory management.
I remember opening a webpage and being hacked seemed more likely. Adobe Flash and Java had more vulnerabilities and weaker (if any) sandboxes than JavaScript.
Except that when you did connect Windows to anything it was hacked in less than 30 seconds (the user ignored the "apply these updates first, and then connect ..." advice, they wanted some keyboard driver. Hacked, whoops, gotta waste time doing a wipe and reinstall. This was back when many places had no firewalls). IRIX would fall over and die if you pointed a somewhat aggressive nmap at it, some buggy daemon listening by default on TCP/0, iirc. There was code in ISC DHCPD "windows is buggy, but we work around it with this here kluge..." and etc etc etc etc etc
Not just dhcpd. Besides the entire existance of Wine and Samba, Qemu has a workaround for win2k. Mkudffs has a workaround for MS-Windows not being able to read the filesystem without an mbr. Libc can work with local system time for those who dual-boot. Git can work around the difference in line endings. There are probably more of these kludges than you can shake a stick at.
The quantity of tests, known as penetration attempts, that most critical software survives today in a networked environment, is magnitudes more daring that the easily-cracked software printed on CDs. I really don't understand how this argument about software made 26 years ago really stands any reasonable ground.
It feels like rose-tinted glasses. While lots of low-hanging fruit had to be plucked to be shippable, there was still plenty of software which mandated specific hardware/software combinations or (worse) had major bugs which weren't patched but had workarounds documented in the manual, and if you weren't actively reading the manual, your newly-purchased software just wouldn't work (and if it was something low-level, that may mean you have to reinstall the OS).
Then there was stuff like rwall, which could be used to scrawl a message across basically every terminal connected to a networked Unix box in the world by accident [0][1], and it was far from the only insecure-by-design Unix software in widespread use.
It's interesting to watch youtubers like clabretro [2], NCommander [3], and Old Computers Sucked [4] who have documented the slog that was setting up and patching networking equipment, obscure Microsoft products, Netware, Unixes and Unix hardware, old Linux distros, etc. We take so much for granted these days. We don't even have to think about C/++ standards compliance outside the occasional compiler bug, much less the myriad of mutually-incompatible POSIX implementations that helped Microsoft win the Unix wars.
The fact that you can just build a PC with no prior experience or IT knowledge after watching an hour-long youtube video rather than having to spend weeks researching hardware compatibility or futzing about with IRQ levels, recompiling kernels, and messing with autoexec.bat/config.sys is a testament to how far we have come. You don't even have to think about drivers anymore unless you have specialized equipment.
It is hard to say which of the 2 is the reason, more likely both, i.e. lower complexity enabled more exhaustive testing.
In any case some of the software from before 2000 was definitely better than today, i.e. it behaved like being absolutely foolproof, i.e. nothing that you could do could cause any crash or corrupted data or any other kind of unpredictable behavior.
However, the computers to which most people had access at that time had only single-threaded CPUs. Even if you used a preemptive multitasking operating system and a heavily multi-threaded application, executing it on a single-threaded CPU was unlikely to expose subtle bugs due to race conditions, that might have been exposed on a multi-core CPU.
While nowadays there exists no standard operating system that I fully trust to never fail in any circumstance, unlike before 2003, I wonder whether this is caused by a better quality of the older programs or by the fact that it is much harder to implement software concurrency correctly on systems with hardware parallelism.
Not all software are done with the same quality, whatever the epoch.
It was possible to work with Ada as soon as 1980 wherever high guarantee of reliability was taken seriously, for example.
And not everyone is Knuth with a personal human secretary in well funded world-top institution.
In 2000s, Microsoft which was already sitting on insanely high mountain of resources released Windows Millennium Edition. Ask your greybeard neighbour if you are too young to remember. While commercialisation started in 2000, it is the last MS-DOS-based Windows version and so represent the pinnacle of what Windows 9x represented, before the big switch to a NT inheritance.
As always, the largest advantage of the good all time, is selective memory. After all, people that can remember know they survived the era, while present and future never provided much certainty on that point.
I’ve been considering that this might be an outcome of AI-written software and it’s the one aspect of all this that I’m actually unequivocally happy about.
Most software written at companies is shit. It’s whatever garbage someone slapped together and barely got working, and then they had to move onto the next thing. We end up squashing a never ending list of bugs because in a time-limited world, new features come first.
But that only really applies when the cost of good software dwarfs that of barely-functioning software. And when the marginal cost of polishing something is barely longer than it took to write it in the first place? There’s no reason not to take a few passes, get all the bugs out, and polish things up. Right now, AI can (and will) write an absolutely exhaustive set of test cases that handles far more than a human would ever have the motivation to write. And it will get better.
If a company can ship quality software in essentially the same time as it can ship garbage, the incentives will change rapidly. At least I hope so.
It appeared better, because there were fewer features and more time to develop and test. But it's also a lot of nostalgia, because everything moved slower, the world was smaller, there was a lower standard; people will usually remember the later versions of a software, or never even encountered the earlier versions. Without the internet and every one bitching about every little detail, the general awareness was also different, not as toxic as today.
Just think of 8 and 16 bit video console games. Those cartridges were expensive so just how sure they had to be they were bug free before making millions of them?
Literally the moment everyone got on the internet, pretty much every computer program and operating system in the world was besieged by viruses and security flaws, so no.
Before 2000 fixing a bug the user would notice was expensive - you had to mail them a new disk/cd. As such there was a lot more effort put into testing software to ensure there were no bugs users would notice.
However before 2000 (really 1995) the internet was not a thing for most people. There were a few viruses around, but they had it really hard to propagate (they still managed, but compared to today it was much harder). Nobody worried about someone entering something too long in various fields - it did happen, but if you made your buffers "large" (say 100 bytes) most forms didn't have to worry about checking for overflow because nobody would type that much anyway. Note the assumption that a human was typing things on a keyboard into fields to create the buffer overflow. Thus a large portion of modern attacks weren't an issue - we are much better at checking buffer sizes now than there - they knew back then they should, but often got away with being lazy and not doing it. If a vulnerability exists but is never exploited do you care - thus is today better is debatable.
In the 1990s the US had encryption export laws, if you wanted to protect data often it was impossible. Modern AES didn't even exist until 2001, instead we had DES (when you cared triple DES which was pretty good even by today's standards) - but you were not allowed to use it in a lot of places. I remember the company I worked for at the time developed their own encryption algorithm for export, with the marketing(!) saying something like "We think it is good, but it hasn't been examined near as well as DES so you should only use it if you legally you can't use DES"
As an end user though, software was generally better. They rarely had bugs anyone would notice. This came at the expense of a lot more testing, and features took longer to develop. Even back then it was a known trade off, and some software was known to be better than others because of the effort the company put into making it work before release. High risk software (medical) is still developed with a lot of extra testing and effort today.
As for the second part - software back then was plenty complex. Sure today things are more complex, but I don't think that is the issue. In fact in some ways things were more complex because extra effort was put into optimization (200mhz CPUs were the top end expensive servers, most people only had around 90mhz, and more than one core was something only nerds knew was possible and most of them didn't have it). As such a lot of effort was put into complex algorithms that were faster at the expensive of being hard to maintain. Today we have better optimize rs and faster CPUs so we don't write as much complex code trying to get performance.
Yeah I don't think that is true at all. Plenty of software today is very well tested, and plenty of software back then was pushed out with insufficient testing due to short deadlines (some probably caused by the fact that they had to press CDs).
It was a simpler time. Not better. Not worse. Programs still had bugs, but they weren't sloppy UI bugs, they were logic bugs and memory leaks. If software was better back then, we'd still be using it!
Yes. The incentives for writing reliable, robust code were much higher. The internet existed so you could, in theory, get a patch out for people to download - but a sizeable part of any user base might have limited access, so would require something physical shipped to them (a floppy or CD). Making sure that your code worked and worked well at time of shipping was important. Large corporate customers were not going to appreciate having to distribute an update across their tens of thousands of machines.
No. The world wasn't as connected as it is today, which meant that the attack surface to reasonably consider was much smaller. A lot of the issues that we had back then were due to designs and implementations that assumed a closed system overall - but often allowed very open interoperability between components (programs or machines) within the system. For example, Outlook was automatable, so that it could be part of larger systems and send mail in an automated way. This makes sense within an individual organisation's "system", but isn't wise at a global level. Email worms ran rampant until Microsoft was forced to reduce that functionality via patches, which were costly for their customers to apply. It damaged their reputation considerably.
An extreme version of this was openness was SQL Slammer - a worm which attacked SQL Servers and development machines. Imagine that - enough organisations had their SQL Servers or developer machines directly accessible that an actual worm could thrive on a relational database system. Which is mindboggling to think about these days, but it really happened - see https://en.wikipedia.org/wiki/SQL_Slammer for details.
I wouldn't say that the evidence points to software being better in the way that we would think of "better" today. I'd say that the environment it had to exist in was simpler, and that the costs of shipping & updating were higher - so it made more sense to spend time creating robust software. Also nobody was thinking about the possible misuse or abuse of their software except in very limited ways. These days we have to protect against much more ingenious use & abuse of programs.
Furthermore today patching is quick and easy (by historical comparison), and a company might even be offering its own hosted solution, which makes the cost of patching very low for them. In such an environment it can seem more reasonable to focus on shipping features quickly over shipping robust code slowly. I'd argue that's a mistake, but a lot of software development managers disagree with me, and their pay packet often depends on that view, so they're not going to change their minds any time soon.
In a way this is best viewed as the third age of computing. The first was the mainframe age - centralised computer usage, with controlled access and oversight, so mistakes were costly but could be quickly recovered from. The second was the desktop PC age - distributed computer usage, with less access control, so mistakes were often less costly but recovering from them was potentially very expensive. The third is the cloud & device age, with a mix of centralised and distributed computer use, a mix of access control, and potentially much lower costs of recovery. In this third age if you make the wrong decisions on what to prioritise (robustness vs speed of shipping), it can be the worst of both the previous ages. But it doesn't have to be.
I hope that makes sense, and is a useful perspective for you.
Delta, JetBlue, American Airlines and Alaska Airlines have free Internet as long as you are enrolled (for free) in their loyalty programs.
JetBlue and Delta use ViaSat. I only fly Delta for the most part and ViaSat was available on all domestic routes I’ve flown except for the smaller A900 that I take from ATL to Southwest GA (50 minute flight). Then I use my free unlimited 1 hour access through T-Mobile with GoGo ground based service.
“Reversing was already mostly a speed-bump even for entry-level teams, who lift binaries into IR or decompile them all the way back to source. Agents can do this too, but they can also reason directly from assembly. If you want a problem better suited to LLMs than bug hunting, program translation is a good place to start.”
Huh. Direct debugging, in assembly. At that point, why not jump down to machine code?
For the purposes of debugging, assembly is machine code, just with some nice constructs to make it easier to read. Transpiling between assembly and machine code is mostly a find-and-replace exercise, not like the advanced reasoning involved in proper compilation.
On x86/x64/variable instruction length architectures this isn't always the case. You can jump in middle of an instruction to get a different instruction. It can be used to obfuscate code.
Decompiled assembly is basically machine code; without recreating the macros that make assembly "high level" you're as close to machine code as you're going to get unless you're trying to exploit the CPU itself.
There's no way the AI is a priori understanding codebases with millions of LoC now. We've tried that already, it failed. What it is doing now is setting up its own extremely powerful test harnesses and getting the information and testing it efficiently.
Sure, its semantic search is already strong, but the real lesson that we've learned from 2025 is that tooling is way more powerful.
That's cool! I've always wanted to learn how kernel devs properly test stuff reliably but it seemed hard. As someone who's dabbled in kernel dev for his job. Like real variable hardware, and not just manual testing shit.
Honestly, AI has only helped me become a better SWE because no one else has the time or patience to teach me.
Yeah, maybe you are right. But is doing math and reasoning about Turing machines a priori? If so, then it seems plausible to me that reasoning about a codebase (without running it) is also ‘a priori’.
> I don't know how long this pace will last. I suspect that bugs are reported faster than they are written, so we could in fact be purging a long backlog
Hopefully these same tools will also help catch security bugs at the point they're written. Maybe one day we'll reach a point where the discovery of new, live vulnerabilities is extremely rare?
Around 70% of security vulnerabilities are about memory safety and only exist because software is written in C and C++. Because most vulnerabilities are in newly written code, Google has found that simply starting writing new code in Rust (rather than trying to rewrite existing codebases) quickly brings the number of found vulnerabilities down drastically.
It comes from all his reporters being teenagers in developing countries with older models, and people using SOTA models who know how to qualify a potential vulnerability having much bigger fish to fry than curl. curl is a meaningful target, but it's in nobody's top tier.
You can't just write Rust in a part of the codebase that's all C/C++. Tools for checking the newly written C/C++ code for issues will still be valuable for a very long time.
You actually can? A Rust-written function that exports a C ABI and calls C ABI functions interops just fine with C. Of course that's all unsafe (unless you're doing pure value-based programming and not calling any foreign code), so you don't get much of a safety gain at the single-function level.
No, this is false. For Rust codebases that aren't doing high-peformance data structures, C interop, or bare-metal stuff, it's typical to write no unsafe code at all. I'm not sure who told you otherwise, but they have no idea what they're talking about.
It's the classic "misunderstanding" that UB or buggy unsafe code could in theory corrupt any part of your running application (which is technically true), and interpreting this to mean that any codebase with at least one instance of UB / buggy unsafe code (which is ~100% of codebases) is safety-wise equivalent to a codebase with zero safety check - as all the safety checks are obviously complete lies and therefore pointless time-wasters.
Which obviously isn't how it works in practice, just like how C doesn't delete all the files on your computer when your program contains any form of signed integer overflow, even though it technically could as that is totally allowed according to the language spec.
If you're talking about Rust codebases, I'm pretty sure that writing sound unsafe code is at least feasible. It's not easy, and it should be avoided if at all possible, but saying that 100% of those codebases are unsound is pessimistic.
One feasible approach is to use "storytelling" as described here: https://www.ralfj.de/blog/2026/03/13/inline-asm.html That's talking about inline assembly, but in principle any other unsafe feature could be similarly modeled.
It's not impossible, it is just highly unlikely that you'll never write a single safety-related bug - especially in nontrivial applications and in mixed C-plus-Rust codebases. For every single bug-free codebase there will be thousands containing undiscovered subtle-but-usually-harmless bugs.
After all, if humans were able to routinely write bug-free code, why even worry about unsoundness and UB in C? Surely having developers write safe C code would be easier than trying to get a massive ecosystem to adopt a completely new and not exactly trivial programming language?
Rust is not really "completely new" for a good C/C++ coder, it just cleans up the syntax a bit (for easier machine-parsing) and focuses on enforcing the guidelines you need to write safe code. This actually explains much of its success. The fact that this also makes it a nice enough high-level language for the Python/Ruby/JavaScript etc. crowd is a bit of a happy accident, not something that's inherent to it.
Good developers only write unsafe rust when there is good reason to. There are a lot of bad developers that add unsafe anytime they don't understand a Rust error, and then don't take it out when that doesn't fix the problem (hopefully just a minority, but I've seen it).
This is "the bomber will always get through" mentality for the modern era. You will invent air defences. You will write fewer bugs. You will leave code that doesn't have bugs alone, so it gains no more bugs. You will build software that finds bugs as easily as you think "enemies" find bugs, and you'll run it before you release your code.
What's the saying? Given many eyes, all bugs are shallow? Well, here are some more eyes.
I'd be very curious to know what class of vulnerability these tend to be (buffer overrun, use after free, misset execute permissions?), and if, armed with that knowledge, a deterministic tool could reliably find or prevent all such vulnerabilities. Can linters find these? Perhaps fuzzing? If code was written in a more modern language, is it sill likely that these bugs would have happened?
That's what syzbot / syzkaller does, as mentioned in the article, with somewhat similar results to the AI-fuzzing that they've been experiencing recently.
The issue that Linux maintainers have in general is that there are so many of these "strict correctness and safety" bugs in the Linux codebase that they can't fix them all at once, and they have no good mechanism to triage "which of these bugs is accessible to create an exploit."
This is also the argument by which most of their bugs become CVEs; in lieu of the capability to determine whether a correctness bug is reachable by an attacker, any bug could be an exploit, and their stance is that it's too much work to decide which is which.
Academically, syzkaller is just a very well orchestrated fuzzer, producing random pathological inputs to system calls, detecting crashes, and then producing reproductions. Syzkaller doesn't "know" what it's found, and a substantial fraction of what it finds are "just" crashers that won't ever be weaponizable.
An LLM agent finding vulnerabilities is an implicit search process over a corpus of inferred vulnerability patterns and inferred program structure. It's stochastic static program analysis (until you have the agent start testing). It's generating (and potentially verifying) hypotheses about actual vulnerabilities in the code.
That distinction is mostly academic. The bigger deal is: syzkaller crashes are part of the corpora of inputs agents will use to verify hypotheses about how to exploit Linux. It's an open secret that there are significant vulnerabilities encoded in the (mostly public!) corpus of syzbot crash reproductions; nobody has time to fish them out. But agents do, and have the added advantage of being able to quickly place a crash reproduction in the inferred context of kernel internals.
Yes, once we reach the broader conversation (I actually didn't initially grasp that the OP post was a sub-article under another one on LWN which then linked out to yet another article called "Vulnerability Research is Cooked"), I completely agree.
Modern LLMs are _exceptionally_ good at developing X-marks-the-spot vulnerabilities into working software; I fed an old RSA validation mistake in an ECU to someone in a GitHub comment the other day and they had Claude build them a working firmware reflashing tool within a matter of hours.
I think that the market for "using LLMs to triage bug-report inputs by asking it to produce working PoCs" is incredibly under-leveraged so far and if I were more entrepreneurial-minded at this junction I would even consider a company in this space. I'm a little surprised that both this article and most of the discussion under it hasn't gone that direction yet.
Interesting that it's been higher than forecast since 2023. Personally I'd expect that trend to continue given that LLMs both increase bugs written as well as bugs discovered.
Why don't we just pagerank github contributors? Merged PRs approved by other quality contributors improves rank. New PRs tagged by a bot with the rank of the submitter. Add more scoring features (account age? employer?) as desired.
It's interesting to hear from people directly in the thick of it that these bug reports are apparently gaining value and are no longer just slop. Maybe there is hope for a world where AI helps create bug free software and doesn't just overload maintainers.
I wish they wouldn’t call it “AI slop” before acknowledging that most of the bugs are correct.
Let’s bring a bit of nuance between mindless drivel (e.g. LinkedIn influencing posts, spammed issues that are LLMs making mistakes) vs using LLMs to find/build useful things.
I think they are saying what you want them to say. In the past they got a bunch of AI slop and now they are getting a lot of legit bug reports. The implication being that the AI got better at finding (and writing reports of) real bugs.
It can be correct and slop at the same time. The reporter could have reported it in a way that makes it clear a human reviewed and cared about the report.
Slop is a function of how the information is presented and how the tools are used. People don't care if you use LLMs if they don't tell you can use them, they care when you send them a bunch of bullshit with 5% of value buried inside it.
If you're reading something and you can tell an LLM wrote it, you should be upset. It means the author doesn't give a fuck.
No it can't. These aren't "Show HN" posts about new programs people have conjured with Claude. They're either vulnerabilities or they're not. There's no such thing as a "slop vulnerability". The people who exploit those vulnerabilities do not care how much earlier reporters "gave a fuck" about their report.
This is in the linked story: they're seeing increased numbers of duplicate findings, meaning, whatever valid bugs showboating LLM-enabled Good Samaritans are finding, quiet LLM-enabled attackers are also finding.
People doing software security are going to need to get over the LLM agent snootiness real quick. Everyone else can keep being snooty! But not here.
Everyone is free to be as snooty as they like. If a report is harder to read/understand/validate because the author just yolo'ed it with an LLM, that's on the report author, not on the maintainers.
It's not okay to foist work onto other people because you don't think LLM slop is a problem. It is absolutely a problem, and no amount of apologizing and pontificating is going to change that.
Grow up and own your work. Stop making excuses for other people. Help make the world better, not worse. It's obvious that LLMs can be useful for this purpose, so people should use them well and make the reports useful. Period.
Try to make this sentiment coherent. "It's not OK to foist work onto other people". Ok, sure, I won't. The vulnerability still exists. The maintainers just don't get to know about it. I do, I guess. But not them: telling them would "make the world worse".
Those aren't vulnerabilities. You're missing the point.
Nobody is saying there's no such thing as a slop report. Not only are there, but slop vulnerability reports as a time-consuming annoying phenomenon predate LLM chatbots by almost a decade. There's a whole cottage industry that deals with them.
If I read the sentence correctly they're saying that past reports were AI slop, but the state of the art has advanced and that current reports are valid. This matches trends I've seen on the projects I work on.
An AI enthusiast having a breathless and predictive position on the future of the technology? No way! It's almost like Wall Street is about to sour on the whole stack and there is a concerted effort to artificially push these views into the conversation to get people on board.
Then again, I'm a known crank and aggressive cynic, but you never really see any gathered data backing these points up.
Anyone who says anything good about AI must be an AI shill from the start, not someone who is genuinely observing reality or had their mind changed, don't you know?
Sort of a tautology to just assert that someone saying good things about AI is an AI enthusiast and therefore their opinion should be dismissed. He also happens to have been a kernel maintainer, his experience as he's describing it should count for something.
"On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us."
Is there a reason you’ve copy pasted the first paragraph from the link? It doesn’t add anything to the discussion, and also doesn’t help as a tl;dr because it’s literally the first paragraph. Genuine question!
The actual title is pretty unclear ("Significant Raise of Reports" of what?), so I considered replacing it by some of this excerpt, but HN rules say not to editorialize titles. Hence I put it into the `text` field, which I thought would be the body, but actually just gets posted as a comment.
Reports being written faster than bugs being created? Better quality software than before the 2000s?
Oh my sweet summer child.
This is some seriously delusional cope from someone who drank the entire jug of kool-aid.
I’d love to be proven wrong but the current trajectory is pretty plain as day from current outcomes. Everything is getting worse, and everyone is getting overwhelmed and we are under attack even more and the attacks are getting substantially more sophisticated and the blast radius is much bigger.
126 comments:
>software that used to follow the "release-then-go-back-to-cave" model will have to change to start dealing with maintenance for real, or to just stop being proposed to the world as the ultimate-tool-for-this-and-that because every piece of software becomes a target.
Actually, some software are running the water-heater/heat-pump system in my basement. There is a small blue light screen, it keeps logs of consumed electricity/produced heat and can make small histograms. Of course there is a smart option to make it internet connected. The kind of functionality I’m glad it’s disabled by default and not enforced to be able to operate. If possible, I’ll never upgrade it. Release then go back to the cave has definitely its place in many actual physical product in the world.
I’ll deal with enough WTF software security in my daily job during my career. Sparing some cognitive load of whatever appliance being turned into a brick because the company that produced it or some script-kiddy-on-ai-steroid decided it was desirable to do so, that’s more time to do whatever other thing cosmos allows to explore.
There's an anecdote I remember reading somewhere: When an 'embedded systems' engineer was to present a web-based product they were tasked to build, the managers/reviewers were puzzled they couldn't find any bugs. Asked about this, the engineer replied: "I didn't know that was an option".
Definitely a different mindset/toolset is required when it comes to building systems that have to be working autonomously without "quick fixes" from the web.
Web interfaces in embedded systems are very common remote exploit mechanisms, so this anecdote for sure isn't the typical experience.
Yes but I would push back a little on the idea that you simply put yourself in a "mindset of writing bug-free code."
Simpler code has fewer bugs. Embedded code tends to be simpler and more targeted in its role. Of course, putting yourself in the mindset of writing simpler code is great too - if you have the time to do so, and the problem you are solving is itself sufficiently simple.
Embedded code is also simpler because it has to be. When you are confined to a microcontroller, there isn't room for bloated app frameworks, hundreds of NPM packages, etc.
My gut feeling and expectation is that people will be turning their internet off at night, and at all times. At least for a while until this whole new security situation somehow settles with newly invented automation.
May sound weird, but as author of previous comment noted - a lot of appliances need not be connected ever and still benefit humanity.
> people will finally understand that security bugs are bugs, and that the only sane way to stay safe is to periodically update, without focusing on "CVE-xxx"
Linux devs keep making that point, but I really don't understand why they expect the world to embrace that thinking. You don't need to care about the vast majority of software defects in Linux, save for the once-in-a-decade filesystem corruption bug. In fact, there is an incentive not to upgrade when things are working, because it takes effort to familiarize yourself with new features, decide what should be enabled and what should be disabled, etc. And while the Linux kernel takes compatibility seriously, most distros do not and introduce compatibility-breaking changes with regularity. Binary compatibility is non-existent. Source compatibility is a crapshoot.
In contrast, you absolutely need to care about security bugs that allow people to run code on your system. So of course people want to treat security bugs differently from everything else and prioritize them.
I think part of it is that, especially at the kernel level, it can be hard to really categorise bugs into security or not-security (it has happened in the past that an exploit has used a bug that was not thought to be a security problem). There's good reason to want to avoid updates which add new features and such (because such changes can introduce more bugs), but linux has LTS releases which contain only bug fixes (regardless of security impact) for that situation, and in that case you can just stay up to date with very minimal risk of disruption.
And this is the best-case scenario. Because once updates become opt-out it simply becomes an attack vector of another type.
If the updated code is not open source, you are trusting blindly that not some kind of different remote code execution just happened without you knowing it.
If you don't personally review every line then you are already trusting blindly.
As blind as my belief that Asia exists, because I haven't personally navigated there. Hell, I've used electricity (using it right now), but I couldn't do the experiments you need to do to get myself to an 1850s level of understanding of how it works, much less our current level.
I trust that Linux has a process. I do not believe it is perfect. But it gives me a better assurance than downloading random packages from PyPi (though I believe that the most recent release of any random package on PyPi is still more likely safe than not--it's just a numbers game).
I get what you are saying but as you said, if you are already under attack you can't trust your own computer, you just hope that you aren't downloading another exploit/bogus update. Real software I imagine is not so easy to pwn so completely but I don't know.
And if you're the kind of person who cares about that, you pay a vendor that gives you 10 years on the same distro version.
Or just use an off-brand RHEL I guess.
Details are important, but my mental model has settled as: Security bugs are being use in a manner to how politicians use think of the children. It's used as an auto-win button. There are things to me that compete with them in priorities. (Performance, functionality, friction, convenience, compatibility etc); it's one thing to weigh. In some cases, I am asking: "Why is this program or functionality an attack surface? Why can someone on the internet write to this system?"
Many times, there will be a system that's core purpose is to perform some numerical operations, display things in a UI, accept user input via buttons etc, and I'm thinking "This has a [mandatory? automatic? People are telling me I have to do this or my life will be negatively affected in some important way?] security update? There's a vulnerability?" I think: Someone really screwed up at a foundational requirements level!.
Yeah that attitude really makes no sense, and I don't see why AI finding security bugs would make people "finally understand".
I suspect it's just an excuse for Linux's generally poor security track record.
Everything has a poor security track record. That's the point.
Well, except OpenBSD. They’ve only had two vulns in forever.
Only two remote code execution vulnerabilities in the default configuration. But that's not the only type of security bug.
They're trolling me. :)
You mean "in the default install, in a heck of a long time". :)
>people will finally understand that security bugs are bugs, and that the only sane way to stay safe is to periodically update, without focusing on "CVE-xxx"
The problem is that the very same tools, I expect, are behind the supply chain attacks that seem to be particularly notorious recently. No matter where you turn, there's an edge to cut you on that one.
The last paragraph is interesting: "Overall I think we're going to see a much higher quality of software, ironically around the same level than before 2000 when the net became usable by everyone to download fixes. When the software had to be pressed to CDs or written to millions of floppies, it had to survive an amazing quantity of tests that are mostly neglected nowadays since updates are easy to distribute."
Was software made before 2000 better? And, if so, was it because of better testing or lower complexity?
I was a developer at Microsoft in the 90s (Visual Studio (Boston) and Windows teams). I won't claim that software back then was "better," but what is definitely true is that we had to think about everything at a much lower level.
For example, you had to know which Win32 functions caused ring-3 -> ring-0 transitions because those transitions could be incredibly costly. You couldn't just "find the right function" and move on. You had to find the right function that wouldn't bring your app (and entire system) to its knees.
I specifically remember hating my life whenever we ran into a KiUserExceptionDispatcher [0] issue, because even something as simple as an exception could kill your app's performance.
Additionally, we didn't get to just patch flaws as they arose. We either had to send out patches on floppy disks, post them to BBSs, or even send them to PC Magazine.
[0]: https://doar-e.github.io/blog/2013/10/12/having-a-look-at-th...
From the user perspective, Windows and Office certainly crashed more frequently back then. I don't mean that as a criticism of the Microsoft developers at the time: they did some great work within severe constraints. But overall the product quality is far better now.
I wouldn't take that as criticism; you are 100% correct. But that instability was a direct result of the issues I mentioned above: the ring transition protection/implementation was absolutely horrible; 3rd-party developers would discover a useful function in NTDLL and start using it in unintended ways, etc.
Do you remember the CSRSS Backspace Bug? [0]
A simple: printf("hung up\t\t\b\b\b\b\b\b"); from ring-3 would result in a BSOD. That was a pretty major embarrassment.
After retiring, I started volunteering my time to mentor CS students at two local universities. I work with juniors and seniors who have no idea what "heap memory" is because, for the most part, they don't need to know. For many developers, the web browser is the "operating system".
I absolutely love using Python because I don't have to worry about the details that were major issues back in the 90s. But, at the same time, when I run into an issue, I fully understand what the operating system is doing and can still debug it down to assembly if need be.
[0]: https://jdebp.uk/FGA/csrss-backspace-bug.html
I can't imagine how much of a breath of fresh air Python / Java must have been if you were used to write typical business crud apps (and server software) in C/C++ (with no sanitizers / modern tooling to speak of).
It wasn’t. Java was very different from its current state before roughly Java 5. It felt like a downgrade from C++ to me at the time. C++ had templates and RAII and smart pointers, all of which Java lacked (and in some respects still lacks today). Not having something like the C preprocessor was quite annoying. Java performance wasn’t great. Tooling was better in some ways, worse in others. Linters did exist in C/C++, as did debug versions of libraries. You could load a crash dump into a debugger and could often get a pretty good picture of what went wrong. While Java certainly became preferable for business code, it wasn’t a sudden breath of fresh air, it was trade-offs that gradually became more favorable to it over the years.
I used to joke that using something like Python or C# felt like "programming with oven mitts". I never felt like I had any control. But that eventually morphed into "Well, I don't need that control and can focus on other things."
I spent the last few months building a toy LLM from scratch. I can't believe that within my lifetime I've gone from using punch cards to arguing with Claude when it does something ridiculous.
It was the best of times, it was the worst of times.
Best/better because yes, QA actually existed and was important for many companies - QA could "stop ship" before the final master was pressed if they found something (hehe as it was usually games) "game breaking". If you search around on folklore or other historical sites you can find examples of this - and programmers working all night with the shipping manager hovering over them ready to grab the disk/disc and run to the warehouse.
HOWEVER, updates did exist - both because of bugs and features, and because programmers weren't perfect (or weren't spending space-shuttle levels of effort making "perfect code" - and even voyager can get updates iirc). Look at DooM for an example - released on BBS and there are various versions even then, and that's 1994 or so?
But it was the "worst" in that the frameworks and code were simply not as advanced as today - you had to know quite a bit about how everything worked, even as a simple CRUD developer. Lots of protections we take for granted (even in "lower level" languages like C) simply didn't exist. Security issues abounded, but people didn't care much because everything was local (who cares if you can r00t your own box) - and 2000 was where the Internet was really starting to take off and everything was beginning to be "online" and so issues were being found left and right.
"everything was local"
This was the big thing. There were tons of bugs. Not really bugs but vulnerabilities. Nothing a normal user doing normal things would encounter, but subtle ways the program could be broken. But it didn't matter nearly as much, because every computer was an island, and most people didn't try to break their own computer. If something caused a crash, you just learned "don't do that."
Even so, we did have viruses that were spread by sharing floppy disks.
That's a really big part of it - bugs were ways that the program wouldn't do what the user wanted - and often workarounds existed (don't do that, it'll crash).
Nowadays those bugs still exist but a vast majority of bugs are security issues - things you have to fix because others will exploit them if you don't.
There are some rose-colored glasses when people say this.
Programs didn’t auto save and regularly crashed. It was extremely common to hear someone talk about losing hours of work. Computers regularly blue screened at random. Device drivers weren’t isolated from the kernel so you could easily buy a dongle or something that single-handedly destabilized your system. Viruses regularly brought the white-collar economy to its knees. Computer games that were just starting to come online and be collaborative didn’t do any validation of what the client sent it (this is true sometimes now, but it was the rule back then).
> Viruses regularly brought the white-collar economy to its knees.
Now, it's anti-virus (Crowdstrike) that does that. I don't think many or any virus or ransomware has ever had as big an impact at one time as Crowdstrike did. Maybe the ILOVEYOU worm.
Saving also often took a long time, so people didn't do it very often.
Certainly depended on the software. But disks were slow back then, and a save would commonly block the entire UI. If your software produced big files you could wait for an inconvenient amount of time
It's amazing that the world has largely forgotten the terror of losing entire documents forever. It happened to me. It happened to everyone. And this is the only comment I've seen so far here to even mention this.
Bad old days indeed!
Indeed, but it was pretty easy to develop the habit of hitting whatever function key was bound to "Save" fairly frequently. I certainly did.
Also auto-save is a mixed bag. With manual save, I was free to start editing a document and then realize I want to save it as something else, or just throw away my changes and start over. With auto-save, I've already modified my original. It took me quite a while to adjust to that.
If your program's auto-save works like that, it's broken.
Almost none do, though. Auto-save almost always writes to a temporary file, that is erased when you save manually.
Google Docs and VS Code are the first two that come to mind for autosave and they find use a temp file.
Fun fact: I was on the Google Docs team from 2010-2015. Save didn't do anything but we still hooked up an impression to the keystroke to measure how often people tried to save. It was one of the top things people did in the app at first; it was comparable to how often people would bold and unbold text. And then as people gained confidence it went down over time.
I still occasionally make that auto-save mistake.
AI tools have caused me to trip up a few times too when I fail to notice how many changes haven’t been checked into git, and then the tool obliterates some of its work and a struggle ensues to partially revert (there are ways, both in git and in AI temporary files etc). It’s user error but it is also a new kind of occasional mistake I have to adapt to avoid. As with when auto-save started to become universal.
> Was software made before 2000 better?
At the time of release, yes. They had to ensure the software worked before printing CDs and floppies. Nowadays they release buggy versions that users essentially test for them.
Also in terms of security, there was generally a much smaller potential attack surface and those surfaces were harder to reach because we were much less constantly connected.
> in terms of security
I wouldn't go that far. As soon as you went online all bets were off.
In the 90s we had java applets, then flash, browsers would open local html files and read/write from c:, people were used to exchanging .exe files all the time and they'd open them without scrutiny (or warnings) and so on. It was not a good time for security.
Then dial-up was so finicky that you could literally disconnect someone by sending them a ping packet. Then came winXP, and blaster and its variants and all hell broke loose. Pre SP2 you could install a fresh version of XP and have it pwned inside 10 minutes if it was connected to a network.
Servers weren't any better, ssh exploits were all over the place (even The Matrix featured a real ssh exploit) and so on...
The only difference was that "the scene" was more about the thrill, the boasting, and learning and less about making a buck out of it. You'd see "x was here" or "owned by xxx" in page "defaces", instead of encrypting everything and asking for a reward.
Software has gotten drastically more secure than it was in 2000. It's hard to comprehend how bad the security picture was in 2000. This very much, extremely includes Linux.
But there was much less awareness of buffer overflows and none of the countermeasures that are widespread today. It was almost defining of the Win95 era that applications (eg. Word) frequently crashed because of improper and unsafe memory management.
I remember opening a webpage and being hacked seemed more likely. Adobe Flash and Java had more vulnerabilities and weaker (if any) sandboxes than JavaScript.
Except that when you did connect Windows to anything it was hacked in less than 30 seconds (the user ignored the "apply these updates first, and then connect ..." advice, they wanted some keyboard driver. Hacked, whoops, gotta waste time doing a wipe and reinstall. This was back when many places had no firewalls). IRIX would fall over and die if you pointed a somewhat aggressive nmap at it, some buggy daemon listening by default on TCP/0, iirc. There was code in ISC DHCPD "windows is buggy, but we work around it with this here kluge..." and etc etc etc etc etc
Not just dhcpd. Besides the entire existance of Wine and Samba, Qemu has a workaround for win2k. Mkudffs has a workaround for MS-Windows not being able to read the filesystem without an mbr. Libc can work with local system time for those who dual-boot. Git can work around the difference in line endings. There are probably more of these kludges than you can shake a stick at.
The quantity of tests, known as penetration attempts, that most critical software survives today in a networked environment, is magnitudes more daring that the easily-cracked software printed on CDs. I really don't understand how this argument about software made 26 years ago really stands any reasonable ground.
It feels like rose-tinted glasses. While lots of low-hanging fruit had to be plucked to be shippable, there was still plenty of software which mandated specific hardware/software combinations or (worse) had major bugs which weren't patched but had workarounds documented in the manual, and if you weren't actively reading the manual, your newly-purchased software just wouldn't work (and if it was something low-level, that may mean you have to reinstall the OS).
Then there was stuff like rwall, which could be used to scrawl a message across basically every terminal connected to a networked Unix box in the world by accident [0][1], and it was far from the only insecure-by-design Unix software in widespread use.
It's interesting to watch youtubers like clabretro [2], NCommander [3], and Old Computers Sucked [4] who have documented the slog that was setting up and patching networking equipment, obscure Microsoft products, Netware, Unixes and Unix hardware, old Linux distros, etc. We take so much for granted these days. We don't even have to think about C/++ standards compliance outside the occasional compiler bug, much less the myriad of mutually-incompatible POSIX implementations that helped Microsoft win the Unix wars.
The fact that you can just build a PC with no prior experience or IT knowledge after watching an hour-long youtube video rather than having to spend weeks researching hardware compatibility or futzing about with IRQ levels, recompiling kernels, and messing with autoexec.bat/config.sys is a testament to how far we have come. You don't even have to think about drivers anymore unless you have specialized equipment.
[0]: https://news.ycombinator.com/item?id=31822138
[1]: https://news.ycombinator.com/item?id=35759965
[2]: https://www.youtube.com/@clabretro
[3]: https://www.youtube.com/@NCommander
[4]: https://www.youtube.com/@old-computers-sucked
It is hard to say which of the 2 is the reason, more likely both, i.e. lower complexity enabled more exhaustive testing.
In any case some of the software from before 2000 was definitely better than today, i.e. it behaved like being absolutely foolproof, i.e. nothing that you could do could cause any crash or corrupted data or any other kind of unpredictable behavior.
However, the computers to which most people had access at that time had only single-threaded CPUs. Even if you used a preemptive multitasking operating system and a heavily multi-threaded application, executing it on a single-threaded CPU was unlikely to expose subtle bugs due to race conditions, that might have been exposed on a multi-core CPU.
While nowadays there exists no standard operating system that I fully trust to never fail in any circumstance, unlike before 2003, I wonder whether this is caused by a better quality of the older programs or by the fact that it is much harder to implement software concurrency correctly on systems with hardware parallelism.
Not all software are done with the same quality, whatever the epoch.
It was possible to work with Ada as soon as 1980 wherever high guarantee of reliability was taken seriously, for example.
And not everyone is Knuth with a personal human secretary in well funded world-top institution.
In 2000s, Microsoft which was already sitting on insanely high mountain of resources released Windows Millennium Edition. Ask your greybeard neighbour if you are too young to remember. While commercialisation started in 2000, it is the last MS-DOS-based Windows version and so represent the pinnacle of what Windows 9x represented, before the big switch to a NT inheritance.
As always, the largest advantage of the good all time, is selective memory. After all, people that can remember know they survived the era, while present and future never provided much certainty on that point.
I’ve been considering that this might be an outcome of AI-written software and it’s the one aspect of all this that I’m actually unequivocally happy about.
Most software written at companies is shit. It’s whatever garbage someone slapped together and barely got working, and then they had to move onto the next thing. We end up squashing a never ending list of bugs because in a time-limited world, new features come first.
But that only really applies when the cost of good software dwarfs that of barely-functioning software. And when the marginal cost of polishing something is barely longer than it took to write it in the first place? There’s no reason not to take a few passes, get all the bugs out, and polish things up. Right now, AI can (and will) write an absolutely exhaustive set of test cases that handles far more than a human would ever have the motivation to write. And it will get better.
If a company can ship quality software in essentially the same time as it can ship garbage, the incentives will change rapidly. At least I hope so.
It appeared better, because there were fewer features and more time to develop and test. But it's also a lot of nostalgia, because everything moved slower, the world was smaller, there was a lower standard; people will usually remember the later versions of a software, or never even encountered the earlier versions. Without the internet and every one bitching about every little detail, the general awareness was also different, not as toxic as today.
Just think of 8 and 16 bit video console games. Those cartridges were expensive so just how sure they had to be they were bug free before making millions of them?
> Was software made before 2000 better?
Literally the moment everyone got on the internet, pretty much every computer program and operating system in the world was besieged by viruses and security flaws, so no.
Define better.
Before 2000 fixing a bug the user would notice was expensive - you had to mail them a new disk/cd. As such there was a lot more effort put into testing software to ensure there were no bugs users would notice.
However before 2000 (really 1995) the internet was not a thing for most people. There were a few viruses around, but they had it really hard to propagate (they still managed, but compared to today it was much harder). Nobody worried about someone entering something too long in various fields - it did happen, but if you made your buffers "large" (say 100 bytes) most forms didn't have to worry about checking for overflow because nobody would type that much anyway. Note the assumption that a human was typing things on a keyboard into fields to create the buffer overflow. Thus a large portion of modern attacks weren't an issue - we are much better at checking buffer sizes now than there - they knew back then they should, but often got away with being lazy and not doing it. If a vulnerability exists but is never exploited do you care - thus is today better is debatable.
In the 1990s the US had encryption export laws, if you wanted to protect data often it was impossible. Modern AES didn't even exist until 2001, instead we had DES (when you cared triple DES which was pretty good even by today's standards) - but you were not allowed to use it in a lot of places. I remember the company I worked for at the time developed their own encryption algorithm for export, with the marketing(!) saying something like "We think it is good, but it hasn't been examined near as well as DES so you should only use it if you legally you can't use DES"
As an end user though, software was generally better. They rarely had bugs anyone would notice. This came at the expense of a lot more testing, and features took longer to develop. Even back then it was a known trade off, and some software was known to be better than others because of the effort the company put into making it work before release. High risk software (medical) is still developed with a lot of extra testing and effort today.
As for the second part - software back then was plenty complex. Sure today things are more complex, but I don't think that is the issue. In fact in some ways things were more complex because extra effort was put into optimization (200mhz CPUs were the top end expensive servers, most people only had around 90mhz, and more than one core was something only nerds knew was possible and most of them didn't have it). As such a lot of effort was put into complex algorithms that were faster at the expensive of being hard to maintain. Today we have better optimize rs and faster CPUs so we don't write as much complex code trying to get performance.
Yeah I don't think that is true at all. Plenty of software today is very well tested, and plenty of software back then was pushed out with insufficient testing due to short deadlines (some probably caused by the fact that they had to press CDs).
It was a simpler time. Not better. Not worse. Programs still had bugs, but they weren't sloppy UI bugs, they were logic bugs and memory leaks. If software was better back then, we'd still be using it!
Depends what you mean by better. It crashed more and there was a lot of data loss, but it wasn't explicitly evil so maybe on measure it was better.
Yes and no.
Yes. The incentives for writing reliable, robust code were much higher. The internet existed so you could, in theory, get a patch out for people to download - but a sizeable part of any user base might have limited access, so would require something physical shipped to them (a floppy or CD). Making sure that your code worked and worked well at time of shipping was important. Large corporate customers were not going to appreciate having to distribute an update across their tens of thousands of machines.
No. The world wasn't as connected as it is today, which meant that the attack surface to reasonably consider was much smaller. A lot of the issues that we had back then were due to designs and implementations that assumed a closed system overall - but often allowed very open interoperability between components (programs or machines) within the system. For example, Outlook was automatable, so that it could be part of larger systems and send mail in an automated way. This makes sense within an individual organisation's "system", but isn't wise at a global level. Email worms ran rampant until Microsoft was forced to reduce that functionality via patches, which were costly for their customers to apply. It damaged their reputation considerably.
An extreme version of this was openness was SQL Slammer - a worm which attacked SQL Servers and development machines. Imagine that - enough organisations had their SQL Servers or developer machines directly accessible that an actual worm could thrive on a relational database system. Which is mindboggling to think about these days, but it really happened - see https://en.wikipedia.org/wiki/SQL_Slammer for details.
I wouldn't say that the evidence points to software being better in the way that we would think of "better" today. I'd say that the environment it had to exist in was simpler, and that the costs of shipping & updating were higher - so it made more sense to spend time creating robust software. Also nobody was thinking about the possible misuse or abuse of their software except in very limited ways. These days we have to protect against much more ingenious use & abuse of programs.
Furthermore today patching is quick and easy (by historical comparison), and a company might even be offering its own hosted solution, which makes the cost of patching very low for them. In such an environment it can seem more reasonable to focus on shipping features quickly over shipping robust code slowly. I'd argue that's a mistake, but a lot of software development managers disagree with me, and their pay packet often depends on that view, so they're not going to change their minds any time soon.
In a way this is best viewed as the third age of computing. The first was the mainframe age - centralised computer usage, with controlled access and oversight, so mistakes were costly but could be quickly recovered from. The second was the desktop PC age - distributed computer usage, with less access control, so mistakes were often less costly but recovering from them was potentially very expensive. The third is the cloud & device age, with a mix of centralised and distributed computer use, a mix of access control, and potentially much lower costs of recovery. In this third age if you make the wrong decisions on what to prioritise (robustness vs speed of shipping), it can be the worst of both the previous ages. But it doesn't have to be.
I hope that makes sense, and is a useful perspective for you.
I think there was a period where things got better but I don’t think it was pre-internet.
There was a point in time where both windows wasn’t constantly bsoding and Microsoft’s primary objectives weren't telemetry and slop coding.
Delta, JetBlue, American Airlines and Alaska Airlines have free Internet as long as you are enrolled (for free) in their loyalty programs.
JetBlue and Delta use ViaSat. I only fly Delta for the most part and ViaSat was available on all domestic routes I’ve flown except for the smaller A900 that I take from ATL to Southwest GA (50 minute flight). Then I use my free unlimited 1 hour access through T-Mobile with GoGo ground based service.
Important to note that this is a comment on this article: https://lwn.net/Articles/1065586/.
“Reversing was already mostly a speed-bump even for entry-level teams, who lift binaries into IR or decompile them all the way back to source. Agents can do this too, but they can also reason directly from assembly. If you want a problem better suited to LLMs than bug hunting, program translation is a good place to start.”
Huh. Direct debugging, in assembly. At that point, why not jump down to machine code?
For the purposes of debugging, assembly is machine code, just with some nice constructs to make it easier to read. Transpiling between assembly and machine code is mostly a find-and-replace exercise, not like the advanced reasoning involved in proper compilation.
On x86/x64/variable instruction length architectures this isn't always the case. You can jump in middle of an instruction to get a different instruction. It can be used to obfuscate code.
Decompiled assembly is basically machine code; without recreating the macros that make assembly "high level" you're as close to machine code as you're going to get unless you're trying to exploit the CPU itself.
I'm actually curious about AI progress:
There's no way the AI is a priori understanding codebases with millions of LoC now. We've tried that already, it failed. What it is doing now is setting up its own extremely powerful test harnesses and getting the information and testing it efficiently.
Sure, its semantic search is already strong, but the real lesson that we've learned from 2025 is that tooling is way more powerful.
That's cool! I've always wanted to learn how kernel devs properly test stuff reliably but it seemed hard. As someone who's dabbled in kernel dev for his job. Like real variable hardware, and not just manual testing shit.
Honestly, AI has only helped me become a better SWE because no one else has the time or patience to teach me.
What do you mean "a priori understanding codebases"? Quantify it and let's test specifically what you mean. Linux is huge.
> What do you mean "a priori understanding codebases"?
I took him to be distinguishing between (1) just reading the code/docs and reasoning about it, and (2) that + crafting and running tests.
I don't think that's it; both reading the code and running tests are a posteriori capabilities.
Yeah, maybe you are right. But is doing math and reasoning about Turing machines a priori? If so, then it seems plausible to me that reasoning about a codebase (without running it) is also ‘a priori’.
> I don't know how long this pace will last. I suspect that bugs are reported faster than they are written, so we could in fact be purging a long backlog
Hopefully these same tools will also help catch security bugs at the point they're written. Maybe one day we'll reach a point where the discovery of new, live vulnerabilities is extremely rare?
Around 70% of security vulnerabilities are about memory safety and only exist because software is written in C and C++. Because most vulnerabilities are in newly written code, Google has found that simply starting writing new code in Rust (rather than trying to rewrite existing codebases) quickly brings the number of found vulnerabilities down drastically.
I find this interesting.
Curl's Daniel Stenberg claimed during his NDC talk that vulnerabilities in this project are 8 years old on average.
I wonder where the disconnect comes from.
It comes from all his reporters being teenagers in developing countries with older models, and people using SOTA models who know how to qualify a potential vulnerability having much bigger fish to fry than curl. curl is a meaningful target, but it's in nobody's top tier.
You can't just write Rust in a part of the codebase that's all C/C++. Tools for checking the newly written C/C++ code for issues will still be valuable for a very long time.
You actually can? A Rust-written function that exports a C ABI and calls C ABI functions interops just fine with C. Of course that's all unsafe (unless you're doing pure value-based programming and not calling any foreign code), so you don't get much of a safety gain at the single-function level.
And to a good approximation all real world Rust uses unsafe everywhere.
So we now have a new code base in an undefined language which still has memory bugs.
This is progress.
No, this is false. For Rust codebases that aren't doing high-peformance data structures, C interop, or bare-metal stuff, it's typical to write no unsafe code at all. I'm not sure who told you otherwise, but they have no idea what they're talking about.
It's the classic "misunderstanding" that UB or buggy unsafe code could in theory corrupt any part of your running application (which is technically true), and interpreting this to mean that any codebase with at least one instance of UB / buggy unsafe code (which is ~100% of codebases) is safety-wise equivalent to a codebase with zero safety check - as all the safety checks are obviously complete lies and therefore pointless time-wasters.
Which obviously isn't how it works in practice, just like how C doesn't delete all the files on your computer when your program contains any form of signed integer overflow, even though it technically could as that is totally allowed according to the language spec.
If you're talking about Rust codebases, I'm pretty sure that writing sound unsafe code is at least feasible. It's not easy, and it should be avoided if at all possible, but saying that 100% of those codebases are unsound is pessimistic.
One feasible approach is to use "storytelling" as described here: https://www.ralfj.de/blog/2026/03/13/inline-asm.html That's talking about inline assembly, but in principle any other unsafe feature could be similarly modeled.
It's not impossible, it is just highly unlikely that you'll never write a single safety-related bug - especially in nontrivial applications and in mixed C-plus-Rust codebases. For every single bug-free codebase there will be thousands containing undiscovered subtle-but-usually-harmless bugs.
After all, if humans were able to routinely write bug-free code, why even worry about unsoundness and UB in C? Surely having developers write safe C code would be easier than trying to get a massive ecosystem to adopt a completely new and not exactly trivial programming language?
Rust is not really "completely new" for a good C/C++ coder, it just cleans up the syntax a bit (for easier machine-parsing) and focuses on enforcing the guidelines you need to write safe code. This actually explains much of its success. The fact that this also makes it a nice enough high-level language for the Python/Ruby/JavaScript etc. crowd is a bit of a happy accident, not something that's inherent to it.
Our experiences are different.
Good developers only write unsafe rust when there is good reason to. There are a lot of bad developers that add unsafe anytime they don't understand a Rust error, and then don't take it out when that doesn't fix the problem (hopefully just a minority, but I've seen it).
The parent comments references real world data from Google: https://security.googleblog.com/2024/09/eliminating-memory-s...
This is "the bomber will always get through" mentality for the modern era. You will invent air defences. You will write fewer bugs. You will leave code that doesn't have bugs alone, so it gains no more bugs. You will build software that finds bugs as easily as you think "enemies" find bugs, and you'll run it before you release your code.
What's the saying? Given many eyes, all bugs are shallow? Well, here are some more eyes.
I'd be very curious to know what class of vulnerability these tend to be (buffer overrun, use after free, misset execute permissions?), and if, armed with that knowledge, a deterministic tool could reliably find or prevent all such vulnerabilities. Can linters find these? Perhaps fuzzing? If code was written in a more modern language, is it sill likely that these bugs would have happened?
> Can linters find these? Perhaps fuzzing?
That's what syzbot / syzkaller does, as mentioned in the article, with somewhat similar results to the AI-fuzzing that they've been experiencing recently.
The issue that Linux maintainers have in general is that there are so many of these "strict correctness and safety" bugs in the Linux codebase that they can't fix them all at once, and they have no good mechanism to triage "which of these bugs is accessible to create an exploit."
This is also the argument by which most of their bugs become CVEs; in lieu of the capability to determine whether a correctness bug is reachable by an attacker, any bug could be an exploit, and their stance is that it's too much work to decide which is which.
It's a bigger deal than that.
Academically, syzkaller is just a very well orchestrated fuzzer, producing random pathological inputs to system calls, detecting crashes, and then producing reproductions. Syzkaller doesn't "know" what it's found, and a substantial fraction of what it finds are "just" crashers that won't ever be weaponizable.
An LLM agent finding vulnerabilities is an implicit search process over a corpus of inferred vulnerability patterns and inferred program structure. It's stochastic static program analysis (until you have the agent start testing). It's generating (and potentially verifying) hypotheses about actual vulnerabilities in the code.
That distinction is mostly academic. The bigger deal is: syzkaller crashes are part of the corpora of inputs agents will use to verify hypotheses about how to exploit Linux. It's an open secret that there are significant vulnerabilities encoded in the (mostly public!) corpus of syzbot crash reproductions; nobody has time to fish them out. But agents do, and have the added advantage of being able to quickly place a crash reproduction in the inferred context of kernel internals.
Yes, once we reach the broader conversation (I actually didn't initially grasp that the OP post was a sub-article under another one on LWN which then linked out to yet another article called "Vulnerability Research is Cooked"), I completely agree.
Modern LLMs are _exceptionally_ good at developing X-marks-the-spot vulnerabilities into working software; I fed an old RSA validation mistake in an ECU to someone in a GitHub comment the other day and they had Claude build them a working firmware reflashing tool within a matter of hours.
I think that the market for "using LLMs to triage bug-report inputs by asking it to produce working PoCs" is incredibly under-leveraged so far and if I were more entrepreneurial-minded at this junction I would even consider a company in this space. I'm a little surprised that both this article and most of the discussion under it hasn't gone that direction yet.
(I wrote the "Cooked" article, I'm not entirely sure why people are commenting on it on LWN.)
Probably related to this (genuinely interesting) talk given by an entropic researcher https://youtu.be/1sd26pWhfmg?si=j2AWyCfbNbOxU4MF
To clarify, the talk is by an Anthropic researcher, though given the subject of LLMs, "entropic researcher" also makes some kind of sense.
This really comforts me :) I'm looking forward to a more secure and private IT future.
Anecdotally, I've been seeing a higher rate of CVEs tracked by a few dependabot projects.
Seems supported by this as well: https://www.first.org/blog/20260211-vulnerability-forecast-2...
Interesting that it's been higher than forecast since 2023. Personally I'd expect that trend to continue given that LLMs both increase bugs written as well as bugs discovered.
Why don't we just pagerank github contributors? Merged PRs approved by other quality contributors improves rank. New PRs tagged by a bot with the rank of the submitter. Add more scoring features (account age? employer?) as desired.
It will be gamed, just as pagerank was.
Of course, but killing the 80% of low hanging fruit is already valuable. The rest is an arms race like always.
Not by everyone, so that would be better than nothing.
Excited to have to do SEM for my GitHub profile so that people will read my pull requests
It's interesting to hear from people directly in the thick of it that these bug reports are apparently gaining value and are no longer just slop. Maybe there is hope for a world where AI helps create bug free software and doesn't just overload maintainers.
this is what i'm seeing on a micro scale. i pointed a code-davinci-002 model at my own repo and it found a subtle off-by-
Or we can stop putting everything on the internet as a vector for enforced enshittification.
I wish they wouldn’t call it “AI slop” before acknowledging that most of the bugs are correct.
Let’s bring a bit of nuance between mindless drivel (e.g. LinkedIn influencing posts, spammed issues that are LLMs making mistakes) vs using LLMs to find/build useful things.
I think they are saying what you want them to say. In the past they got a bunch of AI slop and now they are getting a lot of legit bug reports. The implication being that the AI got better at finding (and writing reports of) real bugs.
It can be correct and slop at the same time. The reporter could have reported it in a way that makes it clear a human reviewed and cared about the report.
Slop is a function of how the information is presented and how the tools are used. People don't care if you use LLMs if they don't tell you can use them, they care when you send them a bunch of bullshit with 5% of value buried inside it.
If you're reading something and you can tell an LLM wrote it, you should be upset. It means the author doesn't give a fuck.
No it can't. These aren't "Show HN" posts about new programs people have conjured with Claude. They're either vulnerabilities or they're not. There's no such thing as a "slop vulnerability". The people who exploit those vulnerabilities do not care how much earlier reporters "gave a fuck" about their report.
This is in the linked story: they're seeing increased numbers of duplicate findings, meaning, whatever valid bugs showboating LLM-enabled Good Samaritans are finding, quiet LLM-enabled attackers are also finding.
People doing software security are going to need to get over the LLM agent snootiness real quick. Everyone else can keep being snooty! But not here.
Everyone is free to be as snooty as they like. If a report is harder to read/understand/validate because the author just yolo'ed it with an LLM, that's on the report author, not on the maintainers.
It's not okay to foist work onto other people because you don't think LLM slop is a problem. It is absolutely a problem, and no amount of apologizing and pontificating is going to change that.
Grow up and own your work. Stop making excuses for other people. Help make the world better, not worse. It's obvious that LLMs can be useful for this purpose, so people should use them well and make the reports useful. Period.
Try to make this sentiment coherent. "It's not OK to foist work onto other people". Ok, sure, I won't. The vulnerability still exists. The maintainers just don't get to know about it. I do, I guess. But not them: telling them would "make the world worse".
> There's no such thing as a "slop vulnerability"
https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s...
See the list at the bottom of the post for examples.
Those aren't vulnerabilities. You're missing the point.
Nobody is saying there's no such thing as a slop report. Not only are there, but slop vulnerability reports as a time-consuming annoying phenomenon predate LLM chatbots by almost a decade. There's a whole cottage industry that deals with them.
Or did. Obsolete now.
If I read the sentence correctly they're saying that past reports were AI slop, but the state of the art has advanced and that current reports are valid. This matches trends I've seen on the projects I work on.
An AI enthusiast having a breathless and predictive position on the future of the technology? No way! It's almost like Wall Street is about to sour on the whole stack and there is a concerted effort to artificially push these views into the conversation to get people on board.
Then again, I'm a known crank and aggressive cynic, but you never really see any gathered data backing these points up.
Could you back up your assertion that Willy Tarreau — who used to maintain the Linux kernel — is “an AI enthusiast”? I can’t find anything about it.
Also one of the initial creator of haproxy, a well known reverse proxy. To imply somebody like as a simple "AI shill" is just ignorant.
Anyone who says anything good about AI must be an AI shill from the start, not someone who is genuinely observing reality or had their mind changed, don't you know?
> but you never really see any gathered data backing these points up.
https://www.anthropic.com/news/mozilla-firefox-security
?
Sort of a tautology to just assert that someone saying good things about AI is an AI enthusiast and therefore their opinion should be dismissed. He also happens to have been a kernel maintainer, his experience as he's describing it should count for something.
> He also happens to have been a kernel maintainer
And a primary author of one of the most stable and used load balancers in the history of networking.
"On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us."
Is there a reason you’ve copy pasted the first paragraph from the link? It doesn’t add anything to the discussion, and also doesn’t help as a tl;dr because it’s literally the first paragraph. Genuine question!
The actual title is pretty unclear ("Significant Raise of Reports" of what?), so I considered replacing it by some of this excerpt, but HN rules say not to editorialize titles. Hence I put it into the `text` field, which I thought would be the body, but actually just gets posted as a comment.
Reports being written faster than bugs being created? Better quality software than before the 2000s?
Oh my sweet summer child.
This is some seriously delusional cope from someone who drank the entire jug of kool-aid.
I’d love to be proven wrong but the current trajectory is pretty plain as day from current outcomes. Everything is getting worse, and everyone is getting overwhelmed and we are under attack even more and the attacks are getting substantially more sophisticated and the blast radius is much bigger.