Hi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.
I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:
$ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')
$ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; done 2>/dev/null
$ echo a > file
$ time cp file copy
real 0m0.010s
user 0m0.002s
sys 0m0.003s
$ time uu_cp file copy
real 0m12.857s
user 0m0.064s
sys 0m12.702s
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].
Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)
First of all, thank you for presenting a succinct take on this viewpoint from the other side of the fence from where I am at.
So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.)
So here goes:
Question 1:
How come "speed", "performance", race conditions and st_ino keep getting brought up?
Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there.
Question 2:
Does throughput really matter more than latency in everyday application?
Question 3 (explanation first, this time):
The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils.
What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO).
> Does throughput really matter more than latency in everyday application?
In my experience latency and throughput are intrinsically linked unless you have the buffer-space to handle the throughput you want. Which you can't guarantee on all the systems where GNU Coreutils run.
No need to apologize at all. Doing it in one cd invocation would fail since the file name is longer than PATH_MAX. In that case passing it to a system call would fail with errno set to ENAMETOOLONG.
You could probably make the loop more efficient, but it works good enough. Also, some shells don't allow you to enter directories that deep entirely. It doesn't work on mksh, for example.
Yes? The quote says "tends to", and you still can cd into that directory, albeit not in a single invocation. Windows has similar limitations [0], it's just that their MAX_PATH is just 260 so it's somewhat more noticeable... and IIRC the hard limit of 32 K for paths in non-negotiable.
To be even fair-er, it wasn't actually memory unsafety, it was "just" unsoundness, there was a type, that IF you gave it an io reader implementation that was weird, that implementation could see uninit data, or expose uninit data elsewhere, but the only readers actually used were well behaved readers.
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.
More than that: it seems that Rust stdlib nudges the developer towards using neat APIs at an incorrect level of abstraction, like path-based instead of handle-based file operations. I hope I'm wrong.
Nearly every available filesystem API in Rust's stdlib maps one-to-one with a Unix syscall (see Rust's std::fs module [0] for reference -- for example, the `File` struct is just a wrapper around a file descriptor, and its associated methods are essentially just the syscalls you can perform on file descriptors). The only exceptions are a few helper functions like `read_to_string` or `create_dir_all` that perform slightly higher-level operations.
And, yeah, the Unix syscalls are very prone to mistakes like this. For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle; and so Rust has a `rename` function that takes two paths rather than an associated function on a `File`. Rust exposes path-based APIs where Unix exposes path-based APIs, and file-handle-based APIs where Unix exposes file-handle-based APIs.
So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
> For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle
And then there’s renameat(2) which takes two dirfd… and two paths from there, which mostly has all the same issues rename(2) does (and does not even take flags so even O_NOFOLLOW is not available).
I’m not sure what you’d need to make a safe renameat(), maybe a triplet of (dirfd, filefd, name[1]) from the source, (dirfd, name) from the target, and some sort of flag to indicate whether it is allowed to create, overwrite, or both.
How about fd of the file you wanna rename, dirfd of the directory you want to open it in, and name of the new file? You could then represent a "rename within the same directory" as: dfd = opendir(...); fd = openat(dfd, "a"); rename2(fd, dfd, "b");
I can't think of a case this API doesn't cover, but maybe there is one.
The file may have been renamed or deleted since the fd was opened, and it might have been legitimate and on purpose, but there’s no way to tell what trying to resolve the fd back to a path will give you.
And you need to do that because nothing precludes having multiple entries to the same inode in the same directory, so you need to know specifically what the source direntry is, and a direntry is just a name in the directory file.
After reading this article, I'm inclined to think that the right thing for this project to do is write their own library that wraps the Rust stdlib with a file-handle-based API along with one method to get a file handle from a Path; rewrite the code to use that library rather than rust stdlib methods, and then add a lint check that guards against any use of the Rust standard library file methods anywhere outside of that wrapper.
If that's the right approach, then it would be useful to make that library public as a crate, because writing such hardened code is generally useful. Possibly as a step before inclusion in the rust stdlib itself.
If anything, I find the rust standard library to default to Unix too much for a generic programming language. You need to think very Unixy if you want to program Rust on Windows, unless you're directly importing the Windows crate and foregoing the Rust standard library. If you're writing COBOL style mainframe programs, things become even more forced, though I doubt the overlap between Rust programmers and mainframe programmers that don't use a Unix-like is vanishingly small.
This can also be a pain on microcontrollers sometimes, but there you're free to pretend you're on Unix if you want to.
Someone once coined a related term, "disassembler rage". It's the idea that every mistake looks amateur when examined closely enough. Comes from people sitting in a disassembler and raging the high level programmers who had the gall to e.g. use conditionals instead of a switch statement inside a function call a hundred frames deep.
We're looking solely at the few things they got wrong, and not the thousands of correct lines around them.
When I read the article I came away with the impression that shipping bugs this severe in a rewrite of utils used by hundreds of millions of people daily (hourly?) isn’t ok. I don’t think brushing the bad parts off with “most of the code was really good!” is a fair way to look at this.
Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
I think that legitimate real world issues in rust code should be talked about more often. Right now the language enjoys a reputation that is essentiaöly misleading marketing. It isn't possible to create a programing language that doesn't allow bugs to happen (even with formal verification you can still prove correctness based on a wrong set of assumptions). This weird, kind of religious belief that rust leads to magically completely bug free programs needs to be countered and brought in touch with reality IMO.
Nobody believes Rust programs are but free, though. Rust never promised that. It doesn't even promise memory safety, it only promises memory safety if you restrict yourself to safe APIs which simply isn't always possible.
Is it possible you’ve misunderstood what Rust promises?
> It isn't possible to create a programing language that doesn't allow bugs to happen
Yes, that’s true. No one doubts this. Except you seem to think that Rust promises no bugs at all? I don’t know where you got this impression from, but it is incorrect.
Rust promises that certain kinds of bugs like use-after-free are much, much less likely. It eliminates some kinds of bugs, not all bugs altogether. It’s possible that you’ve read the claim on kinds of bugs, and misinterpreted it as all bugs.
Because the bugs were caused by programmer error, not anything inherent to rust. It was more notable due to cloudflare being a critical dependency for half the internet, but that particular issue could've happened in any language.
This kind of melodramatic reaction to rust code is fatiguing, honestly. Rust does not bill itself as some programming panacea or as a bug free language, and neither do any of the people I know using it. That's a strawman that just won't go away.
Rust applies constraints regarding memory use and that nearly eliminates a class of bugs, provided safe usage. And that's compelling to enough people that it warrants migration from other languages that don't focus on memory safety. Bugs introduced during a rewrite aren't notable. It happens, they get fixed, life moves on.
I didn't downvote, but I feel the last two points show a lack of nuance. It's saying "Rust doesn't prevent 100% of the bugs, like all other programming languages", while failing to acknowledge that if a programming language prevents entire classes of bugs, it's a very significant improvement.
Seems pretty impressive they rewrote the coreutils in a new language, with so little Unix experience, and managed to do such a good job with very little bugs or vulns. I would have expected an order of magnitude more at least.
Shows how good Rust is, that even inexperienced Unix devs can write stuff like this and make almost no mistakes.
Yes, it's the lack of Unix experience that's terrifying. So many of mistakes listed are rookie mistakes, like not propagating the most severe errors, or the `kill -1` thing. Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
> Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
From what I understand, "assigned" probably isn't the best way to put it. uutils started off back in 2013 as a way to learn Rust [0] way before the present kerfuffle.
Yeah perhaps learning UNIX API's and Rust at the same time doesn't lead to a drop in replacement ready to be shipped in major distributions. Who whould have thunk it.
Why is it even possible to represent a negative PID, let alone treat the integer -1 as a PID meaning "all effective processes"? This seems like a mistake (if not a rookie mistake) in the Linux kernel API itself.
-1 is a special case, a way to represent a PID with all bits set in a platform-independent way. It's not very clean, and it comes from ancient times when writing some extra code and storing an extra few bytes was way more expensive.
Pretty much all the rough edges being discussed here are design mistakes in Linux or Unix, and/or a consequence of using an unsafe language with limited abstractions and a weak type system. But because of ubiquity, this is everyone’s problem now.
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?
My understanding is the uutils development process involved extensive testing against the behaviour of the original utilities, including preserving bugs.
One thing that's hard about rewriting code is that the original code was transformed incrementally over time in response to real world issues only found in production.
The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
TFA is a good list of this exact sort of thing.
Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.
> we cannot accept any changes based on the GNU source code [..]. It is however possible to look at other implementations under a BSD or MIT license like Apple's implementation or OpenBSD.
The wording of that clearly implies that you should not look at GNU source code in order to contribute to uutils.
> The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target.
It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.
To the extent that locking exists in posix it is various degrees of useless and broken. And as far as I know while BSDs have extensions which make some use cases workable Linux is completely hopeless.
Nope! But basically, expect anything that resolves usernames, or host names, to be done in the userspace by NSS.
Sun engineers Thomas Maslen and Sanjay Dani were the first to design and implement
the Name Service Switch. They fulfilled Solaris requirements with the nsswitch.conf
file specification and the implementation choice to load database access modules as
dynamically loaded libraries, which Sun was also the first to introduce.
Sun engineers' original design of the configuration file and runtime loading of name
service back-end libraries has withstood the test of time as operating systems have
evolved and new name services are introduced. Over the years, programmers ported the
NSS configuration file with nearly identical implementations to many other operating
systems including FreeBSD, NetBSD, Linux, HP-UX, IRIX and AIX.[citation needed] More
than two decades after the NSS was invented, GNU libc implements it almost identically.
The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
E.g.
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.
To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries.
This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
> The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
Seems and smells is weasel words. The root cause is not thinking: Why is root chrooting into a directory they do not control?
Whatever you chroot into is under control of whoever made that chroot, and if you cannot understand this you have no business using chroot()
> To me such a get_user_by_name function is like a booby trap
> I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
You'd probably still be in the trap: there's usually very little difference between writing to newroot/etc/passwd and newroot/usr/lib/x86_64-linux-gnu/libnss_compat.so or newroot/bin/sh or anything else.
So I think there's no reason for /usr/sbin/chroot look up the user id in the first place (toybox chroot doesn't!), so I think the bug was doing anything at all.
Thanks for the list. I like these lists, so I can put them into a .md file, then launch "one agent per file" on my codebase and see if they can find anything similar to the mentioned CVEs.
For core system functionality maybe. But for most applications Rust slow compiler iteration speed becomes a bottleneck when the likes of TypeScript (with Bun) and Go have sub second iteration times.
Plus AI is also good at catching, in other languages, errors that Rust tooling enforces. Like race conditions, use after free, buffer overflows, lifetimes, etc.
So maybe AI will become to ultimate "rust checker" for any language.
The title of this article should be "Rust can't stop you from not giving a fuck" or "Rust can't give a fuck for you."
---
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
...
[List of bugs a diligent person would be mindful of, unix expert or not]
---
Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.
For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.
Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.
Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.
I love Rust, but I wonder if this is an example of the idea that its excellent type system can lull some people into a false sense of security. Particularly when interfacing to low-level code like kernel APIs, which are basically minefields inadvertently designed to trick the unwary, the Rust guarantees are undermined. The extent of this may not be immediately obvious to everyone.
This seems to be the case, yes. Before reading this post I was a lot more open minded about the "rewrite it in Rust" scene but now I'm just kind of in a horrorpit wondering whether I'll be stuck on macOS forever :(.
> Pretty shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure
uutils did not start off as "let's make critical infrastructure in Rust", it started off as "coreutils are small and have tests, so we're rewriting them in Rust for fun". As a result there's needed to be a bunch of cleanup work.
Okay, thanks for the context, but aren't distributions eager to adopt these? Are current GNU coreutils a common vulnerability vector?
> For fun
My idea of fun is reviewing my code and making sure I'm handling errors correctly so that my software doesn't suck. Maybe the people who are doing this, for fun, should be more aligned with that mentality?
> uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug.
That's the minimum, it is absurd that they did not start from that!
I recall the last time there was a massive bug in the uutils project, it was because the coreutils tests didn't cover some crucial aspect people relied on. Running these tests is useful for compatibility and all, but it won't necessarily catch security issues.
This is what happens when many people hype about a technology that solves a specific class of vulnerabilities, but it is not designed to prevent the others such as logic errors because of human / AI error.
Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.
In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.
I'm not sure that they were all that experienced in Rust when most of this code was written. uutils has been a bit of a "good first rust issue" playground for a lot of its existence
Which makes it pretty unsurprising that the authors also weren't all that well versed in the details of low-level POSIX API
I know nobody's perfect and I'm not asking for perfection, but these bugs are pretty alarming? It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they are trying to replace. Or at least didn't have any curiosity about why the GNU tools work the way they do. Otherwise they might've wondered about why things operate on bytes and file descriptors instead of strings and paths.
I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.
1. uutils as a project started back in 2013 as a way to learn Rust, by no means by knowledgeable developers or in a mature language
2. uutils didn't even have a consideration to become a replacement of GNU Coreutils until.... roughly 2021, I think? 2021 is when they started running compliance/compatibility tests, anyway
3. The choice of licensing (made in 2013) effectively forbids them from looking at the original source
> It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they were supposed to be replacing.
They're a group of people who want to replace pro-user software (GPL) with pro-business software (MIT).
They are deliberately not looking at coreutils code because the Rust versions are released as MIT and they don't want the project contaminated by GPL. I am not fond of this, personally.
I find it interesting how people will criticise Rust for not preventing all bugs, when the alternative languages don't prevent those same bugs nor the bugs rust does catch. If you're comparing Rust to a perfect language that doesn't exist, you should probably also compare your alternative to that perfect language as well right?
I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.
What's the point of a "rewrite in Rust" when it introduces bugs that either never existed in the original or were fixed already?
> I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime
The point is, those bugs had been discovered and fixed decades ago. Do you want to wait decades for coreutils_rs to reach the same robustness? Why do a rewrite when the alternative is to help improve the original which is starting from a much more solid base?
And even when a complete rewrite would make sense, why not do a careful line-by-line porting of the original code instead of doing a clean-room implementation to at least carry over the bugfixes from the original? And why even use the Rust stdlib at all when it contains footguns that are not acceptable for security-critical code?
Idk, you should ask the maintainers these questions, or the Ubuntu maintainers. I'm not particularly arguing in favour of this rewrite, but the title and contents of the post are talking about Rust in general and the type of bugs it can/can't prevent.
Perhaps one good reason is that once the initial bugs are fixed, over time the number of security issues will be lower than the original? If it could reach the same level of stability and robustness in months or a small number of years, the downsides aren't totally obvious. We will have to wait to judge I suppose. Maybe it's not worth it and that's fine, but it doesn't speak to Rust as a language.
You’re right, but it’s gonna be hard to stop them from raging. In many ways people want to be justified in a „see, I told you so, Rust is useless” belief, and they’re willing to take one or two questionable logical steps to get there.
Google people were specifically talking about memory safety. Logic bugs happen. All of the issues in TFA are effectively logic bugs or POSIX stuff, which is a different category entirely that Rust never claimed to solve
I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)
[1] https://www.gnu.org/prep/standards/standards.html#Semantics [2] https://github.com/advisories/GHSA-w9vv-q986-vj7x
So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.)
So here goes:
Question 1:
How come "speed", "performance", race conditions and st_ino keep getting brought up?
Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there.
Question 2:
Does throughput really matter more than latency in everyday application?
Question 3 (explanation first, this time):
The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils.
What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO).
In my experience latency and throughput are intrinsically linked unless you have the buffer-space to handle the throughput you want. Which you can't guarantee on all the systems where GNU Coreutils run.
EDIT: got it. -bash: cd: a/a/a/....../a/a/: File name too long
You could probably make the loop more efficient, but it works good enough. Also, some shells don't allow you to enter directories that deep entirely. It doesn't work on mksh, for example.
> However, GNU software tends to work very hard to avoid arbitrary limits [1].
[0] https://learn.microsoft.com/en-us/windows/win32/fileio/maxim...
They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.
And, yeah, the Unix syscalls are very prone to mistakes like this. For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle; and so Rust has a `rename` function that takes two paths rather than an associated function on a `File`. Rust exposes path-based APIs where Unix exposes path-based APIs, and file-handle-based APIs where Unix exposes file-handle-based APIs.
So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
[0]: https://doc.rust-lang.org/std/fs/index.html
And then there’s renameat(2) which takes two dirfd… and two paths from there, which mostly has all the same issues rename(2) does (and does not even take flags so even O_NOFOLLOW is not available).
I’m not sure what you’d need to make a safe renameat(), maybe a triplet of (dirfd, filefd, name[1]) from the source, (dirfd, name) from the target, and some sort of flag to indicate whether it is allowed to create, overwrite, or both.
As the recent https://blog.sebastianwick.net/posts/how-hard-is-it-to-open-... talks about (just for file but it applies to everything) secure file system interaction is absolutely heinous.
[1]: not path
I can't think of a case this API doesn't cover, but maybe there is one.
And you need to do that because nothing precludes having multiple entries to the same inode in the same directory, so you need to know specifically what the source direntry is, and a direntry is just a name in the directory file.
This can also be a pain on microcontrollers sometimes, but there you're free to pretend you're on Unix if you want to.
We're looking solely at the few things they got wrong, and not the thousands of correct lines around them.
Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
Less than that actually, considering Rust has its own definition of what "safe" means.
https://dwarffortresswiki.org/DF2014:Fun&redirect=no
> It isn't possible to create a programing language that doesn't allow bugs to happen
Yes, that’s true. No one doubts this. Except you seem to think that Rust promises no bugs at all? I don’t know where you got this impression from, but it is incorrect.
Rust promises that certain kinds of bugs like use-after-free are much, much less likely. It eliminates some kinds of bugs, not all bugs altogether. It’s possible that you’ve read the claim on kinds of bugs, and misinterpreted it as all bugs.
I’ve had this conversation before, and it usually ends like https://www.smbc-comics.com/comic/aaaah
Exactly what is the controversial take here?
> I don’t think brushing the bad parts off with “most of the code was really good!” is a fair way to look at this.
Nope. this is fine.
> Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Maybe this?
> Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
Nope, this is fine too.
This kind of melodramatic reaction to rust code is fatiguing, honestly. Rust does not bill itself as some programming panacea or as a bug free language, and neither do any of the people I know using it. That's a strawman that just won't go away.
Rust applies constraints regarding memory use and that nearly eliminates a class of bugs, provided safe usage. And that's compelling to enough people that it warrants migration from other languages that don't focus on memory safety. Bugs introduced during a rewrite aren't notable. It happens, they get fixed, life moves on.
Shows how good Rust is, that even inexperienced Unix devs can write stuff like this and make almost no mistakes.
From what I understand, "assigned" probably isn't the best way to put it. uutils started off back in 2013 as a way to learn Rust [0] way before the present kerfuffle.
[0]: https://github.com/uutils/coreutils/tree/9653ed81a2fbf393f42...
So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?
The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
TFA is a good list of this exact sort of thing.
Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.
uutils would be so much better imo if it was GPL and took direct inspiration from the coreutils source code.
It does not matter if it's in the GPL explicitly or not since we're talking about uutils and their stance on it, and they've written that:
https://github.com/uutils/coreutils/blob/6b8a5a15b4f077f8609...
> we cannot accept any changes based on the GNU source code [..]. It is however possible to look at other implementations under a BSD or MIT license like Apple's implementation or OpenBSD.
The wording of that clearly implies that you should not look at GNU source code in order to contribute to uutils.
It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.
[0] https://michael.orlitzky.com/articles/posix_hardlink_heartac...
That's kind of horrifying. Is there a reliable list somewhere of all the functions that do that? Is that list considered stable?
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.
To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries. This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
Seems and smells is weasel words. The root cause is not thinking: Why is root chrooting into a directory they do not control?
Whatever you chroot into is under control of whoever made that chroot, and if you cannot understand this you have no business using chroot()
> To me such a get_user_by_name function is like a booby trap
> I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
You'd probably still be in the trap: there's usually very little difference between writing to newroot/etc/passwd and newroot/usr/lib/x86_64-linux-gnu/libnss_compat.so or newroot/bin/sh or anything else.
So I think there's no reason for /usr/sbin/chroot look up the user id in the first place (toybox chroot doesn't!), so I think the bug was doing anything at all.
Rust won't catch it, but now the agents will.
Edit: https://gist.github.com/fschutt/cc585703d52a9e1da8a06f9ef93c... for anyone who needs copying this
Plus AI is also good at catching, in other languages, errors that Rust tooling enforces. Like race conditions, use after free, buffer overflows, lifetimes, etc.
So maybe AI will become to ultimate "rust checker" for any language.
---
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
...
[List of bugs a diligent person would be mindful of, unix expert or not]
---
Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.
For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.
Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.
Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.
uutils did not start off as "let's make critical infrastructure in Rust", it started off as "coreutils are small and have tests, so we're rewriting them in Rust for fun". As a result there's needed to be a bunch of cleanup work.
> For fun
My idea of fun is reviewing my code and making sure I'm handling errors correctly so that my software doesn't suck. Maybe the people who are doing this, for fun, should be more aligned with that mentality?
[0]: https://github.com/uutils/coreutils-tracking/commits/main/?a...
Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.
In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.
I'm not sure that they were all that experienced in Rust when most of this code was written. uutils has been a bit of a "good first rust issue" playground for a lot of its existence
Which makes it pretty unsurprising that the authors also weren't all that well versed in the details of low-level POSIX API
In this case the filesystem API was perhaps not as well designed as it could have been. That can potentially be fixed though.
Some of the other bugs would be hard to statically prevent though. But nobody ever claimed otherwise.
I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.
1. uutils as a project started back in 2013 as a way to learn Rust, by no means by knowledgeable developers or in a mature language
2. uutils didn't even have a consideration to become a replacement of GNU Coreutils until.... roughly 2021, I think? 2021 is when they started running compliance/compatibility tests, anyway
3. The choice of licensing (made in 2013) effectively forbids them from looking at the original source
They're a group of people who want to replace pro-user software (GPL) with pro-business software (MIT).
I don't really want them to achieve their goal.
I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.
Don't just downvote me, tell me how I'm wrong.
> I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime
The point is, those bugs had been discovered and fixed decades ago. Do you want to wait decades for coreutils_rs to reach the same robustness? Why do a rewrite when the alternative is to help improve the original which is starting from a much more solid base?
And even when a complete rewrite would make sense, why not do a careful line-by-line porting of the original code instead of doing a clean-room implementation to at least carry over the bugfixes from the original? And why even use the Rust stdlib at all when it contains footguns that are not acceptable for security-critical code?
Perhaps one good reason is that once the initial bugs are fixed, over time the number of security issues will be lower than the original? If it could reach the same level of stability and robustness in months or a small number of years, the downsides aren't totally obvious. We will have to wait to judge I suppose. Maybe it's not worth it and that's fine, but it doesn't speak to Rust as a language.