Compiling C to Safe Rust, Formalized

(arxiv.org)

289 points | by love2read 9 days ago

18 comments

wffurr 9 days ago
Note that this is done for “existing formally verified C codebases” which is a lot different from typical systems C code which is not formally verified.
[-]
- safercplusplus 9 days ago
  And even then, not completely reliably it seems (from Section 2.2):
  > The coercions introduced by conversion rules can however lead to subtle semantic differences
  The example they give is this C code:
```
    1 uint8_t x[1] = { 0 };
    2 uint8_t *y = x;
    3 *y = 1;
    4 assert(*x == 1); /* SUCCESS */
```
  getting translated to this (safe) Rust code:
```
    1 let x: [u8; 1] = [0; 1];
    2 let mut y: Box<[u8]> = Box::new(x);
    3 y[0] = 1;
    4 assert!(x[0] == 1) /* failure */
```
  So the pointer (iterator) targeting an existing (stack-allocated) array declared on line 2 gets translated to an owning pointer/Box) targeting a (heap-allocated) new copy of the array. So if the original code was somehow counting on the fact that the pointer iterator was actually targeting the array it was assigned to, the translated code may (quietly) not behave correctly.
  For comparison, the scpptool (my project) auto-translation (to a memory safe subset of C++) feature would translate it to something like:
```
    1 mse::lh::TNativeArrayReplacement<uint8_t, 1> x = { 0 };
    2 mse::lh::TNativeArrayReplacement<uint8_t, 1>::iterator y = x; // implicit conversion from array to iterator
    3 *y = 1;
    4 assert(*x == 1); /* SUCCESS */ // dereferencing of array supported for compatibility
```
  or if y is subsequently retargeted at another type of array, then line 2 may end up as something like:
```
    2 mse::TAnyRandomAccessIterator<uint8_t> y = x; // implicit conversion from array to iterator
```
  So the OP project may only be converting C code that is already amenable to being converted to safe Rust. But given the challenge of the problem, I can respect the accomplishment and see some potential utility in it.
  edit: added translation for line 2 in an alternate hypothetical situation.
  [-]
  - lambda 8 days ago
```
  the translated code may (quietly) not behave correctly.
```
    The whole point of them show that example is that they say they catch this case, and bring it to the attention of the programmer:
```
  If the original C program further relies on x, our translation will error out, and will ask the
  programmer to fix their source code. This is another area where we adopt a “semi-active” approach
  to verification, and declare that some patterns are poor enough, even for C, that they ought to be
  touched up before the translation takes place.
```
    [-]
    - safercplusplus 8 days ago
      Thanks for clarifying. The issue is what code would be rejected for auto-translation, not the correctness of an "accepted" translation (as my comment may have implied).
      The point of noting that the example translation quietly does the wrong thing, is that that is the reason that it would have to be ("unconditionally") rejected.
      While the paper does suggest that their example translation would be rejected:
      > If the original C program further relies on x, our translation will error out
      note that precisely determining whether or not the program "further relies on x" statically (at compile/translation-time) is, in general, a "Halting Problem". (I.e. Cannot be reliably done with finite compute resources.) So they would presumably have to be conservative and reject any cases were they cannot prove that the program does not "further rely on x". So it's notable that they choose to use a (provisional) translation that has to be rejected in a significant set of false positive cases.
      And at least on initial consideration, it seems to me that an alternative translation could have, for example, used RefCell<>s or whatever and avoided the possibility of "quietly doing the wrong thing". (And thus, depending on your/their requirements, avoid the need for unconditional rejection.) Now, one might be an a situation where they'd want to avoid the run-time overhead and/or potential unreliability of RefCell<>s, but even then it seems to me that their translation choice does not technically avoid either of those things. Their solution allocates on the heap which has at least some theoretical run-time overhead, and could theoretically fail/panic.
      Now I'm not concluding here that their choice is not the right one for their undertaking. I'm just suggesting that choosing a (provisional) translation that has to be rejected with significant false positives (because it might quietly do the wrong thing) is at least initially notable. And that there are other solutions out there that demonstrate translation of C to a (high-performance, deterministic) memory-safe language/dialect that don't have the same limitations.
- light_hue_1 9 days ago
  Oh, this has so so many more caveats! It's borderline false advertising.
  First of all they never translated any C! At all. Zero lines.
  They took code written in F* and modified its C compiler to emit Rust. They never had to deal with any actual C code of any complexity, aside from the most trivial code that might theoretically be emitted by a toy compiler (but wasn't even that!). They just pretended they were dealing with C (or a heavily restricted Mini-C).
  Even if we accept this, there are plenty of arbitrary restrictions on the C it can even support in principle:
  > If the original C program further relies on x, our translation will error out, and will ask the programmer to fix their source code.
  That's saying you want C to already be written in a style that makes the Rust borrow checker happy! They're avoiding the actual hard parts of the problem.
  > These rules present simplified versions of what we support. Our implementation features several peephole optimizations ...
  If I may translate from the language of academia. "We present beautiful rules in figure 4. But in reality, our implementation relies on a large number of hacks."
  But it gets worse!
  > For overlap cases that can be distinguished statically (as above), we emit a compile-time error; otherwise, the Rust code will panic at runtime.
  The resulting code is not guaranteed to be correct! For example, aliasing in C can cause crashes in the resulting Rust code. See comment at the top of page 10. We're going from a formally verified C program to a "It may crash now" Rust program!? That's nuts!
  > We apply our methodology to existing formally verified C codebases, namely, the HACL* cryptographic library ...
  This is such a blatant lie that I had to return to it. It's designed to catch an unwary reviewer. We often talk about HACL being in verified C because it gets compiled to that. But it's not a C library, it's written in a totally different language. You cannot confuse the two.
  I'm not a reviewer for their paper. But if I was, I would strongly fight for rejection.
  The fact that they only handle formally verified C is so astronomically far away from being their main problem.
  An honest title would be "Compiling a subset of F* to partially Safe Rust, Partially Formalized"
  [-]
  - lou1306 9 days ago
    > An honest title would be "Compiling a subset of F* to partially Safe Rust, Partially Formalized"
    Sadly that is the name of the game in conference publishing. Make a big claim in the title and hope the reviewer does not read the fine print.
    > If I may translate from the language of academia. "We present beautiful rules in figure 4. But in reality, our implementation relies on a large number of hacks."
    Er, this I can understand. Every conference paper only presents you with a simplified view of the contributions. There is no way to describe every corner case within the page limits of a conference paper (specifically, I bet this was submitted to POPL 2025).
    [-]
    - light_hue_1 9 days ago
      > Er, this I can understand. Every conference paper only presents you with a simplified view of the contributions.
      In my group I don't allow hacks. Most other good papers I can think of don't do that. It takes new students a while to get used to not maximizing performance at all costs. And if a hack must exist, then it's explained, not as a hack, but as part of the method.
      Don't you hate it when you implement some beautiful idea from a paper only to discover that it hardly bares any relationship to the actual method that works?
      > There is no way to describe every corner case within the page limits of a conference paper (specifically, I bet this was submitted to POPL 2025).
      You're probably right. But POPL allows for unlimited pages in an attached appendix. They can and should describe the full method. Particularly if they're going to play the game of saying that they formalized this.
      > Sadly that is the name of the game in conference publishing. Make a big claim in the title and hope the reviewer does not read the fine print.
      It's a form of namespace squatting. Do the worst job possible to claim a title that you couldn't execute on, so that when people figure out that problem, they will be forced to cite your non-working solution. I loathe this approach to publishing.
      We should punish people for doing. Reject for lack of honesty and your paper can't be published in related conference for X years.
      [-]
      - lou1306 8 days ago
        Well in this case (and if it was POPL, but it's a pretty safe bet considering the format and the timing) it looks like reviewers have indeed rejected it. And I completely agree, it is namespace squatting. Sadly every once in a while it does work (and very effectively), so there is little incentive for the community to punish it.
        Sorry if my previous comment came off as dismissive, it's just that I'm getting increasingly disillusioned with the state of things in this space.
    - jeltz 8 days ago
      As someone who recently reviewed hubdreds of abstracts for a mid-sized open source conference: our committee would have declined that talk due to the misleading title. Has academia really that low standards?
      [-]
      - lou1306 8 days ago
        Not to be dismissive, but you reviewed hundreds of abstracts. A top conference in this area receives hundreds to thousands of papers. So there is a non-negligible chance your submission trickles down to some PhD student who is still learning the ropes of peer review. And then, there is a non-zero chance they won't fully understand the fine print and give you a thumbs up.
        Again, this does not seem the case here (this one was likely uploaded to ArXiv after being rejected) and I thnk the opposite (good papers being rejected) is way more common in academia, so I wouldn't frame it as "low standards". Just an unfortunate outcome of "publish or perish" in a conferences-first community that highly values a small subset of all venues.
- akkad33 9 days ago
  Is Rust formally verified? Not that I know of
  [-]
  - gpm 9 days ago
    No, but small pieces of it are, and there's active work to extend it
    The concept of the borrow checker has been on a simplified version of rust https://people.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf - work has continued in this area steadily (e.g. see tree borrows)
    There's a variety of tools that take rust code and translate it to something a proof system understands, and then checks that it matches a specification. AWS is leading a project to use these to verify the standard library: https://aws.amazon.com/blogs/opensource/verify-the-safety-of...
  - PartiallyTyped 9 days ago
    You can always run model checkers like Kani, though even that is limited.
    [-]
    - medo-bear 9 days ago
      So no?
      [-]
      - johnisgood 9 days ago
        The answer is that it is not.
        It frustrates me more than it should, I admit, that people always mention Rust when they talk about safety, but never Ada / SPARK. You want formal verification? Use Ada / SPARK. It has been battle-tested. It has been used for critical systems for a really long time now.
        (And a compiler being formally verified vs. being able to write formally verified code means two different things.)
        [-]
        lambda 8 days ago
        You also need to consider that formal verification is not the be-all/end-all of correctness.
        Formal verification is verifying a mathematical model of the program correct, according to certain mathematical correctness properties. That model may not actually represent the real world, or the correctness properties may not actually be the properties that you want to guarantee. Famously, there was the formally verified Java LinkedList implementation that was buggy due to the formal model not taking into account integer overflow.
        There are parts of the Rust language and standard library that have been formally verified. That helps provide confidence that the basic model, of having safety properties upheld by unsafe core functions, to be sound.
        But the big advantage of Rust, especially for everyday open source code that the world widely depends on, is that it strikes a good balance between a language that people actually want to use, and which can provide good practical safety guarantees.
        Rust is far, far safer than C and C++. Most other memory-safe languages also are, but trade that off for lower performance, while Rust provides a similar level of performance as C and C++. And additionally, it's a language that people actually want to use for writing a lot of the core infrastructure.
        As far as I know, I have never use any open source software written in Ada/SPARK. People just don't do it. The only reason they generally do is if they need to for functional safety reasons, in a high-criticality system, and they find it easier to get their software certified by writing it that way.
        And even in safety critical systems, most people find using a subset of C, with restrictive rules, static checkers to check those, code review, and extensive testing, done by independent teams and testing on real hardware with requirements based tests that achieve 100% coverage at the MC/DC level is considered to be the gold standard of verification.
        Formal verification can not possibly provide some of the safety guarantees that testing on real hardware based on actual system level requirements can provide.
        Formal verification is one tool in a toolbox. It is useful for providing certain guarantees. But it's less important than many other factors for the actual improvements in safety that Rust can provide; by actually being used, in critical software that is exposed to potential network based exploitation, a memory-safe language like Rust makes a much better foundation than C or C++, and more people will actually use it in such a context than Ada/SPARK.
        [-]
        medo-bear 6 days ago
        The article talks about compiling formalized C code to Rust. While interesting from a tinkering perspective, I fail to see any significant utility in safety, security, readability, popularity and ease of language. Can you please educate me as to what I am missing
        [-]
        lambda 2 days ago
        They have a library that allows you to write and verify formalized C code more easily. And they have actually written parsers for some common file formats in this, which have been used in some major projects; and that's good, because efficient parsers for complex file formats is one of the places where you really want to use C for its efficiency, but it's also really easy to make a mistake that leads to exploitable memory safety vulnerabilities.
        Now that Rust is becoming more popular, it would be nice to be able to re-use these formally verified parsers in Rust, where your entire language is memory safe. The formally verified parsers can still be helpful, because the formal verification can ensure that you also won't crash (in safe Rust, you can still crash, you just won't be subject to arbitrary memory corruption).
        But just using the C libraries from Rust is unsatisfactory, now you need to go through an unsafe interface which introduces a potential place to introduce new bugs. And there are existing C to Rust translators, but they generate unsafe Rust.
        So this demonstrates a way to translate from C to safe Rust, though with constraints on the existing C code. It's both useful in that it means that you can translate some of these already formally verified libraries to safe Rust, and this research could be used as part of later work for a more general tool, that could potentially translate more C code to safe Rust while falling back to unsafe in cases that it can't reason about.
        Anyhow, not all academic work like this ends up being used practically in the real world, but some of it can be, or some of it can be put into practice by other tools later on. Rust came about that way; much of its reason for existence is trying to put decades of academic research into a practical, real-world usable language, since there was lots of academic work that had never really been used in industry as much as it should be.
        medo-bear 9 days ago
        I think a disclaimer like this should be written with every Rust application, like health warnings on cigarette packets
        [-]
        johnisgood 9 days ago
        At this point, I think that would be better, yes, just because people think Rust is "fully" safe, which is just incorrect. I think the problem was the Rust hype and repeated statements of it being very safe, so we have some undoing to do.
        For example if someone on GitHub sees that the project is written in Rust, they are automatically going to assume it is safe, incorrectly so. I do not blame them though.
        [-]
        keybored 9 days ago
        You presumably extend this to every virtual machine or interpreter for every language which is implemented in an unsafe language. When that language claims to be safe (like all such languages claim to be).
        That seems excessive and tedious.
        [-]
        medo-bear 9 days ago
        The point, I think, was that "safety" presumptions about Rust are often exaggerated or poorly misunderstood due to hype. That could certainly lead to problems
        [-]
        woodruffw 9 days ago
        I don’t think Rust’s actual safety properties aren’t overhyped, although they may be subject to misunderstanding about their exact extent.
        Concretely: spatial and temporal memory safety are good things, and Rust achieves both. It’s not unique in this regard, nor is it unique in not having a formal definition.
        [-]
        medo-bear 8 days ago
        > although they may be subject to misunderstanding about their exact extent
        Isnt that what overhype means? Also no one is saying that Rust is unique in being overhyped. It is true of almost any language worth writing in, including c, lisp, python, haskell, type script etc.
        [-]
        woodruffw 8 days ago
        I don't see much overhype coming from Rust practitioners. I see a lot of people who care about spatial and temporal memory safety, for which the hype with Rust appears largely appropriate. I've yet to see people (incorrectly) extend this to a claim that Rust solves all security problems, which would meet the definition of overhype.
        (If there's nothing unique here, it doesn't make sense to single any particular language out. But each language does have unique properties: Python is a great rapid development language, Rust offers zero-cost abstractions for memory safety, etc.)
        [-]
        medo-bear 7 days ago
        > I don't see much overhype coming from Rust practitioners
        That is like saying I don't see much overhype about AI from machine learning engineers. I am a ml engineer, and like myself great majority of ml engineers will tell you that there is certainly overhype about the field and do not engage in the overhype. Which is not to say that the field isnt producing some really cool results
        > I've yet to see people (incorrectly) extend this to a claim that Rust solves all security problems, which would meet the definition of overhype.
        Ive seen plenty of questionable Rust rewrites due to solving security problems
        > Python is a great rapid development language
        Saying to a Lisper that Python is a great rapid development language is like selling Rust safety to an Ada person :)
        keybored 8 days ago
        Double negation makes things ambigious.
        [-]
        woodruffw 8 days ago
        The second negation is a typo, but it’s too late to fix it.
        keybored 8 days ago
        > The point, I think, was that "safety" presumptions about Rust are often exaggerated or poorly misunderstood due to hype. That could certainly lead to problems
        Then the point is hypocritical.
        Runtimes for safe programming languages have been implemented in unsafe languages since the dawn of safe programming languages, basically.
        EDIT: I see now that you are the cigarette warning guy. In that case I don’t understand what this coy “I think” speculation is about when you made such a bizarre proclamation on this topic.
        [-]
        medo-bear 8 days ago
        Your tendency to answer jokes so seriously is a symptom of the hype
        [-]
        keybored 8 days ago
        Me being humorless (read: not just rolling with your cop-out) is a symptom of the Rust hype.
        Doesn’t even make sense.
        [-]
        medo-bear 8 days ago
        you feeling the need to defend it at every jibe. chill. rust is not a cult
        [-]
        keybored 8 days ago
        A jibe is some sort of playful banter based on a literal or exaggerated truth. Since your comment does not lampoon any real deficiency except when being hypocritical, there is nothing to defend.
        Maybe you should wear a sign on your torso: will make irrational points and then complain about persecution when that is pointed out. I don’t know. It’s just one more idea for when we decide to put all the anti-Rust people in camps.
        [-]
        medo-bear 8 days ago
        Its sunday, shouldnt you be in church of holy borrow checker?
        [-]
        keybored 8 days ago
        Today is Java day.
        mardifoufs 8 days ago
        It's weird to imagine for cniles but discussing a language doesn't inherently involve defending it, especially when it's just someone correcting someone else. I know, that's a very weird concept :^)
        [-]
        medo-bear 7 days ago
        Correcting what though ?
        jeltz 8 days ago
        Your irrational hate for a programming language adds nothing to the discussion.
        [-]
        medo-bear 8 days ago
        so many powerful words lol
        chillingeffect 9 days ago
        Rust is to safe as Tesla is to autopilot.
        [-]
        mardifoufs 8 days ago
        Judging by the facts, are you saying that Tesla has a safe auto pilot? Rare to see on HN :^)
        [-]
        chillingeffect 7 days ago
        Heh no... I was going for "the average person thinks Rust is automatically safe." And "the average person thinks Tesla is automatically safe."
        medo-bear 9 days ago
        As Tesla is to Tesla
  - antiquark 8 days ago
    Rust doesn't have a specification or standard yet, which would make it difficult to formally verify.
    https://stackoverflow.com/questions/75743030/is-there-a-spec...
    [-]
    - gpm 8 days ago
      It does have a specification: https://github.com/ferrocene/specification
      It also strikes me as extraordinarily unlikely that any formal verification effort will use the existing specification, and not build their own (using their own formal language) as they go.
- jokoon 9 days ago
  What is formally verified C? Why isn't there more of it?
  [-]
  - PhilipRoman 9 days ago
    Because it takes a lot of effort. Google Frama-C. On the flip side, it can express not just memory safety constraints, but also correctness proofs.
    In this case it's not about Frama-C or similar tools though, see your sibling comments about the caveats.
    [-]
    - zozbot234 9 days ago
      Arguably, generating verified C from a low-level focused subset of F* (Low*, used in the HACL project) is close enough to count as a "similar tool".
- medo-bear 9 days ago
  What is the benefit of compiling formally correct code to Rust? It seems that all the possible benefits are already there (if not more)
  [-]
  - gpm 9 days ago
    I suppose hypothetically putting it in an easier language to make changes in. Though it's hard to imagine it being easier to make changes to transpiled code than the original.
    Alternatively this might be seen as a stepping stone to translating non-formally-verified C to rust, which I understand the US government has expressed a fair bit of interest in.
    [-]
    - medo-bear 9 days ago
      Good luck with both of that. Otherwise people whose dayjob is rewritting everything in Rust will soon be out of a job
- tonetegeatinst 9 days ago
  What is the main difference? Can compiler flags force compliance?
  [-]
  - nickpsecurity 9 days ago
    Formal verification often requires simplified code in a restrictive style. You might not even be able to use C features or structures that have the performance you want. How theorem provers and brains work are also different enough that making something easy for one often makes it harder for the other.
    You can also see this effect in the article on the history of Coverity’s analyzer. Real-world code was horrible to deal with vs the academic examples they started with.
    https://cacm.acm.org/research/a-few-billion-lines-of-code-la...
  - LPisGood 9 days ago
    My understanding is that formal verification is a tough goal to achieve and that it usually requires designing the program or the language to be a specific way.
    The problem with transpiling C to rust is that unsafe and unverified behavior can be a key property of the behavior resulting program, so there isn’t an obvious way to spit out a sort of rustified (CRustified?) binary that matches the behavior of the C program.
    [-]
    - thunkingdeep 9 days ago
      Typical term is “Oxidized”. I think they feel clever when they do the RiiR thing and say that.
  - immibis 9 days ago
    "Formally verified" means someone has written a correct mathematical proof that the code has no bugs (the proof is checked by a computer program to make sure it is correct). This is a very high bar.
    I'm not sure what it has to do with translating the code to Rust.
    [-]
    - swiftcoder 9 days ago
      > I'm not sure what it has to do with translating the code to Rust.
      Formally verified C programs typically have had all their undefined behaviour already stamped out during the verification process. That makes mapping it over to Rust a lot simpler.
      [-]
      - tjoff 9 days ago
        Translating undefined behavior is the easy part.
        [-]
        swiftcoder 9 days ago
        Maybe, but only if you know the specifics of the environment in which it is executing (i.e. which compiler/architecture-specific behaviours the code actually relies on)
        [-]
        Filligree 9 days ago
        Correctly written C programs don’t have undefined behaviour, so a translator can operate assuming there is none.
    - DonaldPShimoda 9 days ago
      In this case specifically, two separate aspects are being referred to with regard to "formal verification".
      The first is that the translation function (from C to Rust) has itself been formally verified. In other words: if you give it a C program that obeys certain necessary pre-conditions, it will give you a safe Rust program. I believe the title of the paper is primarily using "Formalized" with regard to this characteristic.
      The second is that the set of programs the authors evaluate their tool on are C programs that have themselves been formally verified. I only barely skimmed the introduction and didn't see it addressed directly there, but I would assume that this is because the pre-conditions necessary for their translation to work are most easily (only?) met by formally verified C programs, where of course the verification has been rendered with respect to properties that would be relevant here (e.g., memory use).
    - SAI_Peregrinus 8 days ago
      > "Formally verified" means someone has written a correct mathematical proof that the code has no bugs
      No, only that it complies with a formal specification. The code will have bugs if and only if the specification has bugs.
      The difficulty of writing a bug-free formal specification is just as great as writing bug-free code. In some domains it's easier to write the specification, but not in most. Fully specifying what a program shoud do is often the hardest part of programming, after all.
- cushychicken 9 days ago
  Woof. Major unspoken caveat in the title!
amenghra 9 days ago
In 2002, a group of researchers presented a paper on Cyclone, a safe dialect of C [1]. While (manually) porting code from C to Cyclone, they found safety bugs in the C code.
These kinds of manual or automated conversation from C to <safer language> therefore have potential not only for increasing adoption of safer languages but also for uncovering existing bugs.
[1] https://www.researchgate.net/profile/James-Cheney-2/publicat...
[-]
- als0 3 days ago
  > Cyclone is no longer supported; the core research project has finished and the developers have moved on to other things. (Several of Cyclone's ideas have made their way into Rust.) Cyclone's code can be made to work with some effort, but it will not build out of the box on modern (64 bit) platforms).
  http://cyclone.thelanguage.org
pizza234 8 days ago
I've ported some projects to Rust (including C, where I've used C2Rust as first step), and I've drawn some conclusions.
1. Converting a C program to Rust, even if it includes unsafe code, often uncovers bugs quickly thanks to Rust’s stringent constraints (bounds checking, strict signatures, etc.).
2. automated C to Rust conversion is IMO something that will never be solved entirely, because the design of C program is fundamentally different from Rust; such conversions require a significant redesign to be made safe (of course, not all C programs are the same).
3. in some cases, it’s plain impossible to port a program from C to Rust while preserving the exact semantics, because unsafety can be inherent in the design.
That said, tooling is essential to porting, and as tools continue to evolve, the process will become more streamlined.
[-]
- int_19h 8 days ago
  > automated C to Rust conversion is IMO something that will never be solved entirely
  Automated conversion of C to safe fast Rust is hard. Automated conversion of C to safe Rust in general is much easier - you just need to represent memory as an array, and treat pointers as indices into said array. Now you can do everything C can do - unchecked pointer arithmetic, unions etc - without having to fight the borrow checker. Semantics fully preserved. Similar techniques have been used for C-to-Java for a long time now.
  Of course, the value of such a conversion is kinda dubious. You basically end up with something like C compiled to wasm, but even slower, and while the resulting code is technically "safe", it is still susceptible to issues buffer overflows producing invalid state, dangling pointers allowing access to data in contexts where it shouldn't be allowed etc.
  [-]
  - zozbot234 8 days ago
    You can do a lot better than that. You can treat memory ranges coming from separate allocations as distinct segments, and pointers as tuples of a segment ID and a linear offset within the segment. This is essentially what systems like CHERI are built on, and how C and C++ are implemented on segmented architectures like the 8086 and 80286. The C standard includes a somewhat limited notion of "objects" that's intended to support this exact case.
  - pizza234 8 days ago
    There is no point in doing such type of conversions.
- LPisGood 8 days ago
  >because unsafety can be inherent in the design
  I agree in principle, and I have strong feelings based on my experience that this is the case, but I think it would be illustrative to have some hard examples in mind. Does anyone know any simple cases to ground this discussion in?
  [-]
  - nuancebydefault 8 days ago
    Suppose it is a dll that has exported functions returning or accepting unsafe strings. No way to make it safe without changing the API.
    [-]
    - tatref 8 days ago
      In Rust, there is no unsafe String, only blocks of code can be unsafe, no?
      [-]
      - whytevuhuni 8 days ago
        They likely mean a char* pointer to a null-terminated string, or a char* pointer and a length, as is usual for C.
        If Rust was forced to expose such an API (to be on par with C's old API), it would have to use `*const u8` in its signature. Converting that to something that can be used in Rust is unsafe.
        Even once converted to &[u8], it now has to deal with non-UTF8 inputs throughout its whole codebase, which is a lot more inconvenient. A lot of methods, like .split_ascii_whitespace, are missing on &[u8]. A lot of libraries won't take anything but a &str.
        Or they might be tempted to convert such an input to a String, in which case the semantics will differ (it will now panic on non-UTF8 inputs).
  - colejohnson66 8 days ago
    Maybe a JIT? Especially one that can poke back into the runtime (like dotnet).
    [-]
    - LPisGood 8 days ago
      I know Unity game engine uses some transpiler called IL2CPP that manages to preserve some of the safety features of C# but still gets the speed of CPP, so one direction is definitely possible
      [-]
      - neonsunset 8 days ago
        Oh, it's mainly for platform compatibility. IL2CPP performance is really problematic since it still carries many issues of Mono, even if transpiled to C++: https://meetemq.com/2023/09/18/is-net-8-performant-enough/ (don't look at just the starting graph - make sure to scroll down to notes where RyuJITs code competes with other fast entries or even outperforms them).
        Perhaps what you were looking for is NativeAOT? Either way C ports really well to C# since it supports a large subset of it "as is" and then some with generics and other features originating from C# itself.
protz 7 days ago
Author here, I thought it'd be helpful to address a few of the points brought up in the various comment threads.
1. This is an academic paper that we posted on arxiv, not a release announcement for a new product where we claim we have solved C to Rust. We submitted to a PL conference, not an open-source meeting like e.g. FOSDEM -- this is not the same audience at all, and the expectations are very different.
2. Our story is simple. We start from the constraint of translating C to /safe/ Rust, and see what this entails: a small well-behaved subset of C, inference of slice splitting, a translation that may error out, and a program that may abort (plus a few other things described in the paper). We evaluate our ideas on what we have (C embedded in F*), and show that it scales decently with those constraints in mind, on a large-scale C library that is used in Firefox, Python, and many other pieces of mainstream software. We don't claim we can rewrite e.g. Firefox in Rust automatically.
3. This is how research works. We think we have an interesting point in the design space; we don't claim we solve every issue, but think this is an interesting idea that may unlock further progress in the space of C to Rust translation, and we think it's worth putting out there for others to take inspiration from. Who knows, maybe some existing tool will use this approach for parts the fit in the subset, and fall back to unsafe Rust for other parts that don't fit! This is a very active area: if we can contribute something that other tools / researchers can use, great.
4. This is not the final story, and again this is how research works. We are working on an actual C frontend via libclang, and are exploring how e.g. guarantee that the generated Rust does not generate out of bounds accesses, perhaps by emitting verification conditions to Z3 (speculating on future work here). If the reviewers think more work is needed, that's fine, and we'll resubmit with enhancements. If the reviewers think this is an active area and others could benefit from our ideas, and take the paper, even better.
nickpsecurity 9 days ago
The thing I wonder about is why we would do this. The technology to really convert industrial-grade apps from C to Rust could probably bullet proof the C apps more easily. They’d just have to do some analyses that fed into existing tooling, like static analyzers and test generators.
Similarly, it they might generate safe wrappers that let teams write new code in Rust side by side with the field-proven C. New code has the full benefits, old code is proven safe, and the interfaces are safer.
A full on translator might be an ideal option. We’d want one language for the codebase in the future. Push-button safety with low, false positives for existing C and C++ is still the greatest need, though. Maybe auto-correcting bad structure right in the C, too, like Google’s compiler tool and ForAllSecure’s Mayhem do.
[-]
- DonaldPShimoda 9 days ago
  > The technology to really convert industrial-grade apps from C to Rust could probably bullet proof the C apps more easily.
  No, some C programs cannot be made safe. This can be due to dependency on undefined or unspecified behaviors, or it can be because introducing proper safety checks would limit the domain of possible inputs too much to be useful, or other things.
  Translating to a safe language can maintain the expressive capabilities of the inputs while statically guaranteeing correct operation at run-time. It is objectively better in these cases.
  > field-proven C
  I don't think this exists, as the numerous critical vulnerabilities over the years have shown. All we have is C that seems to work pretty well often enough to be useful.
  > old code is proven safe
  Old code is assumed to be safe due to luck, actually. "Prove" has a specific meaning (especially on a post for a paper about proving things), and the overwhelming majority of C code is not proven to any rigorous mathematical standard. In contrast, the Rust type system has been mathematically proven to be correct.
  > A full on translator might be an ideal option.
  It depends on what you're willing to give up. If you don't mind losing performance, limiting your domain of inputs or range of outputs, giving up code legibility, and so on, then sure, this can probably be done to some extent. But when you start wanting your translator to be both sound and complete over all of these concerns, you run into problems.
  [-]
  - uecker 9 days ago
    > No, some C programs cannot be made safe. This can be due to dependency on undefined or unspecified behaviors, or it can be because introducing proper safety checks would limit the domain of possible inputs too much to be useful, or other things.
    You can certainly replace code using undefined behavior in C code by using defined constructs.
    > I don't think this exists, as the numerous critical vulnerabilities over the years have shown. All we have is C that seems to work pretty well often enough to be useful.
    I think this highly misleading. Some of the most reliable programs I know are written in C and Rust projects will also have critical vulnerabilities. Most vulnerabilities are not actually related to memory safety and the use of unsafe Rust will also lead to memory safety issues in Rust code. So I see some advantage to Rust but to me it is obviously overhyped.
  - _flux 9 days ago
    > In contrast, the Rust type system has been mathematically proven to be correct.
    Is this the case? E.g. the issue "Prove the Rust type system sound" https://github.com/rust-lang/rust/issues/9883 is closed with comment "This will be an open issue forever. Closing." in 2016: https://github.com/rust-lang/rust/issues/9883#issuecomment-2... .
    At least nowadays (since 2022) we do have a language specification for Rust: https://ferrous-systems.com/blog/the-ferrocene-language-spec...
    [-]
    - aw1621107 9 days ago
      The closest thing is probably RustBelt [0], which proved the soundness of a subset of Rust that included borrowing/lifetimes. This was later extended to include relaxed memory accesses [1].
      Neither of these papers include the trait system, unfortunately, and I'm not aware of another line of research that does (yet?).
      [0]: https://dl.acm.org/doi/10.1145/3158154
      [1]: https://plv.mpi-sws.org/rustbelt/rbrlx/paper.pdf
  - nickpsecurity 9 days ago
    There’s tools that prove the safety of C code. They include RV-Match, Astree Analzer, and Frama-C. CompCert certifies the compilation. There’s tons of tools to find bugs by model checking, test generation, etc. Rust was nowhere near C last I checked in terms of tooling available.
    So, back to my original comment, the investment in making a Rust to C transpiler would make a C to safer C transpiler that fed that code into the verification tooling. A small transpiler would be easier to write with huge, immediate gains.
    On field proven, there’s two angles: spec and code. Many C/C++ apps didn’t become usable at all until years of iteration based on feedback from field use. An alternative in Rust might have incorrect specs which. From there, a number have had much code review, static analysis, and testing. They have few to no known bugs at any point. So, there are definitely C/C++ applications out there that are field-proven with a decent level of safety.
    (Note: Any transpiler might need to be bug for bug and weird behavior for weird behavior compatible to match the implicit spec in the C/C++ code.)
    You’re right about the average Rust vs C code, like type system vs correctness. I’d prefer new code be written in safer languages, which I added to my proposal.
    If apples to apples on correctness, you’d have to compare C written for verification with Rust. Quite a few C projects have been proven to be correct. Since you’re using a compiler, I’ll add that tools like Softbound+CETS make C code safe automatically.
    I do think Rust brings the cost and effort down quite a lot, though. It’s tooling is usually more mature or at least free than most for verifying C.
alkonaut 9 days ago
If you used a naïve translation to Rust, wouldn’t you get parts that are safe and parts that are unsafe? So your manual job would need to be only verifying safety in the unsafe regions (same as when writing rust to begin with)?
Seems it would be a win even if the unsafe portion is quite large. Obviously not of it’s 90% of the end result.
[-]
- Animats 8 days ago
  Indeed, yes. Someone tried converting C OpenJPEG to low-level unsafe Rust using c2rust. OpenJPEG was known to segfault on a test case. I tried that test case on the Rust version. Segfaulted in the equivalent place in the Rust code.
  At least it's compatible. But that approach is a dead end. To make any progress, translation must recognize the common idioms of the language and upgrade those to the ideomatic forms of the target language. Compiling into Rust generates awful Rust, full of calls to functions that do unsafe C-type pointer manipulation.
  The big upgrading problems mostly involve pointers. The most promising result in this paper is that they figured out how to convert C pointer arithmetic into Rust slices. Slices can do most of the things C pointer arithmetic can do, and now someone automated the translation. Pointer arithmetic that can't be translated has to be looked at with deep suspicion.
  A useful way to think about this is that raw pointers in C which point to arrays implicitly have a length associated with them. That length is not visible in C source code, but exists somewhere, as a function of the program state. It might be a constant. It might be the size requested back at a "malloc" call. It might be a parameter to a function. It's usually not too hard for maintenance programmers to find array lengths.
  That could be an LLM kind of problem. Ask an LLM, "Examine this code. What is the length of array foo?" Then use that to guide translation to Rust by a non-LLM translator. If the LLM is wrong, the resulting Rust will get subscript errors or have an oversize array, but will not be unsafe. Array size info idioms are stylized enough in C that it should be possible to get it right most of the time. Especially since LLMs can read comments.
- CodesInChaos 9 days ago
  A naïve translation would produce rust code which is almost entirely unsafe (using raw pointers instead of references everywhere). Translating to references is difficult, since C code isn't written with the restrictions of the Rust alias model / borrow-checker in mind.
pizlonator 9 days ago
Compiling a tiny subset of C, that is. It might be so tiny as to be useless in practice.
I have low hopes for this kind of approach; it’s sure to hit the limits of what’s possible with static analysis of C code. Also, choosing Rust as the target makes the problem unnecessarily hard because Rust’s ownership model is so foreign to how real C programs work.
[-]
- pornel 9 days ago
  Rust's ownership model is close enough for translating C. It's just more explicit and strongly typed, so the translation needs to figure out what a more free-form C code is trying to do, and map that to Rust's idioms.
  For example, C's buffers obviously have lengths, but in C the length isn't explicitly tied to a pointer, so the translator has to deduce how the C program tracks the length to convert that into a slice. It's non-trivial even if the length is an explicit variable, and even trickier if it's calculated or changes representations (e.g. sometimes used in the form of one-past-the-end pointer).
  Other C patterns like `bool should_free_this_pointer` can be translated to Rust's enum of `Owned`/`Borrowed`, but again it requires deducing which allocation is tied to which boolean, and what's the true safe scope of the borrowed variant.
  [-]
  - smolder 9 days ago
    It's not that simple. In fact it's impossible in some cases if you don't sprinkle unsafe everywhere and defeat the purpose. Rusts restrictions are so that it can be statically analyzed to guarantee safety. The superset of all allowable C program behaviors includes lots of things that are impossible to guarantee the safety of through static analysis.
    Formally verified C involves sticking to a strict subset of the capabilities of C that is verifiable, much like Rust enforces, so it makes sense that programs meeting that standard could be translated.
  - pizlonator 9 days ago
    Rust’s ownership model forbids things like doubly linked lists, which C programs use a lot.
    That’s just one example of how C code is nowhere near meeting Rust’s requirements. There are lots of others.
    [-]
    - orf 9 days ago
      > Rust’s ownership model forbids things like doubly linked lists, which C programs use a lot.
      It’s literally in the standard library
      https://doc.rust-lang.org/std/collections/struct.LinkedList....
      [-]
      - quuxplusone 9 days ago
        But it's not in C's standard library. So the exercise isn't merely to auto-translate one language's standard library to another language's standard library (say, replacing C++ std::list with Rust LinkedList) — which would already be very hard. The exercise here is to auto-identify-and-refactor idioms open-coded in one language, into idioms suited for the other language's already-written standard library.
        Imagine refactoring your average C program to use GLib for all (all!) of its data structures. Now imagine doing that, but also translating it into Rust at the same time.
        [-]
        Animats 9 days ago
        > The exercise here is to auto-identify-and-refactor idioms open-coded in one language, into idioms suited for the other language's already-written standard library.
        That's what LLMs are for - idiom translation. You can't trust them to do it right, though.
        [Pan et al . 2024] find that while GPT-4 generates code that is more idiomatic than C2Rust, only 61% of it is correct (i.e., compiles and produces the expected result), compared to 95% for C2Rust.
        This problem needs both AI-type methods to help with the idioms and formal methods to insure that the guessed idioms correctly capture the semantics.
        A big advance in this project is that they can usually translate C pointer arithmetic into Rust slices. That's progress on of one of the hardest parts of the problem. C2Rust did not do that. That system just generates unsafe raw pointer arithmetic, yielding ugly Rust code that replicates C pointer semantics using function calls.
        DARPA is funding research in this area under the TRACTOR program. Program awards in April 2025, so this is just getting started. It's encouraging to see so much progress already. This looks do-able.
        [-]
        fuhsnn 9 days ago
        >That's what LLMs are for - idiom translation. You can't trust them to do it right, though.
        Optimizing C compilers also happened to be good at idiom recognition, and we can probably trust them a little more. The OP paper does mention future plan to use clang as well: >We have plans for a libclang-based frontend that consume actual C syntax.
        If such transformation can be done at IR level it might be more efficient to be to C-IR > idiom transform to Rust-IR > run safe-checks in Rust-IR > continue compilation in C-IR or Rust-IR or combining both for better optimization properties.
        [-]
        swiftcoder 9 days ago
        I'm definitely bullish on this angle of compiling C down to LLVM assembly, and then "decompiling" it back to Rust (with some reference to the original C to reconstruct high-level idioms like for loops)
        immibis 9 days ago
        Actually, LLMs are for generating humorous nonsense. Putting them in charge of the world economy was not intended, but we did it anyway.
        [-]
        dhosek 9 days ago
        Given that in my (small, employer-mandated) explorations with Copilot autocompletions it’s offered incorrect suggestions about a third of the time and seems to like to also suggest deprecated APIs, I’m skeptical about the current generation’s ability to be useful at even this small task.
        [-]
        LeFantome 8 days ago
        Have you seen O3?
        If your experience with something less than half as good as state-of-the-art is that it worked 66% of the time, I am not sure why you would be so dismissive about the future potential.
        glouwbug 9 days ago
        Sure but it takes two copilots to fly a plane
        saghm 9 days ago
        Oh god, I can't even imagine trying to have formally-verified LLM-generated code. It's not surprising that even incremental progress for that would require quite a lot of ingenuity.
        [-]
        CodesInChaos 9 days ago
        Why does C2Rust produce so much incorrect code? Getting 5% wrong sounds terrible, for a 1:1 translation to unsafe Rust. What does it mis-translate?
        https://dl.acm.org/doi/pdf/10.1145/3597503.3639226
        > As for C2Rust, the 5% unsuccessful translations were due to compilation errors, the majority of them caused by unused imports.
        I'm rather confused by what that's supposed to mean, since unused imports cause warnings, not errors in Rust.
      - singron 9 days ago
        This implementation uses unsafe. You can write a linked list in safe rust (e.g. using Rc), but it probably wouldn't resemble the one you write in C.
        In practice, a little unsafe is usually fine. I only bring it up since the article is about translating to safe rust.
        [-]
        orf 9 days ago
        Safe rust isn’t “rust code with absolutely 0 unsafe blocks in any possible code path, ever”. Rc uses unsafe code every time you construct one, for example.
        Unsafe blocks are an escape hatch where you promise that some invariants the compiler cannot verify are in fact true. If the translated code were to use that collection, via its safe interfaces, it would still be “safe rust”.
        More generally: it’s incorrect to say that the rust ownership model forbids X when it ships with an implementation of X, regardless of if and how it uses “unsafe” - especially if “unsafe” is a feature of the ownership model that helps implement it.
        [-]
        andrewflnr 9 days ago
        No one here is confused about what unsafe means. The point is, they're not implemented by following Rust's ownership model, because Rust's ownership model does in fact forbid that kind of thing.
        You can nitpick the meaning of "forbids", but as far as the current context is concerned, if you translate code that implements a doubly linked list (as opposed to using one from a library) into Rust, it's not going to work without unsafe. Or an index-based graph or something.
        [-]
        oneshtein 9 days ago
        It's easy to implement doubly linked lists in safe Rust. Just ensure that every element has one OWNER, to avoid «use after free» bugs, or use a garbage collector, like a reference counter.
        Unlike C++ or Rust, C has no references, only pointers, so developer must release memory manually at some arbitrary point. This is the problem and source of bugs.
        [-]
        saghm 9 days ago
        While I might agree that it's easy if you use a reference counter, this is not going to be as performant as the typical linked list written in C, which is why the standard library uses unsafe for its implementation of stuff like this. If it were "easy" to just write correct `unsafe`, then it would be easy to do it in C as well.
        Note that the converse to this isn't necessarily true! People I trust way more to write unsafe Rust code than me than me have argued that unsafe Rust can be harder than writing C in some ways due to having to uphold certain invariants that don't come up in C. While there are a number of blog posts on the topic that anyone interested can probably find fairly easily by googling "unsafe Rust harder than C", I'll break my usual rule of strongly preferring articles to video content to link a talk from youtube because the speaker is one of those people I mention who I'd trust more than me to write unsafe code and I remember seeing him give this talk at the meetup: https://www.youtube.com/watch?v=QAz-maaH0KM
        [-]
        bonzini 9 days ago
        > unsafe Rust can be harder than writing C in some ways due to having to uphold certain invariants that don't come up in C.
        Yes, this is absolutely correct and on top of this you sometimes have to employ tricks to make the compiler infer the right lifetime or type for the abstraction you're providing. On the other hand, again thanks to the abstraction power of Rust compared to C, you can test the resulting code way more easily using for example Miri.
        imtringued 9 days ago
        I don't really see it as a big "owning" of Rust that a complex pointer heavy structure with runtime defined ownership cannot be checked statically. Almost every language that people use doubly linked lists in has a GC, making the discussion kind of meaningless.
        So C and C++ are the exceptions to the rule, but how do they make it easy to write doubly linked lists? Obviously, the key assumption is that that the developer makes sure that node->next->prev = node->prev->next = node (Ignoring nullptr).
        With this restriction, you can safely write a doubly linked list even without reference counting.
        However, this isn't true on the pointer level. The prev pointers could be pointing at the elements in a completely random order. For example tail->prev = head, head->prev = second_last and so on. So that going backwards from the tail is actually going forwards again!
        Then there is also the problem of having a pointer from the outside of the linked list pointing directly at a node. You would need a weak pointer, because another pointer could have requested deletion from the linked list, while you're still holding a reference.
        If you wanted to support this generic datastructure, rather than the doubly linked list you have in your head, then you would need reference counting in C/C++ as well!
        What this tells you, is that Rust isn't restrictive enough to enforce these memory safe contracts. Anyone with access to the individual nodes could break the contract and make the code unsafe.
        oconnor663 9 days ago
        More important than whether you use a little unsafe or a lot, is whether you can find a clean boundary above which everything can be safe. Something like a hash function or a block cipher can be piles and piles of assembly under the covers, but since the API is bytes-in-bytes-out, the safety concerns are minimal. On the other hand, memory-mapping a file is just one FFI function call, but the uncontrollable mutability of the whole thing tends to poison everything above it with unsafety.
      - pizlonator 9 days ago
        Good luck inferring how to use that from some C programmer’s deranged custom linked list.
        C programmers don’t do linked lists by using libraries, they hand roll them, and often they are more complex than “just” a linked list. Lots of complex stuff out there.
    - Rusky 8 days ago
      Rust's ownership model has two aspects:
      - A dynamic part specifies what is actually allowed, and totally supports doubly linked lists and other sorts of cyclic mutable data structures.
      - A static part conservatively approximates the dynamic part, but is still flexible enough to express the interfaces and usage of these data structures even if it can't check their implementations.
      This is the important difference over traditional static analysis of C. It enables `unsafe` library code to bridge the dynamic and static rules in a modular way, so that that extensions to the static rules can be composed safely, downstream of their implementation.
      Rust's strategy was never for the built-in types like `&mut T`/`&T` to be a complete final answer to safety. It actually started with a lot more built-in tools to support these patterns, and slowly moved them out of the compiler and runtime and into library code, as it turned out their APIs could still be expressed safely with a smaller core.
      Something like Fil-C would be totally complementary to this approach- not only could the dynamic checking give you stronger guarantees about the extensions to the static rules, but the static checks could give the compiler more leverage to elide the dynamic checks.
    - oneshtein 9 days ago
      Rus's ownership model doesn't forbid doubly linked lists. It forbids doubly owned lists, or, in other words, «use after free» bug.
  - jerf 9 days ago
    That's a classic example of an argument that looks really good from the 30,000 foot view, but when you're actually on the ground... no, basically none of that beautiful idea can actually be manifested into reality.
  - bloppe 9 days ago
    Is this sarcastic? There's a reason why the lifetime checker is so annoying to people with a lot of C experience. You absolutely cannot just use your familiar C coding styles in Rust.
    [-]
    - orf 9 days ago
      You’ve misread the comment.
      The ownership model is close enough, but the way that model is expressed by the developer is completely arbitrary (and thus completely nuts).
- whatisyourwork 9 days ago
  It can be good as an interface language. Good for bindings.
- titzer 9 days ago
  Meh, you know people are just going to throw LLMs at it and they'll be fine with it hallucinating correctish code by the ton-load. But I agree that they are going to have tough time making idiomatic Rust from random C. Like I said, correct-ish.
  [-]
  - pizlonator 9 days ago
    Great way to introduce novel security vulnerabilities!
    If that’s the Rust way, then I’m all for it. Will make it easier for Fil-C to have some epic kill shots.
- fc417fc802 9 days ago
  [dead]
zoom6628 9 days ago
I wonder how this compares to the zig-to-C translate function.
Zig seems to be awesome at creating mixed environs of zig for new code and C for old, and translating or interop, plus being a C compiler.
There must be some very good reasons why Linux kernel maintainers aren't looking to zig as a C replacement rather than Rust.
I don't know enough to even speculate so would appreciate those with more knowledge and experiencing weighing in.
[-]
- ChristianJacobs 9 days ago
  > looking to zig as a C replacement rather than Rust
  Rust isn't a "replacement for C", but an addition to it. It's a tool that Torvalds et. al. has recognised the value of and thus it's been allowed in the kernel. The majority of the kernel code will still be written in C.
  I'm no kernel maintainer, but I can speculate that two of the main reasons for Rust over Zig are the compile time guarantees that the language provides being better as well as the rate of adoption. There is a lot of work done by many leading companies in the industry to provide Rust native code or maintained Rust bindings for their APIs. Windows devs are re-writing parts of _their_ kernel in Rust. There's a "movement" going on that has been going on for a while. I only hope it doesn't stop.
  Maybe the maintainers feel like Zig doesn't give them enough over C to be worth the change? Many of them are still opposed to Rust as well.
  [-]
  - DonaldPShimoda 9 days ago
    > Rust isn't a "replacement for C"
    Hmm I think to clarify I would say that Rust _is_ intended as a replacement for C in general, but that this isn't how the Linux kernel developers are choosing to use it.
    As for why the kernel developers would choose Rust, I would think another one of the primary benefits is that the type system guarantees the absence of a wide class of memory-related errors that are prevalent in C, and this type system (as well as those of its predecessors) has been subjected to significant scrutiny by the academic community over the last couple of decades to help iron out problems. I suspect this is also part of why Rust has a relatively large and passionate community compared to other C alternatives.
    [-]
    - burakemir 9 days ago
      Agreed. The large and passionate community may have multiple factors but "things actually work" is probably a factor.
      It is hard to get a full picture of how academic research influenced Rust and vice versa. Two examples:
      - The use of linearity for tracking ownership in types has been known to academics but had never found its way into a mainstream language.
      - researchers in programming language semantics pick Rust as a target of formalization, which was only possible because of design choices around type system. They were able to apply techniques that resulted from decades of trying to get a certified C. They have formalized parts of the standard library, including unsafe Rust, and found and fixed bugs.
      So it seems fair to say that academic research on safety for C has contributed much to what makes Rust work today, and in ways that are not possible for C and C++ because these languages do not offer static guarantees where types Transport information about exclusive access to some part of memory.
  - josefx 9 days ago
    > It's a tool that Torvalds et. al. has recognised the value of and thus it's been allowed in the kernel.
    Has there actually been a successfull contribution to the mainline kernel? The last two big projects I heard of (ext2 / Apple drivers) seemed to have issues getting their code accepted.
    [-]
    - shakna 9 days ago
      rnull is in the kernel. And I believe one of the generic Realtek drivers is Rust as well.
- devjab 9 days ago
  As I understand it most kernel maintainers aren’t looking to replace C with anything.
  Zig has much better interoperability with C than Rust, but it’s not memory safe or stable. I think we’ll see quite a lot of Zig adoption in the C world, but I don’t think it’s in direct competition with Rust as such. In my region of the world nobody is adopting Rust, the C++ people are remaining in C++. There was some interest in Rust originally but it never really caught on in any company I know of. Likely for the same reason Go has become huge in younger companies but will not really make its way into companies which are traditionally Java/C# because even if it made sense technically (and it probably doesn’t) it’s a huge change management task. Zig is seeing traction for programs without the need for dynamic memory allocation, but not much beyond that.
- spiffyk 9 days ago
  Zig is nowhere near mature enough to be considered for the kernel yet. There are breaking changes to it regularly still - which is a good thing for Zig now, but not so good for huge, long-lived codebases like Linux. Also compiler bugs happen.
  Saying this as someone who generally likes Zig's direction.
- 3836293648 9 days ago
  Zig isn't 1.0 and has zero backcompat guarantees. It's also barely used anywhere and hasn't proven its value, even if parts of it may seem promising
- capitol_ 9 days ago
  Maybe because zig isn't memory safe.
- Ar-Curunir 9 days ago
  Rust is stable and used by a number of big players, while Zig is not stable, and as a result hasn’t seen widespread adoption yet
mbana 9 days ago
Can something like `C2Rust` then use this to generate formally correct code?
Also, is much of the authors did manual or was it run through something to produce the Rust code? If so, where is the code that generates Rust, I do not see any links to any source repos.
[-]
- zozbot234 9 days ago
  > If so, where is the code that generates Rust, I do not see any links to any source repos.
  The paper states that these developments will be released under open source licenses after the review process is completed, i.e. most likely, after the paper is formally published.
p0w3n3d 9 days ago
I wonder, if a C library is working (i.e. is not formally proven to be not having problems, but works in most ways) why shouldn't we translate it using rust unsafe? I would say there is a value in it as rust lacks of libraries generally. And this would not be different from using a dll/so that was written in c and can be unsafe in some circumstances after all
jtrueb 9 days ago
Interesting how higher optimization levels didn’t really help speed up rust in the O level comparison
[-]
- vlovich123 9 days ago
  As they say it’s likely that the code they’re outputting is pessimizing rustc’s ability. Namely it sounds like they’re inlining code as part of the conversion
  [-]
  - jtrueb 9 days ago
    Yes, I’m just saying how it kicks in basically immediately (O1).
dmezzetti 9 days ago
Interesting concept. But for a working system in C, why do we need to "convert" it to Rust. Seems like an effort where juice isn't worth the squeeze. Probably will create more problems than we're fixing.
ljlolel 9 days ago
I wonder how well O3 can do just compiling C to rust in one shot
[-]
- saagarjha 9 days ago
  Probably pretty bad.
  [-]
  - letmedang 9 days ago
    [flagged]
  - letmedang 9 days ago
    [flagged]
- ojosilva 9 days ago
  Funny, I came here to say just the opposite, that I'm glad algorithmic computing is still a thing in research and that not everything is AI.
  Ironically, AI is able to produce research-grade algorithms and will probably become an authority on the subject, helping take more traditional CS to the next level.
  [-]
  - Garlef 9 days ago
    I think it would make sense to evaluate if the the 'surgical' rewrites mentioned in the article can be carried out by or assisted by an LLM based process.
  - sunshowers 9 days ago
    There's a lot of code in the world where correctness is a requirement. :)
    I agree with the sibling -- I think LLMs may be able to help automate some parts of it, but humans are still 95% of it. At least for now.
Alifatisk 9 days ago
c2rust.com, but it uses things like libc::c_int
[-]
- love2read 9 days ago
  C2Rust is mentioned in the second paragraph of the related work section.
  [-]
  - rat87 9 days ago
    How is c2rust doing these days? For practical codebases?
kelnos 9 days ago
Ugh. They didn't compile any C to Rust. They modified the F*-to-C compiler to emit Rust instead. So they compiled F* to safe Rust. And they couldn't even do that 100% reliably; some valid F* constructs couldn't be translated into Rust properly. They could either translate it into Rust code that wouldn't compile, or translate it into similar-looking Rust code that would compile, but would produce incorrect results.
Flagged, this is just a lie of a title.
[-]
dboreham 9 days ago
[flagged]
[-]
- dang 9 days ago
  Maybe so, but please don't post unsubstantive comments to Hacker News.
  Edit: it looks like you've been doing this a lot lately. Can you please not? We're trying for something more interesting here.
rustacular 9 days ago
[flagged]
ActorNightly 9 days ago
[flagged]
[-]
- zxvkhkxvdvbdxz 9 days ago
  With rust having recently entered the Linux kernel, Windows 11, qemu among others where Haskell never took a hold, I really fail to see where you feel the wind is blowing.
  The thing is, rust is used today in more and more places because it's reliable. We're not going to switch out the ground we are standing on every time something shiny comes along and that's why this is such an interesting development.
  [-]
  - ActorNightly 6 days ago
    Just because you have tiny pieces being written in it doesnt mean its getting adopted. Its way too cumbersome to write entire OS in it - all the example projects like AWS Firecracker end up with tons of unsafes. And its runtime components for dynamic memory stuff eat into efficiency.
    Basically a more expansive and possibly ML powered valgrind can easily offer the same memory safety as Rust does, without the penalty of slower development and efficiency hits, so Rust adoption is "unstable" i.e its just a nice improvement but not does not give a hard advantage in any form and way.
    [-]
    - zxvkhkxvdvbdxz 1 day ago
      > Its way too cumbersome to write entire OS in it
      Not at all. See https://www.redox-os.org/
      and many more: https://github.com/flosse/rust-os-comparison
      > all the example projects like AWS Firecracker end up with tons of unsafes.
      That's a good thing. With rust you are explicit about unsafe.
      > Basically a more expansive and possibly ML powered valgrind can easily offer the same memory safety as Rust does, without the penalty of slower development and efficiency hits
      It is apparent that you have no experience from rust, and I wonder if you have used valgrind much too, as you seem unaware about the huge overhead from running your app through valgrind.
      Rust's error messages are precise, while valgrind is a crapshow to work with, as you should know since you mentioned it?
      > so Rust adoption is "unstable" i.e its just a nice improvement but not does not give a hard advantage in any form and way.
      It's used in production in several high profile projects, some that I already mentioned. Apparently smart people do see advantages.
  - glouwbug 9 days ago
    Even steel plated armour succumbs to rust given the environment and time
    [-]
    - ActorNightly 6 days ago
      I clearly remeber seeing replies similar to yours from the Haskell fans:)
- CoastalCoder 9 days ago
  Would you mind expanding on this?
  It sounds interesting, but I'm not tuned into either community enough to know what parallels you see.
  [-]
  - ActorNightly 6 days ago
    Haskell philosophy is basically asumming that programmers were really bad at parallelism and thread saftey, and thus it was made to address this concern. However because that overarching idea is ideological in nature (i.e not true as a statement about programmers as a whole), the community was prevalent with ideology, using a lot of lambda calculus notation to set itself apart, and making coding in it cumbersome. As a result, nobody bothered to really do anything with it.
    Rust has largely the same problem. You see bits of use of it here and there, but its also based in ideology that programmers are bad at managing memory.
    From a technical viewpoint, Rust doesnt offer anything that doesnt already exist, while providing a more cumbersome way to code. Which is why its a days are numbered.
    [-]
    - tome 6 days ago
      Haskell was most certainly not made to address a concern about programmers being really bad at parallelism and thread safety.
      [-]
      - ActorNightly 6 days ago
        It absolutely has. The functional programming paradigm is designed specifically to describe a program withoilut side effects, so you can be sure that execution is deterministic (and thus scalable and parallelizable). There are no other advantages to functional programming.
- Ar-Curunir 9 days ago
  What steps are you talking about? lambda calculus is one particular way to formalize program semantics, which is appropriate when talking about... program formalization
  [-]
  - ActorNightly 6 days ago
    Lambda calculus is a thing that stays in colleges. Nobody uses it in the real world. Computers are imperative by design. Any community that tries to to apply some abstract theory to tools used in the real world is doomed to fail from the start.
- jedisct1 9 days ago
  [flagged]
- keybored 9 days ago
  [flagged]