Rustls Server-Side Performance

(memorysafety.org)

171 points | by jaas 178 days ago

10 comments

hardwaresofton 175 days ago
At the risk of sounding like a crustacean cult member, really hope the skeptics read this post. No hype, no drama, just slow, steady high perf incremental improvement in a crucially important area without any feet blown off.
I feel bad for other/new system languages, you get so much for the steeper learning curve with Rust (cult membership optional). And I think it’s genuinely difficult to reproduce Rust’s feature set.
[-]
- landl0rd 174 days ago
  I stole these graphs for a branch of that thread ffmpeg started on twitter. The one where they were flaming rav1d vs dav1d performance to attack Rust generally.
  I don't like the RiiR cult. I do like smart use of a safer language and think long-term it can get better than C++ with the right work.
  [-]
  - jaas 174 days ago
    I'm the person who is running the rav1d bounty, also involved with the Rustls project.
    In many (most?) situations I think Rust is effectively as fast as C, but it's not a given. They're close enough that depending on the situation, one can be faster than the other.
    If you told me I had to make a larger and more complex piece of code fast though, I'd pick Rust. Because of the rules that the Rust compiler enforces, it's easier to have confidence in the correctness of your code and that really frees you up when you're dealing with multi-threaded or algorithmic complexity. For example, you can be more confident about how little locking you can get away with, what the minimum amount of time is that you need to keep some memory around, or what exact state is possible and thus needs to be handled at any given time.
    There are some things that make rav1d performance particularly challenging. For example - unlike Rustls, which was written from the start in Rust, rav1d is largely the result of C to Rust translation. This means the code is mostly not idiomatic Rust, and the Rust compiler is generally more optimized for idiomatic Rust. How much this particular issue contributes to the performance gap I don't know, but it's a suspect and to the extent that it's worth pursuing, one would probably want to figure out where being idiomatic matters most instead of blanket rewriting everything.
  - LoganDark 174 days ago
    > I don't like the RiiR cult.
    For certain types of people, Rust has a way of just feeling better to use. After learning Rust I just can't imagine choosing to use C or C++ for a future project ever again. Rust is just too good.
    [-]
    - WJW 174 days ago
      Choosing Rust for new projects is very different than trying to rewrite an existing codebase with thousands of hours poured into it. Or worse, demanding that someone else do that for you for free.
      [-]
      - josephg 174 days ago
        People keep saying this. But in my experience, directly porting code from one language to another is much easier that people think. I can do maybe about 500 lines per day depending on the language similarity. ChatGPT is great for doing a first pass - though it will add small, subtle bugs in the process.
        I’m not arguing that we should rewrite everything in rust. C and C++ are fine languages. But sometimes it really is better to just have your code in a different language rather than deal with FFI. For example, I have some collaborative text editing code in rust and recently I just ported the whole thing to typescript, because it’s just straight out easier to use in a browser that way, compared to dealing with a wasm bundle.
        I think the big mistake people make when rewriting into a different language is doing a refactor at the same time. This is the wrong way to go about it. First port directly the code you have. Then port your tests and get them passing in the new language. Then refactor. Obviously there’s always some language differences - but ideally you can confine differences within modules, and keep most of the module boundaries intact through the rewrite. You can also refactor before translating your code. If I were porting something to rust that wouldn’t pass the borrow checker, this is probably what I’d do. First refactor everything to make the borrow checker happy - so for example, make sure your structs / classes are in a strict tree. Then get tests passing. Translate between languages and cleanup.
        If you approach it like that, rewriting code is a largely mechanical process. It really takes a lot less time than people think to translate code, since you don’t actually have to understand every single line to do it correctly. So the time taken scales based on the number of lines of code. Not the number of hours it took to write! And then, if you want to refactor your new program at the end, go for it.
        [-]
        chrismorgan 173 days ago
        500 lines a day feels very much on the low side to me, for most projects. That’s around one line per minute. Yeah, there are definitely parts where you have to go a lot slower, but also there are plenty of lines that take 0–10 seconds.
        Going from Rust to TypeScript will normally be pretty easy—though if things like numeric and bytewise manipulations are involved, it can be tough. Going from TypeScript to Rust will often be easy, but also often be fiendishly difficult to do without refactoring a lot, due to ownership model differences.
        Occasionally I’ve chosen to do a refactoring in the source language first, and then port that. That can work decently, though it depends so much on exactly what the changes are and why they are, which is often to do with which two languages are involved.
        [-]
        josephg 173 days ago
        500 lines a day feels about right taking the whole project into consideration. I ported a physics engine from C to Javascript a decade or so ago. Once I was in the swing of it, I did a lot more than 500 lines a day too. But I also spent a few days reading the original code. And I tweaked how I wanted to represent everything in javascript - which slowed me down. I should have just ported it directly, then refactored afterwards. And I spent a few days at the end tracking down some bugs that snuck in. In retrospect, typescript would have been a better choice. But I hadn't learned it yet.
        The best days I ported maybe 1500 lines. On the "worst" days I did 0 lines. In all, 500 lines a day feels like the right ballpark for this sort of work.
        neilv 174 days ago
        > ChatGPT is great for doing a first pass - though it will add small, subtle bugs in the process.
        Kill it with fire.
        [-]
        neilv 174 days ago
        Downvoters: The whole point of Rust is avoiding defects due to programmer mistakes. Why would you use Rust, but then outsource the programming to a known-bad programmer/plagiarizer.
        [-]
        setr 174 days ago
        Isn’t that exactly why
        If rust avoids defects due to programmer mistakes, then throwing a shitty programmer at it is functionally safe (or safer than otherwise) because their shitty code won’t compile. So worst case is they don’t do any harm, and best case is free cheap labor
        [-]
        neilv 174 days ago
        Rust only prevents/discourages some classes of programmer mistakes.
        But presumably your you want to reduce all of the mistakes.
        Rust won't prevent/discourage a lot of the other classes of mistake that a bad programmer is creating.
        [-]
        setr 174 days ago
        Sure but generally speaking, the safer the language/model/api, the more likely you can get away with AI generated code.
        Rust having more powerful modeling tools than the usual language sales it more viable to use crappier programmers, not less.
        Obviously if your goal is perfection, then only hire programmers capable of writing perfection. If you’re making the trade-off of cheaper/junior resources and putting more effort into testing/api design/type design/code review to help defend against the inevitable errors, then your genai code fits neatly into the same equation
      - sshine 174 days ago
        Yes, those are different experiences.
        But assuming you're experienced with Rust, porting something is actually both easy and enjoyable.
        Porting something as your first Rust project is going to end up like a dirty hybrid.
        It's an excellent learning opportunity, but it will not leave a good showcase of Rust.
      - IshKebab 174 days ago
        Yeah it depends on the project I think. There have been quite a lot of successful rewrites of large projects into Rust. For example Fish and svgr.
      - laerus 174 days ago
        Rewriting could also make sense if there is a chance to improve the architecture based on the experience with the existing codebase. Something that would otherwise be impossible to even consider.
    - landl0rd 174 days ago
      I feel similarly for most stuff. I just see a difference between “I like rust and enjoy using it for stuff bc of its advantages” and “rewrite the entire systems programming world in rust, also I will be bitchinf anytime someone releases something not written in rust, opening gh issues asking why it wasn’t, etc.”
- rastignack 174 days ago
  Is not rustls a mix of c++, assembly and rust ?
  I think it’s not a good indication of the success of the language.
  [-]
  - jaas 174 days ago
    In Rustls, TLS is implemented entirely in Rust. It uses aws-lc-rs [1] for cryptography, and aws-lc-rs uses assembly for core cryptographic routines, which are wrapped in some C code, which then exposes a Rust API which Rustls uses.
    It's not practical right now to write high performance cryptographic code in a secure way (e.g. without side channels) in anything other than assembly.
    [1] https://github.com/aws/aws-lc-rs
    [-]
    - robmor 174 days ago
      Is that right?
      From the AWS-LC README: https://github.com/aws/aws-lc
      > A portable C implementation of all algorithms is included and optimized assembly implementations of select algorithms is included for some x86 and Arm CPUs.
      It also states that it kind of forked BoringSSL and OpenSSL.
      You’re right though that most of the memory safety attack surface has been replaced with Rust.
      [-]
      - jaas 174 days ago
        I think what you're quoting says what I was saying - assembly with some C around it, wrapped in a Rust API. At least for the "select" (read: most important) algorithms. The details of the C/asm boundary in aws-lc are hard to summarize.
        Ideally the C would eventually move to Rust, but I think aws-lc needs to work in many contexts where a Rust toolchain is not available so it might be a while.
        Graviola is an interesting option under development, in part because it gets rid of the C:
        https://github.com/ctz/graviola
      - toast0 174 days ago
        TLS in X, cryptography from OpenSSL (in C and assembly) is a common, useful pattern for integrating TLS in other languages.
        TLS is the protocol generation and parsing, hopefully (but not always) including certificate parsing.
        Crypto tends to have clear, fixed buffer sizes, and OpenSSL tends to have good implementations of it with reasonable interfaces.
        Protocol parsing and certificate parsing and validation are where many more problems happen that memory safety can reduce. High profile crypto problems are generally information leaks from non-constant time algorithms leaking information; although information leaks also happen from protocol code too.
    - rastignack 174 days ago
      In rust with some C code, ok. How is the DER format parsed for example ?
      Regarding crypto operations, I know as of now for rust projects assembly is a must to have constant time guarantees.
      Maybe there could be a way with intrinsics and a constant-time marker, similar to unsafe, to use pure rust.
      In the meantime I think there still is too much C code.
      It’s a great step in the good direction by the way.
      [-]
      - jaas 174 days ago
        In Rustls, DER, and all certificate parsing and validation in general, is done in Rust.
        https://github.com/rustls/webpki
    - PoignardAzur 174 days ago
      I wonder if it would be possible to implement a safe_asm macro in Rust?
      Even if unrestricted asm is inherently unsafe, there's got to be a subset of instructions and operand types you can guarantee is safe if called a certain way.
- amelius 174 days ago
  Rust was made for this kind of thing.
  Too bad there are those who think they should use Rust to write GUIs and end-user applications, which is where Rust ergonomics breaks down fast.
  [-]
  - hardwaresofton 174 days ago
    > Too bad there are those who think they should use Rust to write GUIs and end-user applications, which is where Rust ergonomics breaks down fast.
    Disagree here, happy that those people are experimenting, and Rust is being used all the way up and down the stack. I may not prefer using Rust for web pages, but I'm super glad that some people want to -- projects like Dioxus and Tauri are really fantastic to use, and reflect well on the ecosystem as a whole.
    I think Rust gained from this enthusiasm, because it's one of the languages that I think goes almost everywhere, if you pay the high upfront cost of learning. There really aren't that many domains where you absolutely couldn't write Rust. The fact that Swift is also chasing this quality suggests it's valuable.
    [-]
    - amelius 174 days ago
      Of course, experimentation is allowed always, in any language, even assembly.
      But that doesn't say anything about ergonomics for big production systems.
  - the__alchemist 174 days ago
    I, with passion, do not agree with this. I think after embedded and other systems-programming/bare-metal applications, PC applications with GUI is rust's best use case.
    With this in mind, I'm curious: What do you feel are good use cases for Rust?
    [-]
    - amelius 174 days ago
      Best use-cases are what it was meant for: systems tools, and kernels.
      Worst use-cases: GUIs and end-user applications.
      [-]
      - the__alchemist 173 days ago
        What language would you recommend instead? We can limit our choices to C, C++, rust, ADA and Zig, as full-performance ones. Chop off Zig and ADA due to limited GUI and rendering compatibility.
toast0 175 days ago
I wish they included details on how they ran these benchmarks, like they did last year [1].
I'd like to take a look and try to understand why there's such a big difference in handshake performance. I wouldn't expect single threaded handshake performance to vary so much between stacks... it should be mostly limited by crypto operations. Last time, they did say something about having a cpu optimization for handshaking that the other stack might not have, but this is on a different platform and they didn't mention that.
I'd also be interested in seeing what it looks like with OpenSSL 1.1.1, given the recent article from HAProxy about difficulties with OpenSSL 3 [2]
[1] https://www.memorysafety.org/blog/rustls-performance-outperf...
[2] https://www.haproxy.com/blog/state-of-ssl-stacks
[-]
- jaas 174 days ago
  This report contains more details about the results discussed in the blog post:
  https://rustls.dev/perf/2024-11-28-threading/
  [-]
  - toast0 170 days ago
    Thanks for that ... I got a little bit nerd sniped. I haven't gotten to the bottom of this one, but I dug a bit.
    On my machine dual-socket Intel(R) Xeon(R) CPU L5640 running 14.2-RELEASE-p3, I found similar differences in speed with the system openssl (3.0.16) and rustls from head (795ae1f5d0435dbc80dac04ec147e85d4970563c).
```
   Openssl 3.0.16 (FreeBSD base 14.2-RELEASE-p3)
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1275.38 handshakes/s (512 / 0.401448)

   Rustls: (795ae1f5d0435dbc80dac04ec147e85d4970563c)
   handshakes      TLSv1_3 EcdsaP256       TLS13_AES_256_GCM_SHA384        server  server-auth     no-resume       1998.39 handshakes/s
```
    I looked at a lot of stuff, but no real smoking guns. There's a difference in behavior between the two handshakes, but it's not that different. openssl-bench generates 4 application packet wrappers for the 'first flight', whereas rustls generates one which contains the 4 messages of encrypted extensions, server cert, server cert verify, server handshake finished; this seems like it could be significant, but I couldn't easily undo it to test. Also, openssl-bench generates 2 more application packets after receiving the client handshake finished; I'm pretty sure those are tickets, but turning off ticket generation was ~ 1% improvement, so whatever.
    However, one of my friends suggested aws-lc might just be super fast, so I ran openssl-bench linked against that and saw a big improvement. So I went ahead and tried with all the options from FreeBSD pkg. Here's my list of results:
```
   aws-lc-1.48.4 (freebsd pkg)
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  2478.93 handshakes/s (512 / 0.206541)

   openssl111-1.1.1w_2 (freebsd pkg)
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1773.9  handshakes/s (512 / 0.28863)


   openssl-3.0.16,1
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1333.5  handshakes/s (512 / 0.383951)

   openssl31-3.1.8
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1387.69 handshakes/s (512 / 0.368958)

   openssl32-3.2.4
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1353.54 handshakes/s (512 / 0.378267)

   openssl33-3.3.3
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1406.62 handshakes/s (512 / 0.363994)

   openssl34-3.4.1
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1393.34 handshakes/s (512 / 0.367463)

   openssl35-3.5.0.b1
   handshakes      server  TLSv1.3 TLS_AES_256_GCM_SHA384  1155    handshakes/s (512 / 0.443289)

   boringssl-0.0.0.0.2025.03.27.01_1
   did not manage to get a matching cipher

   libressl-4.0.0_1
   (does not compile, don't care to fix)
```
    So.... in my testing, on my machine, rustls is faster than openssl-bench linked against openssl and openssl 1.1.1 is faster than openssl 3.x, but openssl-bench linked against aws-lc is faster than rustls.
    I'll try to get ahold of the authors tomororow and suggest they add openssl-bench linked against aws-lc to their test.
bastawhiz 175 days ago
I'm not a Rust guy and I probably won't be any time soon, but Rustls is such an exciting project in my eyes. Projects like BoringSSL are cool and noble in their intentions, but having something that's not just a hygienic codebase but an implicitly safer one feels deeply satisfying. I'm eagerly looking forward to this finding its way into production use cases.
pzmarzly 175 days ago
Also in referent news: "The State of TLS Stacks" by HAProxy devs https://www.haproxy.com/blog/state-of-ssl-stacks https://news.ycombinator.com/item?id=43912164
TLDR OpenSSL days seem to be coming to an end, but Rustls C bindings add not production ready yet.
[-]
- jaas 174 days ago
  Rustls has two C APIs.
  The first is C bindings for the native Rustls API. This should work great for anyone who wants to use Rustls from C, but it means writing to the Rustls API.
  The second is C bindings that provide OpenSSL compatibility. This only supports a subset of the OpenSSL API (enough for Nginx but not yet HAProxy, for example), so not everything that uses OpenSSL will work with the Rustls OpenSSL compatibility layer yet. We are actively improving the amount of OpenSSL API surface that we support.
  [-]
  - tialaramex 174 days ago
    I should probably help with that whole OpenSSL compat. thing. Is there guidance somewhere or should I just find something interesting to implement and send over a PR ?
    [-]
    - jaas 174 days ago
      The repository is here:
      https://github.com/rustls/rustls-openssl-compat
      We just work with normal issues/PRs, and there is a Rustls discord channel if you want to chat. We'd love your help!
- Twirrim 174 days ago
  Would love to see compliance and accreditation coming through for native rusttls, like FIPS. That'll unlock a large potential market, which can in turn unlock other markets.
  You can get FIPS by using some of the third party back-end integration via aws-lc-rs.
  [-]
  - jaas 174 days ago
    The default cryptographic back-end for Rustls, aws-lc-rs, is FIPS compliant and integrated in a FIPS-compliant way so it's easy to get FIPS compliance with Rustls.
nyanpasu64 175 days ago
I wonder if replacing the encryption key every 6 hours would be a good use case for a crossbeam-epoch, though this may be premature optimization, and that library requires writing unsafe code as far as I can tell.
[-]
- toast0 174 days ago
  I think it is worth optimizing, there's a noticable, but small, dip in handshakes per second going from 1 to 2 threads.
  If I were to optimize it, and the cycling rate is fixed and long, I would have the global storage be behind a simple Mutex, and be something like (Expiration, oldval, newval), on use, check a threadlocal copy, use it if it's not expired, otherwise lock the global, if the global is not expired, copy it to thread local. If the global is expired, generate a new one, saving the old value so that the previous generation tickets are still valid.
  You can use a simple Mutex, because contention is limited to the expiration window. You could generate a new ticket secret outside the lock to reduce the time spent while locked, at the expense of generating a ticket secret that's immediately discarded for each thread except the winning thread. Not a huge difference either way, unless you cycle tickets very frequently, or run a very large number of threads.
- yencabulator 174 days ago
  You might like https://docs.rs/arc-swap/latest/arc_swap/
  [-]
  - dochtman 174 days ago
    We tried this when we were benchmarking and it was not significantly better than the Arc<RwLock<_>> that we're using now.
  - nyanpasu64 174 days ago
    AIUI epoch GC doesn't require Arc's atomic increment/decrement operations which can be slower than naive loads (https://codeberg.org/nyanpasu64/cachebash), but at this point we're getting into nano-optimization territory.
dlgeek 173 days ago
Between this and https://www.haproxy.com/blog/state-of-ssl-stacks, I think we need to start accepting the idea that OpenSSL is not the right way forward for anything performance sensitive.
Given how aws-lc powers both of these articles, I'm curious how Rustls compares to s2n-tls - AWS's TLS library to go along with aws-lc.
thevivekshukla 174 days ago
Wow this is fast.
However I tried rustls with redis for my axum application, for some reason it was not working, even though my self signed ca certificate was updated in my system's local CA store.
After a lot of try I gave up then thought about trying native tls, and it worked in first go.
[-]
- dochtman 174 days ago
  Did you file an issue or ask in the rustls Discord channel? We're happy to help.
  [-]
  - encom 174 days ago
    >Discord
- whizzter 174 days ago
  The irony is that due to CA stores (and how verification is handled) it's usually a tad more ficklish to replace TLS clients than TLS servers.
  Was there no way to provide a custom CA store (that only included your self signed one)?
- aberoham 174 days ago
  Did you try rustls-tls-native-roots? rustls-tls defaulting to only use the webpki bundle caught me off guard on a system with a bespoke CA
PoignardAzur 174 days ago
That name is confusing. Reading the headline, I first thought it was about the deprecated language server and was very confused.
[-]
- jjice 174 days ago
  That was rls, but I can see what you mean from the name of the package once I looked at it again.
  https://github.com/rust-lang/rls
koakuma-chan 175 days ago
It's blazingly fast.
lifeinthevoid 174 days ago
Out of curiosity, rustls uses aws-lc-rs which in turn uses aws-lc, which is in turn "based on code from the Google BoringSSL project and the OpenSSL project."
You're trying to get rid of OpenSSL, but you're actually relying on OpenSSL code. Sounds a bit iffy imo. Can somebody provide a bit more depth here?
Or is it just the OpenSSL TLS API that is hopelessly confusing and bug inducing? I can imagine that the crypto primitives in OpenSSL are very solid.
[-]
- tialaramex 174 days ago
  Yes the core thing everybody actually wants is the constant time cryptographic primitives, and it's not at all practical to attempt those in a high level programming language like C or Rust, they're always raw machine code, or - which is less awful to work with - assembler. So yeah it's roughly the same code in all of the projects you mentioned - the correct machine code for a high quality ChaCha20 implementation on x86-64 is the same if the rest of your TLS implementation is C (OpenSSL) or Rust (rustls) or like, hand written Perl (please tell me nobody did this)
  Although of course the Rust compiler has no way to inspect this ChaCha20 primitive and check it is memory safe, we can "vouch" for it, and these primitives have been eyeballed by a huge number of people since they're so widely used so it feels as reasonable as the claim that ChaCha20 itself works, which has been considered by plenty of cryptanalysis experts from government and industry.
  Pretty much everything else is Rust, so the bit-twiddling inside a DER implementation to parse certificates is Rust, the TLS handshake implementation is Rust, and so on.
  [-]
  - toast0 174 days ago
    > hand written Perl (please tell me nobody did this)
    I've not had a need, but I wrote a just enough protocol code for TLS 1.3 in Erlang as a prototype, and it wouldn't be awful in Perl with pack/unpack; binary matching is a lot nicer in Erlang though. :P
    Used that prototype to inform development of a Java implementation of TLS 1.3 protocol only (crypto and certificate parsing and verification through system libraries) to get consistent TLS 1.3 features on a very popular Android app. I think Google has a thing you can do to get a TLS 1.3 capable stack now, but not then.
    With the TLS illustrated series [1] it would be easier than when I did it. The test vectors in the TLS 1.3 rfcs and drafts are very nice to have too.
    [1] https://tls13.xargs.org/
- jaas 174 days ago
  Rustls uses aws-lc-rs for cryptography, which, roughly speaking, is based on the cryptography from BoringSSL, which is a heavily modified fork of OpenSSL from a long time ago. I'm not sure how similar OpenSSL and aws-lc-rs cryptography implementations are today (maybe someone else knows?), but it's probably not accurate in a useful way to say that aws-lc-rs just uses cryptography from OpenSSL.
  In any case, OpenSSL does a whole bunch of things, and one of those is providing low-level cryptographic routines. When people talk about issues with OpenSSL, they're usually not (in my experience) talking about issues with its low-level cryptographic routines. They're talking about things like the TLS implementation and API.
  Rustls has its own Rust code for the TLS protocol and certificate parsing/validation, which doesn't come, directly or by lineage, from OpenSSL or any OpenSSL derivatives.
- koakuma-chan 174 days ago
  rustls doesn't necessarily use aws-lc-rs, you can use a different provider like ring or rustcrypto