Restartable Sequences

(justine.lol)

65 points | by grappler 2 hours ago

5 comments

GlenTheMachine 2 hours ago
If you had no idea what a restorable sequence is the takeaway is about halfway down the OP:
“This is why Linux now provides rseq() which is a much more enlightened solution. With restartable sequences, you actually can get rid of both the mutex and atomics, while the OS continues to fully abstract scheduling. The way it works is you advise the kernel whenever your program enters a critical section of code that you don't want interrupted. It's probably going to be maybe 10 assembly instructions tops. The first assembly opcode should be a move instruction that sets the rseq_cs field. The last instruction needs to be the thing that makes the modification to your global data structure. Think of it sort of like a really tiny database transaction. What makes it go fast, is that the bidirectional communication with the kernel happens via shared memory.”
[-]
- rwmj 1 hour ago
  LWN has a good article: https://lwn.net/Articles/1033955/
- manoDev 1 hour ago
  That’s clever — am I right to think it’s the intermediate solution between locks and full STM, implemented at the kernel level, and with zero abstraction cost?
  [-]
  - khuey 1 hour ago
    It's in some sense a light form of STM. The key insight behind rseq(2) is that if the data is local to a given CPU the only way to get a race is if the kernel deschedules your program from that CPU at an inopportune time. If your operation can be aborted and restarted and the kernel has a mechanism to notify you when that needs to happen you can dispense with the overhead of "real" synchronization and just use a couple mov instructions to enter and exit the critical section.
dan_sbl 11 minutes ago
Love how the 512 GB of RAM documented in one of his setups as costing $2,776 now sells for $18,299. https://justine.lol/rseq/threadripper.html
khuey 2 hours ago
Maybe I'm just getting old but the "if you don't spend $20,000 on a workstation you're going to be left behind like a dinosaur" at the top of this article is a huge turn off to reading any further. And I say that as someone who owns a workstation with more cores than the author's.
[-]
- toast0 1 hour ago
  If we're being overly generous, they're saying you need at least a raspberry pi? You can see a 3x improvement there, which shows the pattern works, and that's good enough for a dinosaur (this interpretation is easier to justify if you just skim the article... Which I did the first time)
  But agreeing with you, I've done big optimization stuff for multicore servers (not as many cores, but same kind of work) and my workstation was something small with not even the same os. I don't need the big machine on my desk to understand the concepts. I just need the big machine to check my work. For me, that's always been a production machine, sometimes a production machine taken out of rotation for pre-validation before running on production load. I guess I should mention, I work on applications specifically, and libraries and kernels as it relates to making whatever my application is work better. I also don't have a problem with pinning threads to cpus... but my applications are usually one big program that fills the system. Someone writing a general purpose library has a harder time.
  Of course, if you want to do this kind of work and you don't have your own production load, you're going to have to borrow, rent, or buy a big machine. It doesn't need to be your workstation though. I hate working with cloud nonsense, but if your tests are short, and you do the upfront work to make your images start fast, you can probably save a lot of money by renting spot instances when testing ... I don't know if you can do spot instances of bare metal though, so you're probably stuck with vm overhead.
  [-]
  - khuey 1 hour ago
    Yeah, you can rent an equivalent workstation from AWS for under $10/hour (and that's the on demand price) so I don't think cost is a huge barrier to doing this sort of work. The language and listing the prices of the workstations down to the penny just strikes me as a rather unprofessional way to communicate.
- lpapez 32 minutes ago
  Except that is not what the article says and you clearly missed the sarcasm.
senderista 28 minutes ago
I'm surprised there was no reference to the librseq library, maintained by the rseq implementer:
https://github.com/compudj/librseq
This has helpers for common use cases like counters and linked lists. You shouldn't need to write assembly at all to use rseq in most applications.
Veserv 1 hour ago
Restartable windows, or more generically introspection windows, are a really useful technique you can apply in any situation where you understand or control the sources of preemption. The earliest uses of this technique in operating systems that I am aware of are ~25 years old.
The key insight is that the preempter can introspect the program counter of the code being preempted (which is now stable since it was preempted) and act accordingly. The simplest mechanism is to reset their program counter if in a critical section. The more generic mechanism is to jump them to a supplied address. This allows you to do things like hard abort and more.
You can further remove the need for the preempter to understand the preempted code by having the preempted code create a self-introspection code snippet and supplying that with the program counter at preemption. So the preempter just vectors them to their own code which knows how to interpret its own state at any preemption point.
[-]
- senderista 24 minutes ago
  There is a paper from Sun that anticipated tcmalloc's development of rseq by over a decade:
  https://dl.acm.org/doi/abs/10.1145/512429.512451