I Made Zig Compute 33M Satellite Positions in 3 Seconds. No GPU Required

(atempleton.bearblog.dev)

70 points | by signa11 7 hours ago

6 comments

tylermw 2 hours ago
Nice results! SIMD can be a pain, good to know Zig makes it easy.
However, note that the plot under "Native SIMD Throughput Comparison" is extremely misleading: for an accurate proportional comparison between bar charts, you should start the y-axis at zero. The way the data are presented makes it look like a 10-100x gain, rather than the actual 2x improvement.
[-]
- voidUpdate 1 hour ago
  I was going to comment the same. I saw the huge difference and went "wow", then read that it was a 2x improvement and had to check the axes properly, thinking "slightly less wow". It reminds me of that barchart of women's average heights in different countries that starts at 5 feet https://preview.redd.it/dohqa8l94kb41.png?auto=webp&s=865180...
exitb 23 minutes ago
Is that solving the right problem? The algorithm can give reasonably accurate positions at arbitrary points in future, but you don’t need to run it over and over if you need positions every second. You can generate keyframes and interpolate the positions between, as the short term orbital movements are rather trivial.
philipallstar 2 hours ago
I've never seen SIMD code before, and this is quite a nice little intro into that and Zig.
dfajgljsldkjag 1 hour ago
It is funny how we often assume we need a graphics card for these kinds of calculations when a standard processor is actually plenty fast. The specific changes to the memory layout seemed to make the biggest difference here by allowing the hardware to actually use its vector capabilities.
[-]
- JohnLeitch 1 minute ago
  At risk of being called out for my ignorance (I am still new to GPU development and have only limited experience with CUDA), it seems to come down to how appropriate the execution model is to the work e.g. SIMT vs SIMD here.
- Agingcoder 13 minutes ago
  These days a single machine with lots of ram and cores will handle almost everything you throw at it, barring specific compute intensive / memory bound scenarios ( current AI, gaming etc ).
ginko 17 minutes ago
There's one example given where either the result of a simple or complex calculation is picked depending on eccentricity mentioning it's faster to just always calculate both and picking with a mask.
If you calculate both, wouldn't it be even faster to just always do the complex calculation? (presumably that's more precise?)
androiddrew 55 minutes ago
You tell that language what to do!