Nvidia GB10's Memory Subsystem, from the CPU Side

(chipsandcheese.com)

64 points | by ingve 12 hours ago

2 comments

  • freeqaz 1 hour ago
    I assume that the author here is testing against one of these boxes, right? https://marketplace.nvidia.com/en-us/enterprise/personal-ai-...

    Are these considered a good deal at $3-4k? What's the software support like on them? I've got 2x 3090s and I'm curious how this compares.

  • Neywiny 8 hours ago
    I don't understand on one of the later graphs the core to core latency for strix halo goes out to 32 cores but he says only has 16 cores?
    • wtallis 8 hours ago
      AMD's cores have SMT, allowing them to run two threads at a time and appear to the OS and its scheduler as two logical cores despite being implemented as a single physical core.
      • Neywiny 7 hours ago
        What pattern in the data shows that's what's being measured? I would expect to see basically 0 latency between adjacent "cores" then since L1 is shared per thread?
        • monocasa 6 hours ago
          Co resident threads might not get any speed up here since coherency instructions are functionally operations on the L2 cache.