Nvidia GB10's Memory Subsystem, from the CPU Side

(chipsandcheese.com)

64 points | by ingve 12 hours ago

2 comments

freeqaz 1 hour ago
I assume that the author here is testing against one of these boxes, right? https://marketplace.nvidia.com/en-us/enterprise/personal-ai-...
Are these considered a good deal at $3-4k? What's the software support like on them? I've got 2x 3090s and I'm curious how this compares.
Neywiny 8 hours ago
I don't understand on one of the later graphs the core to core latency for strix halo goes out to 32 cores but he says only has 16 cores?
[-]
- wtallis 8 hours ago
  AMD's cores have SMT, allowing them to run two threads at a time and appear to the OS and its scheduler as two logical cores despite being implemented as a single physical core.
  [-]
  - Neywiny 7 hours ago
    What pattern in the data shows that's what's being measured? I would expect to see basically 0 latency between adjacent "cores" then since L1 is shared per thread?
    [-]
    - monocasa 6 hours ago
      Co resident threads might not get any speed up here since coherency instructions are functionally operations on the L2 cache.