README is in my opinion (author here) the most interesting - I wrote it to help others build useful mental model to be able to recreate the project yourself, without need to even read my code
I am not super familiar with C and CUDA, so I read solely for the README and enjoyed it supremely. The blend of cheerful walking through instructive examples and your philosophical takes on how to approach the exercise to get the most out of it put me in a great mood. You captured that special upbeat attitude that comes about when you're doing something as well as you can just because it's so legitimately interesting to you.
The lesson-style README is a great approach. Breaking down LLM inference into digestible steps makes the codebase approachable even for people who haven't touched CUDA before.
Well, the whole purpose is to be independent of invisible backdoor injectors...^W I mean compiler,
to be more accurate those compilers which deals with computer languages with an absurd and
grotesque syntax complexity.
>>Physically, LLM is a file which contains a lot of float numbers.
aka atoms of the LLM.
Anybody?