the format combines hands-on projects with a speaker series. we've confirmed some solid speakers (Jensen Huang from NVIDIA, Matthew Prince from Cloudflare etc), but i'm also keen to bring in perspectives from folks who don't fit the standard mold. tbh, many of the best systems eng/devs/infra ppl i've worked with are pretty weird - they think differently, take unconventional paths, and often learn by obsessively building and breaking things rather than following traditional routes. i think it would be cool for the students to realize its a feature, not a bug, to be weirdly obsessive
if you're interested in this kind of stuff, i'd value your thoughts on:
1/ who are the fascinating/unsung heroes in infra/systems eng that students should learn from? especially interested in people who've solved hard scaling problems through unconventional thinking or unique approaches
2/ what kind of projects do you think would fun and meaningfully demonstrate real-world infrastructure challenges while still being achievable in an academic quarter?
prerequisites are CS106/CS111 level programming. draft syllabus here: https://explorecourses.stanford.edu/search?view=catalog&filt...
email: anjney at alumni dot stanford edu if you prefer to share thoughts privately. thank you in advance for any and all help
Julia Evans has a wonderful approach as well, and has amazing talent for teaching: https://jvns.ca/
Kellan Elliott-McCrea (https://laughingmeme.org/) has given the world some of the better advice on the hardest parts of software scaling, which is of course scaling the human organizations. New grads are virtually always underestimating that part of the work; eventually you realize the hard problems are usually social and not technical.
re: human org scaling - true and this was the most surprising thing for me when i was running the platform org at discord. companies ship their org charts whether they like it or not. and refactoring org charts correctly, at scale, is essentially untested in the modern era
Build a multi-cloud architecture. And by this, I mean connect two cloud's networks without traversing the public internet to connect two applications running in each respective cloud. And then, put that into IaC. It sounds like not much, but the issues you uncover are pretty illuminating and it is a fantastic interview question to give to senior-ish infra guys to see how they approach it and the challenges they expect.
And you're right, we're all weird.
1) CI and IAC that deploy a web app running in a container
2) Add horizontal scaling and load balancer
3) Add long running tasks / scheduled task support
4) Deploys will likely break long running tasks. Implement blue/green or rolling deploys or some other sort of advanced deployment scheme
5) Implement rollbacks
7) Alarms
Brendan Gregg has a lot of good stuff about monitoring and performance analysis https://brendangregg.com/ https://github.com/brendangregg
Also Jess Frazelle (lots of good stuff, esp around containerization): https://blog.jessfraz.com/ https://github.com/jessfraz
2) I think multiplayer games could be interesting! Lots of meat while still having a lot of space to calibrate the scope.
Cheers!
deploy something like cassandra and make a system that can update the kernel on the servers running the databases without downtime or losing data
or come up with some distrubuted blob store thing/cdn for world wide users
my whole career has been automating updates for software or operating systems lol