Codemaps: Understand Code, Before You Vibe It

(cognition.ai)

202 points | by janpio 9 hours ago

16 comments

apstls 5 hours ago
Sounds very cool.
I wanted to try this out, so I opened Windsurf for the first time in ages and clicked the "Upgrade Available" button, which sent me to: https://windsurf.com/editor/update-linux
```
  Did you install using apt or apt-get? If so...
  
  1. Update package lists
  
  sudo apt-get update
  
  2. Upgrade Windsurf
  
  sudo apt-get upgrade windsurf
```
Whle `apt-get upgrade windsurf` will technically upgrade Windsurf, instructing users to run a command that will attempt to upgrade all packages on their system is nuts when the command is provided in a context that strongly implies it will only upgrade Windsurf and has no warnings or footnotes to the contrary. Good thing I didn't ask Windsurf's agent to ugprade itself for me, I guess.
EDIT - I don't want to detract from the topic at hand, however - after upgrading (with `sudo apt-get install --only-upgrade windsurf` :)) and playing around a bit, the Codemaps feature indeed seems very nifty and worth checking out. Good job!
[-]
- kps 4 hours ago
  So `apt-get upgrade $PACKAGE` has ridiculous semantics that no one would expect, and the actual syntax for upgrading a package is in neither the man page nor the command help.
  [-]
  - normie3000 4 hours ago
    > So `apt-get upgrade $PACKAGE` has ridiculous semantics that no one would expect
    Especially not an LLM!
  - pxc 22 minutes ago
    Sure it is¹ (kinda):
```
  --no-upgrade
      Do not upgrade packages; when used in conjunction with install, no-upgrade will prevent packages on the command line from being upgraded if they are already installed. Configuration Item: APT::Get::Upgrade.
```
    The canonical way to do the thing you want via apt-get is `apt-get install`. And if you read the man page from start to finish, it'd be clear to you... but it is tucked away there in the most obtuse, indirect, ungreppable way. :'D
    That would be a great addendum to an EXAMPLES section! In the meantime, this is documented well and clearly in the tldr page for apt-get².
    Fwiw, apt-get not only sucks, but has been known to suck for many, many years (more than a decade at least). Its interface sticks around because it's basically plumbing at this point. But you, as a user, should never use it (or `apt-cache` or `apt-*`, if you can avoid it.
    Aptitude is preferable for a whole host of reasons, not least of which being that its upgrade commands have the semantics you'd intuitively expect³. They take packages as an optional list of positional args, and upgrade everything only if you don't pass any. (Aptitude also has a ton of other nice features and I highly recommend it.)
    There's also an official new porcelain in APT itself, aptly called "apt". It preserves⁴ the semantics of apt-get's `upgrade` command, but its usage message actually matches that syntactically— hopefully it'll barf if you tell it `apt upgrade windsurf` or whatever.
    But automation needs to rely on the ugly, old, disparate APT commands that have been around forever and can't really change. That probably goes, too, for things guides want you to copy and paste, or instructions handed over to LLMs.
    (This is one reason that if you only learn to use APT from guides/tutorials whose primary concern is something other than documenting or teaching how to use Debian-based systems, you'll probably never learn to use the correct tools (the nicer, newer ones).)
    --
    1: https://manpages.debian.org/trixie/apt/apt-get.8.en.html
    2: https://tldr.inbrowser.app/pages/linux/apt-get
    3: https://manpages.debian.org/trixie/aptitude/aptitude.8.en.ht...
    4: https://manpages.debian.org/trixie/apt/apt.8.en.html
- swyx 3 hours ago
  hiya! team noticed your comment and agreed - and it is fixed.
```
    - const CodeSnippetTwo = `sudo apt-get upgrade windsurf`;
    + const CodeSnippetTwo = `sudo apt-get install windsurf`;
```
  [-]
  - blks 3 hours ago
    Did you also generate this with “AI”?
    [-]
    - swyx 3 hours ago
      https://tenor.com/view/westworld-if-you-cant-tell-does-it-ma...
      [-]
      - tbillington 2 hours ago
        If you couldn't tell your food had been cut with sawdust would it matter to you if you found out?
gnarlouse 5 hours ago
A few things to point out after reading and thinking about this:
- Another AI firm building products focused on Fortune 500 scale problems. If you're not at a F500, this tool isn't necessarily a good fit for you, so YMMV.
- static analysis tools that produce flowcharts and diagrams like this have existed since antiquity, and I'm not seeing any new real innovation other than "letting the LLM produce it".
They say it's ZDR, so maybe I don't fully understand what problem they're trying to solve, but in general I don't see the value add for a system like this. Also onboarding isn't necessarily just presenting flow charts and diagrams: one of the biggest things you can do to onboard somebody is level-set and provide them with problem context. You COULD go into a 30 minute diatribe about how "this is the X service, which talks to the Y service, and ..." and cover a whiteboard in a sprawling design diagram, or you could just explain to them "this is the problem we're working on", using simple, compact analogies where/when applicable. If the codebase is primarily boilerplate patterns, like CRUD, MVC, or Router/Controller/Service/DB, why talk about them? Focus on the deviant patterns your team uses. Focus on the constraints your team faces, and how you take the unbeaten path to navigate those constraints.
[-]
- _jayhack_ 5 hours ago
  > static analysis tools that produce flowcharts and diagrams like this have existed since antiquity, and I'm not seeing any new real innovation other than "letting the LLM produce it".
  Inherent limitation of static analysis-only visualization tools is lack of flexibility/judgement on what should and should not be surfaced in the final visualization.
  The produced visualizations look like machine code themselves. Advantage of having LLMs produce code visualizations is the judgement/common sense on the resolution things should be presented at, so they are intuitive and useful.
  [-]
  - gnarlouse 4 hours ago
    Although I haven't personally experienced the feeling of "produced visualizations looking like machine code", I can appreciate the argument you're making wrt judgment-based resolution scaling.
- bigiain 2 hours ago
  > static analysis tools that produce flowcharts and diagrams like this have existed since antiquity
  Today I am apparently on of xkcd's "Lucky 10,000".
  Does anyone have any recommendations for such tools? Ideally open source, but that's not a hard requirement. (Although "Enterprise - if you have to ask the price you can't afford it" options will not work for me.)
  I'm particularity interested tools that work with Python, Java, and Javascript (Angular flavoured Javascript, it if matters)?
  [-]
  - gnarlouse 1 hour ago
    https://www.ensoftcorp.com/products/atlas is the Java/c oriented flavor I'm most familiar with. I've used them for Javascript before previously, although I'd have to do some digging to find the particular package I used. I am confident that you could find one with an npm/pypi search.
bluelightning2k 8 hours ago
I really think more people should give Windsurf a go. It's really good. I'm a senior engineer and do a mix of agentic and regular coding and I really think people are looking past Windsurf.
As the conversation shifted towards Cursor vs Claude code vs Codex people seem to have stopped mentioning it which is a shame.
Source: user for 12 months - not a shill.
Codemaps was a very pleasant surprise when it showed up.
[-]
- gslepak 6 hours ago
  I prefer IDEs like Zed that don't lock me in to their ecosystem and force me to "log in" to use them.
  [-]
  - gnarlouse 5 hours ago
    `codeium` which is now `windsurf` started out as a vscode fork IIRC
    [-]
    - froober 5 hours ago
      Not to be confused with `vscodium` which is an open source build of vscode
      [-]
      - gnarlouse 3 hours ago
        Yeah definitely confused it with vscodium, thx
    - bpavuk 3 hours ago
      you missed the point. Zed develops and pioneers ACP (Agent Client Protocol), which I can also use in other editors and with other agents. at the moment, only Neovim is available as an alternative editor, but nothing stops, say, JetBrains from implementing it. I can plug Codex, Gemini, Claude Code, and Goose directly into my editor of choice.
- CSMastermind 2 hours ago
  I as a big Windsurf advocate, miles ahead of Cursor IMO but I've fully switched to Codex these days. The cloud environments are just such a nice feature.
  Still like Windsurf though their pricing is what drove me to not roll it out across my company.
  [-]
  - swyx 2 hours ago
    cloud envs are good :)
    1) did you compare codex cloud with devin?
    2) how about the new claude code teleport feature from web to cli?
    just wanted to pry for more opinions on what matters to you
- mpalmer 3 hours ago
  I co-sign this as a similarly-credentialed person. I use windsurf at work and recently started enjoying Claude Code, but the UX of Windsurf is actually a legit value add. Codemaps especially - been using them for weeks and they're excellent. Ask me again in a year maybe; churn in code could make maintaining codemaps annoying, but even that seems solvable.
  [-]
  - swyx 2 hours ago
    appreciate the feedback! just a reminder codemaps are based on snapshots of your code when you run them; technically there's nothing to maintain, because you just rerun them if you need to.
    [-]
    - mpalmer 37 minutes ago
      Yep it's low friction, but is it easy to discover that I need to? I guess it's the "needing to know that I need to rerun" that I'm less enthused about.
- corefinder 3 hours ago
  I'm surprised that people still aren't discussing Github Copilot baked into VSCode. I pair agent mode + Sonnet 4 + Sequential Thinking + Tavily MCP servers and it works wonders. I recently prototyped the first version of our SaaS with this setup in a minimal amount of time. Also worth nothing, the pricing is extremely reasonable. Free credits + pay per use. I frequently max out the free tier and have never spent more than $40 per month.
- gnarlouse 5 hours ago
  Agreed. I'm back and forth about whether I want to spend the time with an agentic coding editor yet, because it's sitting right on the cusp of distraction/enhancement.
  I've also tried the 3 C's, and it still feels like Windsurf has the net best user experience.
- all2 7 hours ago
  This is enough for me to give this a go. I've tried a few different tools; abacus.ai (and their IDE), claude CLI, crush-cli. My workflows are still mostly on the command line, and a little in VS Code. I haven't found a flow that works "right", yet.
  [-]
  - swyx 6 hours ago
    first mention i've heard of abacus.ai and IDE. what do you think stands out about them?
    you might struggle with Windsurf since you're so command line heavy. but pro tip - ask for command line work to be done inside of Windsurf's Cascade agent. they were first to the terminal-inside-aichat pattern and i really like how it's much better at command line stuff than i am (or can do the legwork to specify command line commands based on a few english descriptions)
    [-]
    - all2 5 hours ago
      > first mention i've heard of abacus.ai and IDE. what do you think stands out about them?
      Their reasoning agent is better than anything else I've used, tbh. The inability to use it in a CLI environment is why I stopped using it. They have a router that they hook into that "intelligently" chooses models for you in a normal "chat" setting. The power comes with their DeepThink (or whatever) mode that has a VM hooked up to it, as well as many, many well designed agents and internal prompts that handle all sorts of interesting things, from planning to front-end dev, to reasoning about requirements and requirements fulfillment.
      [-]
      - swyx 2 hours ago
        ah yeah i have heard about their router. i wonder if GPT5 doing a router hurt it a bit.
- sama004 8 hours ago
  I just tried out windsurf yesterday, The only thing I hate for now is that when there are changes and I accept one of them, then trying to accept the others gives an error saying the file was changed
  [-]
  - lord_sudo 8 hours ago
    I’m sorry to hear about that. What version are you on? Looking to fix / repro this asap
    [-]
    - swyx 7 hours ago
      (also you can hit "share conversation" or "view response statistics" and then "copy request id" and send to support!)
- dingnuts 6 hours ago
  I've used it, and I thought it was absolute trash. Goes crazy doing shit I don't want. I spend more time deleting crap I didn't want and reviewing and changing its code than I do just writing it myself.
  I know what you're going to say: I need to learn to use this groundbreaking technology that is so easy to use that my product manager will soon be doing my job but also is too hard for me a senior engineer, to find value in.
  Kindly: no, I trust my judgement, and the data backs me up.
  Have you taken measurements of how many features and bugs you've shipped over the last twelve months or are you just like the engineers in the METR study who self reported an improvement but when measured, had been impaired? What evidence do you have that your attitude is not simply informed by the sunk cost of your subscription?
  Please share your data below
  [-]
  - ghurtado 6 hours ago
    Sir, this is a Wendy's.
    [-]
    - chrisweekly 4 hours ago
      https://knowyourmeme.com/memes/sir-this-is-a-wendys
    - gnarlouse 5 hours ago
      Stop, I bruised a rib laughing at this
asdev 6 hours ago
A feature like this isn't useful because knowing what connects to what, dependencies, etc. means nothing without business context. AI will never know the why behind the architecture, it will only take it at face value. I think technical design docs which have some context and reading the code is more than enough. This sits in the middle ground where it lacks the context of a doc and is less detailed than the code.
[-]
- philippta 6 hours ago
  To add to that, a lot of business context is stuck in people‘s heads. To reach the level of a human engineer, the coding agent would have to autonomously reach out and ask them directed questions.
- CharlesW 6 hours ago
  > AI will never know the why behind the architecture…
  That's true only if you don't provide that context. The answer is: Do provide that context. My experience is that LLM output will be influenced and improved by the why's you provide.
  [-]
  - asdev 6 hours ago
    if you know that context, you don't need a codemap
    [-]
    - CharlesW 2 hours ago
      As you just said, codemaps don't include the "why" behind the architecture. That's context you need to add.
    - baq 6 hours ago
      this is possible if you have a couple two-pizza teams. beyond that, good luck.
  - dingnuts 6 hours ago
    it takes longer to explain the context to the model than it does to just write the code based on the context I already understand, especially since code is more terse than natural language
    [-]
    - fizx 6 hours ago
      Definitely, iff you have to provide the context with every task. If agent memory worked better and across your whole team, then providing context might be much easier
    - Jaxan 6 hours ago
      But wouldn’t the context also be useful, in written form, to colleagues?
      [-]
      - CharlesW 2 hours ago
        It absolutely is, yes.
- swyx 6 hours ago
  you might be surprised how much business context leaks into a codebase and that's plenty to work on :)
  https://deepwiki.com/search/vimfnfnname-lets-you-call-neov_e...
  but also how much you kinda dont need it when you're just debugging code
  https://windsurf.com/codemaps/87532afd-092d-401d-aa3f-0121c7...
  [-]
  - asdev 6 hours ago
    agree that AI can kinda infer business context sometimes. in my experience, it doesn't work that well.
    a lot of the time, debugging isn't a logic issue, but more of a state exploration issue. hence you need to add logging to see the inputs of what's going on, just seeing a control flow isn't super useful. maybe codemaps could simulate some inputs in a flow which would be super cool, but probably quite hard to do.
  - thedelanyo 5 hours ago
    [dead]
ChrisbyMe 7 hours ago
this is the right way to try and tackle this problem imo. too much focus in AI dev tooling has been on building "products" that only half work.
making codebases understandable to humans, and LLMs etc, is a better approach
self documenting, interpretable systems would actually solve a lot of dev churn in big companies
plus it's not like artifacts have to be limited to code once that's figured out
[-]
- esafak 7 hours ago
  I don't think it's a choice; I use both. Code understanding is especially useful in new code bases, but once that's over you need to get work done.
ashirviskas 3 hours ago
So it is the same thing when I ask Claude to build me mermaid charts of code flows? So no point in this tool?
[-]
- ravila4 2 hours ago
  Btw, claude code is a lot better at graphviz than mermaid! I have been using it a lot for architecture designs.
- bigwheels 3 hours ago
  This was my conclusion, too! Over time as agentic coders get better at handling higher-complexity tasks, this kind of bracing will become less and less necessary.
kingjimmy 3 hours ago
Out of nowhere Cognition with a banging product. Probably not 100% yet but the idea is so good I'll be surprised if within 6 months all the other IDEs aren't copying.
[-]
- swyx 2 hours ago
  hey we been shipping!!
  https://cognition.ai/blog/swe-1-5
  https://cognition.ai/blog/swe-grep
  https://cognition.ai/blog/devin-agent-preview-sonnet-4-5
swyx 8 hours ago
(coauthor) happy to take any questions! see 1 min demo video here https://x.com/cognition/status/1985755284527010167
this is brainchild of cognition cto steven who doesn't like the spotlight but he deserves it for this one https://x.com/stevenkplus1/status/1985767277376241827
if you leave qtns here he'll see it
[-]
- bluelightning2k 7 hours ago
  Less a question and more a strong suggestion: codemaps should be viewable in the main pain. The sidebar is FAR too small. Either default or a button or something to open it like an editor tab.
  [-]
  - swyx 7 hours ago
    acked. you can also open it in a browser window for now https://windsurf.com/codemaps/9e2791c4-0b14-4757-b4be-a71488...
stevenkplus 5 hours ago
for people too lazy to download windsurf to try it, codemaps is also in deepwiki
example: https://deepwiki.com/search/how-do-react-hooks-work-under_7a...
this does a pretty good job of going in the weeds of how the useState hook works in react
jrochkind1 5 hours ago
Figuring out new codebases is definitely one of the most challenging and time-intensive things I have had to do in my jobs.
[-]
- swyx 2 hours ago
  we obviously agree but one of the problems i had with onboarding as a key proposition is that onboarding "seems" like a onetime problem. i lacked the datapoints or anecdotes to convincingly pitch "onboarding = context switching", a more recurring problem as the size of your team and size of your codebase and size of your tenure grows, even if its technically the same codebase, you're "always" onboarding to wahtever it is you're working on or maintaining or putting out fires.
  [-]
  - jrochkind1 51 minutes ago
    Depends on how big the codebase is and how many people work on it and how often you need to switch to an unfamiliar context, but yes agreed, it's context switching, and is a regular thing. It is part of the job of software engineering, learning piles of code you were previously unfamiliar with.
dennisy 7 hours ago
Looks an interesting enough feature to give Windsurf a try!
alansaber 5 hours ago
I really like this kind of applied statistical data infrastructure approach, feels much more natural than just raw text + immediate HIL
[-]
- swyx 5 hours ago
  huh? whats statistical about it? i hope i didnt give the wrong impression in the post.
  but yes, keeping the human in the loop, in charge, on top of the code, is the way to prevent ai slop code
yunyu 6 hours ago
Great idea. I always end up having to tag the relevant files/abstractions anyways to avoid having the LLM produce duplicated slop, and something like this makes collecting this info much easier.
rmonvfer 6 hours ago
This looks awesome. I’m a very heavy Claude Code user (and Codex) in both the CLI and VS Code (and now in the web too!) and it’s quite infuriating when the agent just gets lost after context compaction and I have to point it to read CLAUDE/AGENTS.md (and update it if a lot of changes have been made)
I tried Windsurf a while back but I’ll definitely come back ASAP just to play with this and see how it does in a somewhat complex project I’m working on.
Kudos to the team!
fHr 2 hours ago
this is amazing!