While there's not a lot of meat on the bone for this post, one section of it reflects the overall problem with the idea of Claude-as-everything:
> I spent weeks casually trying to replicate what took years to build. My inability to assess the complexity of the source material was matched by the inability of the models to understand what it was generating.
When the trough of disillusionment hits, I anticipate this will become collective wisdom, and we'll tailor LLMs to the subset of uses where they can be more helpful than hurtful. Until then, we'll try to use AI to replace in weeks what took us years to build.
It's amusing to think that claude might be better at generating ascii diagrams than generating code to generate diagrams, despite it being nominally better at generating code.
I'm generating a lot of PDFs* in claude, so it does ascii diagrams for those, and it's generally very good at it, but it likely has a lot of such diagrams in its training set. What it then doesn't do very well is aligning them under modification. It can one-shot the diagram, it can't update it very well.
The euphoric breakthrough into frustration of so-called vibe-coding is well recognised at this point. Sometimes you just have to step back and break the task down smaller. Sometimes you just have to wait a few months for an even better model which can now do what the previous one struggled at.
Funny to see this show up today since coincidentally I've had Claude code running for the past ~15 hours attempting to port MicroQuickJS to pure dependency-free Python, mainly as an experiment in how far a porting project can go but also because a sandboxed (memory constrained, to us time limits) JavaScript interpreter that runs in Python is something I really want to exist.
I'm currently torn on whether to actually release it - it's in a private GitHub repository at the moment. It's super-interesting and I think complies just fine with the MIT licenses on MicroQuickJS so I'm leaning towards yes.
> I spent weeks casually trying to replicate what took years to build. My inability to assess the complexity of the source material was matched by the inability of the models to understand what it was generating.
When the trough of disillusionment hits, I anticipate this will become collective wisdom, and we'll tailor LLMs to the subset of uses where they can be more helpful than hurtful. Until then, we'll try to use AI to replace in weeks what took us years to build.
I'm generating a lot of PDFs* in claude, so it does ascii diagrams for those, and it's generally very good at it, but it likely has a lot of such diagrams in its training set. What it then doesn't do very well is aligning them under modification. It can one-shot the diagram, it can't update it very well.
The euphoric breakthrough into frustration of so-called vibe-coding is well recognised at this point. Sometimes you just have to step back and break the task down smaller. Sometimes you just have to wait a few months for an even better model which can now do what the previous one struggled at.
* Well, generating Typst mark-up, anyway.
I'm currently torn on whether to actually release it - it's in a private GitHub repository at the moment. It's super-interesting and I think complies just fine with the MIT licenses on MicroQuickJS so I'm leaning towards yes.
Its got to 402 tests with 2 failing - the big unlock was the test suite from MicroQuickJS: https://github.com/bellard/mquickjs/tree/main/tests
Its been spitting out lines like this as it works: