> While I’m certain that this technology is producing some productivity improvements, I’m still genuinely (and frustratingly) unsure just how much of an improvement it is actually creating.
I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools. A lot of the time, I'm resorting to using an AI because I can't get information on how the current API of some-thing works into my brain fast enough, because the docs are non existent, outdated, or scattered and hard to collate.
As someone who does broad activities, it supercharges a lot of things. Having a critical eye is required though. I estimate 40%-60% improvements on basic coding tasks.
Yeah I get this impression too. AI feels like it's papering over overwrought and badly designed frameworks, tech stacks with far too many things in them, and also the decline of people creating or advocating for really expressive languages.
Pragmatic sure, but we're building a tower of chairs here rather than building ladders like a real engineering field.
> To what degree did I expand scope because I knew I could do more using the AI?
Someone at work recently termed this “Claude Creep”. It’s so easy to generate things push you towards going further but the reality is that’s you’re setting yourself up for more and more work to get them over the line.
the flip side of claude creep is that the easy parts are now genuinely free, which means all your time goes to the 30% that was already hard. ai doesn't save you time on the hard bits, it just eliminates the excuse to not have done the easy bits first.what's helped: think in postconditions, not tasks. instead of 'add feature X', define 'the tests pass and the user can do Y'. the agent figures out what X means. without that anchor there's nothing to mark as done, so scope drifts indefinitely.
100%
Over the years I've amassed hundreds of code boilerplate snippets/templates that I would copy and paste and the modify, and now they're all just sitting in Obsidian gathering dust.
Why would I waste my time copying and pasting when I can just have Claude generate me basic ansible playbooks on the spot in 30 seconds.
Some of the expanded scope that I’ve done almost for free is usually around UX polish and accessibility. I even completely redid the —help for a few CLI tools I have when I would never have invested over an hour on each before agents.
I agree that the efficiency and quality are very hard to measure. I’m extremely confident that when used well, agents are a huge gain for both though. When used poorly, it is just slop you can make really fast.
Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
> I’ve had the idea that from a social perspective it’d be regarded like plastic surgery, in that it only looks weird when its over-done, or done badly.
An important aspect of comparison is that nobody is going to tell you that your surgery is noticeable or looks bad.
Your friends, family, partners, coworkers, aren't going to say anything, neither are people you meet casually, certainly not service workers, strangers aren't going to pull you aside to tell you the truth about your nose job, etc.
I hope the same social taboo doesn't transfer over to AI content. We should honestly critique AI generated content, used either in-whole or in-part with human creations. If the inclusion of AI content botched your article, saying so should be socially acceptable.
We saw some of this here on HN. It used to be that when AI content would be submitted here, it was a social faux pas to even mention it was LLM generated, same thing with LLM generated comments, no matter how obvious it was. Mentioning a comment was AI was socially verboten and you'd be finger-wagged at.
Eventually, AI fatigue caused the community to discount Show HN entries, submissions and comments, and the signal to noise ratio could no longer be ignored.
Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section. These comments objectively hurt discussion and it's a good thing they're shadowbanned.
Culturally, I hope we can reach a point where critique of AI content, including code, doesn't brand critics as haters, Luddites, or worse, and stifle conversation about what our communities really value and want.
It's funny you mention that. The only difference is sometimes you need a functionality without doing the plumbing. At the end of the day if you're getting the output you need, the process doesn't matter. It's an interesting analogy but only works if the inspector is another expert dev.
I would agree with the utility of Claude and Claude Code. Claude feels like your own executive assistant, sales team and IT department. Combine that with Claude Code and you can build some incredible things. Myself as an example, I used Claude to advise me on starting a business and building a MVP. After a few weeks of refinement I was able to create something I never could have done without Claude. It is a game changer for sure.
> (The) Output was coherent but its ‘style’ was very boring and overtly inoffensive, which was (and still is) a clear limitation of the technology.
The style isn’t a limit of the technology, it’s a limit of the lobotomized models from OpenAI and Anthropic. The open source community has lots of models that are great at creative writing.
The Gartner hype cycle has 5 phases: tech trigger (6 months - 2 years), peak of inflated expectations (6 months - 2 years ), the slope of enlightenment (2 - 5 years), and the plateau of productivity (5+ years), and the slope of decline (Obsolescence which noone talks about). If we are in fact at the 40th month then we are either approaching the peak of inflated expectations, the slope of enlightenment, or the plateau of productivity. I would say we are probably approaching the peak of inflated expectations. We are constantly hearing the symptoms of the 'This Time is Different' Syndrome from people saying the old rules don’t apply which is the classic sign the peak is approaching. The average financial bubble bursts after 3 years, however the dot-com bubble burst 5 years after peak and the housing bubble took 3-4 years. We are probably in the “bubble mania” phase right now because of all the irrational exuberance. Ride the Lightning!
The section about being "glazed" into action resonates. Hidden within this concept I think is something profound about human motivation, innuendo and all.
> AI generated prose is at best boring, and at worst genuinely unappealing. I’m continually tempted, because in theory it should work well. The AI has perfect spelling and grammar, has more than enough context to produce article-length content, and can do in seconds what takes me hours.
I have a thesis in mind...that there is something fundamental to the human spirit that relishes a sort of friction that LLMs cannot observe or reproduce on their own.
Do you regularly find text content that you know is AI written (but is not marked as such)? Because honestly I don't, and it must exist in decent quantity by now. Or perhaps it's still sparse?
Have a look here [1] and here [2] - I think they are good resources, but fallible in the long run. I think yes, I do, often confirmed by communication with people I know (i.e. i suspect they have used AI to make something -> I ask). This falls victim to confirmation bias, though. I suspect a nontrivial amount of writing I read is AI generated without me realising, and I'm wary also of falsely flagging AI-generated content that is actually from humans.
I think the second resource that you linked to is valuable. The first is useless unless you're a Wikipedia editor, the significance of verifying citations not withstanding.
The gap between LLM-generated writing and the composite style of the average Wikipedia page is more narrow than most people may believe.
- Other source-to-text integrity issues; for example, the WWF source says very little about Malaysia specifically, only mentions Sunda tigers (Panthera tigris sondaica), and does not mention tapirs at all
- Very short yet consistent paragraph length
- Generic "see also" links, one of which is redlinked
This is not the sort of thing that I pay attention to unless I'm doing detailed research. And even then I'd probably have a bot check these for me, ironically, since it's such a mechanical job. At the very least detecting AI like this requires conscious effort.
You will start to recognize it over time. The major AI models each have their own voice and patterns that they overuse.
The more you see those patterns the more you start recognizing them. By now I can recognize quickly if a blog post or README.md was generated by Claude or ChatGPT because the signs are so obvious.
Even Hacker News comments that are AI written are easy to spot if they weren't edited. I know I'm not alone because when I recognize an AI comment I check their comment history and find other people calling out their AI-generated submissions, too.
Learning how to recognize the output of the popular AI models is becoming a critical business skill, too. You need to be able to separate out the content from someone who was doing real work that you should take seriously as opposed to the output of someone who is having ChatGPT produce volumes of text that they don't review. The people who do that will waste your time.
This is a temporary problem. Look at how fast things are progressing. Things will improve until none of this matters because the output is indistinguishable.
Yes, often, and often here on HN or Substack if I point it out, it doesn't lead to anything good. Many don't recognize it, many do, the author gets defensive etc.
This article doesn't have the tells, it looks human written.
I found that many people don't have a radar for this. They may know about delve, emdashes, tapestry, multifaceted or "not just X but y" and if these are not there they don't see it.
Bro but... you now are having a business is planned by a paid chatbot, they can shutdown anytime or make it more expensive, also it is imposiable to get something new, you are copying for somewhere else, maybe what claude is copying is having a copyrights on it, like a leaked code and etc, also your brain will slowly shutdown from thinking about 'business' so you will hevaly relays on claude in the future :)
My friend is trying to do the same, the Docker stack he made for his SaaS is really amazing, it is following the standards from the ancient age.
I suspect you'll (a small-medium business) be able to buy a Claude 4.6-class rack mount device for $6000 by 2030 that does 100 t/s with 1 million token context, which honestly, is probably adequate for an office (front office, back office, executive tier etc) of 10-300 unless you've got more than 4 engineers on staff. That kind of offline device is going to push everyone to provide that kind of cloud-enabled baseline service at very low cost. The Qwen 3.5 series is already showing you can almost (but not quite) squeeze that kind of performance out of consumer hardware. 256/512gb consumer video cards will get us there, eventually, if capacity ever catches up with demand.
I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools. A lot of the time, I'm resorting to using an AI because I can't get information on how the current API of some-thing works into my brain fast enough, because the docs are non existent, outdated, or scattered and hard to collate.
I don't bring huge codebases to it.
Pragmatic sure, but we're building a tower of chairs here rather than building ladders like a real engineering field.
Someone at work recently termed this “Claude Creep”. It’s so easy to generate things push you towards going further but the reality is that’s you’re setting yourself up for more and more work to get them over the line.
I agree that the efficiency and quality are very hard to measure. I’m extremely confident that when used well, agents are a huge gain for both though. When used poorly, it is just slop you can make really fast.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
> I’ve had the idea that from a social perspective it’d be regarded like plastic surgery, in that it only looks weird when its over-done, or done badly.
Your friends, family, partners, coworkers, aren't going to say anything, neither are people you meet casually, certainly not service workers, strangers aren't going to pull you aside to tell you the truth about your nose job, etc.
I hope the same social taboo doesn't transfer over to AI content. We should honestly critique AI generated content, used either in-whole or in-part with human creations. If the inclusion of AI content botched your article, saying so should be socially acceptable.
We saw some of this here on HN. It used to be that when AI content would be submitted here, it was a social faux pas to even mention it was LLM generated, same thing with LLM generated comments, no matter how obvious it was. Mentioning a comment was AI was socially verboten and you'd be finger-wagged at.
Eventually, AI fatigue caused the community to discount Show HN entries, submissions and comments, and the signal to noise ratio could no longer be ignored.
Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section. These comments objectively hurt discussion and it's a good thing they're shadowbanned.
Culturally, I hope we can reach a point where critique of AI content, including code, doesn't brand critics as haters, Luddites, or worse, and stifle conversation about what our communities really value and want.
The style isn’t a limit of the technology, it’s a limit of the lobotomized models from OpenAI and Anthropic. The open source community has lots of models that are great at creative writing.
The section about being "glazed" into action resonates. Hidden within this concept I think is something profound about human motivation, innuendo and all.
> AI generated prose is at best boring, and at worst genuinely unappealing. I’m continually tempted, because in theory it should work well. The AI has perfect spelling and grammar, has more than enough context to produce article-length content, and can do in seconds what takes me hours.
I have a thesis in mind...that there is something fundamental to the human spirit that relishes a sort of friction that LLMs cannot observe or reproduce on their own.
[1] https://en.wikipedia.org/wiki/Wikipedia%3AAI_or_not_quiz [2] https://en.wikipedia.org/wiki/Wikipedia%3ASigns_of_AI_writin...
The gap between LLM-generated writing and the composite style of the average Wikipedia page is more narrow than most people may believe.
AI generated. Some of the clues include:
- Most obviously, a failed ISBN checksum
- Other source-to-text integrity issues; for example, the WWF source says very little about Malaysia specifically, only mentions Sunda tigers (Panthera tigris sondaica), and does not mention tapirs at all
- Very short yet consistent paragraph length
- Generic "see also" links, one of which is redlinked
This is not the sort of thing that I pay attention to unless I'm doing detailed research. And even then I'd probably have a bot check these for me, ironically, since it's such a mechanical job. At the very least detecting AI like this requires conscious effort.
I can easily tell AI writing. I'm sure plenty goes under the radar, but I can still catch a lot.
The more you see those patterns the more you start recognizing them. By now I can recognize quickly if a blog post or README.md was generated by Claude or ChatGPT because the signs are so obvious.
Even Hacker News comments that are AI written are easy to spot if they weren't edited. I know I'm not alone because when I recognize an AI comment I check their comment history and find other people calling out their AI-generated submissions, too.
Learning how to recognize the output of the popular AI models is becoming a critical business skill, too. You need to be able to separate out the content from someone who was doing real work that you should take seriously as opposed to the output of someone who is having ChatGPT produce volumes of text that they don't review. The people who do that will waste your time.
This article doesn't have the tells, it looks human written.
HN and YouTube are the worst offenders for me.
https://www.seriouseats.com/eggplant-grilling-tips-11759622
My friend is trying to do the same, the Docker stack he made for his SaaS is really amazing, it is following the standards from the ancient age.
Local models are about 25 months behind the current SOTA. If that holds, businesses won't need the paid models for many things.
Not counting from 1971s DARPA? Sorry I'm allegric when LLMs being called AI like nothing existed before it.