A lot of people down on AI in this thread, but I'm watching the industry slip over the line of trust with these latest frontier models. GPT 5.5 is the first model good enough for me to just let rip.
Every jira ticket I see now has acceptance criteria, reproduction steps, and detailed information about why the ticket exists.
Every commit message now matches the repo style, and has detailed information about what's contained in the commit.
Every MR now has detailed information about what's being merged.
Every code base in the teams around me now has 70 to 90%+ code coverage.
Every line of code now comes with best practices baked in, helpful comments, and optimized hot paths.
I regularly ship four features at a time now across multiple projects.
The MCP has now automated away all of the drudgery of programming, from summarizing emails, to generating confluence documentation, to generating slide decks.
People keep screaming that tech debt is going to pile up, but I think it's going to be exactly the opposite. Software is going to pile up because developing it is now cheap.
Most code before llms sucked. Most projects I on-boarded to were a massive ball of undocumented spaghetti, written by humans. The floor has been raised significantly as to what bad code can even look like, and fixing issues is now basically free if your company is willing to shell out for tokens.
What you are describing is a the role of a manager, not a software engineer. Software engineering has very little to do with writing code, but more on architecting at the higher level on what needs to be done. The code is just the executional part. LLMs can code? Ok good. Without a clear architectural pathway / direction, that code is just useless. It's not tech debt. It's just a bunch of random strings. You can argue that Claude code and others do create a plan of attack - but still, it's not at the architectural level, but rather executional level.
To me, architecture starts all the way from the top - even before you write a single line of code, you do the DDD (Domain-Driven Design) and then create a set of rulesets (eg. use the domain name as table prefix) and contexts and then define the functionality w.r.t to that architecture. LLMs can do all this - only if you ask them to explicitly. So, they are pretty useful to brainstorm with, but not autonomously design reliably and push it to production with your eyes closed and support a 100,000 user base. It's a far cry from that.
But sure, you can upsell to management about the vanity metrics like lines of code and get that promotion with LLM. But, it's still not software engineering.
> Software is going to pile up because developing it is now cheap.
Software to do what, though ?!
Coding, maybe 10% of a developers job (Brooks "Silver Bullet" estimates 1/6), was never the bottleneck, and even if you automated that away entirely then you've only reduced development time by 10% (assuming you are not doing human code review etc).
I would also argue that software development as a whole (not just the coding part) was also typically never the bottleneck to companies shipping product faster, maybe also not for automating their business faster (internal IT systems), since the rest of the company is not moving that fast, business needs are not changing that fast, and external factors that might drive change are not moving that fast either.
I think that when the dust settles we'll find that LLM-assisted coding has had far less impact than those trying to sell it to us are forecasting. There will be exceptions of course, especially in terms of what a lone developer can do, or how fast a software startup can get going, but in terms of impact to larger established companies I expect not so much.
For people who like to tick boxes, which is essentially most of the above, AI is welcome. That includes managers.
It still has nothing to do with software engineering. All good code was written by humans. AI took it, plagiarizes it, launders it and repackages it in a bloated form.
Whenever I look deeply at an AI plagiarized mess, it looks like it is 90% there but in reality it is only 50%. Fixing the mess takes longer than writing it oneself.
The hard part of software engineering is turning a vague problem description into a set of box-ticking exercises. If ticking boxes became genuinely easier, the software engineering part is now a lot more valuable.
> Your linter should identify all issues - including architectural
If a linter could deterministically identify bad architecture, you wouldn't need an LLM, your linters could just write your code for you. The vibe coding takes are just getting more and more empty-headed...
> I regularly ship four features at a time now across multiple projects.
Many people are missing the fact that LLMs allow ICs to start operating like managers.
You can manage 4 streams now. Within a couple years, you may be able to manage 10 streams like a typical manager does today.
IME, LLMs don't speed you up that much if 1) you're already an expert at what you're doing (inherently not scalable), 2) you're only working on one thing (doesn't make sense when you can manage multiple streams), or 3) doing something LLMs are particularly bad it (not many remaining coding tasks, but definitely still some).
A manager doesn't have to look at the code that's being shipped. An IC will still need to do that, and this will eventually take up much of their work. It can be addressed by moving up the stack to higher level and more strictly checked languages, where there's overall less stuff to review manually.
Just like a manager, you don't need to look at the code. You need to set up quality systems to provide evidence the code does what it is supposed to do, just like a manager.
I agree with most of this, I just have sort of turned a blind eye to what the code actually probably looks like. Reviews are rapid, and I’ll admit I do feel like I’m betraying my inner programmer by just optimizing directly against the claims of token bot. But the way I see it, as long as the numbers don’t lie I’m okay with the process.
Everyone talks about productivity as if that is the only metric that matters in the business.
The MCP has now automated away all of the drudgery of programming, from summarizing emails, to generating confluence documentation, to generating slide decks.
I wonder about the hallucination. Reading someone's writing doesn't take all that long.
Is programming supposed to suck all the time? Am I doing it wrong? I mean yeah, sure, it sucks sometimes, but overcoming that "suck" is where I feel progress and growth. If we decide to optimise that away...What the fuck am I doing here? No offence to managers, but if everybody is a manager, is anybody?
> GPT 5.5 is the first model good enough for me to just let rip.
You know this is the exact same thing said during Opus 4.6, right?
That makes it hard to believe because it's the same "last week's model was so much behind you can't even comprehend" meme that's been going on throughout last year.
More info dumped into tickets and projects is great for understanding for both people and LLM. But hopefully not LLM generated.
I think numerically this is the exception - and it's a fantastic exception! But in practice what I've seen is things getting worse because people still just aren't very good at thinking, so the great-looking Jira ticket actually turns out to be nonsensical in some subtle way, whereas before it was just lacking in some obvious way that could immediately be called out and had an obvious solution.
I.e. it's making good output better, but it's making mediocre output (which is most output) worse by adding volume and the appearance of quality, creating a new layer of FUD, stress, tedium, and unhappiness on top of the previously more-manageable problems that come with mediocre output.
I'm still seeing this even with the newest models, because the problem is the user, not the model - the model just empowers them to be even worse, in a new and different way.
If writing code was the only part of the job, and it was easy, these jobs wouldn't pay so well.
Engineering is hard. It's always going to be hard. I'm glad that AI makes some parts of it easier, and we (software engineers) can focus on engineering, that's nice.
Code is NEVER cheap. Just because, at current completely unrealistic AI pricing, using agents is cheaper than hiring juniors, does not make code cheap. It makes producing code cheap, which has always been low-cost. Every line of code is a cost, is a maintenance burden, is complexity. An AI, even with somehow infinite context window, will cost more money the more code you have.
Could you replace a whole team of engineers with AI? Probably, yeah. Could you simply fire everyone at your company and close it down, without much of a problem? Also probably yes, for most companies.
AIs can help with debugging, can help with writing code, with drafting designs, they can help with almost every step. The second you let OpenAI, or Anthropic, take full code ownership over your products, and you fire the last engineer, is the time when the AI pricing can go up to match what engineers make today. You've just reinvented the highly paid consultant.
Or you could take the middle-ground and hire good engineers, make sure they maintain an understanding of the codebase, and let them use whatever tools they use to get the job done, and done well. This is the way that I've seen competent companies handle it.
"We" should not do anything. The LLM industry should go and find solutions for the problems they created, themselves. Not offload it to others through sneaky influencer posts. And we should hold them responsible, should they not be able to address the problems they are creating.
Certain types of code are cheap. Proof of concept is cheap. Adding small features that fit within the existing architecture is cheap. Otherwise, I'm not so sure. Coding agents are fantastic at minutiae, but have no taste. They'll turn a code base into a ball of mud very quickly, given the opportunity.
While I agree with you that agentic coding still has quite a way to go and is not always producing the quality that I would want from it, I can say quite confidently that its baseline is way above some of the production code in many applications many people use today. It really isn’t that code before agents was primarily written with taste and beautiful structure in mind. Your average code base is a messy hell full of quick fixes that turned into all kinds of debt over the years.
I took the previous post, with its mention of the ball of mud, to be about complexity.
“Taste”, is used in many cases, I suspect, to give a name the collection of practices and strategies developers use to keep their code and projects at a manageable level of complexity.
LLMs don’t seem to manage complexity. They’ll just blow right past manageable and keep on going. That’s a problem. The human has to stay in the loop because LLMs only build what we tell them to build (so far).
BTW, the essay that introduced the big ball of mud pattern to me didn’t hold it up as something entirely bad to be avoided. It pointed out how many projects — successful or at least on-going projects — use it, and how its passive flexibility might actually be an advantage. Big ball of mud might just be the steady state where progress can be made while leaving complexity manageable.
I think there are at least two factors behind ye olde ball of mud that LLMs should be able to help with:
1. Lack of knowledge of existing conventions, usually caused by churn of developers working on a project. LLMs read very quickly.
2. Cost of refactoring existing code to meet current best practices / current conception of architecture. LLMs are ideal for this kind of mostly mechanical refactoring.
Currently, though, they don't see to be much help. I'm not sure if this is a limitation in their ability to use their context window, or simply that they've been trained to reproduce code as seen in the wild with all its flaws.
Keeping complexity down is always a conscious act. Because you need to go past the scope of the current problem and start to think about how it affects the whole project. It’s not a matter of convention, nor refactoring. It’s mostly prescience (due to experience) that a solution, even if correct and easy to implement, will be harmful in the long term.
Architecture practices is how to avoid such harmful consequences. But they’re costly and often harmful themselves. So you need to know which to pick and when to start applying them. LLM won’t help you there.
I agree. I do wonder if what I'm seeing is a limitation of the reasoning power of LLMs or if it's just replicating the patterns (or lack thereof) in the training data.
Preproduction code was always cheap or even free. Sales people have been selling software that didn't do what was on the tin since the dawn of time. Those features cost 0 dollars to write!
Production code. Especially production code with bugs is expensive. It can cost you customers, you can even get negative money for it in the form of law suits.
Coding agents are great for preproduction and one offs. For production I really wouldn't chance it at any scale above normal human output.
Except here's the thing, that's the sort of code that was extremely expensive before, in large part because of our day jobs (which still to this day require mindfulness and can't just be vibe-coded).
However, an extra script here or there to make your life easier, adding extra UI features based on some datapoint to your internal dashboard, ect, these were things that could've taken a few days you didn't have before to get exactly right and now they can be done with only a few minutes of attention.
I came here exactly to point out what I'm glad to see is 10. "Free as in puppies" is a wonderful way to put it.
Every time I open linkedin I'm scared of how many big heads have taken the wrong lesson that coding almost free == free engineering. So many bait posts asking engineers why they would need to pay them any longer, or being glad they're generating millions of lines a month....this is going to end badly.
> 10. Code is cheap, but maintenance, support, and security aren’t.
I also keep circling around this point. So many software repositories in the AI space seem to follow a publish and forget pattern. If you simply can show that you have the patience to maintain a project, ideally with manual intervention instead of a fully autonomous AI, then you already have an outstanding project.
I had a business owner tell me that they don't need to hire juniors anymore because claude can do all of that work for them. This was not a software shop so it's not even about writing code but I also thought that was something that will bite in the near future. A business that is not investing in juniors is a business that is not investing in the future.
The role of AI in non-software shops is going to be interesting. To a great extent it's not competing with devs, it's competing with Excel. However bad a system your AI can produce, it can't compare to the workflows that a group of non-techies armed only with Office can produce.
On the other hand, like giving a supercar to a teenager, this just enables them to get into trouble faster.
(the "my vibe coded app deleted prod!" stories are funny schadenfreude when they happen to SV startups, whose whole business is pretending to know better. When this happens to a small business who've suddenly lost all their finanacials and now maybe will lose their house, it's a tragedy. And this can happen on a much larger, not AI-related scale, like Jaguar Land Rover: https://www.bbc.co.uk/news/articles/cy9pdld4y81o )
> The role of AI in non-software shops is going to be interesting
I have friend in west Texas who does industrial electrical gear sales (like those giant spools of cable you see on tractor trailers). He’s 110% good old boy Texan but has adopted and loves AI. He says it’s been a huge help pulling quotes together and other tasks. Coincidentally he lives in Abilene where one of the stargate campuses are going. Btw, the scale of what’s being built in Abilene is like nothing I’ve ever seen.
Agreed, but a worrying amount of managers and leaders spend time there for reasons I never fully understood, so it offers a glimpse into their worldview.
The issue is that when you gaze long into an abyss, the abyss also gazes into you.
Guy works for the Overture Map Foundation, with Amazon, Microsoft etc. being sponsors. He has been boosting AI all over the Internet. I'm sure Microslop and Amazon are very happy with these efforts.
I'm glad that "10 ways to do X" submissions are allowed as long as they boost AI.
Are you suggesting that Microsoft and Amazon's sponsorship of Overture comes with an understanding that people who work on Overture will spend their time writing articles that "boost AI"?
Does "boosting AI" include opening an article with "Frontier models are really good at coding these days, much better than they are at other tasks"?
Can't speak for the former, but the latter question: yes.
"Product is really good at X, much better than at Y" does not imply that it's bad at Y, and even if it did, if you're targeting an audience that only cares about X, who gives a shit about Y? Might as well throw Y under the bus to boost the perceived effectiveness of product at X even more in comparison.
I am in India, junior developer hiring is all down. Ai has reduced offshoring to India and eliminated the need for janitor work (often offloaded to juniors).
Many people are finding it difficult to even land internships.
The most affected areas are sysadmin, devops, and frontend. Where you'll have very hard time getting any offer.
Companies like BrowserStack are withdrawing campus placement offers.
Meanwhile, I am writing apps for my own use and have reached 10,000+ monthly active users already, even though I am making zero money from doing all this, but it's fun.
Looking at the entire market in Europe it is also down but that is not due to "AI" but because they are easiest to fire with least consequences. There is a global recession looming, despite Wall Street saying otherwise.
#10 needs more emphasis than it receives. Cheaper code doesn't automatically lead to good product decisions.
Instead of focusing on whether you can build it, the scarcer resource becomes whether you should build it. And most teams lack a clear process for addressing this latter question. Requirements are collected in all sorts of places without ever being prioritized in an organized fashion. This is exacerbated by cheaper code. With cheaper code, you can release five times what you used to be able to release in a given period of time, but only if you knew which five products you needed.
This is such a weird argument, beside obvious #10 which will bite back with a vengeance, because... code can't be cheaper than free!
Since at least the early 80s a LOT of very important code wasn't cheap, it was free. Both free of cost (you could "just" download it and run it) but also free as freedom-respecting software.
I just don't get the argument that cheap is new. Cheap is MORE expensive than free!
It’ll be priced slightly higher than the cost to actually run. But it’s still not clear what the real cost of the big models is. They seem very subsidised, but by how much?
It remains an unproven hypothesis. The revenue of the top 2-3 labs is still growing nearly exponentially, which is the ultimate piece of data that settles the question empirically for now. Benchmark scores aren't really proof. Benchmaxxing is possible, for example. Only revenue numbers (and gross margins) count.
The ultimate piece is not revenue but profit. At some point these enormous investments will have to be earned back. Good luck with that when open weight models are also continuously improving, have cheap providers and for many are already very usable.
You can easily develop with models like GLM 5.1 and Kimi k2.6 at a fraction of the cost of GPT 5.5 or Opus 4.7. Requests often cost just a few cents.
Open-source models have caught up tremendously recently. Those who can’t or don’t want to invest a lot of money can already develop with Kimi and GLM without any problems. We don’t have to wait another year for that.
Tried deepseek 4 w/ CC yesterday, and was watch my usage eke up by only 0.01 at a time while doing plenty of high-token-count tasks. I understand it's currently at a discount, but even after that expires the same-quality output will be available at a fraction of the cost of the expensive models.
From experience, the same level of usage would have left me stranded on my CC 5 hr limit within an hour.
There were some difficulties with tool calls, in particular with replacing tab-indented strings - but taking no steps to mitigate that (which meant the model had to figure it out every time I cleared context) only cost relatively few extra tokens -- and it still came in well under 4.6, nevermind 4.7. And of course, I can add instructions to prevent churning on those issues.
I have no reason to go back to anthropic models with these results.
Sure, but there will always be some monstrosities like Mythos that'll pwn all software written by local models in 0.01 seconds, thus forcing people/companies to use the most advanced paid models to keep up and stay unpwned for 1 second longer.
You cutoff a generation of juniors from employment and learning , the seniors are gone and it's all harnesses and AI systems.
I'm not all gloom and doom but the treatment of junior engineers is something I think we will either regret or rejoice. Either will have a spur of creative people doing their own independent thing or we'll have lost a generation of great engineers.
If you fire all your SWEs they won't sit around twiddling their thumbs waiting for an AI collapse, they'll career shift. Maybe to an unemployment line and/or homelessness, maybe to something else productive, but either way they'll lose SWE skills.
If you close down all the SWE junior positions you'll strongly discourage young people training in the field. They'll do something else.
Then if you want to go back, who will you hire for it?
They are large language models. Not automated development machines. They hallucinate.
The goal post has not shifted since 2023 or so. Make an LLM that doesn't blatantly disregard knowledge it has, instructions it has been giving, over and over, and you win. If trillions of USD of investment can't do it, I'd be curious to see what can.
There are definitely automated dev systems, of which an LLM is a part. The remaining part may be called a 'harness' or whatever. The quality of the generated software is another matter.
If the AI is not good enough, then don't fire the devs. If/when the devs are no longer needed, I don't see why the need would return later, that was my point.
The problem of "instant legacy" systems: something that's vibe coded and reached unmaintainable by either the AI or humans, but is also now indispensable because users are relying on it.
Some of that is already there .. but the users generally have nowhere else to go and ineffective pushback. "Enterprise software" has been awful for decades, things like Lotus Notes and SAP. Everyone hates Windows; everyone continues to use Windows.
Users don't currently trust software. Look at what we've done to them - can you blame them?
The consumer space is about extracting every ounce of personal data possible.
The b2b space is about "maximizing customer value" - that is, not maximizing the value of your product to the customer, but maximizing the value of the customer to your business. Lock them in and lock them down, make your product "sticky" so they can't leave without immense cost.
Make usable software. Cheap code means that you can create a lot more prototypes to then perform usability tests by finding a user and sitting next to them. I mostly worked on internal apps lately, so perhaps it's much easier for me to do than it is for some others.
I think you can boil down most of the list to: Understand what you want to do.
I’m not convinced about rebuilding repeatedly as a learning tool though. As relatively quick as it is, it over emphasizes the front line problems you face early. Those tend to be simpler, more straightforward issues that can be more quickly solved by a few minutes of thought (and more cheaply too).
The pure "coder" role, per that paper, died out almost immediately. Nowadays it's done by compilers (a deterministic automation). The distinction between analyst and programmer held out a bit longer - ten years ago I was working somewhere that had "business analysts", essentially requirements-wranglers. It's possible that the "programmer" job of converting a well-defined specification into a program is also going to start disappearing.
.. but that still leaves the specification as the difficult bit! It remains like the old stories with genies: the genie can give you what you ask for. But you need to be very sure what you want, very clear about it, and aware that it may come with unasked-for downsides if you're not.
Code might be cheaper but it's still a liability. In that regard anything that's not been properly designed and documented is going to be an even bigger issue.
Stick to patterns which were painful before. For example, I recently refactored a project written in TS to use better-result instead of throwing errors. Without Claude writing out all of that boilerplate I could not have imagined transitioning to this. Right now the cost of "doing it right" is decreased so much there is no reason to ship slop / poorly thought out code.
People should do what has always been needed, rather than focus on how hard it is to build something, or easy, find what is needed, what right is, what good is, what quality is that actually solves problems and do those things.
I've found the get-shit-done tool[1] to be quite useful for forcing me to properly plan the implementation and ensuring the context remains small and relevant at all times.
It is slower than when I was just using Claude directly though.
I've tried this, it's honestly not worth the amount of time (and additional context) for the results. I've had more success prompting Claude with manageable and testable iterations.
Planning is good but get-shit-done just added too much planning in my opinion.
Every jira ticket I see now has acceptance criteria, reproduction steps, and detailed information about why the ticket exists.
Every commit message now matches the repo style, and has detailed information about what's contained in the commit.
Every MR now has detailed information about what's being merged.
Every code base in the teams around me now has 70 to 90%+ code coverage.
Every line of code now comes with best practices baked in, helpful comments, and optimized hot paths.
I regularly ship four features at a time now across multiple projects.
The MCP has now automated away all of the drudgery of programming, from summarizing emails, to generating confluence documentation, to generating slide decks.
People keep screaming that tech debt is going to pile up, but I think it's going to be exactly the opposite. Software is going to pile up because developing it is now cheap.
Most code before llms sucked. Most projects I on-boarded to were a massive ball of undocumented spaghetti, written by humans. The floor has been raised significantly as to what bad code can even look like, and fixing issues is now basically free if your company is willing to shell out for tokens.
To me, architecture starts all the way from the top - even before you write a single line of code, you do the DDD (Domain-Driven Design) and then create a set of rulesets (eg. use the domain name as table prefix) and contexts and then define the functionality w.r.t to that architecture. LLMs can do all this - only if you ask them to explicitly. So, they are pretty useful to brainstorm with, but not autonomously design reliably and push it to production with your eyes closed and support a 100,000 user base. It's a far cry from that.
But sure, you can upsell to management about the vanity metrics like lines of code and get that promotion with LLM. But, it's still not software engineering.
Software to do what, though ?!
Coding, maybe 10% of a developers job (Brooks "Silver Bullet" estimates 1/6), was never the bottleneck, and even if you automated that away entirely then you've only reduced development time by 10% (assuming you are not doing human code review etc).
I would also argue that software development as a whole (not just the coding part) was also typically never the bottleneck to companies shipping product faster, maybe also not for automating their business faster (internal IT systems), since the rest of the company is not moving that fast, business needs are not changing that fast, and external factors that might drive change are not moving that fast either.
I think that when the dust settles we'll find that LLM-assisted coding has had far less impact than those trying to sell it to us are forecasting. There will be exceptions of course, especially in terms of what a lone developer can do, or how fast a software startup can get going, but in terms of impact to larger established companies I expect not so much.
It still has nothing to do with software engineering. All good code was written by humans. AI took it, plagiarizes it, launders it and repackages it in a bloated form.
Whenever I look deeply at an AI plagiarized mess, it looks like it is 90% there but in reality it is only 50%. Fixing the mess takes longer than writing it oneself.
Your linter should identify all issues - including architectural and stylistic choices - and the AI agents will immediately repair them.
It's about 1000x faster than a human code at repairing its own mess.
If a linter could deterministically identify bad architecture, you wouldn't need an LLM, your linters could just write your code for you. The vibe coding takes are just getting more and more empty-headed...
a) that's not what a linter is built for, its a tool with very specific role
b) You must've never seen LLM expose secrets in plain text or use the most convoluted scenarios you can think of.
Many people are missing the fact that LLMs allow ICs to start operating like managers.
You can manage 4 streams now. Within a couple years, you may be able to manage 10 streams like a typical manager does today.
IME, LLMs don't speed you up that much if 1) you're already an expert at what you're doing (inherently not scalable), 2) you're only working on one thing (doesn't make sense when you can manage multiple streams), or 3) doing something LLMs are particularly bad it (not many remaining coding tasks, but definitely still some).
Can that happen without you? I would assume this is the next step. I don't find it either good or bad, but I'm genuinely curious where this all goes.
The MCP has now automated away all of the drudgery of programming, from summarizing emails, to generating confluence documentation, to generating slide decks.
I wonder about the hallucination. Reading someone's writing doesn't take all that long.
Is programming supposed to suck all the time? Am I doing it wrong? I mean yeah, sure, it sucks sometimes, but overcoming that "suck" is where I feel progress and growth. If we decide to optimise that away...What the fuck am I doing here? No offence to managers, but if everybody is a manager, is anybody?
You know this is the exact same thing said during Opus 4.6, right?
That makes it hard to believe because it's the same "last week's model was so much behind you can't even comprehend" meme that's been going on throughout last year.
More info dumped into tickets and projects is great for understanding for both people and LLM. But hopefully not LLM generated.
I.e. it's making good output better, but it's making mediocre output (which is most output) worse by adding volume and the appearance of quality, creating a new layer of FUD, stress, tedium, and unhappiness on top of the previously more-manageable problems that come with mediocre output.
I'm still seeing this even with the newest models, because the problem is the user, not the model - the model just empowers them to be even worse, in a new and different way.
https://somehowmanage.com/2020/10/17/code-is-a-liability-not...
Every american learns how to live with debt :)
Engineering is hard. It's always going to be hard. I'm glad that AI makes some parts of it easier, and we (software engineers) can focus on engineering, that's nice.
Code is NEVER cheap. Just because, at current completely unrealistic AI pricing, using agents is cheaper than hiring juniors, does not make code cheap. It makes producing code cheap, which has always been low-cost. Every line of code is a cost, is a maintenance burden, is complexity. An AI, even with somehow infinite context window, will cost more money the more code you have.
Could you replace a whole team of engineers with AI? Probably, yeah. Could you simply fire everyone at your company and close it down, without much of a problem? Also probably yes, for most companies.
AIs can help with debugging, can help with writing code, with drafting designs, they can help with almost every step. The second you let OpenAI, or Anthropic, take full code ownership over your products, and you fire the last engineer, is the time when the AI pricing can go up to match what engineers make today. You've just reinvented the highly paid consultant.
Or you could take the middle-ground and hire good engineers, make sure they maintain an understanding of the codebase, and let them use whatever tools they use to get the job done, and done well. This is the way that I've seen competent companies handle it.
“Taste”, is used in many cases, I suspect, to give a name the collection of practices and strategies developers use to keep their code and projects at a manageable level of complexity.
LLMs don’t seem to manage complexity. They’ll just blow right past manageable and keep on going. That’s a problem. The human has to stay in the loop because LLMs only build what we tell them to build (so far).
BTW, the essay that introduced the big ball of mud pattern to me didn’t hold it up as something entirely bad to be avoided. It pointed out how many projects — successful or at least on-going projects — use it, and how its passive flexibility might actually be an advantage. Big ball of mud might just be the steady state where progress can be made while leaving complexity manageable.
1. Lack of knowledge of existing conventions, usually caused by churn of developers working on a project. LLMs read very quickly.
2. Cost of refactoring existing code to meet current best practices / current conception of architecture. LLMs are ideal for this kind of mostly mechanical refactoring.
Currently, though, they don't see to be much help. I'm not sure if this is a limitation in their ability to use their context window, or simply that they've been trained to reproduce code as seen in the wild with all its flaws.
Architecture practices is how to avoid such harmful consequences. But they’re costly and often harmful themselves. So you need to know which to pick and when to start applying them. LLM won’t help you there.
Production code. Especially production code with bugs is expensive. It can cost you customers, you can even get negative money for it in the form of law suits.
Coding agents are great for preproduction and one offs. For production I really wouldn't chance it at any scale above normal human output.
However, an extra script here or there to make your life easier, adding extra UI features based on some datapoint to your internal dashboard, ect, these were things that could've taken a few days you didn't have before to get exactly right and now they can be done with only a few minutes of attention.
Every time I open linkedin I'm scared of how many big heads have taken the wrong lesson that coding almost free == free engineering. So many bait posts asking engineers why they would need to pay them any longer, or being glad they're generating millions of lines a month....this is going to end badly.
I also keep circling around this point. So many software repositories in the AI space seem to follow a publish and forget pattern. If you simply can show that you have the patience to maintain a project, ideally with manual intervention instead of a fully autonomous AI, then you already have an outstanding project.
On the other hand, like giving a supercar to a teenager, this just enables them to get into trouble faster.
(the "my vibe coded app deleted prod!" stories are funny schadenfreude when they happen to SV startups, whose whole business is pretending to know better. When this happens to a small business who've suddenly lost all their finanacials and now maybe will lose their house, it's a tragedy. And this can happen on a much larger, not AI-related scale, like Jaguar Land Rover: https://www.bbc.co.uk/news/articles/cy9pdld4y81o )
I have friend in west Texas who does industrial electrical gear sales (like those giant spools of cable you see on tractor trailers). He’s 110% good old boy Texan but has adopted and loves AI. He says it’s been a huge help pulling quotes together and other tasks. Coincidentally he lives in Abilene where one of the stargate campuses are going. Btw, the scale of what’s being built in Abilene is like nothing I’ve ever seen.
The issue is that when you gaze long into an abyss, the abyss also gazes into you.
I'm glad that "10 ways to do X" submissions are allowed as long as they boost AI.
Does "boosting AI" include opening an article with "Frontier models are really good at coding these days, much better than they are at other tasks"?
"Product is really good at X, much better than at Y" does not imply that it's bad at Y, and even if it did, if you're targeting an audience that only cares about X, who gives a shit about Y? Might as well throw Y under the bus to boost the perceived effectiveness of product at X even more in comparison.
Many people are finding it difficult to even land internships.
The most affected areas are sysadmin, devops, and frontend. Where you'll have very hard time getting any offer.
Companies like BrowserStack are withdrawing campus placement offers.
Meanwhile, I am writing apps for my own use and have reached 10,000+ monthly active users already, even though I am making zero money from doing all this, but it's fun.
Instead of focusing on whether you can build it, the scarcer resource becomes whether you should build it. And most teams lack a clear process for addressing this latter question. Requirements are collected in all sorts of places without ever being prioritized in an organized fashion. This is exacerbated by cheaper code. With cheaper code, you can release five times what you used to be able to release in a given period of time, but only if you knew which five products you needed.
Since at least the early 80s a LOT of very important code wasn't cheap, it was free. Both free of cost (you could "just" download it and run it) but also free as freedom-respecting software.
I just don't get the argument that cheap is new. Cheap is MORE expensive than free!
Free but you're responsible for maintaining it means it's not free. It's the same issue as maintaining your own fork. It's just an ongoing cost.
(Though as AI becomes autonomous enough to be the maintainer, that cost kind of goes away. Then it's just the cost of managing the "dev".)
If anything, I would bet that next year you could get today’s flagship performance for significantly cheaper via an open-weights model.
Open-source models have caught up tremendously recently. Those who can’t or don’t want to invest a lot of money can already develop with Kimi and GLM without any problems. We don’t have to wait another year for that.
From experience, the same level of usage would have left me stranded on my CC 5 hr limit within an hour.
There were some difficulties with tool calls, in particular with replacing tab-indented strings - but taking no steps to mitigate that (which meant the model had to figure it out every time I cleared context) only cost relatively few extra tokens -- and it still came in well under 4.6, nevermind 4.7. And of course, I can add instructions to prevent churning on those issues.
I have no reason to go back to anthropic models with these results.
"No moat" indeed.
I expect tomorrow’s models will be so much more capable that we will happily pay more.
But if not, we will still likely get today’s capabilities or more for cheap.
I don’t see a realistic scenario in which the AI genie is going back into the bottle because of affordability.
It seems like wishful thinking by people who dislike the new paradigm in software engineering.
(Timeframes are hyperbolical).
I'm not all gloom and doom but the treatment of junior engineers is something I think we will either regret or rejoice. Either will have a spur of creative people doing their own independent thing or we'll have lost a generation of great engineers.
We’ve been coasting along on a single generation who have ruled with iron fists.
If you fire all your SWEs they won't sit around twiddling their thumbs waiting for an AI collapse, they'll career shift. Maybe to an unemployment line and/or homelessness, maybe to something else productive, but either way they'll lose SWE skills.
If you close down all the SWE junior positions you'll strongly discourage young people training in the field. They'll do something else.
Then if you want to go back, who will you hire for it?
They are large language models. Not automated development machines. They hallucinate.
The goal post has not shifted since 2023 or so. Make an LLM that doesn't blatantly disregard knowledge it has, instructions it has been giving, over and over, and you win. If trillions of USD of investment can't do it, I'd be curious to see what can.
If the AI is not good enough, then don't fire the devs. If/when the devs are no longer needed, I don't see why the need would return later, that was my point.
The consumer space is about extracting every ounce of personal data possible.
The b2b space is about "maximizing customer value" - that is, not maximizing the value of your product to the customer, but maximizing the value of the customer to your business. Lock them in and lock them down, make your product "sticky" so they can't leave without immense cost.
Company brain drain, knowledge leaves with your seniors if you decide to get rid of them, or they just leave due to the conditions AI creates.
I don't know if the above comes to fruition, there's a lot of questions that only time will answer. But those are my first thoughts.
Make usable software. Cheap code means that you can create a lot more prototypes to then perform usability tests by finding a user and sitting next to them. I mostly worked on internal apps lately, so perhaps it's much easier for me to do than it is for some others.
I’m not convinced about rebuilding repeatedly as a learning tool though. As relatively quick as it is, it over emphasizes the front line problems you face early. Those tend to be simpler, more straightforward issues that can be more quickly solved by a few minutes of thought (and more cheaply too).
Once upon a time, highly bureaucratic organizations tried to make a distinction between "analyst", "programmer" and "coder": https://cacm.acm.org/opinion/the-myth-of-the-coder/
The pure "coder" role, per that paper, died out almost immediately. Nowadays it's done by compilers (a deterministic automation). The distinction between analyst and programmer held out a bit longer - ten years ago I was working somewhere that had "business analysts", essentially requirements-wranglers. It's possible that the "programmer" job of converting a well-defined specification into a program is also going to start disappearing.
.. but that still leaves the specification as the difficult bit! It remains like the old stories with genies: the genie can give you what you ask for. But you need to be very sure what you want, very clear about it, and aware that it may come with unasked-for downsides if you're not.
It is slower than when I was just using Claude directly though.
[1] https://github.com/gsd-build/get-shit-done
Planning is good but get-shit-done just added too much planning in my opinion.
[1] https://github.com/gsd-build/gsd-2
Buy in bulk and resell. /s