There are few principle of software engineering that I hate more than this one, though SOLID is close.
It is important to understand that it is from a 1974 paper, computing was very different back then, and so was the idea of optimization. Back then, optimizing meant writing assembly code and counting cycles. It is still done today in very specific applications, but today, performance is mostly about architectural choices, and it has to be given consideration right from the start. In 1974, these architectural choices weren't choices, the hardware didn't let you do it differently.
Focusing on the "critical 3%" (which imply profiling) is still good advice, but it will mostly help you fix "performance bugs", like an accidentally quadratic algorithms, stuff that is done in loop but doesn't need to be, etc... But once you have dealt with this problem, that's when you notice that you spend 90% of the time in abstractions and it is too late to change it now, so you add caching, parallelism, etc... making your code more complicated and still slower than if you thought about performance at the start.
Today, late optimization is just as bad as premature optimization, if not more so.
The most misunderstood statement in all of programming by a wide margin.
I really encourage people to read the Donald Knuth essay that features this sentiment. Pro tip: You can skip to the very end of the article to get to this sentiment without losing context.
Basically, don't spend unnecessary effort increasing performance in an unmeasured way before its necessary, except for those 10% of situations where you know in advance that crucial performance is absolutely necessary. That is the sentiment. I have seen people take this to some bizarre alternate insanity of their own creation as a law to never measure anything, typically because the given developer cannot measure things.
> I have seen people take this to some bizarre alternate insanity of their own creation as a law to never measure anything, typically because the given developer cannot measure things.
Similar to the "code should be self documenting - ergo: We don't write any comments, ever"
It is to me incredible, how many „developers“, even “10 years senior developers” have no idea how to use a dubugger and or profiler. I’ve even met some that asked “what is a profiler?”
I hope I’m not insulting anybody, but to me is like going to an “experienced mechanic” and they don’t know what a screwdriver is.
The last time I interviewed (around 10 years ago) I was surprised when 9 of the 10 senior developers didn't know how many bits were in basic elemetary types.
(Then, shortly afterward I also tried to find a new job, realized the entire industry had changed, and was fortunate enough to decide it wasn't worth the trouble.)
> 9 of the 10 senior developers didn't know how many bits were in basic elemetary types
That's likely thanks to C which goes to great pains to not specify the size of the basic types. For example, for 64 bit architectures, "long" is 32 bits on the Mac and 64 bits everywhere else.
The net result of that is I never use C "long", instead using "int" and "long long".
This mess is why D has 32 bit ints and 64 bit longs, whether it's a 32 bit machine or a 64 bit machine. The result was we haven't had porting problems with integer sizes.
It's substantially worse on the JVM. One's intuition from C just fails when you have to think about references vs primitives, and the overhead of those (with or without compressed OOPs).
I've met very few folks who understand the overheads involved, and how extreme the benefits can be from avoiding those.
That's a reasonable answer. But, I meant they seemed to have little understanding or interest. I don't interview much, and I'm probably a poor interviewer. But, I guess I was expecting some discussion.
I mean, as a senior developer, the number of bits in an "int" is "who the hell knows, because it has changed a bunch of times during my career, and that's what stdint.h is for." And let's not even talk about machines 32-bit "char" types, which I actually had to program for once.
If the number of bits isn't actually included right in the type name, then be very sure you know what you're doing.
The senior engineer answer to "How many bits are there in an int?" is "No, stop, put that down before you put your eye out!" Which, to be fair, is the senior engineer answer to a lot of things.
> Similar to the "code should be self documenting - ergo: We don't write any comments, ever"
My counterpoint: Code can be self-documenting, reality isn't. You can have a perfectly clear method that does something nobody will ever understand unless you have plenty of documentation about why that specific thing needs to be done, and why it can't be simpler. Like having special-casing for DST in Arizona, which no other state seems to need:
This is crucial detail that almost everyone misses when they are skimming the topic on surface. The implication is that this statement/law is referenced more often to shut down the architecture designs/discussions
Even moreso . I like the Rob Pike restatement of this principle, it really makes it crystal clear:
"You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is."
Moreso, in my personal experience, I've seen a few speed hacks cause incorrect behavior on more than one occasion.
In particular I've seen way too many people use this term as an excuse to write obviously poor performing code. That's not what Knuth said. He never said it's ok to write obviously bad code.
I'm still salty about that time a colleague suggested adding a 500 kb general purpose js library to a webapp that was already taking 12 seconds on initial load, in order to fix a tiny corner case, when we could have written our own micro utility in 20 lines. I had to spend so much time advocating to management for my choice to spend time writing that utility myself, because of that kind of garbage opinion that is way too acceptable in our industry today. The insufferable bastard kept saying I had to do measurements in order to make sure I wasn't prematurely optimizing. Guy adding 500 kb of js when you need 1 kb of it is obviously a horrible idea, especially when you're already way over the performance budget. Asshat. I'm still salty he got so much airtime for that shitty opinion of his and that I had to spend so much energy defending myself.
Reminds me of a codebase that was littered with SQL injection opportunities because doing it right would have been "premature optimization" since it was "just" a research spike and not customer facing. Guess what happened when it got promoted to a customer facing product?
With modern tools it should be pretty easy to build scalable solutions. I take premature optimization as going out of your way to optimize something that's already reasonable. Not that you should write really bad code as a starting point.
The problem is that that this term gets misused to say the opposite of what it was intended for.
It's particularly the kind of people who like to say "hur hur don't prematurely optimize" that don't bother writing decent software to begin with and use the term as an excuse to write poor performing code.
Instead of optimizing their code, these people end up making excuses so they can pessimize it instead.
In my career Ive seen about 1000 instances of somebody trying to optimize something prematurely.
Usually those people also have a good old whinge about the premature optimization quote being wrong or misinterpreted and general attitudes to software efficiency.
Not once have I ever seen somebody try to derail a process of "ascertain speed is an issue that should be tackled" -> "profile" -> fix the hot path.
Yeah, I interpret "premature optimization" as taking a request that takes 500ms and focusing on saving a couple ms by refactoring logic to avoid a SQL JOIN or something.
Your users are not going to notice. Sure, it's faster but it's not focused on the problem.
"today, performance is mostly about architectural choices, and it has to be given consideration right from the start"
This doesn't make sense. Why is performance (via architectural choices) more important today than then?
You can build a snappy app today by using boring technology and following some sensible best practices. You have to work pretty hard to need PREMATURE OPTIMIZATION on a project -- note the premature there
> You can build a snappy app today by using boring technology and following some sensible best practices.
If you are building something with similar practical constraints for the Nth time this is definitely true.
You are inheriting “architecture” from your own memory and/or tools/dependencies that are already well fit to the problem area. The architectural performance/model problem already got a lot of thought.
Lots of problems are like that.
But if you are solving a problem where existing tools do a poor job, you better be thinking about performance with any new architecture.
The big thing that changed is that almost all software performance today is bandwidth-bound at the limit. Not computation-bound. This transition was first noticed in supercomputing around 25 years ago.
Optimization of bandwidth-bound code is almost purely architectural in nature. Most of our software best practices date from a time when everything was computation-bound such that architecture could be ignored with few bad effects.
I agree. But I have to say, when defining the architecture, there are things known that will be terrible bottlenecks later. They should be avoided. Just as the previous comment, about defining proper indices in a database. Optimization means making something that is already “good” and correct better.
There is no excuse to make a half ass, bug ridden shitty software, under the excuse “optimization is for later” that is technical debt generation: 2 very different things.
In the 1970s computer systems spanned fewer orders of magnitude. Operations generally took somewhere between maybe 1 and 10^8 CPU cycles. Today, the range is closer to 10^-1 to 10^13.
Thinking about the overall design, how its likely to be used, and what the performane and other requirements are before aggregating the frameworks of the day is mature optimization.
Then you build things in a reasonable way and see if you need to do more for performance. It's fun to do more, but most of the time, building things with a thought about performance gets you where you need to be.
The I don't need to think about performance at all camp, has a real hard time making things better later. For most things, cycle counting upfront isn't useful, but thinking about how data will be accessed and such can easily make a huge difference. Things like bulk load or one at a time load are enormous if you're loading lots of things, but if you'll never load lots of things, either works.
Thinking about concurrency, parallelism, and distributed systems stuff before you build is also pretty mature. It's hard to change some of that after you've started.
Spent 6 months last year ripping out an abstraction layer that made every request 40ms slower. We profiled, found the hot path, couldn't fix it without a rewrite. The "optimize later" school never tells you later sometimes means never
SOLID tend to encourage premature abstraction, which is a root of evil that is more relevant today than optimization.
SOLID isn't bad, but like the idea of premature optimization, it can easily lead you into the wrong direction. You know how people make fun of enterprise code all the time, that's what you get when you take SOLID too far.
In practice, it tends to lead to a proliferation of interfaces, which is not only bad for performance but also result in code that is hard to follow. When you see a call through an interface, you don't know what code will be run unless you know how the object is initialized.
In a way, SOLID is premature optimization. You are optimizing abstractions before knowing how the code is used in practice. Lots of code will be written and never changed again, but a minority will see changes quite a bit. Concentrate there. Like you don't need to optimize things that aren't in hot code (usually, omg experience will tell you that all rules have exceptions, including the exceptions).
> Lots of code will be written and never changed again, but a minority will see changes quite a bit. Concentrate there
I think the most important principle above all is knowing when not to stick to them.
For example if I know a piece of code is just some "dead end" in the application that almost nothing depends on then there is little point optimizing it (in an architectural and performance sense). But if I'm writing a core part of an application that will have lots of ties to the rest, it totally does make sense keeping an eye on SOLID for example.
I think the real error is taking these at face value and not factoring in the rest of your problem domain. It's way too simple to think SOLID = good, else bad.
They start by indicating people don't understand, “A module should have only one reason to change.”. Reading more of that article, it's clear the author doesn't understand much about software engineering and sounds more like a researcher who just graduated from putting together 2+2.
The great thing bout the net is also it's biggest problem. Anyone can write a blog and if it looks nice, sounds polished they could sway a large group. I roll my eyes so strong at folks that reject SOLID principles and design patterns.
I have seen way too often, advocates of SOLID and patterns to have religious arguments: I don’t like it. That being said, I think there is nothing bad in SOLID, as long as treated as principles and not religious dogmas.
About patterns, I cannot really say as much positive. They are not bad per-se. But I’ve seen they have made lots of harm. In the gang of 4 book, in the preface, I think, says something like “this list is neither exhaustive, nor complete, and often inadequate” the problem is every single person I know who was exposed to the book, try to hammer every problem into one pattern (in the sense of [1]). Also insist in using the name everywhere like “facade_blabla” IMHO the pattern may be Façade, but putting that through the names of all classes and methods, is not good design.
> That being said, I think there is nothing bad in SOLID, as long as treated as principles and not religious dogmas
This should be the header of the website. I think the core of all these arguments is people thinking they ARE laws that must be followed no matter what. And in that case, yeah that won't work.
Something, something, wrong abstractions are worse than no abstractions.
SOLID approaches aren't free... beyond that keeping code closer together by task/area is another approach. I'm not a fan of premature abstraction, and definitely prefer that code that relates to a feature live closer together as opposed to by the type of class or functional domain space.
For that matter, I think it's perfectly fine for a web endpoint handler to make and return a simple database query directly without 8 layers of interfaces/classes in between.
Beyond that, there are other approaches to software development that go beyond typical OOP practices. Something, something, everything looks like a nail.
The issues that I have with SOLID/CLEAN/ONION is that they tend to lead to inscrutable code bases that take an exponentially long amount of time for anyone to come close to learning and understanding... Let alone the decades of cruft and dead code paths that nobody bothered to clean up along the way.
The longest lived applications I've ever experienced tend to be either the simplest, easiest to replace or the most byzantine complex monstrosities... and I know which I'd rather work on and support. After three decades I tend to prioritize KISS/YAGNI over anything else... not that there aren't times where certain patterns are needed, so much as that there are more times where they aren't.
I've worked on one, singular, one application in three decades where the abstractions that tend to proliferate in SOLID/CLEAN/ONION actually made sense... it was a commercial application deployed to various govt agencies that had to support MS-SQL, Oracle and DB2 backends. Every, other, time I've seen an excess of database and interface abstractions have been instances that would have been better solved in other, less performance impacting ways. If you only have a single concrete implementation of an interface, you probably don't need that interface... You can inherit/override the class directly for testing.
And don't get me started on keeping unit tests in a completely separate project... .Net actually makes it painful to put your tests with your implementation code. It's one of my few actual critiques about the framework itself, not just how it's used/abused.
This doesn't seem to be a critique of the principles so much as a critique of their phrasing.
Even his "critique" of Demeter is, essentially, that it focuses on an inconsequential aspect of dysfunction—method chaining—which I consider to be just one sme that leads to the larger principle which—and we, apparently, both agree on this—is interface design.
If you follow SOLID, you'll write OOP only, with always present inheritance chains, factories for everything, and no clear relation between parameters and the procedures that use them.
This is similar to my view. All these "laws" should alwaye be used as guidance not as actual laws. Same with O. I think its good advice to design software so adding features that are orthogonal to other features don't require modifying much code.
That's how I view it. You should design your application such that extension involves little modifying of existing code as long as it's not necessary from a behavior or architectural standpoint.
It's a very interesting topic. Even when designing a system, how to modularize, it's healthy to wait until the whole context is in sight. It's a bit of a black art, too early or too late you pay some price.
Unfortunately people do keep repeating it to excuse the fact that they don't know how to optimize in the first place.
Anyone who has done optimization even a little knows that it isn't very difficult, but you do need to plan and architect for it so you don't have to restructure you whole program to get it to run well.
Mostly it's just rationalization, people don't know the skill so they pretend it's not worth doing and their users suffer for it.
If software and website were even reasonably optimized people could just use a computer as powerful as a rasberry pi 5 (except for high res video) for most of what they do day to day.
And as I point out, what Knuth was talking about in terms of optimization was things like loop unrolling and function inlining. Not picking the right datastructure or algorithm for the problem.
I mean, FFS, his entire book was about exploring and picking the right datastructures and algorithms for problems.
Decades in, this is the worst of all of them. Misused by laziness or malice, and nowhere near specific enough.
The graveyard of companies boxed in by past poor decisions is sprawling. And the people that made those early poor decisions bounce around field talking about their "successful track record" of globally poor and locally good architectural decisions that others have had to clean up.
It touches on a real problem, though, but it should be stricken form the record and replaced with a much better principle. "Design to the problem you have today and the problems you have in 6 months if you succeed. Don't design to the problems you'll have have next year if it means you won't succeed in 6 months" doesn't roll off the tongue.
On your last bit, I definitely agree... personally I've leaned more and more into KISS above all else... simple things that are easy to replace are easily replaced only when you need to. Similarly, I also tend to push for initial implementations of many/most things in a scripted language first, mostly for flexibility/simplicity to get a process "right" before worrying about a lot of other things.
One thing that came out of the no-sql/new-sql trends in the past decade and a half is that joins are the enemy of performance at scale. It really helps to know and compromise on db normalization in ways such as leaning on JSON/XML for non-critical column data as opposed to 1:1/children/joins a lot of the time. For that matter, pure performance and vertical scale have shifted a lot of options back from the brink of micro service death by a million paper cuts processes.
>> Today, late optimization is just as bad as premature optimization, if not more so.
You are right about the origin of and the circumstances surrounding the quote, but I disagree with the conclusion you've drawn.
I've seen engineers waste days, even weeks, reaching for microservices before product-market fit is even found, adding caching layers without measuring and validating bottlenecks, adding sharding pre-emptively, adding materialized views when regular tables suffice, paying for edge-rendering for a dashboard used almost entirely by users in a single state, standing up Kubernetes for an internal application used by just two departments, or building custom in-house rate limiters and job queues when Sidekiq or similar solutions would cover the next two years.
One company I consulted for designed and optimized for an order of magnitude more users than were in the total addressable market for their industry! Of that, they ultimately managed to hit only 3.5%.
All of this was driven by imagined scale rather than real measurements. And every one of those choices carried a long tail: cache invalidation bugs, distributed transactions, deployment orchestration, hydration mismatches, dependency array footguns, and a codebase that became permanently harder to change. Meanwhile the actual bottlenecks were things like N+1 queries or missing indexes that nobody looked at because attention went elsewhere.
> All of this was driven by imagined scale rather than real measurements
Yes. When I was a young engineer, I was asked to design something for a scale we didn’t even get close to achieving. Eventual consistency this, event driven conflict resolution that… The service never even went live because by the time we designed it, everyone realized it was a waste of time.
I learned it makes no sense to waste time designing for zillions of users that might never come. It’s more important to have an architecture that can evolve as needs change rather than one that can see years into the future (that may never come).
Thank your for posting this. I disagreed with OP but couldn't _quite_ find the words to describe why. But your post covers what i was trying to say.
I was quite literally asked to implement an in-memory cache to avoid a "full table scan" caused by a join to a small DB table recently. Our architect saw "full table scans" in our database stats and assumed that must mean a performance problem. I feel like he thought he was making a data-driven profiling decision, but seemed to misunderstand that a full-table scan is faster for a small table than a lookup. That whole table is in RAM in the DB already.
So now we have a complex Redis PubSub cache invalidation strategy to save maybe a ms or two.
I would believe that we have performance problems in this chunk of code, and it's possible an in-memory cache may "fix" the issue, but if it does, then the root of the problem was more likely an N+1 query (that an in-memory cache bandaids over). But by focusing on this cache, suddenly we have a much more complex chunk of code that needs to be maintained than if we had just tracked down the N+1 query and fixed _that_
I would venture that this statement is not true for library authors. Performance is a large factor in competitive advantage, especially in domains like image analysis or processing large corpuses of text etc.
In these domains, algorithm selection, and fine tuning hot spots pays off significantly. You must hit minimum speeds to make your application viable.
“A variable should mean one thing, and one thing only. It should not mean one thing in one circumstance, and carry a different value from a different domain some other time. It should not mean two things at once. It must not be both a floor polish and a dessert topping. It should mean One Thing, and should mean it all of the time.”
> It must not be both a floor polish and a dessert topping.
I worked as a janitor for four years near a restaurant, so I know a little bit about floor polishing and dessert toppings. This law might be a little less universal than you think. There are plenty of people who would happily try out floor polish as a dessert topping if they're told it'll get them high.
Borax is an example of a substance that is simultaneously used for skin care, household cleaning, as soldiering flux, and ant killer. But I guess it is a constant with variable effects. Hard to be found in local shops anymore.
I worked for awhile as a janitor in a college dorm. Not an easy job but it definitely revealed a side of humanity I might not have otherwise seen. Especially the clean out after students left for the year.
Something as expensive as dessert toppings would only be used as floor polish by the people who truly were high... and only if they could do it without the boss knowing what they were doing.
Oh! I didn’t have a name for this one, but it’s a lesson I’ve learned. E.g. if variable x is non-zero, then variable y will be set to zero. Don’t check variable y to find out whether x is zero. And definitely don’t repurpose y for some other function in cases where it’s otherwise unused.
For modern day web developers there are very, very, very few things that fall into the "if you do this you should probably be escorted out of the building on the first offense" category, but "reusing a variable because it's 'not being used'" might be on that list. I can maybe see the argument in very low memory embedded systems or similar systems where I'm not even qualified to come up with examples but not in anything that regularly shows up on HN for example.
Remember that these "laws" contain so many internal contradictions that when they're all listed out like this, you can just pick one that justifies what you want to justify. The hard part is knowing which law break when, and why
Postel's Law vs. Hyrum's Law is the canonical example. Postel says be liberal in what you accept — but Hyrum's Law says every observable behavior of your API will eventually be depended on by someone. So if you're lenient about accepting malformed input and silently correcting it, you create a user base that depends on that lenient behavior. Tightening it later is a breaking change even if it was never documented. Being liberal is how you get the Hyrum surface area.
The resolution I've landed on: be strict in what you accept at boundaries you control (internal APIs, config parsing) and liberal only at external boundaries where you can't enforce client upgrades. But that heuristic requires knowing which category you're in, which is often the hard part.
I’m one of those that have thrown out Postel’s law entirely. Maybe the issue is that it never defines “strict”, “liberal”, and “accept”. But at least for public APIs, it never made sense to me.
If I accidentally accept bad input and later want to fix that, I could break long-time API users and cause a lot of human suffering. If my input parsing is too strict, someone who wants more liberal parsing will complain, and I can choose to add it before that interaction becomes load-bearing (or update my docs and convince them they are wrong).
The stark asymmetry says it all.
Of course, old clients that can’t be upgraded have veto power over any changes that could break them. But that’s just backwards compatibility, not Postel’s Law.
Source: I’m on a team that maintains a public API used by thousands of people for nearly 10 years. Small potatoes in internet land but big enough that if you cause your users pain, you feel it.
I probably use a different interpretation of Postel's law. I try not "break" for anything I might receive, where break means "crash, silently corrupt data, so on". But that just means that I return an error to the sender usually. Is this what Postel meant? I have no idea.
I used to see far more references to Postel’s law in the 00s and early 10s. In the last decade, that has noticeably shifted towards hyrum’s law. I think it’s a shift in zeitgeist.
This reminds me of a comment I read here a long time ago; it was about XML and how DTDs were supposed to permit one to be strict. However, in reality, the person said, if the the other end who is sending you broken XML is a big corp who refuses to fix it, then you have no choice but accept it.
Bottom line: it's all a matter of balance of powers. If you're the smaller guy in the equation, you'll be "Postel'ed" anyway.
Yet Postel's law is still in the "the road to hell is paved with good intentions" category, for the reason you explain very well (AKA XKCD #1172 "Workflow"). Wikipedia even lists a couple of major critics about it [1].
I look at Postel’s law more as advice on how to parse input. At some point you’re going to have to upgrade a client or a server to add a new field. If you’ve been strict, then you’ve created a big coordination problem, because the new field is a breaking change. But if you’re liberal, then your systems ignore components of the input that they don’t recognize. And that lets you avoid a fully coordinated update.
I've seen CompSci guys especially (I'm EEE background, we have our own problems but this ain't one of them) launch conceptual complexity into the stratosphere just so that they could avoid writing two separate functions that do similar things.
The purpose of the blade is to cut down your opponent.
The purpose of software is to provide value to the customer.
It's the only thing that matters.
You can also philosophize why people with blades needed to cut down their opponents along with why we have to provide value to the customer but thats beyond the scope of this comment
Why baseball? You don't use the number 3 in any other context?
If you write a lot of code, the odds of something repeating in another place just by coincidence are quite large. But the odds of the specific code that repeated once repeating again are almost none.
That's a basic rule from probability that appears in all kinds of contexts.
Anyway, both DRY and WET assume the developers are some kind ignorant automaton that can't ever know the goal of their code. You should know if things are repeating by coincidence or not.
If you get lucky doing that you might regret it. Especially with non-technical management.
Making software is a back-of-house function, in restaurant terms. Nobody out there sees it happen, nobody knows what good looks like, but when a kitchen goes badly wrong, the restaurant eventually closes.
These projects quickly reach a point where evolving it further is too costly and risky. To the point that the org owning it will choose to stop development to do a re-implementation which, despite being a very costly and risky endeavor, ends up being a the better choice.
It's easy to say that organizations should do it right the first time, in terms of applying proper engineering practices. But they often didn't have the time, capital, and skillset to do that. Not ideal, but that's often how things work in the real world and it will never change.
Organizations should do it not catastrophically wrongly, especially once a core design / concept is mostly solidified. Putting a little time into reliability and guardrails prevents a huge amount of downside.
I've been at organizations that don't think engineers should write tests because it takes too much time and slows them down...
Plenty of businesses or products within businesses stagnate and fail because their software got too expensive to maintain and extend. Not infrequently, this happens before it even sees a public release. Any business that can't draw startup-type levels of investment to throw effectively infinite amounts of Other People's Money at those kinds of problems, risks that end if they allow their software to get too messed-up.
The "who gives a shit, we'll just rewrite it at 100x the cost" approach to quality is very particular to the software startup business model, and doesn't work elsewhere.
I worked for a company that also had hardware engineers writing RTL. Our software architect spent years helping that team reuse/automate/modularize their code. At a mininum, it's still just text files with syntax, despite rather different semantics.
I've heard that story a few times (ironically enough) but can't say I've seen a good example. When was over-architecture motivated by an attempt to reduce duplication? Why was it effective in that goal, let alone necessary?
I think there is often tension between DRY and "thing should do only one thing." E.g., I've found myself guilty of DRYing up a function, but the use is slightly different in a couple places, so... I know, I'll just add a flag/additional function argument. And you keep doing that and soon you have a messed up function with lots of conditional logic.
The key is to avoid the temptation to DRY when things are only slightly different and find a balance between reuse and "one function/class should only do one thing."
For sure. I feel I need all of my experience to discern the difference between “slightly different, and should be combined” and “slightly different, and you’ll regret it if you combine them.”
One of my favorite things as a software engineer is when you see the third example of a thing, it shows you the problem from a different angle, and you can finally see the perfect abstraction that was hiding there the whole time.
Buy me a beer and I can tell you some very poignant stories. The best ones are where there is a legitimate abstraction that could be great, assuming A) everyone who had to interact with the abstraction had the expertise to use it, B) the details of the product requirements conformed to the high level technical vision, now and forever, and C) migrating from the current state to the new system could be done in a bounded amount of time.
My view is over-engineering comes from the innate desire of engineers to understand and master complexity. But all software is a liability, every decision a tradeoff that prunes future possibilities. So really you want to make things as simple as possible to solve the problem at hand as that will give you more optionality on how to evolve later.
I’ll give a simplified example of something I have at work right now. The program moves data from the old system to the new system. It started out moving a couple of simple data types that were basically the same thing by different names. It was a great candidate for reusing a method. Then a third type was introduced that required a little extra processing in the middle. We updated the method with a flag to do that extra processing. One at a time, we added 20 more data types that each had slightly different needs. Now the formerly simple method is a beast with several arguments that change the flow enough that there are a probably just a few lines that get run for all the types. If we didn’t happen to start with two similar types we probably wouldn’t have built this spaghetti monster.
I saw a fancy HTML table generator that had so many parameters and flags and bells and whistles that it took IIRC hundreds of lines of code to save writing a similar amount of HTML in a handful of different places.
Yes the initial HTML looked similar in these few places, and the resultant usage of the abstraction did not look similar.
But it took a very long time reading each place a table existed and quite a bit longer working out how to get it to generate the small amount of HTML you wanted to generate for a new case.
Definitely would have opted for repetition in this particular scenario.
DRY is misunderstood. It's definitely a fundamental aspect of code quality it's just one of about 4 and maximizing it to the exclusion of the others is where things go wrong. Usually it comes at the expense of loose coupling (which is equally fundamental).
The goal ought to be to aim for a local minima of all of these qualities.
Some people just want to toss DRY away entirely though or be uselessly vague about when to apply it ("use it when it makes sense") and thats not really much better than being a DRY fundamentalist.
DRY is misnamed. I prefer stating it as SPOT — Single Point Of Truth. Another way to state it is this: If, when one instance changes in the future, the other instance should change identically, then make it a single instance. That’s really the only DRY criterion.
I like this a lot more, because it captures whether two things are necessarily the same or just happen to be currently the same.
A common "failure" of DRY is coupling together two things that only happened to bear similarity while they were both new, and then being unable to pick them apart properly later.
I said this elsewhere in the comments, but I think there's sort of a fundamental tension that shows up sometimes between DRY and "a function/class should only do one thing." E.g., there might be two places in your code that do almost identical things, so there's a temptation to say "I know! I'll make a common function, I'll just need to add a flag/extra argument..." and if you keep doing that you end up with messy "DRY" functions with tons of conditional logic that tries to do too much.
Yeah there are ways to avoid this and you need to strike balances, but sometimes you have to be careful and resist the temptation to DRY everything up 'cuz you might just make it brittler (pun intended).
Yes, and there are many different kinds of truth, so when two arise together—we can call this connascence—we can categorize how these instances overlap: Connascence of Name, Connascence of Algorithm, etc.
That’s how I understand it as well. It’s not about an abstract ideal of duplication but about making your life easier and your software less buggy. If you have to manually change something in 5 different places, there’s a good chance you’ll forget one of those places at some point and introduce a bug.
That's how I understood it. If you add a new thing (constant, route, feature flag, property, DB table) and it immediately needs to be added in 4 different places (4 seems to be the standard in my current project) before you can use it, that's not DRY.
> If you add a new thing (constant, route, feature flag, property, DB table) and it immediately needs to be added in 4 different places (4 seems to be the standard in my current project) before you can use it, that's not DRY.
The tricky part is that sometimes "a new thing" is really "four new things" disguised as one. A database table is a great example because it's a failure mode I've seen many times. A developer has to do it once and they have to add what they perceive as the same thing four times: the database table itself, the internal DB->code translation e.g. ORM mapping, the API definition, and maybe a CRUD UI widget. The developer thinks, "oh, this isn't DRY" and looks to tools like Alembic and PostGREST or Postgraphile to handle this end-to-end; now you only need to write to one place when adding a database table, great!
It works great at first, then more complex requirements come down: the database gets some virtual generated columns which shouldn't be exposed in code, the API shouldn't return certain fields, the UI needs to work off denormalized views. Suddenly what appeared to be the same thing four times is now four different things, except there's a framework in place which treats these four things as one, and the challenge is now decoupling them.
Thankfully most good modern frameworks have escape valves for when your requirements get more complicated, but a lot of older ones[0] really locked you in and it became a nightmare to deal with.
[0] really old versions of Entity Framework being the best/worst example.
I believe that was the point of Ruby on Rails: that you really had to just create the class, and the framework would create the table and handle the ORM. Or maybe you still had to write the migration; it's been as while. That was pretty spectacular in its dedication to DRY, but also pretty extreme.
But the code I'm talking about is really adding the same thing in 4 different places: the constant itself, adding it to a type, adding it to a list, and there was something else. It made it very easy to forget one step.
Renaming it doesnt change the nature of the problem.
There should often be two points of truth because having one would increase the coupling cost more than the benefits that would be derived from deduplication.
This was also true of Amazon's Leadership Principles. They are pretty reasonable guidelines, but in a debate, it really came down to which one you could most reasonably weaponize in favor of your argument, even to the detriment of several others.
It's because they are heurists intended to be applied by knowledgeable and experienced humans.
It can be quite hard to explain when a student asks why you did something a particular way. The truthful answer is that it felt like the right way to go about it.
With some thought you can explain it partly - really justify the decision subconsciously made.
If they're asking about a conscious decision that's rarely much more helpful that you having to say that's what the regulations, or guidelines say.
Where they really learn is seeing those edge cases and gray areas
As a very senior SWE with a decent amount of eng decision making responsibility these days I still find I get so much mileage out of KISS and YAGNI that I never really think about any other laws.
So much SWE is overengineering. Just like this website to be honest. You don't get away with all that bullshit in other eng professions where your BoM and labour costs are material.
I'll propose this as the only unbreakable law: "everything in moderation", which I feel implies any law is breakable, which now this is sounding like the barber's paradox. What else does anyone propose as unbreakable?
Most of them felt contradictory and kind of antiquated.
Reading through the list mostly made me feel sad. You can't help but interpret these through the modern lens of AI assisted coding. Then you wonder if learning and following (some) of these for the last 20 years is going to make you a janitor for a bunch of AI slop, or force you into a coding style where these rules are meaningless, or make you entirely irrelevant.
This is doubly true in Machine Learning Engineering. Knowing what methods to avoid is just as important to know what might work well and why. Importantly a bunch of Data Science techniques — and I use data science in the sense of making critical team/org decisions — is also as important for which you should understand a bit of statistics not only data driven ML.
You know, I mention this stuff all the time in various meetings and discussions. I read a lot of stuff on Hacker News and just have years of accumulated knowledge from the various colleagues I've worked with. Its nice to have a little reference sheet.
- Every website will be vibecoded using Claude Opus
This will result in the following:
- The background color will be a shade of cream, to properly represent Anthropic
- There will be excessive use of different fonts and weights on the same page, as if a freshman design student who just learned about typography
- There will be an excess of cards in different styles, a noteworthy amount of which has a colored, round border either on hover or by default on exactly one side of the card
What do you call the law that you violate when you vibe code an entire website for "List of 'laws' of software engineering" instead of just creating a Wikipedia page for it
I know it's not software-engineering-only, but Chesterton's Fence is often the first 'law' I teach interns and new hires: https://fs.blog/chestertons-fence/
That’s such a good one! I wish more people understood this. It seems management and business types always want some upfront plan. And I get it, to an extent. But we’ve learned this isn’t a very effective way to build software. You can’t think of all possible problems ahead of time, especially the first time around. Refactoring to solve problems with a flexible architecture it better than designing yourself into a rigid architecture that can’t adapt as you learn the problem space.
Functionalities aren’t necessarily orthogonal to each other; features tend to interact with one another. “Avoid coupling between unrelated functionalities” would be more realistic.
For example using nested .gitignore files vs using one root .gitignore file. I guess this principle is related to this one:
Imagine the code as a graph with nodes and edges. The nodes should be grouped in a way that when you display the graph with grouped nodes, you see few edges between groups. Removing a group means that you need to cut maybe 3 edges, not 30. I.e. you don't want something where every component has a line to every other component.
Also when working on a feature - modifying / adding / removing, ideally you want to only look at an isolated group, with minimal links to the rest of the code.
What's the smallest unit of functionality to which your principle applies?
For example, each comment on HN has a line on top that contains buttons like "parent", "prev", "next", "flag", "favorite", etc. depending on context. Suppose I might one day want to remove the "flag" functionality. Should each button be its own file? What about the "comment header" template file that references each of those button files?
I think that if you continue along the logical progression of the parent poster, then maybe the smaller units of functionality would be represented by simple ranges of lines of text. Given that, deleting a single button would ideally mean a single contiguous deletion from a file, versus deleting many disparate lines.
Now, I tend towards the C idiom of having few files and not a deep structure and away from the one class, one file of Java. Less files to rename when refactoring and less files to open trying to understand an implementation.
One advantage of more smaller files is that merge conflicts become less common. I would guess that at least half of the trivial merge conflicts I see are two unrelated commits which both add some header or other definition at the top of the file, but because both are inserted at line 17, git looks at it and says, “I give up.”
This in itself might not be enough to justify this, but the fewer files will lead to more challenges in a collaborative environment (I’d also note that more small files will speed up incremental compilations since unchanged code is less likely to get recompiled which is one reason why when I do JVM dev, I never really think about compilation time—my IDE can recompile everything quickly in the background without my noticing).
for the first point, such merge commits are so trivial to fix that it’s barely takes time.
You got a point for incremental compilation. But fewer files (done well) is not really a challenge as everything is self contained. It makes it easier to discern orthogonal features as the dependency graph is clearer. With multiple files you find often that similar things are assumed to be identical and used as such. Then it’s a big refactor when trying to split them, especially if they are foundational.
Hot take - I hate YAGNI. My personal pet peeve is when someone says YAGNI to a structure in the code they perceive as "more complex than they would have done it".
Sure, don't add hooks for things you don't immediately need. But if you are reasonably sure a feature is going to be required at some point, it doesn't hurt to organize and structure your code in a way that makes those hooks easy to add later on.
Worst case scenario, you are wrong and have to refactor significantly to accommodate some other feature you didn't envision. But odds are you have to do that anyway if you abide by YAGNI as dogma.
The amount of times I've heard YAGNI as reasoning to not modularize code is insane. There needs to be a law that well-intentioned developers will constantly misuse and misunderstand the ideas behind these heuristics in surprising ways.
Nice to have these all collected nicely and sharable. For the amusement of HN let me add one I've become known for at my current work, for saying to juniors who are overly worried about DRY:
> Fen's law: copy-paste is free; abstractions are expensive.
edit: I should add, this is aimed at situations like when you need a new function that's very similar to one you already have, and juniors often assume it's bad to copy-paste so they add a parameter to the existing function so it abstracts both cases. And my point is: wait, consider the cost of the abstraction, are the two use cases likely to diverge later, do they have the same business owner, etc.
- Shirky Principle: Institutions will try to preserve the problem to which they are the solution
- Chesterton's Fence: Changes should not be made until the reasoning behind the current state of affairs is understood
- Rule of Three: Refactoring given only two instances of similar code risks selecting a poor abstraction that becomes harder to maintain than the initial duplication
> "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it"
I'm seeing some hate for SOLID in these comments and I am a little surprised. While I don't think it should ever be used religiously, I would much rather work on a team that understood the principles than one that didn't.
The few on this page today who object to SOLID seem likely to me to be functional programmers who have never understood software engineering principles in the first place.
That's interesting, what makes you think that? Not long ago, I was working on my degree in Computer Science (Software Engineering), and we were heavily drilled on this principle. Even then, I found it amusing how all the professors were huge fanboys of SOLID. It was very dogmatic.
Calling them 'laws' is always a bit of a stretch. They are more like useful heuristics. The real engineering part is knowing exactly when to break them.
For anyone reading this. Learn software engineering from people that do software engineering. Just read textbooks which are written by people that actually do things
Any recommendations? I read designing data intensive applications(DDIA) which was really good. But it is by Martin Klepmann who as I understand is an academic. Reading PEPs is also nice as it allows one to understand the motivations and "Why should I care" about feature X.
the python cookbook is good. and fluent python is more from principles rather than application (obvs both python specific). I also like philosophy of software design. tiny little book that uses simple example (class that makes a text editor) to talk about complexity, not actually about making a text editor at all.
When it comes to frameworks (any framework) any jargon not explicitly pointing to numbers always eventually reduces down to some highly personalized interpretation of easy.
It is more impactful than it sounds because it implicitly points to the distinction of ultimate goal: the selfish developer or the product they are developing. It is also important to point out that before software frameworks were a thing the term framework just identifies a defined set of overlapping abstract business principles to achieve a desired state. Software frameworks, on the other hand, provide a library to determine a design convention rather than the desired operating state.
I recently had success with a problem I was having by basically doing the following:
- Write a correct, pretty implementation
- Beat Claude Code with a stick for 20 minutes until it generated a fragile, unmaintainable mess that still happened to produce the same result but in 300ms rather than 2500ms. (In this step, explicitly prompting it to test rather than just philosophising gets you really far)
- Pull across the concepts and timesaves from Claude's mess into the pretty code.
Seriously, these new models are actually really good at reasoning about performance and knowing alternative solutions or libraries that you might have only just discovered yourself.
However, a correct, pretty and fast solution may exist that neither of you have found yet.
But yes, the scope and breadth of their knowledge goes far beyond what a human brain can handle. How many relevant facts can you hold in your mind when solving a problem? 5? 12? An LLM can take thousands of relevant facts into account at the same time, and that's their superhuman ability.
Don't see a really important one in my opinion:
Refactor legacy code, don't rewrite it. All that cruft you see are bug fixes.
Because rewriting old complex code is way more time consuming that you think it'll be. You have to add not only in the same features, but all the corner cases that your system ran into in the past.
Have seen this myself. A large team spent an entire year of wasted effort on a clean rewrite of an key system (shopping cart at a high-volume website) that never worked...
...although, in the age of AI, wonder if a rewrite would be easier than in the past. Still, guessing even then, it'd be better if the AI refactored it first as a basis for reworking the code, as opposed to the AI doing a clean rewrite of code from the start.
One that is missing is Ousterhout’s rule for decomposing complexity:
complexity(system) =
sum(complexity(component) * time_spent_working_in(component)
for component in system).
The rule suggests that encapsulating complexity (e.g., in stable libraries that you never have to revisit) is equivalent to eliminating that complexity.
That’s not some kind of law, though. And I’m also not sure whether it even makes sense, complexity is not a function of time spent working on something.
First, few of the laws on that site are actual laws in the physics or mathematics sense. They are more guiding principles.
> complexity is not a function of time spent working on something.
But the complexity you observe is a function of your exposure to that complexity.
The notion of complexity exists to quantify the degree of struggle required to achieve some end. Ousterhout’s observation is that if you can move complexity into components far away from where you must do your work to achieve your ends, you no longer need to struggle with that complexity, and thus it effectively is not there anymore.
And in addition, the time you spend making a component work properly is absolutely a function of its complexity. Once you get it right, package it up neatly with a clean interface and a nice box, and leave it alone. Where "getting it right" means getting it to a state where you can "leave it alone".
I think the intent is that if you can cleanly encapsulate some complexity so that people working on stuff that uses it don't have to understand anything beyond a simple interface, that complexity "doesn't exist" for all intents and purposes. Obviously this isn't universal, but a fair percentage of programmers these days don't understand the hardware they're programming against due to the layers of abstractions over them, so it's not crazy either.
It's showing that all the complexity in the components are someone else's problem. Your only complexity is your own top layer and your interface with the components.
Only when the problem has been resolved well enough for your use cases. Like using an http client instead of dealing with parsing http messages or using a GUI toolkit instead of manipulating raw pixels.
That’s pretty much what good design is about. Your solve a foundational problems and now no one else needs to think about it (including you when working on some other parts).
There is one missing that i am using as primary for the last 5 years.
The UX pyramid but applied to DX.
It basically states that you should not focus in making something significant enjoyable or convenient if you don't have something that is usable, reliable or remotely functional.
Today, I was presented with Claude's decision to include numerous goto statements in a new implementation. I thought deeply about their manual removal; years of software laws went against what I saw. But then, I realized it wouldn't matter anymore.
Then I committed the code and let the second AI review it. It too had no problem with goto's.
Claude's Law:
The code that is written by the agent is the most correct way to write it.
Love the details sub pages. Over 20 years I collected a little list of specific laws or really observations (https://metamagic.substack.com/p/software-laws) and thought about turning each into specific detailed blog posts, but it has been more fun chatting with other engineers, showing the page and watch as they scan the list and inevitably tell me a great story. For example I could do a full writeup on the math behind this one, but it is way more fun hearing the stories about the trying and failing to get second re-writes for code.
9. Most software will get at most one major rewrite in its lifetime.
A couple are well-described/covered in books, e.g., Tesler's Law (Conservation of Complexity) is at the core of _A Philosophy of Software Design_ by John Ousterhout
(and of course Brook's Law is from _The Mythical Man Month_)
Curious if folks have recommendations for books which are not as well-known which cover these, other than the _Laws of Software Engineering_ book which the site is an advertisement for.....
I'd like to propose a corollary to Gall's Law. Actually it's a self-proving tautology already contained with the term "lifecycle." Any system that lasts longer than a single lifecycle oscillates between (reducing to) simplicity and (adding) complexity.
My bet is on the long arc of the universe trending toward complexity... but in spite of all this, I don't think all this complexity arises from a simple set of rules, and I don't think Gall's law holds true. The further we look at the rule-set for the universe, the less it appears to be reducible to three or four predictable mechanics.
Another commenter WillAdams has mentioned A Philosophy of Software Design (which should really be called A Set of Heuristics for Software Design) and one of the key concepts there are small (general) interfaces and deep implementations.
A similar heuristic also comes up in Elements of Clojure (Zachary Tellman) as well, where he talks about "principled components and adaptive systems".
The general idea: You should greatly care about the interfaces, where your stuff connects together and is used by others. The leverage of a component is inversely proportional to the size of that interface and proportional to the size of its implementation.
I think the way that connects to testing is that architecturally granular tests (down the stack) is a bit like pouring molasses into the implementation, rather than focusing on what actually matters, which is what users care about: the interface.
Now of course we as developers are the users of our own code, and we produce building blocks that we then use to compose entire programs. Having example tests for those building blocks is convenient and necessary to some degree.
However, what I want to push back on is the implied idea of having to hack apart or keep apart pieces so we can test them with small tests (per method, function etc.) instead of taking the time to figure out what the surface areas should be and then testing those.
If you need hyper granular tests while you're assembling pieces, then write them (or better: use a REPL if you can), but you don't need to keep them around once your code comes together and you start to design contracts and surface areas that can be used by you or others.
I think the general wisdom in that scenario is to keep them around until they get in the way. Let them provide a bit of value until they start being a cost.
Remember, just because people repeated it so many times it made it to this list, does not mean its true. There may be some truth in most of these, but none of these are "Laws". They are aphorisms: punchy one liners with the intent to distill something so complex as human interaction and software design.
Two of my main CAP theorem pet peeves happen on this page:
- Not realizing it's a very concrete theorem applicable in a very narrow theoretical situation, and that its value lies not in the statement itself but in the way of thinking that goes into the proof.
- Stating it as "pick any two". You cannot pick CA. Under the conditions of the CAP theorem it is immediately obvious that CA implies you have exactly one node. And guess what, then you have P too, because there's no way to partition a single node.
A much more usable statement (which is not a theorem but a rule of thumb) is: there is often a tradeoff between consistency and availability.
It sounds like you are unfamiliar with the idea that software engineering efforts can be underestimated at the outset. The humorous observation here is that the total is 180 percent, which mean that it took longer than expected, which is very common.
> Another example is prematurely choosing a complex data structure for theoretical efficiency (say, a custom tree for log(N) lookups) when the simpler approach (like a linear search) would have been acceptable for the data sizes involved.
This example is the exact example I'd choose where people wrongly and almost obstinately apply the "premature optimization" principles.
I'm not saying that you should write a custom hash table whenever you need to search. However, I am saying that there's a 99% chance your language has an inbuilt and standard datastructure in it's standard library for doing hash table lookups.
The code to use that datastructure vs using an array is nearly identical and not the least bit hard to read or understand.
And the reason you should just do the optimization is because when I've had to fix performance problems, it's almost always been because people put in nested linear searches turning what could have been O(n) into O(n^3).
But further, when Knuth was talking about actual premature optimization, he was not talking about algorithmic complexity. In fact, that would have been exactly the sort of thing he wrapped into "good design".
When knuth wrote about not doing premature optimizations, he was living in an era where compilers were incredibly dumb. A premature optimization would be, for example, hand unrolling a loop to avoid a branch instruction. Or hand inlining functions to avoid method call overhead. That does make code more nasty and harder to deal with. That is to say, the specific optimizations knuth was talking about are the optimizations compilers today do by default.
I really hate that people have taken this to mean "Never consider algorithmic complexity". It's a big reason so much software is so slow and kludgy.
> Another example is prematurely choosing a complex data structure for theoretical efficiency (say, a custom tree for log(N) lookups) when the simpler approach (like a linear search) would have been acceptable for the data sizes involved.
To be fair, a linear search through an array is, most of the time, faster than a hash table for sufficiently small data sizes.
Some of these laws are like Gravity, inevitable things you can fight but will always exist e.g. increasing complexity. Some of them are laws that if you break people will yell at you or at least respect you less, e.g. leave it cleaner than when you found it.
"No matter how adept and talented you are at your craft with respect to both technical and business matters, people involved in finance will think they know better."
I feel that Postel's law probably holds up the worst out of these. While being liberal with the data you accept can seem good for the functioning of your own application, the broader social effect is negative. It promotes misconceptions about the standard into informal standards of their own to which new apps may be forced to conform. Ultimately being strict with the input data allowed can turn out better in the long run, not to mention be more secure.
The list is great but the explanation are clearly AI slop.
"Before SpaceX, launching rockets was costly because industry practice used expensive materials and discarded rockets after one use. Elon Musk applied first-principles thinking: What is a rocket made of? Mainly aluminum, titanium, copper, and carbon fiber. Raw material costs were a fraction of finished rocket prices. From that insight, SpaceX decided to build rockets from scratch and make them reusable."
Everything including humans are made of cheap materials but that doesn't convey the value. The AI got close to the answer with it's first sentence (re-usability) but it clearly missed the mark.
>The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise- and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.
Knuth thought an easy 12% was worth it, but most people who quote him would scoff at such efforts.
Moreover:
>Knuth’s Optimization Principle captures a fundamental trade-off in software engineering: performance improvements often increase complexity. Applying that trade-off before understanding where performance actually matters leads to unreadable systems.
I suppose there is a fundamental tradeoff somewhere, but that doesn't mean you're actually at the Pareto frontier, or anywhere close to it. In many cases, simpler code is faster, and fast code makes for simpler systems.
For example, you might write a slow program, so you buy a bunch more machines and scale horizontally. Now you have distributed systems problems, cache problems, lots more orchestration complexity. If you'd written it to be fast to begin with, you could have done it all on one box and had a much simpler architecture.
Most times I hear people say the "premature optimization" quote, it's just a thought-terminating cliche.
I absolutely cannot stand people who recite this quote but has no knowledge of the sentences that come before or after it: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."
> In many cases, simpler code is faster, and fast code makes for simpler systems. (...)
I wholeheartedly agree with you here. You mentioned a few architectural/backend issues that emerge from bad performance and introduce unnecessary complexity.
But this also happens in UI: Optimistic updates, client side caching, bundling/transpiling, codesplitting etc.
This is what happens when people always answer performance problems with adding stuff than removing stuff.
> I suppose there is a fundamental tradeoff somewhere, but that doesn't mean you're actually at the Pareto frontier, or anywhere close to it. In many cases, simpler code is faster, and fast code makes for simpler systems.
Just a little historic context will tell you what Knuth was talking about.
Compilers in the era of Knuth were extremely dumb. You didn't get things like automatic method inlining or loop unrolling, you had to do that stuff by hand. And yes, it would give you faster code, but it also made that code uglier.
The modern equivalent would be seeing code working with floating points and jumping to SIMD intrinsics or inline assembly because the compiler did a bad job (or you presume it did) with the floating point math.
That is such a rare case that I find the premature optimization quote to always be wrong when deployed. It's always seems to be an excuse to deploy linear searches and to avoid using (or learning?) language datastructures which solve problems very cleanly in less code and much less time (and sometimes with less memory).
This looks like a static website that could be served for free from Cloudflare Pages or Vercel, with a nearly unlimited quota. And still... It's been hugged to death, which is ironic, considering it's a software engineering website :).
Hell, something like this probably doesn't even need that. Throw it on a debian box running nginx or apache and you'll probably be set (though, with how hard bots have been scraping recently it might be harder than that)
So somebody who doesn’t know how to properly host a static website wants to teach me about software engineering. Cool.
99% sure it’s a vibecoded container for AI slop anyway.
Software engineering is voodoo masquerading as science. Most of these "laws" are just things some guys said and people thought "sounds sensible". When will we have "laws" that have been extensively tested experimentally in controlled conditions, or "laws" that will have you in jail for violating them? Like "you WILL be held responsible for compromised user data"?
Look, I understand the intent you have, and I also understand the frustration at the lack of care with which many companies have acted with regards to personal data. I get it, I'm also frustrated.
But (it's a big but)...
Your suggestion is that we hold people legally responsible and culpable for losing a confrontation against another motivated, capable, and malicious party.
That's... a seriously, seriously, different standard than holding someone responsible for something like not following best practices, or good policy.
It's the equivalent of killing your general when he loses a battle.
And the problem is that sometimes even good generals lose battles, not because they weren't making an honest effort to win, or being careless, but because they were simply outmatched.
So to be really, really blunt - your proposal basically says that any software company should be legally responsible for not being able to match the resources of a nation-state that might want to compromise their data. That's not good policy, period.
Incidents happen in the meat world too. Engineers follow established standards to prevent them to the best of their ability. If they don't, they are prosecuted. Nobody has ever suggested putting people in jail for Russia using magic to get access to your emails. However, in the real world, there is no magic. The other party "outmatches" you by exploiting typical flaws in software and hardware, or, far more often, in company employees. Software engineering needs to grow up, have real certification and standards bodies and start being rigorously regulated, unless you want to rely on blind hope that your "general" has been putting an "honest effort" and showing basic competence.
We already have similar legal measures in software for following standards. These match very directly to engineering standards in things like construction and architecture. These are clearly understood, ex SOC 2, PCI DSS, GDPR, CCPA, NIST standards, ISO 27001, FISMA... etc... Delve is an example (LITERALLY RIGHT NOW!) of these laws being applied.
What we don't do in engineering is hold the engineer responsible when Russia bombs the bridge.
What you're suggesting is that we hold the software engineer responsible when Russia bombs their software stack (or more realistically, just plants an engineer on the team and leaks security info, like NK has been doing).
Basically - I'm saying you're both wrong about lacking standards, and also suggesting a policy that punishes without regard for circumstance. I'm not saying you're wrong to be mad about general disregard for user data, but I'm saying your "simple and clear" solution is bad.
... something something... for every complex problem there is an answer that is clear, simple, and wrong.
France killed their generals for losing. It was terrible policy then and it's terrible policy now.
Sure, and in cases of negligence this is fine. The law even explicitly scales the punishment based on perceived negligence and almost always is only prosecuted in cases where the standards expectations aren't followed.
Ex - MMG for 2026 was prosecuted because:
- They failed to notify in response to a breach.
- They failed to complete proper risk analysis as required by HIPAA
They paid 10k in fines.
It wasn't just "They had a data breach" (ops proposal...) it was "They failed to follow standards which led to a data breach where they then acted negligently"
In the same way that we don't punish an architect if their building falls over. We punish them if the building falls over because they failed to follow expected standards.
Buildings don't just fall over, and security breaches don't just happen. These things happen when people fuck up. In the architecture world we hold individuals responsible for not fucking up--not the architect, but instead the licensed engineer who signs and seals the structural aspects of a plan. In the software world we do not.
> any software company should be legally responsible for not being able to match the resources of a nation-state that might want to compromise their data
No. Not the company, holding companies responsible doesn't do much. The engineer who signed off on the system needs to be held personally liable for its safety. If you're a licensed civil engineer and you sign off on a bridge that collapses, you're liable. That's how the real world works, it should be the same for software.
Obviously if someone dies or is injured a safety violation has occurred. But other examples include things like data protection failures--if for example your system violates GDPR or similar constraints it is unsafe. If your system accidentally breaks tenancy constraints (sends one user's data to another user) it is unsafe. If your system allows a user to escalate privileges it is unsafe.
These kinds of failures are not inevitable. We can build sociotechnical systems and practices that prevent them, but until we're held liable--until there's sufficient selection pressure to erode the "move fast and break shit" culture--we'll continue to act negligently.
None of those are what OP proposed. Frankly, we also cover many of these practices just fine. What do you think SOC 2 type 2 and ISO 27001 are?
It seems like your issue is that we don't hold all companies to those standards. But I'm personally ok with that. In the same way I don't think residential homes should be following commercial construction standards.
> What do you think SOC 2 type 2 and ISO 27001 are?
They're compliance frameworks that have little to no consequences when they're violated, except for some nebulous "loss of trust" or maybe in extreme cases some financial penalties. The problem is the expectation value of the violation penalty isn't sufficient to change behavior. Companies still ship code which violates these things all the time.
> It seems like your issue is that we don't hold all companies to those standards.
Yes, and my issue is that we don't hold engineers personally liable for negligent work.
> I don't think residential homes should be following commercial construction standards.
Sure, there are different gradations of safety standards, but often residential construction plans require sign-off by a professional engineer. In the case when an engineer negligently signs off on an unsafe plan, that engineer is liable. Should be exactly the same situation in software.
There are few principle of software engineering that I hate more than this one, though SOLID is close.
It is important to understand that it is from a 1974 paper, computing was very different back then, and so was the idea of optimization. Back then, optimizing meant writing assembly code and counting cycles. It is still done today in very specific applications, but today, performance is mostly about architectural choices, and it has to be given consideration right from the start. In 1974, these architectural choices weren't choices, the hardware didn't let you do it differently.
Focusing on the "critical 3%" (which imply profiling) is still good advice, but it will mostly help you fix "performance bugs", like an accidentally quadratic algorithms, stuff that is done in loop but doesn't need to be, etc... But once you have dealt with this problem, that's when you notice that you spend 90% of the time in abstractions and it is too late to change it now, so you add caching, parallelism, etc... making your code more complicated and still slower than if you thought about performance at the start.
Today, late optimization is just as bad as premature optimization, if not more so.
I really encourage people to read the Donald Knuth essay that features this sentiment. Pro tip: You can skip to the very end of the article to get to this sentiment without losing context.
Here ya go: https://dl.acm.org/doi/10.1145/356635.356640
Basically, don't spend unnecessary effort increasing performance in an unmeasured way before its necessary, except for those 10% of situations where you know in advance that crucial performance is absolutely necessary. That is the sentiment. I have seen people take this to some bizarre alternate insanity of their own creation as a law to never measure anything, typically because the given developer cannot measure things.
Similar to the "code should be self documenting - ergo: We don't write any comments, ever"
(Then, shortly afterward I also tried to find a new job, realized the entire industry had changed, and was fortunate enough to decide it wasn't worth the trouble.)
That's likely thanks to C which goes to great pains to not specify the size of the basic types. For example, for 64 bit architectures, "long" is 32 bits on the Mac and 64 bits everywhere else.
The net result of that is I never use C "long", instead using "int" and "long long".
This mess is why D has 32 bit ints and 64 bit longs, whether it's a 32 bit machine or a 64 bit machine. The result was we haven't had porting problems with integer sizes.
I've met very few folks who understand the overheads involved, and how extreme the benefits can be from avoiding those.
To be fair, though, I come up short on a lot of things comp sci graduates know.
It's why Andrei Alexandrescu and I made a good team. I was the engineer, and he the scientist. The yin and the yang, so to speak.
If the number of bits isn't actually included right in the type name, then be very sure you know what you're doing.
The senior engineer answer to "How many bits are there in an int?" is "No, stop, put that down before you put your eye out!" Which, to be fair, is the senior engineer answer to a lot of things.
My counterpoint: Code can be self-documenting, reality isn't. You can have a perfectly clear method that does something nobody will ever understand unless you have plenty of documentation about why that specific thing needs to be done, and why it can't be simpler. Like having special-casing for DST in Arizona, which no other state seems to need:
https://en.wikipedia.org/wiki/Time_in_the_United_States
"You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is."
Moreso, in my personal experience, I've seen a few speed hacks cause incorrect behavior on more than one occasion.
I'm still salty about that time a colleague suggested adding a 500 kb general purpose js library to a webapp that was already taking 12 seconds on initial load, in order to fix a tiny corner case, when we could have written our own micro utility in 20 lines. I had to spend so much time advocating to management for my choice to spend time writing that utility myself, because of that kind of garbage opinion that is way too acceptable in our industry today. The insufferable bastard kept saying I had to do measurements in order to make sure I wasn't prematurely optimizing. Guy adding 500 kb of js when you need 1 kb of it is obviously a horrible idea, especially when you're already way over the performance budget. Asshat. I'm still salty he got so much airtime for that shitty opinion of his and that I had to spend so much energy defending myself.
It's particularly the kind of people who like to say "hur hur don't prematurely optimize" that don't bother writing decent software to begin with and use the term as an excuse to write poor performing code.
Instead of optimizing their code, these people end up making excuses so they can pessimize it instead.
Usually those people also have a good old whinge about the premature optimization quote being wrong or misinterpreted and general attitudes to software efficiency.
Not once have I ever seen somebody try to derail a process of "ascertain speed is an issue that should be tackled" -> "profile" -> fix the hot path.
Your users are not going to notice. Sure, it's faster but it's not focused on the problem.
This doesn't make sense. Why is performance (via architectural choices) more important today than then?
You can build a snappy app today by using boring technology and following some sensible best practices. You have to work pretty hard to need PREMATURE OPTIMIZATION on a project -- note the premature there
If you are building something with similar practical constraints for the Nth time this is definitely true.
You are inheriting “architecture” from your own memory and/or tools/dependencies that are already well fit to the problem area. The architectural performance/model problem already got a lot of thought.
Lots of problems are like that.
But if you are solving a problem where existing tools do a poor job, you better be thinking about performance with any new architecture.
Optimization of bandwidth-bound code is almost purely architectural in nature. Most of our software best practices date from a time when everything was computation-bound such that architecture could be ignored with few bad effects.
There were fewer available layers of abstraction.
Whether you wrote in ASM, C, or Pascal, there was a lot less variance than writing in Rust, JavaScript, Python.
Thinking about the overall design, how its likely to be used, and what the performane and other requirements are before aggregating the frameworks of the day is mature optimization.
Then you build things in a reasonable way and see if you need to do more for performance. It's fun to do more, but most of the time, building things with a thought about performance gets you where you need to be.
The I don't need to think about performance at all camp, has a real hard time making things better later. For most things, cycle counting upfront isn't useful, but thinking about how data will be accessed and such can easily make a huge difference. Things like bulk load or one at a time load are enormous if you're loading lots of things, but if you'll never load lots of things, either works.
Thinking about concurrency, parallelism, and distributed systems stuff before you build is also pretty mature. It's hard to change some of that after you've started.
SOLID isn't bad, but like the idea of premature optimization, it can easily lead you into the wrong direction. You know how people make fun of enterprise code all the time, that's what you get when you take SOLID too far.
In practice, it tends to lead to a proliferation of interfaces, which is not only bad for performance but also result in code that is hard to follow. When you see a call through an interface, you don't know what code will be run unless you know how the object is initialized.
I think the most important principle above all is knowing when not to stick to them.
For example if I know a piece of code is just some "dead end" in the application that almost nothing depends on then there is little point optimizing it (in an architectural and performance sense). But if I'm writing a core part of an application that will have lots of ties to the rest, it totally does make sense keeping an eye on SOLID for example.
I think the real error is taking these at face value and not factoring in the rest of your problem domain. It's way too simple to think SOLID = good, else bad.
https://www.tedinski.com/2019/04/02/solid-critique.html
[1] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...
This should be the header of the website. I think the core of all these arguments is people thinking they ARE laws that must be followed no matter what. And in that case, yeah that won't work.
SOLID approaches aren't free... beyond that keeping code closer together by task/area is another approach. I'm not a fan of premature abstraction, and definitely prefer that code that relates to a feature live closer together as opposed to by the type of class or functional domain space.
For that matter, I think it's perfectly fine for a web endpoint handler to make and return a simple database query directly without 8 layers of interfaces/classes in between.
Beyond that, there are other approaches to software development that go beyond typical OOP practices. Something, something, everything looks like a nail.
The issues that I have with SOLID/CLEAN/ONION is that they tend to lead to inscrutable code bases that take an exponentially long amount of time for anyone to come close to learning and understanding... Let alone the decades of cruft and dead code paths that nobody bothered to clean up along the way.
The longest lived applications I've ever experienced tend to be either the simplest, easiest to replace or the most byzantine complex monstrosities... and I know which I'd rather work on and support. After three decades I tend to prioritize KISS/YAGNI over anything else... not that there aren't times where certain patterns are needed, so much as that there are more times where they aren't.
I've worked on one, singular, one application in three decades where the abstractions that tend to proliferate in SOLID/CLEAN/ONION actually made sense... it was a commercial application deployed to various govt agencies that had to support MS-SQL, Oracle and DB2 backends. Every, other, time I've seen an excess of database and interface abstractions have been instances that would have been better solved in other, less performance impacting ways. If you only have a single concrete implementation of an interface, you probably don't need that interface... You can inherit/override the class directly for testing.
And don't get me started on keeping unit tests in a completely separate project... .Net actually makes it painful to put your tests with your implementation code. It's one of my few actual critiques about the framework itself, not just how it's used/abused.
Even his "critique" of Demeter is, essentially, that it focuses on an inconsequential aspect of dysfunction—method chaining—which I consider to be just one sme that leads to the larger principle which—and we, apparently, both agree on this—is interface design.
The only part of SOLID that is perhaps OO-only is Liskov Substitution.
L is still a good idea, but without object-inheritance, there's less chance of shooting yourself in the foot.
If you follow SOLID, you'll write OOP only, with always present inheritance chains, factories for everything, and no clear relation between parameters and the procedures that use them.
L and I are both pretty reasonable.
But S and D can easily be taken to excess.
And O seems to suggest OO-style polymorphism instead of ADTs.
That's how I view it. You should design your application such that extension involves little modifying of existing code as long as it's not necessary from a behavior or architectural standpoint.
Bunch of stuff is done for us. Using postgres having indexes correct - is not premature optimization, just basic stuff to be covered.
Having double loop is quadratic though. Parallelism is super fun because it actually might make everything slower instead of faster.
Not if your optimization for performance is some Rube Goldberg assemblage of microservices and an laundry list of AWS services.
Anyone who has done optimization even a little knows that it isn't very difficult, but you do need to plan and architect for it so you don't have to restructure you whole program to get it to run well.
Mostly it's just rationalization, people don't know the skill so they pretend it's not worth doing and their users suffer for it.
If software and website were even reasonably optimized people could just use a computer as powerful as a rasberry pi 5 (except for high res video) for most of what they do day to day.
And as I point out, what Knuth was talking about in terms of optimization was things like loop unrolling and function inlining. Not picking the right datastructure or algorithm for the problem.
I mean, FFS, his entire book was about exploring and picking the right datastructures and algorithms for problems.
[1] https://news.ycombinator.com/item?id=47849194
Decades in, this is the worst of all of them. Misused by laziness or malice, and nowhere near specific enough.
The graveyard of companies boxed in by past poor decisions is sprawling. And the people that made those early poor decisions bounce around field talking about their "successful track record" of globally poor and locally good architectural decisions that others have had to clean up.
It touches on a real problem, though, but it should be stricken form the record and replaced with a much better principle. "Design to the problem you have today and the problems you have in 6 months if you succeed. Don't design to the problems you'll have have next year if it means you won't succeed in 6 months" doesn't roll off the tongue.
One thing that came out of the no-sql/new-sql trends in the past decade and a half is that joins are the enemy of performance at scale. It really helps to know and compromise on db normalization in ways such as leaning on JSON/XML for non-critical column data as opposed to 1:1/children/joins a lot of the time. For that matter, pure performance and vertical scale have shifted a lot of options back from the brink of micro service death by a million paper cuts processes.
You are right about the origin of and the circumstances surrounding the quote, but I disagree with the conclusion you've drawn.
I've seen engineers waste days, even weeks, reaching for microservices before product-market fit is even found, adding caching layers without measuring and validating bottlenecks, adding sharding pre-emptively, adding materialized views when regular tables suffice, paying for edge-rendering for a dashboard used almost entirely by users in a single state, standing up Kubernetes for an internal application used by just two departments, or building custom in-house rate limiters and job queues when Sidekiq or similar solutions would cover the next two years.
One company I consulted for designed and optimized for an order of magnitude more users than were in the total addressable market for their industry! Of that, they ultimately managed to hit only 3.5%.
All of this was driven by imagined scale rather than real measurements. And every one of those choices carried a long tail: cache invalidation bugs, distributed transactions, deployment orchestration, hydration mismatches, dependency array footguns, and a codebase that became permanently harder to change. Meanwhile the actual bottlenecks were things like N+1 queries or missing indexes that nobody looked at because attention went elsewhere.
Yes. When I was a young engineer, I was asked to design something for a scale we didn’t even get close to achieving. Eventual consistency this, event driven conflict resolution that… The service never even went live because by the time we designed it, everyone realized it was a waste of time.
I learned it makes no sense to waste time designing for zillions of users that might never come. It’s more important to have an architecture that can evolve as needs change rather than one that can see years into the future (that may never come).
I was quite literally asked to implement an in-memory cache to avoid a "full table scan" caused by a join to a small DB table recently. Our architect saw "full table scans" in our database stats and assumed that must mean a performance problem. I feel like he thought he was making a data-driven profiling decision, but seemed to misunderstand that a full-table scan is faster for a small table than a lookup. That whole table is in RAM in the DB already.
So now we have a complex Redis PubSub cache invalidation strategy to save maybe a ms or two.
I would believe that we have performance problems in this chunk of code, and it's possible an in-memory cache may "fix" the issue, but if it does, then the root of the problem was more likely an N+1 query (that an in-memory cache bandaids over). But by focusing on this cache, suddenly we have a much more complex chunk of code that needs to be maintained than if we had just tracked down the N+1 query and fixed _that_
In these domains, algorithm selection, and fine tuning hot spots pays off significantly. You must hit minimum speeds to make your application viable.
“A variable should mean one thing, and one thing only. It should not mean one thing in one circumstance, and carry a different value from a different domain some other time. It should not mean two things at once. It must not be both a floor polish and a dessert topping. It should mean One Thing, and should mean it all of the time.”
I worked as a janitor for four years near a restaurant, so I know a little bit about floor polishing and dessert toppings. This law might be a little less universal than you think. There are plenty of people who would happily try out floor polish as a dessert topping if they're told it'll get them high.
It probably won’t be up very long but it’s a classic.
I’m still waiting for the moment in the ice cream shop when I can ask them, “sugar or plain?” https://mediaburn.org/videos/sugar-or-plain/
It definitely revealed a lot of falsehoods and stereotypes.
I think that would be called a drug, not a desert topping.
The resolution I've landed on: be strict in what you accept at boundaries you control (internal APIs, config parsing) and liberal only at external boundaries where you can't enforce client upgrades. But that heuristic requires knowing which category you're in, which is often the hard part.
If I accidentally accept bad input and later want to fix that, I could break long-time API users and cause a lot of human suffering. If my input parsing is too strict, someone who wants more liberal parsing will complain, and I can choose to add it before that interaction becomes load-bearing (or update my docs and convince them they are wrong).
The stark asymmetry says it all.
Of course, old clients that can’t be upgraded have veto power over any changes that could break them. But that’s just backwards compatibility, not Postel’s Law.
Source: I’m on a team that maintains a public API used by thousands of people for nearly 10 years. Small potatoes in internet land but big enough that if you cause your users pain, you feel it.
Bottom line: it's all a matter of balance of powers. If you're the smaller guy in the equation, you'll be "Postel'ed" anyway.
Yet Postel's law is still in the "the road to hell is paved with good intentions" category, for the reason you explain very well (AKA XKCD #1172 "Workflow"). Wikipedia even lists a couple of major critics about it [1].
[1] https://en.wikipedia.org/wiki/Robustness_principle
I've seen CompSci guys especially (I'm EEE background, we have our own problems but this ain't one of them) launch conceptual complexity into the stratosphere just so that they could avoid writing two separate functions that do similar things.
Take the 5 Rings approach.
The purpose of the blade is to cut down your opponent.
The purpose of software is to provide value to the customer.
It's the only thing that matters.
You can also philosophize why people with blades needed to cut down their opponents along with why we have to provide value to the customer but thats beyond the scope of this comment
If you write a lot of code, the odds of something repeating in another place just by coincidence are quite large. But the odds of the specific code that repeated once repeating again are almost none.
That's a basic rule from probability that appears in all kinds of contexts.
Anyway, both DRY and WET assume the developers are some kind ignorant automaton that can't ever know the goal of their code. You should know if things are repeating by coincidence or not.
Partially correct. The purpose of your software to its owners is also to provide future value to customers competitively.
What we have learnt is that software needs to be engineered: designed and structured.
Making software is a back-of-house function, in restaurant terms. Nobody out there sees it happen, nobody knows what good looks like, but when a kitchen goes badly wrong, the restaurant eventually closes.
This is a very costly way of developing software.
I've been at organizations that don't think engineers should write tests because it takes too much time and slows them down...
The "who gives a shit, we'll just rewrite it at 100x the cost" approach to quality is very particular to the software startup business model, and doesn't work elsewhere.
The key is to avoid the temptation to DRY when things are only slightly different and find a balance between reuse and "one function/class should only do one thing."
One of my favorite things as a software engineer is when you see the third example of a thing, it shows you the problem from a different angle, and you can finally see the perfect abstraction that was hiding there the whole time.
My view is over-engineering comes from the innate desire of engineers to understand and master complexity. But all software is a liability, every decision a tradeoff that prunes future possibilities. So really you want to make things as simple as possible to solve the problem at hand as that will give you more optionality on how to evolve later.
Yes the initial HTML looked similar in these few places, and the resultant usage of the abstraction did not look similar.
But it took a very long time reading each place a table existed and quite a bit longer working out how to get it to generate the small amount of HTML you wanted to generate for a new case.
Definitely would have opted for repetition in this particular scenario.
The spectrum is [YAGNI ---- DRY]
A little less abstract: designing a UX comes to mind. It's one thing to make something workable for you, but to make it for others is way harder.
The goal ought to be to aim for a local minima of all of these qualities.
Some people just want to toss DRY away entirely though or be uselessly vague about when to apply it ("use it when it makes sense") and thats not really much better than being a DRY fundamentalist.
A common "failure" of DRY is coupling together two things that only happened to bear similarity while they were both new, and then being unable to pick them apart properly later.
Which is often caused by the "midlayer mistake" https://lwn.net/Articles/336262/
Yeah there are ways to avoid this and you need to strike balances, but sometimes you have to be careful and resist the temptation to DRY everything up 'cuz you might just make it brittler (pun intended).
The tricky part is that sometimes "a new thing" is really "four new things" disguised as one. A database table is a great example because it's a failure mode I've seen many times. A developer has to do it once and they have to add what they perceive as the same thing four times: the database table itself, the internal DB->code translation e.g. ORM mapping, the API definition, and maybe a CRUD UI widget. The developer thinks, "oh, this isn't DRY" and looks to tools like Alembic and PostGREST or Postgraphile to handle this end-to-end; now you only need to write to one place when adding a database table, great!
It works great at first, then more complex requirements come down: the database gets some virtual generated columns which shouldn't be exposed in code, the API shouldn't return certain fields, the UI needs to work off denormalized views. Suddenly what appeared to be the same thing four times is now four different things, except there's a framework in place which treats these four things as one, and the challenge is now decoupling them.
Thankfully most good modern frameworks have escape valves for when your requirements get more complicated, but a lot of older ones[0] really locked you in and it became a nightmare to deal with.
[0] really old versions of Entity Framework being the best/worst example.
But the code I'm talking about is really adding the same thing in 4 different places: the constant itself, adding it to a type, adding it to a list, and there was something else. It made it very easy to forget one step.
There should often be two points of truth because having one would increase the coupling cost more than the benefits that would be derived from deduplication.
Which maybe is also fine, I dunno :)
It can be quite hard to explain when a student asks why you did something a particular way. The truthful answer is that it felt like the right way to go about it.
With some thought you can explain it partly - really justify the decision subconsciously made.
If they're asking about a conscious decision that's rarely much more helpful that you having to say that's what the regulations, or guidelines say.
Where they really learn is seeing those edge cases and gray areas
So much SWE is overengineering. Just like this website to be honest. You don't get away with all that bullshit in other eng professions where your BoM and labour costs are material.
[0] https://cupid.dev/
Reading through the list mostly made me feel sad. You can't help but interpret these through the modern lens of AI assisted coding. Then you wonder if learning and following (some) of these for the last 20 years is going to make you a janitor for a bunch of AI slop, or force you into a coding style where these rules are meaningless, or make you entirely irrelevant.
Sort of like a real code of law.
- Every website will be vibecoded using Claude Opus
This will result in the following:
- The background color will be a shade of cream, to properly represent Anthropic
- There will be excessive use of different fonts and weights on the same page, as if a freshman design student who just learned about typography
- There will be an excess of cards in different styles, a noteworthy amount of which has a colored, round border either on hover or by default on exactly one side of the card
I always liked the fence story better though.
"In analyzing complexity, fast iteration almost always produces better results than in-depth analysis."
Boyd invented the OODA loop.
[0]https://blog.codinghorror.com/boyds-law-of-iteration/
Asking "who wrote this stupid code?" will retroactively travel back in time and cause it to have been you.
- Jeremy Clarkson (Top Gear, series 14 episode 5)
Structure code so that in an ideal case, removing a functionality should be as simple as deleting a directory or file.
Imagine the code as a graph with nodes and edges. The nodes should be grouped in a way that when you display the graph with grouped nodes, you see few edges between groups. Removing a group means that you need to cut maybe 3 edges, not 30. I.e. you don't want something where every component has a line to every other component.
Also when working on a feature - modifying / adding / removing, ideally you want to only look at an isolated group, with minimal links to the rest of the code.
For example, each comment on HN has a line on top that contains buttons like "parent", "prev", "next", "flag", "favorite", etc. depending on context. Suppose I might one day want to remove the "flag" functionality. Should each button be its own file? What about the "comment header" template file that references each of those button files?
This in itself might not be enough to justify this, but the fewer files will lead to more challenges in a collaborative environment (I’d also note that more small files will speed up incremental compilations since unchanged code is less likely to get recompiled which is one reason why when I do JVM dev, I never really think about compilation time—my IDE can recompile everything quickly in the background without my noticing).
You got a point for incremental compilation. But fewer files (done well) is not really a challenge as everything is self contained. It makes it easier to discern orthogonal features as the dependency graph is clearer. With multiple files you find often that similar things are assumed to be identical and used as such. Then it’s a big refactor when trying to split them, especially if they are foundational.
Sure, don't add hooks for things you don't immediately need. But if you are reasonably sure a feature is going to be required at some point, it doesn't hurt to organize and structure your code in a way that makes those hooks easy to add later on.
Worst case scenario, you are wrong and have to refactor significantly to accommodate some other feature you didn't envision. But odds are you have to do that anyway if you abide by YAGNI as dogma.
The amount of times I've heard YAGNI as reasoning to not modularize code is insane. There needs to be a law that well-intentioned developers will constantly misuse and misunderstand the ideas behind these heuristics in surprising ways.
> Fen's law: copy-paste is free; abstractions are expensive.
edit: I should add, this is aimed at situations like when you need a new function that's very similar to one you already have, and juniors often assume it's bad to copy-paste so they add a parameter to the existing function so it abstracts both cases. And my point is: wait, consider the cost of the abstraction, are the two use cases likely to diverge later, do they have the same business owner, etc.
> 11. Abstractions don’t remove complexity. They move it to the day you’re on call.
Source: https://addyosmani.com/blog/21-lessons/
- Shirky Principle: Institutions will try to preserve the problem to which they are the solution
- Chesterton's Fence: Changes should not be made until the reasoning behind the current state of affairs is understood
- Rule of Three: Refactoring given only two instances of similar code risks selecting a poor abstraction that becomes harder to maintain than the initial duplication
> "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it"
https://casual-effects.blogspot.com/2014/05/a-computer-scien...
When it comes to frameworks (any framework) any jargon not explicitly pointing to numbers always eventually reduces down to some highly personalized interpretation of easy.
It is more impactful than it sounds because it implicitly points to the distinction of ultimate goal: the selfish developer or the product they are developing. It is also important to point out that before software frameworks were a thing the term framework just identifies a defined set of overlapping abstract business principles to achieve a desired state. Software frameworks, on the other hand, provide a library to determine a design convention rather than the desired operating state.
I had a hard time learning the whole mvc concept
Or develop a skill to make it correct, fast and pretty in one or two approaches.
- Write a correct, pretty implementation
- Beat Claude Code with a stick for 20 minutes until it generated a fragile, unmaintainable mess that still happened to produce the same result but in 300ms rather than 2500ms. (In this step, explicitly prompting it to test rather than just philosophising gets you really far)
- Pull across the concepts and timesaves from Claude's mess into the pretty code.
Seriously, these new models are actually really good at reasoning about performance and knowing alternative solutions or libraries that you might have only just discovered yourself.
But yes, the scope and breadth of their knowledge goes far beyond what a human brain can handle. How many relevant facts can you hold in your mind when solving a problem? 5? 12? An LLM can take thousands of relevant facts into account at the same time, and that's their superhuman ability.
Because rewriting old complex code is way more time consuming that you think it'll be. You have to add not only in the same features, but all the corner cases that your system ran into in the past.
Have seen this myself. A large team spent an entire year of wasted effort on a clean rewrite of an key system (shopping cart at a high-volume website) that never worked... ...although, in the age of AI, wonder if a rewrite would be easier than in the past. Still, guessing even then, it'd be better if the AI refactored it first as a basis for reworking the code, as opposed to the AI doing a clean rewrite of code from the start.
Where's Chesterton's Fence?
https://en.wiktionary.org/wiki/Chesterton%27s_fence
[EDIT: Ninja'd a couple of times. +1 for Shirky's principle]
> complexity is not a function of time spent working on something.
But the complexity you observe is a function of your exposure to that complexity.
The notion of complexity exists to quantify the degree of struggle required to achieve some end. Ousterhout’s observation is that if you can move complexity into components far away from where you must do your work to achieve your ends, you no longer need to struggle with that complexity, and thus it effectively is not there anymore.
That’s pretty much what good design is about. Your solve a foundational problems and now no one else needs to think about it (including you when working on some other parts).
The UX pyramid but applied to DX.
It basically states that you should not focus in making something significant enjoyable or convenient if you don't have something that is usable, reliable or remotely functional.
https://www.google.com/search?q=ux+pyramid
Then I committed the code and let the second AI review it. It too had no problem with goto's.
Claude's Law: The code that is written by the agent is the most correct way to write it.
9. Most software will get at most one major rewrite in its lifetime.
Applies to opensource. But it also means that code reviews are a good thing. Seniors can guide juniors to coax them to write better code.
* https://martynassubonis.substack.com/p/5-empirical-laws-of-s...
* https://newsletter.manager.dev/p/the-unwritten-laws-of-softw..., which linked to:
* https://newsletter.manager.dev/p/the-13-software-engineering...
A couple are well-described/covered in books, e.g., Tesler's Law (Conservation of Complexity) is at the core of _A Philosophy of Software Design_ by John Ousterhout
https://www.goodreads.com/en/book/show/39996759-a-philosophy...
(and of course Brook's Law is from _The Mythical Man Month_)
Curious if folks have recommendations for books which are not as well-known which cover these, other than the _Laws of Software Engineering_ book which the site is an advertisement for.....
I wish AWS/Azure had this functionality.
My bet is on the long arc of the universe trending toward complexity... but in spite of all this, I don't think all this complexity arises from a simple set of rules, and I don't think Gall's law holds true. The further we look at the rule-set for the universe, the less it appears to be reducible to three or four predictable mechanics.
Especially things like “every system grows more complex over time” — you can see it in almost any project after a few iterations.
I think the real challenge isn’t knowing these laws, but designing systems that remain usable despite them.
While browsing it, I of course found one that I disagree with:
Testing Pyramid: https://lawsofsoftwareengineering.com/laws/testing-pyramid/
I think this is backwards.
Another commenter WillAdams has mentioned A Philosophy of Software Design (which should really be called A Set of Heuristics for Software Design) and one of the key concepts there are small (general) interfaces and deep implementations.
A similar heuristic also comes up in Elements of Clojure (Zachary Tellman) as well, where he talks about "principled components and adaptive systems".
The general idea: You should greatly care about the interfaces, where your stuff connects together and is used by others. The leverage of a component is inversely proportional to the size of that interface and proportional to the size of its implementation.
I think the way that connects to testing is that architecturally granular tests (down the stack) is a bit like pouring molasses into the implementation, rather than focusing on what actually matters, which is what users care about: the interface.
Now of course we as developers are the users of our own code, and we produce building blocks that we then use to compose entire programs. Having example tests for those building blocks is convenient and necessary to some degree.
However, what I want to push back on is the implied idea of having to hack apart or keep apart pieces so we can test them with small tests (per method, function etc.) instead of taking the time to figure out what the surface areas should be and then testing those.
If you need hyper granular tests while you're assembling pieces, then write them (or better: use a REPL if you can), but you don't need to keep them around once your code comes together and you start to design contracts and surface areas that can be used by you or others.
- Not realizing it's a very concrete theorem applicable in a very narrow theoretical situation, and that its value lies not in the statement itself but in the way of thinking that goes into the proof.
- Stating it as "pick any two". You cannot pick CA. Under the conditions of the CAP theorem it is immediately obvious that CA implies you have exactly one node. And guess what, then you have P too, because there's no way to partition a single node.
A much more usable statement (which is not a theorem but a rule of thumb) is: there is often a tradeoff between consistency and availability.
Relax. You will make all the mistakes because the laws don't make sense until you trip over them :)
Comment your code? Yep. Helped me ten years later working on the same codebase.
You can't read a book about best practises and then apply them as if wisdom is something you can be told :)
It is like telling kids, "If you do this you will hurt yourself" YMMV but it won't :)
ha, someone needs to email Netlify...
https://web.archive.org/web/20260421113202/https://lawsofsof...
> The first 90% of the code accounts for the first 90% of development time; the remaining 10% accounts for the other 90%.
It should be 90% code - 10% time / 10% code - 90% time
This one belongs to history books, not to the list of contemporary best practices.
> Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule
Anyway, the list seems like something AI scraped and has a strong bias towards "gotcha" comments from the likes of reddit.
> Premature Optimization (Knuth's Optimization Principle)
> Another example is prematurely choosing a complex data structure for theoretical efficiency (say, a custom tree for log(N) lookups) when the simpler approach (like a linear search) would have been acceptable for the data sizes involved.
This example is the exact example I'd choose where people wrongly and almost obstinately apply the "premature optimization" principles.
I'm not saying that you should write a custom hash table whenever you need to search. However, I am saying that there's a 99% chance your language has an inbuilt and standard datastructure in it's standard library for doing hash table lookups.
The code to use that datastructure vs using an array is nearly identical and not the least bit hard to read or understand.
And the reason you should just do the optimization is because when I've had to fix performance problems, it's almost always been because people put in nested linear searches turning what could have been O(n) into O(n^3).
But further, when Knuth was talking about actual premature optimization, he was not talking about algorithmic complexity. In fact, that would have been exactly the sort of thing he wrapped into "good design".
When knuth wrote about not doing premature optimizations, he was living in an era where compilers were incredibly dumb. A premature optimization would be, for example, hand unrolling a loop to avoid a branch instruction. Or hand inlining functions to avoid method call overhead. That does make code more nasty and harder to deal with. That is to say, the specific optimizations knuth was talking about are the optimizations compilers today do by default.
I really hate that people have taken this to mean "Never consider algorithmic complexity". It's a big reason so much software is so slow and kludgy.
To be fair, a linear search through an array is, most of the time, faster than a hash table for sufficiently small data sizes.
Here's another law: the law of Vibe Engineering. Whatever you feel like, as long as you vibe with it, is software engineering.
It’s not a great list. The good old c2.com has many more, better ones.
That one's free.
Site not available This site was paused as it reached its usage limits. Please contact the site owner for more information.
"Before SpaceX, launching rockets was costly because industry practice used expensive materials and discarded rockets after one use. Elon Musk applied first-principles thinking: What is a rocket made of? Mainly aluminum, titanium, copper, and carbon fiber. Raw material costs were a fraction of finished rocket prices. From that insight, SpaceX decided to build rockets from scratch and make them reusable."
Everything including humans are made of cheap materials but that doesn't convey the value. The AI got close to the answer with it's first sentence (re-usability) but it clearly missed the mark.
https://lawsofsoftwareengineering.com/laws/premature-optimiz...
It leaves out this part from Knuth:
>The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise- and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.
Knuth thought an easy 12% was worth it, but most people who quote him would scoff at such efforts.
Moreover:
>Knuth’s Optimization Principle captures a fundamental trade-off in software engineering: performance improvements often increase complexity. Applying that trade-off before understanding where performance actually matters leads to unreadable systems.
I suppose there is a fundamental tradeoff somewhere, but that doesn't mean you're actually at the Pareto frontier, or anywhere close to it. In many cases, simpler code is faster, and fast code makes for simpler systems.
For example, you might write a slow program, so you buy a bunch more machines and scale horizontally. Now you have distributed systems problems, cache problems, lots more orchestration complexity. If you'd written it to be fast to begin with, you could have done it all on one box and had a much simpler architecture.
Most times I hear people say the "premature optimization" quote, it's just a thought-terminating cliche.
I wholeheartedly agree with you here. You mentioned a few architectural/backend issues that emerge from bad performance and introduce unnecessary complexity.
But this also happens in UI: Optimistic updates, client side caching, bundling/transpiling, codesplitting etc.
This is what happens when people always answer performance problems with adding stuff than removing stuff.
Just a little historic context will tell you what Knuth was talking about.
Compilers in the era of Knuth were extremely dumb. You didn't get things like automatic method inlining or loop unrolling, you had to do that stuff by hand. And yes, it would give you faster code, but it also made that code uglier.
The modern equivalent would be seeing code working with floating points and jumping to SIMD intrinsics or inline assembly because the compiler did a bad job (or you presume it did) with the floating point math.
That is such a rare case that I find the premature optimization quote to always be wrong when deployed. It's always seems to be an excuse to deploy linear searches and to avoid using (or learning?) language datastructures which solve problems very cleanly in less code and much less time (and sometimes with less memory).
Law 0: Fix infra.
If you are saying you _can_ fix 90-99% of performance bottlenecks eventually with caching, that may be true, but doesn't sound as nice
Posterior probability of a prompt-created website: 99%.
Look, I understand the intent you have, and I also understand the frustration at the lack of care with which many companies have acted with regards to personal data. I get it, I'm also frustrated.
But (it's a big but)...
Your suggestion is that we hold people legally responsible and culpable for losing a confrontation against another motivated, capable, and malicious party.
That's... a seriously, seriously, different standard than holding someone responsible for something like not following best practices, or good policy.
It's the equivalent of killing your general when he loses a battle.
And the problem is that sometimes even good generals lose battles, not because they weren't making an honest effort to win, or being careless, but because they were simply outmatched.
So to be really, really blunt - your proposal basically says that any software company should be legally responsible for not being able to match the resources of a nation-state that might want to compromise their data. That's not good policy, period.
What we don't do in engineering is hold the engineer responsible when Russia bombs the bridge.
What you're suggesting is that we hold the software engineer responsible when Russia bombs their software stack (or more realistically, just plants an engineer on the team and leaks security info, like NK has been doing).
Basically - I'm saying you're both wrong about lacking standards, and also suggesting a policy that punishes without regard for circumstance. I'm not saying you're wrong to be mad about general disregard for user data, but I'm saying your "simple and clear" solution is bad.
... something something... for every complex problem there is an answer that is clear, simple, and wrong.
France killed their generals for losing. It was terrible policy then and it's terrible policy now.
Ex - MMG for 2026 was prosecuted because:
- They failed to notify in response to a breach.
- They failed to complete proper risk analysis as required by HIPAA
They paid 10k in fines.
It wasn't just "They had a data breach" (ops proposal...) it was "They failed to follow standards which led to a data breach where they then acted negligently"
In the same way that we don't punish an architect if their building falls over. We punish them if the building falls over because they failed to follow expected standards.
No. Not the company, holding companies responsible doesn't do much. The engineer who signed off on the system needs to be held personally liable for its safety. If you're a licensed civil engineer and you sign off on a bridge that collapses, you're liable. That's how the real world works, it should be the same for software.
These kinds of failures are not inevitable. We can build sociotechnical systems and practices that prevent them, but until we're held liable--until there's sufficient selection pressure to erode the "move fast and break shit" culture--we'll continue to act negligently.
It seems like your issue is that we don't hold all companies to those standards. But I'm personally ok with that. In the same way I don't think residential homes should be following commercial construction standards.
That doesn't worry me overly much.
> What do you think SOC 2 type 2 and ISO 27001 are?
They're compliance frameworks that have little to no consequences when they're violated, except for some nebulous "loss of trust" or maybe in extreme cases some financial penalties. The problem is the expectation value of the violation penalty isn't sufficient to change behavior. Companies still ship code which violates these things all the time.
> It seems like your issue is that we don't hold all companies to those standards.
Yes, and my issue is that we don't hold engineers personally liable for negligent work.
> I don't think residential homes should be following commercial construction standards.
Sure, there are different gradations of safety standards, but often residential construction plans require sign-off by a professional engineer. In the case when an engineer negligently signs off on an unsafe plan, that engineer is liable. Should be exactly the same situation in software.