> A common pattern would be to separate pure business logic from data fetching/writing. So instead of intertwining database calls with computation, you split into three separate phases: fetch, compute, store (a tiny ETL). First fetch all the data you need from a database, then you pass it to a (pure) function that produces some output, then pass the output of the pure function to a store procedure.
Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design? I've heard a lot about it, contrived examples make it seem like something I'd want, but I often find it's much more difficult in real-world cases.
Random example from my codebase: I have a function that periodically sends out reminders for usage-based billing customers. It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type). The code feels very messy and procedural, with business logic mixed with side effects, but I'm not sure where a natural separation point would be -- there's no way to "fetch all the data" up front.
> there's no way to "fetch all the data" up front.
this is incorrect
I assume there's more nuance and complexity as for why it feels like there's no way. Probably involving larger design decisions that feel difficult to unwind. But data collection, decisions, and actions can all be separated without much difficulty with some intent to do so.
I would suggest caution, before implementating this directly: but imagine a subroutine that all it did was lock some database table, read the current list of pending top up charges required, issue the charge, update the row, and unlock the table. An entirely different subroutine wouldn't need to concern itself with anything other than data collection, and calculating deltas, it has no idea if a customer will be charged, all it does is calculate a reasonable amount. Something smart wouldn't run for deactivated/expiring accounts, but why does this need to be smart? It's not going to charge anything, it's just updating the price, that hypothetically might be used later based on data/logic that's irrelevant to the price calculation.
Once any complexity got involved, this is closer to how I would want to implement it, because this also gives you a clear transcript about which actions happened why. I would want to be able to inspect the metadata around each decision to make a charge.
This stuff is quite new to me as I’ve been learning F#, so take this with a pinch of salt. Some of the things you’d want are:
- a function to produce a list of customers
- a function or two to retrieve the data, which would be passed into the customer list function. This allows the customer list function to be independent of the data retrieval. This is essentially functional dependency injection
- a function to take a list of customers and return a list of effects: things that should happen
- this is where I wave my hands as I’m not sure of the plumbing. But the final part is something that takes the list of effects and does something with them
With the above you have a core that is ignorant of where its inputs come from and how its effects are achieved - it’s very much a pure domain model, with the messy interfaces with the outside world kept at the edges
> How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception? Did you ever find yourself surprised when eventually it did happen? I know I did, since then I at least add some logs even if I think I'm sure that it really cannot happen.
I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.
At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.
Ideally, if you can convince yourself something cannot happen, you can also convince the compiler, and get rid of the branch entirely by expressing the predicate as part of the type (or a function on the type, etc.)
Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.
Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.
Not every piece of logic lends itself to being expressed in the type system.
Let's say you're implementing a sorting algorithm. After step X you can be certain that the values at locations A, B, and C are sorted such that A <= B <= C. You can be certain of that because you read the algorithm in a prestigious journal, or better, you read it in Knuth and you know someone else would have caught the bug if it was there. You're a diligent reader and you've convinced yourself of its correctness, working through it with pencil and paper. Still, even Knuth has bugs and perhaps you made a mistake in your implementation. It's nice to add an assertion that at the very least reminds readers of the invariant.
Perhaps some Haskeller will pipe up and tell me that any type system worth using can comfortably describe this PartiallySortedList<A, B, C>. But most people have to use systems where encoding that in the type system would, at best, make the code significantly less expressive.
Yes, this has been my experience too! Another tool in the toolbox is property / fuzz testing. Especially for data structures, and anything that looks like a state machine. My typical setup is this:
1. Make a list of invariants. (Eg if Foo is set, bar + zot must be less than 10)
2. Make a check() function which validates all the invariants you can think of. It’s ok if this function is slow.
3. Make a function which takes in a random seed. It initializes your object and then, in a loop, calls random mutation functions (using a seeded RNG) and then calls check(). 100 iterations is usually a good number.
4. Call this in an outer loop, trying lots of seeds.
5. If anything fails, print out the failing seed number and crash. This provides a reproducible test so you can go in and figure out what went wrong.
If I had a penny for every bug I’ve found doing this, I’d be a rich man. It’s a wildly effective technique.
> At some level, the simplest thing to do is to give up and crash if things are no longer sane.
The problem with this attitude (that many of my co-workers espouse) is that it can have serious consequences for both the user and your business.
- The user may have unsaved data
- Your software may gain a reputation of being crash-prone
If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.
A comment "this CANNOT happen" has no value on itself. Unless you've formally verified the code (including its dependencies) and have the proof linked, such comments may as well be wishes and prayers.
Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.
> A comment "this CANNOT happen" has no value on itself.
I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.
`assert(false)` is pronounced "this can never happen." It's reasonable to add a comment with /why/ this can never happen, but if that's all the comment would have said, a message adds no value.
Oh I agree, literally `assert(false, "This cannot happen")` is useless, but ensuring message is always there encourages something more like, `assert(false, "This implies the Foo is Barred, but we have the Qux to make sure it never is")`.
Ensuring a message encourages people to state the assumptions that are violated, rather than just asserting that their assumptions (which?) don't hold.
debug_assert!() (and it's equivalent in other languages, like C's assert with NDEBUG) is cursed. It states that you believe something to be true, but will take no automatic action if it is false; so you must implement the fallback behavior if your assumption is false manually (even if that fallback is just fallthrough). But you can't /test/ that fallback behavior in debug builds, which means you now need to run your test suite(s) in both debug and release build versions. While this is arguably a good habit anyway (although not as good a habit as just not having separate debug and release builds), deliberately diverging behavior between the two, and having tests that only work on one or the other, is pretty awful.
Importantly, specifying reasoning can have communicative value while falling very far short of formal verification. Personally, I also try to include a cross reference to the things that could allow "this" to happen were they to change.
Do you not make such a tacit assumption every time you index into an array (which in almost all languages throws an exception on bounds failure)? You always have to make assumptions that things stay consistent from one statement to the next, at least locally. Unless you use formal verification, but hardly anyone has the time and resources for that.
If such an error happens, that would be a compiler bug. Why? Because I usually do checks against the length of the array or have it done as part of the standard functions like `map`. I don't write such assumptions unless I'm really sure about the statements, and even then I don't.
> or have it done as part of the standard functions like `map`.
Which are all well and good when they are applicable, which is not always 100% of the time.
> Because I usually do checks against the length of the array
And what do you have your code do if such "checks" fail? Throw an assertion error? Which is my whole point, I'm advocating in favor of sanity-check exceptions.
Or does calling them "checks" instead of "assumptions" magically make them less brittle from surrounding code changes?
A comment have no semantic value to the code. Having code that check for stuff is different from writing comments as they are executed by the machine. Not read by other humans.
False it has value. It’s actually even better to log it or throw an exception. print(“this cannot happen.”)
If you see it you immediately know the class of error is purely a logic error the programmer made a programming mistake. Logging it makes it explicit your program has a logic bug.
What if you didn’t log it? Then at runtime you will have to deduce the error from symptoms. The log tells you explicitly what the error is.
Worse: You may created the proof. You may have linked to the proof. But if anyone has touched any of the code involved since then, it still has no value unless someone has re-done the proof and linked that. (Worse, it has negative value, because it can mislead.)
Git blame will show the commit and the date for each line. It’s easy to verify if the snippet has changed since the comment. i use Emacs and it’s builtin vc package that color code each block.
You should prefer to write unreachable!("because ...") to explain to some future maintenance engineer (maybe yourself) why you believed this would never be reached. Since they know it was reached they can compare what you believed against their observed facts and likely make better decisions.
But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.
Heh, recently I had to fix a bug in some code that had one of these comments. Feels like a sign of bad code or laziness. Why make a path that should not happen? I can get it when it's on some while loop that should find something to return, but on a if else sequence it feels really wrong.
Strong disagree about laziness. If the dev is lazy they will not make a path for it. When they are not lazy they actually make a path and write a comment explaining why they think this is unreachable. Taking the time to write a comment is not a sign of laziness. It’s the complete opposite. You can debate whether the comment is detailed enough to convey why the dev thinks it’s unreachable, but it’s infinitely better than no comment and leaving the unreachability in their head.
I really like modern Swift. It makes a lot of what this author is complaining about, impossible.
The worst file I ever inherited to work on was the ObjC class for
Instagram’s User Profile page. It looked like it’d been written by a JavaScript fan. There were no types in the whole file, everything was an ‘id’ (aka void*) and there were ‘isKindOfClass’ and null checks all over the place. I wanted to quit when I saw it. (I soon did).
Modern swift makes this technically possible but so cluttered that it's effectively impossible, especially compared with typescript.
Swift distinguishes between inclusive and exclusive / exhaustive unions with enum vs protocols and provides no easy or simple way to bridge between the two. If you want to define something that typescript provides as easy as the vertical bar, you have to write an enum definition, a protocol bridge with a type identifier, a necessarily unchecked cast back (even if you can logically prove that the type enum has a 1:1 mapping), and loads of unnecessary forwarding code. You can try and elide some of it with (iirc, its been a couple years) @dynamicMemberLookup, but the compiler often chokes on this, it kills autocomplete, and it explodes compile times because Swift's type checker degrades to exponential far more frequently than other languages, especially when used in practice, such as in SwiftUI.
I think you’re conflating 'conciseness' with 'correctness.' The 'clutter' you're describing in Swift like, having to explicitly define an Enum instead of using a vertical bar |, is exactly what makes it more robust than TS for large-scale systems.
In TypeScript a union like string | number is structural and convenient, but it lacks semantic meaning. In Swift, by defining an Enum, you give those states a name and a purpose. This forces you to handle cases exhaustively and intentionally. When you're dealing with a massive codebase 'easy' type bridging is often how you end up back in 'id' or 'any' hell. Swift’s compiler yelling at you is usually it trying to tell you that your logic is too ambiguous to be safely compiled which, in a safety first language, is the compiler doing its job.
This is just cope. Swift's compiler doesn't choke because your logic is too ambiguous, it chokes because it's simply too slow at inferring types in certain cases, including a very common case where you write a normal Swift UI view. There is nothing ambiguous about that.
Secondly, I'm not sure why you think the mountains of boilerplate to replace | leads to semantic meaning. The type identifier itself is sufficient.
When I tried to do learn some to put together a little app, every search result for my questions was for a quick blog seemingly aimed at iOS devs who didn’t want to learn and just wanted to copy-paste the answer - usually in the form of an extension method
Typing is great, presuming that the developer did a thorough job of defining their type system. If they get the model wrong, or it is incomplete then you aren't really gaining much out of a strictly typed language. Every change is a fight. You are likely to hack the model to make the code compile. There is a reason that Rust is most successful at low level code. This is where the models are concrete and simple to create. As you move up the stack, complexity increases and the ability to create a coherent model goes beyond human abilities. That's why coding isn't math or religion. Different languages and approaches for different domains.
Great read. C# has the concept of nullable reference types[1] which requires you to be explicit if a variable can be null and the compiler is aware of this. I would love to see a similar feature in languages like TypeScript and Go.
> Rust makes it possible to safely manage memory without using a garbage collector, probably one of the biggest pain points of using low-level languages like C and C++. It boils down to the fact that many of the common memory issues that we can experience, things like dangling pointers, double freeing memory, and data races, all stem from the same thing: uncontrolled sharing of mutable state.
Minor nit: this should be mutable state and lifetimes. I worked with Rust for two years before recently working with Zig, and I have to say opt-in explicit lifetimes without XOR mutability requirements would be a nice combo.
The "lies" described here are essentially the definition of weakly typed programming, even in statically typed languages.
Functional languages like ML/Haskell/Lisp dialects has no lies built in for decades, and it's good to see the mainstream programming (Java, TS, C++, etc.) to catch up as well.
There are also cute benefits of having strong schemas for your API as well -- for example, that endpoint becomes an MCP for LLMs automatically.
The whole article gives a generated vibe, but I did want to point out this particular snippet
> The compiler is always angry. It's always yelling at us for no good reason. It's only happy when we surrender to it and do what it tells us to do. Why do we agree to such an abusive relationship?
Programming languages are a formal notation for the execution steps of a computing machine. A formal system is always built around rules and not following the rules is an error, in this case a malformed statement/expression. It's like writing: afjdla lkwcn oqbcn. Yes, they are characters, but they're not english words.
Apart from the syntax, which is a formal system on its own, the compiler may have additional rules (like a type system). And you can add even more rules with a static analysis tool (linter). Even though there may be false positives, failing one of those usually means that what you wrote is meaningless in some way. It may run, but it can have unexpected behavior.
Natural language have a lot of tolerance for ambiguous statements (which people may not be aware of if they share the same metaphor set). But a computer has none. You either follow the rules or you do not and have an error.
Right, and I suspect that was the author's intent - to evoke a sympathetic frustration that newer programmers might feel, and then to point out how the frustration is ill-aimed.
> Rust makes it possible to safely manage memory without using a garbage collector, probably one of the biggest pain points of using low-level languages like C and C++.
In C++, memory management has not been a pain point for many years, and you basically don't need to do it at all if you don't want to. The standard library takes care of it well enough - with owning containers and smart pointers.
> And Rust is famous for its optimizations in the style of "zero cost abstractions".
No, it isn't that famous for those. The safety and no-UB constraints prevent a lot of that.
By the way, C++, which is more famous for them, still struggles in some cases. For example, ABI restrictions prevent passing unique_ptr's via single registers, see: https://stackoverflow.com/q/58339165/1593077
For another perspective on "lying to the compiler," I enjoyed the section on Loopholes in Niklaus Wirth's "Good Ideas, Through the Looking Glass"[1]. An excerpt:
Experience showed that normal users will not shy away from using the loophole, but rather enthusiastically grab on to it as a wonderful feature that they use wherever possible. This is particularly so if manuals caution against its use.
[...]
The presence of a loophole facility usually points to a deficiency in the language proper, revealing that certain things could not be expressed.
Wirth's use of loophole most closely aligns with the unchecked casts that the article uses. I don't think exceptions amount to lying to the compiler. They amount more to assuming for sake of contradiction, which is not quite lying (e.g., AFSOC is a valid proof technique, but proofs can be wrong). Null as a form of lying is not the fault of the programmer, that's more the fault of the language, so again doesn't feel like lying.
I'm not a fan of the recent trend in software development, started by the OOP craze but in the modern day largely driven by Rust advocates, of noun-based programming, where type hierarchies are the primary interface between the programmer and the code, rather than the data or the instructions. It's just so... dogmatic. Inexpressive. It ultimately feels to me like a barrier between intention and reality, another abstraction. The type system is the program, rather than the program being the program. But speaking of dogma, the author's insistence that not abiding by this noun-based programming model is a form of 'lying' is quite the accusatory stretch of language... but I digress at the notion that I might just be a hit dog hollering.
The kind of noun-based programming you don’t like is great for large teams and large code bases where there is an inherent communication barrier based on the number of people involved. (N choose 2 = N*(N-1)/2 so it grows quadratically.) Type hierarchies need to be the primary interface between the programmers and the code because it communicates invariants on the data more precisely than words. It is dogmatic, because that’s the only way it could work for large teams.
When you are the only programmer, this matters way less. Just do whatever based on your personal taste.
That sounds eerily similar to the "OOP is for large teams" defence which is simply not true.
On the contrary, this noun-based programming explodes with complexity on large teams. Yes, interfaces are obviously important, but when every single thing is its own type and you try to solve problems with the type system leading to a combinatoric explosion of types and their interactions, what do you think happens when you scale the team up?
Agreed. It's often accompanied by the dogma "make invalid states unrepresentable" which sounds good until you start trying to encode into the type system foo.bar being 1-42 unless foo.baz is above 10, where now foo.bar can be -42-1 instead, but if foo.omfg is prefixed with "wtf" then foo.baz needs to be above 20 for its modifiers to kick in.
Yeah good luck doing that in the type system in a way that is maintainable, open to modification, an scales with complexity.
This was a great breakdown and very well written. I think you made one of the better arguments for rust Ive read on the internet but you also made sure to acknowledge that large code bases are just a different beast all together. Personally I will say that AI has made making code proofs or "formal verification" more accessible. Actually writing a proof for your code or code verification is very hard to do for most programmers which is why it is not done by most programmers, but AI is making it accessible and with formal verification of code you prevent so many problems. It will be interesting to see where programming and compliers go when "formal verification" becomes normal.
Hello baader meinhoff my old friend - while I’m familiar with the convention, I was just introduced formally to the phrase “functional core, imperative shell” the other day, and now here it is again.
“Learn to stop worrying and love the bomb” was definitely a process I had to go through moving from JavaScript to Typescript, but I do mostly agree with the author here wrt convention. Some things, like using type names as additional levels of context - UserUUID and ItemUUID each alias UUID, which in turn is just an alias for String - have occurred to me naturally, even.
So his view is from a programmer / developer. That's fine.
I had an issue on my local computer system yesterday; manjaro would not
boot with a new kernel I compiled from source. It would freeze, at the
boot menu, which I never had before. Anyway. I installed linuxmint today
and went on to actually compile a multitude of things from source. I
finally finished compiling mesa, xorg-server, ffmpeg, mpv, gtk3 + gtk4 -
and the prior dependencies (llvm etc...). So I am almost finished finally.
I had to invest quite a lot of time hunting for dependencies. Most recent
one was glad2 for libplacebo. Turns out "pip install glad2" suffices here.
But getting that wasn't so trivial. The project project at pip website was
virtually useless; respectively I installed "pip install glad" which was
too old. Also took me perhaps one full minute or more to realise it.
I am tapping into LFS and BLFS webpage (Linux from scratch), which helps a
lot but it is not perfect. So much information is not described and people
have to know what they are doing. You can say this is fair, as this is more
for advanced users. Ok. The problem is ... so many things that compilers do,
is not well-described; or at the least you can not easily find high quality
documentation. Google search is almost virtually useless now; AI just hallucinates
and flat out lies to you often. Or tells you things that are trivia and you already
know it. We kind of lose quality here. It's as if everything got dumbed down.
Meanwhile more and more software is required to build other software. Take
mesa. Now I need not only LLVM but also the whole spirv-stack. And shaderc. And
lots more. And also rust - why is rust suddenly such a huge dependency? Why is
there such a proliferation of programming languages? Ok, perhaps C and C++ are
no longer the best language, but WHY is the whole stack constantly expanding?
We worship complexity. The compilers also become bigger and bigger.
About two days ago I cloned gcc from https://github.com/gcc-mirror/gcc.
The .tar.xz sits at 3.8 GB. Granted, regular tarball releases are much
smaller, e. g. 15.1.0 tar.xz at 97MB (at https://ftp.gnu.org/gnu/gcc/?C=M;O=D).
But still. These things become bigger and bigger. gcc-7.2.0.tar.xz from 9 years ago had a size of 59M. Almost twice the size now in less than 10 years. And
that's really just like all the other software too. We ended up worshipping
more and more bloat. Nobody cares about size. Now one can say "this is just
static code", but this is expanded and it just keeps on getting bigger. Look
at LLVM. How to compile this beast: https://www.linuxfromscratch.org/blfs/view/svn/general/llvm.... - and this will
only get bigger and bigger and bigger.
So, back to the "are compilers your best friend"? I am not sure. We seem to
have the problem of more and more complexity getting in at the same time. And
everyone seems to think this is no issue. I believe there are issues. Take
slackware; basically it was a one person maintains it. This may not be the
primary reason, but slackware slowed down a lot in the last some years. Perhaps
maintaining all of that requires a team of people. Older engineers cared about
size due to constraints. Now that the constraints are less important, bloat
became the default.
Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design? I've heard a lot about it, contrived examples make it seem like something I'd want, but I often find it's much more difficult in real-world cases.
Random example from my codebase: I have a function that periodically sends out reminders for usage-based billing customers. It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type). The code feels very messy and procedural, with business logic mixed with side effects, but I'm not sure where a natural separation point would be -- there's no way to "fetch all the data" up front.
this is incorrect
I assume there's more nuance and complexity as for why it feels like there's no way. Probably involving larger design decisions that feel difficult to unwind. But data collection, decisions, and actions can all be separated without much difficulty with some intent to do so.
I would suggest caution, before implementating this directly: but imagine a subroutine that all it did was lock some database table, read the current list of pending top up charges required, issue the charge, update the row, and unlock the table. An entirely different subroutine wouldn't need to concern itself with anything other than data collection, and calculating deltas, it has no idea if a customer will be charged, all it does is calculate a reasonable amount. Something smart wouldn't run for deactivated/expiring accounts, but why does this need to be smart? It's not going to charge anything, it's just updating the price, that hypothetically might be used later based on data/logic that's irrelevant to the price calculation.
Once any complexity got involved, this is closer to how I would want to implement it, because this also gives you a clear transcript about which actions happened why. I would want to be able to inspect the metadata around each decision to make a charge.
- a function or two to retrieve the data, which would be passed into the customer list function. This allows the customer list function to be independent of the data retrieval. This is essentially functional dependency injection
- a function to take a list of customers and return a list of effects: things that should happen
- this is where I wave my hands as I’m not sure of the plumbing. But the final part is something that takes the list of effects and does something with them
With the above you have a core that is ignorant of where its inputs come from and how its effects are achieved - it’s very much a pure domain model, with the messy interfaces with the outside world kept at the edges
I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.
At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.
Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.
Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.
Let's say you're implementing a sorting algorithm. After step X you can be certain that the values at locations A, B, and C are sorted such that A <= B <= C. You can be certain of that because you read the algorithm in a prestigious journal, or better, you read it in Knuth and you know someone else would have caught the bug if it was there. You're a diligent reader and you've convinced yourself of its correctness, working through it with pencil and paper. Still, even Knuth has bugs and perhaps you made a mistake in your implementation. It's nice to add an assertion that at the very least reminds readers of the invariant.
Perhaps some Haskeller will pipe up and tell me that any type system worth using can comfortably describe this PartiallySortedList<A, B, C>. But most people have to use systems where encoding that in the type system would, at best, make the code significantly less expressive.
1. Make a list of invariants. (Eg if Foo is set, bar + zot must be less than 10)
2. Make a check() function which validates all the invariants you can think of. It’s ok if this function is slow.
3. Make a function which takes in a random seed. It initializes your object and then, in a loop, calls random mutation functions (using a seeded RNG) and then calls check(). 100 iterations is usually a good number.
4. Call this in an outer loop, trying lots of seeds.
5. If anything fails, print out the failing seed number and crash. This provides a reproducible test so you can go in and figure out what went wrong.
If I had a penny for every bug I’ve found doing this, I’d be a rich man. It’s a wildly effective technique.
The problem with this attitude (that many of my co-workers espouse) is that it can have serious consequences for both the user and your business.
- The user may have unsaved data - Your software may gain a reputation of being crash-prone
If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.
Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.
I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.
Ensuring a message encourages people to state the assumptions that are violated, rather than just asserting that their assumptions (which?) don't hold.
- assert!() (always checked),
- debug_assert!() (only run in debug builds)
- unreachable!() (panics)
- unsafe unreachable_unchecked() (tells the compiler it can optimise assuming this is actually unreachable)
- if cfg!(debug_assertions) { … } (Turns into if(0){…} in release mode. There’s also a macro variant if you need debug code to be compiled out.)
This way you can decide on a case by case basis when your asserts are worth keeping in release mode.
And it’s worth noting, sometimes a well placed assert before the start of a loop can improve performance thanks to llvm.
debug_assert!() (and it's equivalent in other languages, like C's assert with NDEBUG) is cursed. It states that you believe something to be true, but will take no automatic action if it is false; so you must implement the fallback behavior if your assumption is false manually (even if that fallback is just fallthrough). But you can't /test/ that fallback behavior in debug builds, which means you now need to run your test suite(s) in both debug and release build versions. While this is arguably a good habit anyway (although not as good a habit as just not having separate debug and release builds), deliberately diverging behavior between the two, and having tests that only work on one or the other, is pretty awful.
Which are all well and good when they are applicable, which is not always 100% of the time.
> Because I usually do checks against the length of the array
And what do you have your code do if such "checks" fail? Throw an assertion error? Which is my whole point, I'm advocating in favor of sanity-check exceptions.
Or does calling them "checks" instead of "assumptions" magically make them less brittle from surrounding code changes?
Keep two copies or three like RAID?
Edit: ECC ram helps for sure, but what else?
if array.Len > 2 { X = Y[1] }
For every CRUD to that array?
That seems... not ideal
If you see it you immediately know the class of error is purely a logic error the programmer made a programming mistake. Logging it makes it explicit your program has a logic bug.
What if you didn’t log it? Then at runtime you will have to deduce the error from symptoms. The log tells you explicitly what the error is.
But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.
My code is peppered with `assert(0)` for cases that should never happen. When they trip, then I figure out why it happened and fix it.
This is basic programming technique.
The worst file I ever inherited to work on was the ObjC class for Instagram’s User Profile page. It looked like it’d been written by a JavaScript fan. There were no types in the whole file, everything was an ‘id’ (aka void*) and there were ‘isKindOfClass’ and null checks all over the place. I wanted to quit when I saw it. (I soon did).
Swift distinguishes between inclusive and exclusive / exhaustive unions with enum vs protocols and provides no easy or simple way to bridge between the two. If you want to define something that typescript provides as easy as the vertical bar, you have to write an enum definition, a protocol bridge with a type identifier, a necessarily unchecked cast back (even if you can logically prove that the type enum has a 1:1 mapping), and loads of unnecessary forwarding code. You can try and elide some of it with (iirc, its been a couple years) @dynamicMemberLookup, but the compiler often chokes on this, it kills autocomplete, and it explodes compile times because Swift's type checker degrades to exponential far more frequently than other languages, especially when used in practice, such as in SwiftUI.
In TypeScript a union like string | number is structural and convenient, but it lacks semantic meaning. In Swift, by defining an Enum, you give those states a name and a purpose. This forces you to handle cases exhaustively and intentionally. When you're dealing with a massive codebase 'easy' type bridging is often how you end up back in 'id' or 'any' hell. Swift’s compiler yelling at you is usually it trying to tell you that your logic is too ambiguous to be safely compiled which, in a safety first language, is the compiler doing its job.
Secondly, I'm not sure why you think the mountains of boilerplate to replace | leads to semantic meaning. The type identifier itself is sufficient.
When I tried to do learn some to put together a little app, every search result for my questions was for a quick blog seemingly aimed at iOS devs who didn’t want to learn and just wanted to copy-paste the answer - usually in the form of an extension method
[1]: https://learn.microsoft.com/en-us/dotnet/csharp/nullable-ref...
Minor nit: this should be mutable state and lifetimes. I worked with Rust for two years before recently working with Zig, and I have to say opt-in explicit lifetimes without XOR mutability requirements would be a nice combo.
Functional languages like ML/Haskell/Lisp dialects has no lies built in for decades, and it's good to see the mainstream programming (Java, TS, C++, etc.) to catch up as well.
There are also cute benefits of having strong schemas for your API as well -- for example, that endpoint becomes an MCP for LLMs automatically.
Zig is one. For that matter standard C has no exceptions
> The compiler is always angry. It's always yelling at us for no good reason. It's only happy when we surrender to it and do what it tells us to do. Why do we agree to such an abusive relationship?
Programming languages are a formal notation for the execution steps of a computing machine. A formal system is always built around rules and not following the rules is an error, in this case a malformed statement/expression. It's like writing: afjdla lkwcn oqbcn. Yes, they are characters, but they're not english words.
Apart from the syntax, which is a formal system on its own, the compiler may have additional rules (like a type system). And you can add even more rules with a static analysis tool (linter). Even though there may be false positives, failing one of those usually means that what you wrote is meaningless in some way. It may run, but it can have unexpected behavior.
Natural language have a lot of tolerance for ambiguous statements (which people may not be aware of if they share the same metaphor set). But a computer has none. You either follow the rules or you do not and have an error.
The guard rails aren't abusing you, they're helping you. They aren't "angry", they're just constraints.
In C++, memory management has not been a pain point for many years, and you basically don't need to do it at all if you don't want to. The standard library takes care of it well enough - with owning containers and smart pointers.
> And Rust is famous for its optimizations in the style of "zero cost abstractions".
No, it isn't that famous for those. The safety and no-UB constraints prevent a lot of that.
By the way, C++, which is more famous for them, still struggles in some cases. For example, ABI restrictions prevent passing unique_ptr's via single registers, see: https://stackoverflow.com/q/58339165/1593077
Experience showed that normal users will not shy away from using the loophole, but rather enthusiastically grab on to it as a wonderful feature that they use wherever possible. This is particularly so if manuals caution against its use.
[...]
The presence of a loophole facility usually points to a deficiency in the language proper, revealing that certain things could not be expressed.
Wirth's use of loophole most closely aligns with the unchecked casts that the article uses. I don't think exceptions amount to lying to the compiler. They amount more to assuming for sake of contradiction, which is not quite lying (e.g., AFSOC is a valid proof technique, but proofs can be wrong). Null as a form of lying is not the fault of the programmer, that's more the fault of the language, so again doesn't feel like lying.
[1] https://people.inf.ethz.ch/wirth/Articles/GoodIdeas.pdf
When you are the only programmer, this matters way less. Just do whatever based on your personal taste.
On the contrary, this noun-based programming explodes with complexity on large teams. Yes, interfaces are obviously important, but when every single thing is its own type and you try to solve problems with the type system leading to a combinatoric explosion of types and their interactions, what do you think happens when you scale the team up?
Yeah good luck doing that in the type system in a way that is maintainable, open to modification, an scales with complexity.
“Learn to stop worrying and love the bomb” was definitely a process I had to go through moving from JavaScript to Typescript, but I do mostly agree with the author here wrt convention. Some things, like using type names as additional levels of context - UserUUID and ItemUUID each alias UUID, which in turn is just an alias for String - have occurred to me naturally, even.
I had an issue on my local computer system yesterday; manjaro would not boot with a new kernel I compiled from source. It would freeze, at the boot menu, which I never had before. Anyway. I installed linuxmint today and went on to actually compile a multitude of things from source. I finally finished compiling mesa, xorg-server, ffmpeg, mpv, gtk3 + gtk4 - and the prior dependencies (llvm etc...). So I am almost finished finally.
I had to invest quite a lot of time hunting for dependencies. Most recent one was glad2 for libplacebo. Turns out "pip install glad2" suffices here. But getting that wasn't so trivial. The project project at pip website was virtually useless; respectively I installed "pip install glad" which was too old. Also took me perhaps one full minute or more to realise it.
I am tapping into LFS and BLFS webpage (Linux from scratch), which helps a lot but it is not perfect. So much information is not described and people have to know what they are doing. You can say this is fair, as this is more for advanced users. Ok. The problem is ... so many things that compilers do, is not well-described; or at the least you can not easily find high quality documentation. Google search is almost virtually useless now; AI just hallucinates and flat out lies to you often. Or tells you things that are trivia and you already know it. We kind of lose quality here. It's as if everything got dumbed down.
Meanwhile more and more software is required to build other software. Take mesa. Now I need not only LLVM but also the whole spirv-stack. And shaderc. And lots more. And also rust - why is rust suddenly such a huge dependency? Why is there such a proliferation of programming languages? Ok, perhaps C and C++ are no longer the best language, but WHY is the whole stack constantly expanding?
We worship complexity. The compilers also become bigger and bigger.
About two days ago I cloned gcc from https://github.com/gcc-mirror/gcc. The .tar.xz sits at 3.8 GB. Granted, regular tarball releases are much smaller, e. g. 15.1.0 tar.xz at 97MB (at https://ftp.gnu.org/gnu/gcc/?C=M;O=D). But still. These things become bigger and bigger. gcc-7.2.0.tar.xz from 9 years ago had a size of 59M. Almost twice the size now in less than 10 years. And that's really just like all the other software too. We ended up worshipping more and more bloat. Nobody cares about size. Now one can say "this is just static code", but this is expanded and it just keeps on getting bigger. Look at LLVM. How to compile this beast: https://www.linuxfromscratch.org/blfs/view/svn/general/llvm.... - and this will only get bigger and bigger and bigger.
So, back to the "are compilers your best friend"? I am not sure. We seem to have the problem of more and more complexity getting in at the same time. And everyone seems to think this is no issue. I believe there are issues. Take slackware; basically it was a one person maintains it. This may not be the primary reason, but slackware slowed down a lot in the last some years. Perhaps maintaining all of that requires a team of people. Older engineers cared about size due to constraints. Now that the constraints are less important, bloat became the default.