Writing your own C++ standard library from scratch

(nibblestew.blogspot.com)

136 points | by JNRowe 14 hours ago

16 comments

bregma 11 hours ago
He's not writing the C++ standard library from scratch. He's writing his own library, in a different namespace, with some similar functionality. It's easy to write a non-standard library that only satisfies your limited subset of needs. People do it all the time. It's not special.
The ABI stability boast is based on having no legacy to support. It will work fine as long as everything is shipped only as code and the whole world needs to be rebuilt from scratch every time and even then, you never change any code ever even for a major bugfix. That's hardly practical in the real world where one tiny misstep on the ABI front can result in billion-dollar multinationals threatening suit (ask me how I know). It's a facile claim.
The C++ standard library hasn't been known as "the STL" for almost 30 years, ever since part of the STL was modified and adopted into the C++ standard library. Most of the features he's providing implementations for were never a part of the STL (file I/O, strings, hash maps, UTF-8).
I maintain an implementation of the C++ standard library for a living. It's a full-time job. It's a huge library (note to the committee: please stop) and it's really easy to mess something up. But if you want to write your own library that doesn't do what the standard library does or meet any of its requirements and implementation constraints or serve its real-world purpose, go right ahead. Just don't claim you're writing your own C++ standard library. You're not.
[-]
- munificent 2 hours ago
  This is an unnecessarily combative comment.
  The author doesn't claim to be implementing the C++ standard library. They clearly say they are implementing a C++ standard library.
  It's obvious from the context that they mean a hobby-scale set of basic datatype and algorithm libraries. It would take an uncharatible reading to not realize that they mean a lowercase "standard library", not "conforming implementation of the ISO C++ Standard Library".
  The article literally says "It's my time, and I'll waste it if I want to!" and uses "pystd" as the namespace.
- qznc 9 hours ago
  How do you know about a billion-dollar multinationals threatening suit?
- bsoles 1 hour ago
  > I maintain an implementation of the C++ standard library for a living.
  I have always wondered: does the standard library have a huge test suite? For that matter, how is the language implementation itself tested against regressions, etc. If anybody has some knowledge about this...
- serial_dev 5 hours ago
  I mean, I think you have unreasonable expectations for a 1K word blog post. Or maybe I have very low standards and no matter how disappointing a post is, I accept it... could be. However, if I read a blog post on "Rewrite React from Scratch", I'm expecting to see some reactivity and that's it.
  As a reader, after I see I barely need to scroll to the end of the article (and the repo isn't very big either), I immediately understood that they aren't rewriting C++ standard library from scratch. Of course they can't give all the answers on how to maintain backwards compatibility with decades of legacy stuff with probably billions of devices and exotic use cases..
- jb1991 11 hours ago
  I agree, and every time I see someone refer to the standard library as the STL, I know I’m interacting with someone who doesn’t actually know the language very well.
  [-]
  - dataflow 7 hours ago
    > I agree, and every time I see someone refer to the standard library as the STL, I know I’m interacting with someone who doesn’t actually know the language very well.
    I would put away the judgment. I and many others still call it the STL despite knowing the language and history of the term. Because that ship sailed and it's what many people call it nowadays. Because writing out "the standard library" every damn time gets really tiring really fast.
    It works in the reverse direction too, btw. The people who constantly nitpick on this are usually the ones who are more focused on being pedantic than being helpful. So that's the signal you send when you do that. Ask me how I know.
  - badsectoracula 9 hours ago
    Pretty much every C++ developer i've ever interacted with the last 25 years -informally- refers to the standard library as 'STL'.
    I do not think anyone with even a small amount of C++ experience will be confused when 'STL' is referenced in the context of C++.
    Of course it might be that every C++ developer i've interacted with doesn't really know the language very well. But considering the popular axiom that says something along the lines of 'people who claim they know C++ do not really know C++ while people who know C++ do not claim they know C++' what you wrote might actually be true :-P.
    [-]
    - pton_xd 6 hours ago
      Yeah but, it's one thing to informally refer to it as the STL; people will know what you're talking about. It's another to write an article about the standard library and say "The C++ standard library (also know as the STL)," which is a false statement and implies the author doesn't know what they're talking about. That's what the parent is referring to, I think. Personally no one I know has even informally referred to it as the STL since at least C++11, so it's a bit jarring to read.
      [-]
      - tom_ 5 hours ago
        If it's a thing people call it (and they do) then surely there's no better qualification for that being something it's also known as.
        [-]
        jb1991 4 hours ago
        It’s not a thing that anyone I know who works professionally in the field calls it. Not for at least 10 years now. Some hobbyists do, though.
        [-]
        Maxatar 3 hours ago
        One of the main contributors to the C++ standard library refers to it that way. I can also confirm that many members of the C++ Committee refer to it that way as well.
        https://old.reddit.com/r/cpp/comments/c90sxa/whats_the_diffe...
        [-]
        jb1991 1 hour ago
        That was half a decade ago. I’d be curious if they still feel that way.
        tom_ 2 hours ago
        Well there you go. Just as you say, it is also known as the STL.
    - ryandrake 7 hours ago
      Well, a lot of people in general use words incorrectly when speaking informally, that doesn't make that usage correct. Irregardless, for all intensive purposes, I could care less how people say STL.
      [-]
      - messe 7 hours ago
        > that doesn't make that usage correct
        Eventually it does. That's how language evolves. Prescriptivism is pointless pedantry.
        [-]
        ryandrake 6 hours ago
        Yes, language evolves over long periods of time, but not all incorrect spelling and usage is language evolving in front of our eyes. Sometimes, it's just poor English knowledge.
        EnergyAmy 6 hours ago
        I think you missed that the OP was demonstrating that statement false in the latter portion of their comment.
      - wholinator2 6 hours ago
        Also, pedantry warning, the phrase is "intents and purposes", not "intensive purposes". Also "could care less" implies you do care, "couldn't care less" implies you don't.
        [-]
        pton_xd 5 hours ago
        Well that's his point. But in the context of this discussion, if he's writing a post about English idioms and expressions and writes like that, no one is going to take his opinion seriously.
      - billforsternz 2 hours ago
        I see what you did there, well played sir.
  - Longhanks 10 hours ago
    Yet, Microsoft's own implementation was open sourced in 2019 in the repo "microsoft/STL" and in the second line of the readme claims the C++ standard library be also known as STL and the readme continues to use the term STL to refer to the C++ standard library.
    (https://github.com/microsoft/STL, https://github.com/microsoft/STL/commit/219514876ea86491de19..., https://github.com/microsoft/STL/blame/main/README.md#L3)
    [-]
    - electroly 6 hours ago
      From https://learn.microsoft.com/en-us/cpp/standard-library/cpp-s... --
      "Microsoft's implementation of the C++ Standard Library is often referred to as the STL or Standard Template Library. Although C++ Standard Library is the official name of the library as defined in ISO 14882, due to the popular use of "STL" and "Standard Template Library" in search engines, we occasionally use those names to make it easier to find our documentation. From a historical perspective, "STL" originally referred to the Standard Template Library written by Alexander Stepanov. Parts of that library were standardized in the C++ Standard Library, along with the ISO C runtime library, parts of the Boost library, and other functionality. Sometimes "STL" is also used to refer to the containers and algorithms parts of the C++ Standard Library adapted from Stepanov's STL. In this documentation, Standard Template Library (STL) refers to the C++ Standard Library as a whole."
    - zabzonk 10 hours ago
      > the readme continues to use the term STL to refer to the C++ standard library.
      Would not be the first time Microsoft were wrong about a standard.
    - germandiago 10 hours ago
      STL is actually a subset of the full C++ standard library, which includes, for example, C headers.
      The STL is the algorithms + data structures + utilities, which are templates.
      [-]
      - Longhanks 10 hours ago
        From a dogmatic point of view, this might've been correct at some point in time. But, as the links from above clearly point out, most people use "STL" and "C++ standard library" interchangeably (including the very maintainers of one of the most populous C++ standard library implementations), without excluding certain parts of one or the other.
    - jb1991 4 hours ago
      I certainly don’t think Microsoft can be used as a barometer for what is considered accurate or standardized with C++.
    - caspper69 8 hours ago
      Interesting that their implementation is Apache 2.0 licensed, yet includes exceptions for LLVM and for GPLv2 licensed code/projects wrt patents.
      Does anyone know if the library's quality is on par with the GNU or Clang libraries? Google has their own too, if memory serves. Is there an implementation deemed "the best"?
      [-]
      - Longhanks 6 hours ago
        The license was explicitly chosen to enable code sharing with LLVM's libc++ (https://devblogs.microsoft.com/cppblog/open-sourcing-msvcs-s...).
        The MSVC STL's quality is good enough for thousands of pieces of Windows software (including Windows itself & Microsoft's software such as Office) to depend and rely on. It delivers excellent performance for a broad range of use cases. It is actively developed in the open, delivering cutting-edge (C++23 & C++26) features, accepting Pull Requests and wonderfully documented on GitHub. It can be consumed using MSVC and LLVM clang-cl (which the MSVC STL maintainers test with CI infrastructure). The maintainers are actively working on "hardening" features to enable more secure C++ (https://github.com/microsoft/STL/wiki/STL-Hardening).
        Unless you specify what "best" or "a library's quality" means to you, MSVC STL is excellent and because of that, the default choice on & for Windows.
        Google chooses to only support libc++ for Chrome/Chromium (https://chromium.googlesource.com/chromium/src/+/main/docs/t...). libc++ is not a Google-owned project.
  - pjmlp 9 hours ago
    I use C++ since 1993 and still refer to the standard library as STL, as do many WG21 members in many of their conference talks, do they don't know C++ very well?
    It is a matter of habit and none of us are going to change, only because some folks think otherwise on the Internet.
  - whobre 10 hours ago
    Meh - many simply use STL for the STandard Library. Few people even remember Stepanov’s Standard Template Library.
    [-]
    - werdnapk 8 hours ago
      So I haven't been a C++ programmer for almost 20 years now (I would have considered myself very experienced with it at the time) and STL was the Standard Template Library for me back then. Is the Standard Template Library no longer in (widespread) use? Would any modern C++ programmer use it anymore?
      [-]
      - stonemetal12 6 hours ago
        The STL was standardized and made part of the standard library in 1998. So no it hasn't been a thing in more than 20 years.
        It was around 2003 when I noticed people start to take issue with calling the standard library templates the STL. IDK why. Used yes, referred to as the STL not unless you like hostility from the language lawyers on stack overflow.
      - injidup 7 hours ago
        A modern c++ programmer uses Rust so no.
    - pjmlp 9 hours ago
      The SGI documentation for it was quite nice though.
leni536 12 hours ago
The section about the "perfect ABI stability" is rather naive. If you have a 3rd party library that exposes a class like this in a header:
```
  class SomePublicClass {
    pystd::HashMap<pystd::U8String, size_t> member;
    /*...*/
  };
```
and distribute that 3rd party library as compiled against a particular pystd version and the headers, then that build is tied to one particular "epoch" or version of pystd, you can't safely link that library against a program that uses a different "epoch" of pystd.
It's also not a new idea either. libc++ puts everything inside an inline namespace `std::__1`. There is a reason that they never bumped that.
[-]
- yig 4 hours ago
  I think you may have misunderstood the proposal. Your 3rd party library example would have to write `pystd2025::HashMap<pystd2025::U8String, size_t> member;`. Isn't that stable?
  From the post:
```
  The sample code above used the pystd namespace. It does not actually exist. Instead it is defined like this in the cpp file:

    #include <pystd2025.hpp> 
    namespace pystd = pystd2025;
```
- SpaceManNabs 41 minutes ago
  The best way to know something well is to confidently do a project like this incorrectly so that commentators correct you :)
- londons_explore 12 hours ago
  just like there is a dynamic linker that can relocate code and fixup addresses, there should be a "dynamic class-sizer" which can recognise that something is being linked against a different version of some library, but the used fields used are all still present even if the structures have changed size, and dynamically adjust all pointers into the class to match the new size.
  [-]
  - leni536 11 hours ago
    Stable API implies stable size, but not the other way around. If I have a vector class that is a pointer and two sizes and change that to a vector class that is three pointers then I didn't change the size, but I broke ABI.
    Any change to the value representation of a class is an ABI break. A change that also changes size is just an obvious one. And value representation is an abstraction which is determined by the semantics of member functions, not something a linker can easily have access to.
    [-]
    - quietbritishjim 9 hours ago
      I think you're probably right, but maybe a bit too dismissive of the thought experiment.
      > value representation is an abstraction which is determined by the semantics of member functions, not something a linker can easily have access to
      This is the real problem. The GP's hypothetical extended linker could work even with semantic changes to the meaning of member variables, like in your sizes to pointers example, so long as all member functions are dynamically obtained from the shared library for that class (and no member variables are publicly exposed for use by application code). That means disabling inlining, which is a problem for templated code. Where does the machine code for std::vector<MyClass>::begin() go when MyClass is by definition unknown at the point when we're compiling the standard library? Even an exhaustive set of implementations for those known at that time isn't feasible (e.g. should the library contain code for vector<vector<int>>::begin()? What about 3 or more levels of nesting?)
      One option might be if template class implementations were tailored to this situation by ensuring that any template class is just a thin inlined wrapper around a non-templated class (with non-inlined methods). Early template libraries actually were often a bit like this to avoid "code bloat", or still are to some extent. But to do it fully, the inner class would need to hold the size of the data type at runtime, and need callbacks for copy constructors etc. This is where the concept really starts to break down.
  - jeffreygoesto 11 hours ago
    Coming with C++26 as far as I see...
    https://isocpp.org/files/papers/P2996R7.html#getting-class-l...
    [-]
    - account42 11 hours ago
      Reflection won't allow a linker to magically translate a class from one library version to another. It may not even be possible to do that translation at all.
  - amelius 11 hours ago
    That might be possible only if we compiled to some intermediate bytecode first, and shipped that with the code.
fefe23 13 hours ago
The title is confusing. He is not reimplementing the STL. He is writing some C++ classes providing functionality that is also already implemented in STL.
[-]
- damnitbuilds 13 hours ago
  Yes, and he has so far reimplemented only a tiny fraction of that STL functionality.
  Still interesting, despite the misleading title.
grandempire 8 hours ago
There is a science to designing reusable containers and algorithms, and it’s based on research like Art of Computer programming and you can learn more by reading primary sources about the design of STL.
STL can absolutely be improved, but posts like this indicate most programmers are clueless about how it works, and not in a position to learn from its mistakes and make something better.
If we are serious about code reuse we need to study these ideas and learn how to actually write libraries. The alternative is the npm/crates model - where you throw together 100 different open source concoctions and hope it works.
BonusPlay 13 hours ago
A problem I encountered while writing custom stdlib, is that certain language features expect stdlib to be there.
For example, <=> operator assumes, that std::partial_ordering exists. Kinda lame. In the newer C++ standards, more and more features are unusable without stdlib (or at least std namespace).
[-]
- account42 11 hours ago
  Sometimes standard library types defined in terms of compiler-builtins like `typedef decltype(nullptr) nullptr_t` but that doesn't always make sense. E.g. for operator<=> the only alternative would be for the compiler to define std::partial_ordering internally but what is gained by that?
  [-]
  - quuxplusone 6 hours ago
    Well, just the idea that you can use the entire core language without `#include`'ing any headers or depending on any standard-library stuff, is seen as a benefit by some people (in which I include myself). C++ inherited from C a pretty strong distinction between "language" and "library". This distinction is relatively alien to, say, Python or JavaScript, but it's pretty fundamental to C that the compiler knows how to do a bunch of stuff and then the library is built _on top of_ the core language, rather than alongside it holding its hand the whole way.
    Your example with partial_ordering is actually one of my longstanding pet issues. It would have been possible (I wrote in https://quuxplusone.github.io/blog/2018/04/15/built-in-libra... ) to define
```
    using strong_ordering = decltype(1 <=> 2);
    using partial_ordering = decltype(1. <=> 2.);
```
    But it remains impossible, AFAIK, to define `weak_ordering` from within the core language. Maybe this is where someone will prove me wrong!
    As of C++14 it's even possible to define the type `initializer_list` using only core-language constructs:
```
    template<class T> T dv();
    template<class T> auto ilist() { auto il = { dv<T>(), dv<T>() }; return il; }
    template<class T> using initializer_list = decltype(ilist<T>());
```
    (But you aren't allowed to do these things without including <compare> resp. <initializer_list> first, because the Standard says so.)
    [-]
    - account42 6 hours ago
      Note that even for C the dependency from compiler to standard library exists in practice because optimizing compilers will treat some standard library functions like memcpy specially by default and either convert calls to them into optimized inlined code, generate calls to them from core language constructs, or otherwise make assumptions about them matching the standard library specification. And beyond that you need compiler support libraries for things like operations missing from the target architecture or stack probes required on some platforms and various other language and/or compiler features.
      But for all of these (including the result types of operator<=>) you can define your own version so it's a rather weak dependency.
- logicchains 12 hours ago
  At least you have the chance to implement your own std::partial_ordering if necessary; in most languages those kind of features would be built into the compiler.
  [-]
  - eru 11 hours ago
    Haskell solves this nicely: your own operators just shadow the built-in operators. (And you can opt to not import the built-in operators, and only use your own. Just like you can opt not to import printf in C.)
    [-]
    - dataangel 8 hours ago
      that's not the issue, the problem is the operator is required by the language to return a type from the stdlib, so you have to pull in the stdlib to get that type
unwind 13 hours ago
Very nice! I like the tone and flippant energy of the post, of course, and also the way to get a nice scope by having a concrete case of a program to implement.
I also appreciated the comparisons against STL, very informative. It's ... interesting that if including `vector` in STL brings in 27,000 lines, and the author's implementation of the functionality for the example program was only 1,000 lines, that the compilation time difference is only 4X. Not sure I understand that, really. But benchmarking is hard, of course.
If I could come with a single suggestion it would be to include the sample program's source as text, not as a picture of text. If that means losing the pretty syntax highlighting, that's fine (by me). :)
[-]
- dwattttt 12 hours ago
  > interesting that if including `vector` in STL brings in 27,000 lines, and the author's implementation of the functionality for the example program was only 1,000 lines, that the compilation time difference is only 4X
  I imagine the time taken varies much more based on what's on the lines, rather than how many there are.
  I'm not aware of specific pathological cases, but I'm sure you could make maybe 10 lines take 20 times longer than both of those vectors put together.
  [-]
  - account42 11 hours ago
    It also depends on how much of the lines end up being actually used - sure the compiler will have to parse all 27000 lines and probably do some more processing on that code but it won't have to do optimizations, register allocation and code generation on member functions or specializations you don't use.
dvh 12 hours ago
I unexpectedly did some cpp few days ago and I was surprised that cpp standard library doesn't have string trim function! Everybody is rolling their own. What is the reason behind that?
[-]
- account42 11 hours ago
  What do you want to trim off? ASCII 0x20? Any ASCII white-space? Any Unicode white-space? Well the latter requires defined string encodings and depends on the Unicode version and you can't just use the latest without introducing subtle compatibility issues.
  [-]
  - criddell 11 hours ago
    ASCII white space (in any encoding) by default with an optional user defined set of trim characters (like Python) would probably solve the needs of 90% of people rolling their own.
    [-]
    - usrnm 10 hours ago
      Not all unicode whitespace characters take up exactly one byte when encoded in utf8. Not even talking about other possible encodings, just good old utf8. Let that sink in a bit, and you'll realize what a can of worms it is in a language where strings are just byte sequences.
      [-]
      - criddell 9 hours ago
        Because it's tricky is exactly why it should be in the standard library.
        The C++ standard library should just incorporate ICU by reference IMHO.
        [-]
        account42 8 hours ago
        ICU is an unreasonably large dependency for something that many users won't need. Its behavior also changes with new Unicode versions which makes it incompatible with something that cares as much as backward compatibility as the C++ standard library.
        [-]
        criddell 6 hours ago
        That’s the nature of Unicode: it’s complicated and a moving target.
        As far as it being a large dependency, the beauty of C++ is that if you don’t use it, it won’t affect your build.
        If ICU is too large, complex, and unstable for the C++ committee, then regular users don’t stand a chance.
        [-]
        account42 6 hours ago
        > As far as it being a large dependency, the beauty of C++ is that if you don’t use it, it won’t affect your build.
        That's the theory. In practice, you have things like iostreams pulling in tons of locale machinery (which is really significant for static builds) even if you never use a locale other than "C". That locale machinery will include gigantic functions for formatting monetary amounts even if you never do any formatting.
        > If ICU is too large, complex, and unstable for the C++ committee, then regular users don’t stand a chance.
        Regular users have more specific requirements and can handle binary compatibility breaks better if those aren't coupled with other unrelated functionality.
  - tialaramex 10 hours ago
    I mean, you're a big grown-up language with generic programming, why can't you:
    https://doc.rust-lang.org/std/primitive.str.html#method.trim...
    C++ can't manage to do this because it doesn't give its primitive types methods, it doesn't have a sensible way to talk about methods on types, and it always coerces to function pointers...
```
    assert_eq!("123foo1bar123".trim_end_matches(char::is_numeric), "123foo1bar");
```
    But it's pretty easy to at least do this:
```
    assert_eq!("11foo1bar11".trim_end_matches('1'), "11foo1bar");
```
    (Yes Rust does provide one that always trims off trailing whitespace, but that requires that you know, as Rust does, what the encoding is)
- eru 11 hours ago
  The problem here isn't so much that it's not in the standard library (not everything needs to be in the standard library), but that everyone is rolling their own instead of using third party libraries.
  [-]
  - tialaramex 10 hours ago
    "Everybody" is the problem, there are two proper use cases for a standard library†
    1. Vocabulary. Things Everybody will want to talk about. It's easier to communicate if we all agree this is a List<Goose> than if we first have to negotiate do we mean MyLinkedList or ArrayList or HybridStorage::List, and if we can't agree do we need an adaptor layer. Vocab is a reason the stdlib should provide a string type (if the language itself does not), a growable array, a hash table, etc. With generic programming you likely want some algorithms in here too, all & any, sum, that sort of thing.
    2. Shared features Everybody will find they need and might otherwise screw up. Trimming trailing whitespace, turning numbers into strings and vice versa, sorting, basic arithmetic, familiar constants.
    This should be in category (2). "Everybody" will need this once in a while.
    † In C++ instead the standard library functions as a way to not bother with package management, this does have amusing effects like how FreeBSD will end up with a linear algebra library required to build the OS.
- madduci 11 hours ago
  Exactly, same as for base64 encoding, sha256/512 hashes and many more.
- jeroenhd 10 hours ago
  In a similar vein, I found out that Go doesn't have a string reverse function either. Everyone online pretends reversing strings is easy (just iterate through the array backwards! The world is US ASCII only, right?).
  Trimming strings isn't hard in most real world applications, on the other hand, and not putting it in the standard library means people won't confuse the way the trim method works (i.e. the user must make a choice between copying memory or reusing memory and risking memory lifetime/consistency issues). And that doesn't even include problems like "what if the string isn't utf8".
  I'm more disappointed in Go, which takes a ton of questionable assumptions in the standard library to pretend difficult problems are easy. C++ wants to be correct and knowing what is or isn't whitespace is hard when you don't know the length of a single grapheme.
  [-]
  - zabzonk 9 hours ago
    Slightly OT:
    Interview question(s): "Write a function to reverse a string/linked list"
    Me, as interviewee: "You spend a lot of time reversing things, do you?"
    I don't understand why people are so obsessed with this kind of thing. In my entire career, I don't think I ever felt the need to reverse anything - iterate backwards, perhaps.
    [-]
    - bluGill 9 hours ago
      That is the point - nobody does this in the real world so you don't have the solution memorized. However doing it is "easy" enough that you can actually do it in an interview. More than once I've worked with someone who had a great resume with a lot of experience, but we quickly figured out once they were on the job that they couldn't write code (I was sometimes involved in the hiring decision, but I never did the hiring alone).
      What the world is looking for in question like that is enough to figure out if you can program. Most people looking for a job have a lot of experience but they can't show you any code.
      Any sane company in the US will only confirm the dates someone worked there and they "left on good terms" - they will not tell you if the person was any good. If they must fire someone HR will often offer to let the person write a resignation letter on the spot thus meaning the the person leaves on good terms - it is to your advantage overall to accept this offer - you can't sue for wrongful termination which protects them, but in turn they will say you left on good terms instead of giving a bad reference.
      As such there is often no indication someone is bad and so they can jump from job to job despite being incompetent. Questions like this exist because you can solve it (at least a simplified version of ASCII only, if you need to work with unknown character set it gets hard)
      [-]
      - zabzonk 8 hours ago
        It's easy to come up with questions they can't prepare for - example, for a C++/SQL database job:
        1) Present them with your database schema, give them time to read and (at least partially) understand it. Allow questions. Give them a workstation.
        2) Get them to write a SELECT statement to pull stuff out of two or three tables.
        3) Get them to integrate the query into a small C++ program. Have the program write data out to a text file.
        You can do this fairly realistic stuff for any technologies. Or, for C++, you could use my favourite interview question: "Tell me about the copy constructor".
        [-]
        Maxatar 7 hours ago
        I'd really rather not tell you our database schema.
        Instead of expecting businesses to tell you domain specific things and then answer questions about them, please just understand some basic principles behind a large class of algorithms.
        Almost all algorithm questions boil down to a simple principle, can you take a problem and break it down into its simplest form; the simplest linked list to reverse is the empty linked list or a linked list with 1 node.
        Can you then build upon the simplest case to solve the next simplest case; reverse a multi-node linked list by reversing the tail and then appending the head to the result.
        It really is unfortunate how many people, instead of trying to understand concepts, want to just memorize a bunch of hardcoded facts or trivia about programming languages or libraries. If you understand the basic principles, you can easily pick up minutia about C++ copy constructors or move constructors... but someone who has memorized a great deal of minutiae about C++ may never be able to understand some of the basic principles that broadly cover a multitude of data structures and algorithms.
        [-]
        zabzonk 7 hours ago
        > you can easily pick up minutia about C++ copy constructors
        Hollow laughter. And if it were true (which it isn't) how well can you explain those "minutia"?
        bluGill 7 hours ago
        That means you tell someone who you might not hire what your database schema is. Probably not something you want them to know. You also assume they know SQL - many C++ jobs only need minimal SQL knowledge and so you are fine with hiring someone who can write a select only with the help of google - but someone at that level wouldn't be able to solve your problem. I've spent a lot of time working in a language that was custom to the one company I worked for at the time - I can learn your language quickly (even C++ is not that hard - the dark corners means it takes years to become great but to be productive doesn't take very long), as such I don't want to force any particular language on the interview, I want something that proves they can write code.
        [-]
        zabzonk 7 hours ago
        > Probably not something you want them to know
        Why not? But if your schema is so secret, come up with a simple one for use in interviews.
        > You also assume they know SQL
        I specifically said this was for a c++/sql job.
        > so you are fine with hiring someone who can write a select only with the help of google
        No, I'm not fine with that, even if it were do-able.
        > I can learn your language quickly (even C++ is not that hard - the dark corners means it takes years to become great but to be productive doesn't take very long)
        Wrongo. And not just for C++.
        > I don't want to force any particular language on the interview, I want something that proves they can write code.
        Obviously, we want very different things.
    - tmoertel 8 hours ago
      The goal of an interview isn't to get the candidate to write code that will be used in production. The goal is to observe the candidate doing something that predicts whether they're a viable hire. If a candidate cannot write a function to reverse a given sequence, especially in a situation where candidates have been led to expect that they'll be asked to do something just like that, then it becomes harder to believe that the candidate is a viable hire.
    - pjmlp 9 hours ago
      It is the closest to do a programmer casting.
      I would rather have that question, instead of how many golf balls fit into a plane.
      At least the former has something to do with programming.
      [-]
      - billforsternz 1 hour ago
        Using my fingers I'm guessing a golf ball is about 3cm in diameter. A 737 or an A320 cabin is, again my guess/estimate, 30m long, 4m wide 2m high. So approx 30cm^3 into 200m^3. One million cm^3 in a m^3. I'm going with 5 million golf balls and hoping I'm right within an order of magnitude or so. I miss those kind of questions which have died out sadly.
        [-]
        pjmlp 1 hour ago
        And that helps proving someone is up to the task of writing a Website using Spring in a Kubernetes cluster in what manner?
        [-]
        billforsternz 34 minutes ago
        Obviously it doesn't because it's more a test of reasoning ability and intelligence that specific domain skills. The theory is that smart programmers will be able to quickly pick up whatever specialised skills are needed for specific projects. Some people are good generalists. Others prefer to specialize. Employers are free to optimize for their circumstances and preferences I guess.
  - jjmarr 9 hours ago
    If you want to reuse memory in C++, you'll either have to modify the string or return a string_view because strings must be null terminated (string_views are not). If you just chop off the last n-bytes of a string, it won't be a string anymore.
    I personally use std::string_view as much as possible especially for compile-time constants. Then you can slice as much as you want without reallocating.
- pjmlp 9 hours ago
  In C++ frameworks it exists for ages.
  Why not in ISO C++?
  Welcome to the ways of ISO and committee driven development, apparently no one cared enough to submit a paper, and do the work to win the paper voting into the standard.
quibono 10 hours ago
A bit off-topic maybe, what is a good open source library to read through to see some clean&modern C++? I've not dealt with the language in a bit and was thinking of diving back in
[-]
- jeroenhd 10 hours ago
  I've always found SerenityOS to have quite nice C++ source code. The project runs on very recent versions of C++ (to the point where the standard compilation script for Ubuntu will compile a modern compiler first) and the OS intentionally doesn't stick to POSIX, allowing some very nice API improvements that only work in a C++-first world.
  Its main author moved on to Ladybird, though, so I haven't really browsed the code recently. I'm not sure if SerenityOS uses concepts and other such recent additions.
- pjmlp 9 hours ago
  In any case, having a go at Tour of C++ book is a good way to read about modern idioms.
  [-]
  - xdavidliu 9 hours ago
    the most recent version has `import std;` in the very first hello world, which to my knowledge is not close to working in the vast majority of compilers.
    [-]
    - pjmlp 5 hours ago
      It works alright on VC++ and clang/CMake/ninja.
      The main issue is VS where the EDG frontend still hasn't been properly updated, and clang/cmake can't handle header units.
      GCC is lagging behind, and everyone else is anyway mostly catching up to C++17.
      For my hobby coding, I am mostly doing C++23 on VC++, so it is modules all the way.
      At work, it is still C++17 land on native libraries for managed languages.
germandiago 12 hours ago
A bit off-topic, but as a Meson user, I would love to see C++ modules support since they start to be usable in all three big compilers.
Nice experiment for the pyStd, though, as pointed out, this would break with pre-compiled 3rd party deps that use pystd in a different version :)
SleepyMyroslav 1 hour ago
Rant. I do not see any improvement in the outcome code. Lets not nitpick on fast parsing and just scroll through unnecessary code actions.
> ... u8line(move(line))
We are not reusing parsed line object between iterations. Forcing fresh allocation per line.
> auto words = ...
Fresh allocation per line.
> lookup/insert
Lookup and hashing done 2 times for each word. Each unique word individually allocated on the heap.
> stats.push_back
Not preallocated. Likely doing full allocate + copy per each word.
> sort_relocatable
Could have been faster with additional memory provided. But this is minor because sorting probably was not ideal in the first place.
and the icing on the cake:
>printf("%d ... (int)count ...
As old saying goes "One can write Fortran program in any language". There are zero reasons to write non type safe text output in 2025 in C++ but here we are.
TLDR. One can name their foundation library any name and use any namespace it does not change how the code written much. Right?
daemin 13 hours ago
I do wonder how much smaller the STL source code would be if it was pre-processed or written with only a single C++ standard in mind. So only for C++20 or only for C++23 etc. In that case how much faster would things be to compile where it doesn't need to filter through hundreds of preprocessor options?
[-]
- magicalhippo 13 hours ago
  From what I've read on mailing lists and whatnot, it seems a lot of complexity comes from explicit choices made, like iterators being unaffected by insertions[1] for maps and such, or time complexity guarantees that forces the implementation into certain corners.
  [1]: https://kera.name/articles/2011/06/iterator-invalidation-rul...
- Someone 11 hours ago
  > In that case how much faster would things be to compile where it doesn't need to filter through hundreds of preprocessor options?
  I think most of the time spent isn’t running the preprocessor, but parsing the declarations and definitions.
  Regardless, the way to speed up importing definitions in modern C++ is to use #import instead of #include.
  https://news.ycombinator.com/item?id=38904758 says they could import the entire std namespace in under a second (that is long when you want to run C++ as a scripting language, but not when you compile large programs)
larusso 10 hours ago
How will this year release scale when you need to work with newer compilers? I don‘t write cpp so I honestly don‘t know. Do you need to freeze your version of the compiler forever? Or is gcc / clang backwards compatible? Or do you sprinkle tons of pragmas on the files to control this? What I mean is how can you make sure your version one API is still compiling in the future. Take the counter example of ruby for instance. I could write a package with lots of namespaces and declare v1 frozen. But I still need to potentially
update the code so it can run in newer versions of the runtime. Edit: typos
[-]
- 0xffff2 2 hours ago
  Overwhelmingly, new compilers will compile old code just fine. IIRC, the only time old code is broken intentionally is when fixing a bug in the compiler itself causes the code to break.
atombender 10 hours ago
> The C++ standard library (also know as the STL)
The C++ Standard Library is not the same as the STL.
The STL is the Standard Template Library, which provides containers such as vectors, as well as related functionality like iterators.
The C++ Standard Library includes STL, but is a lot more, including things like I/O, math, concurrency, and so on.
[-]
- Longhanks 10 hours ago
  The maintainers of Microsoft's C++ standard library use the term interchangeably, both "STL" and "C++ standard library" refer to the same thing. https://github.com/microsoft/STL/blame/main/README.md#L3
  [-]
  - atombender 9 hours ago
    Probably a historical artifact. It has never been the correct term.
usrnm 12 hours ago
Years ago every C++ project worth its salt had at least one implementation of of strings, vectors and other basic things. I hoped we finally were past that.
[-]
- w4rh4wk5 11 hours ago
  Probably not as the way things are defined in the STL is often not in line with what C++ programmers want for their code base.
  STL adheres to zero-cost-abstraction, which often puts safety in the backseat. Many programmers, myself included, prefer safety by default with an escape route, when its really needed.
  Add to that things like exceptions, locale-dependent behavior, functions with a dozen overloads, an overly complex memory allocator interface (`std::vector` vs. `std::pmr::vector`), etc.
  Personally, I'd prefer a common alternative to STL that focuses on these points. ETL [1] and abseil [2] come to mind, but it's not exactly what I envision.
  1: https://github.com/ETLCPP/etl 2: https://github.com/abseil/abseil-cpp
  [-]
  - bluGill 8 hours ago
    There are people on the standard committee looking at safety, and some effort to see if more safety can/should be added to the standard library. Perfect safety is probably impossible, but there are several options being discussed. (operator[] is undefined if the value is out of range, but there is thought that perhaps checking range isn't that expensive and so we should throw an exception - it is possible that in many cases the compiler can prove the value is in range and thus the range check would be optimized out anyway). Safety profiles can also disable unsafe operations, and since they are opt-in (opt-out would be better but break too much existing C++) you choose when to pay the price. (if you have any other idea I'm interested)
  - usrnm 11 hours ago
    And I've seen numerous explanations: someone wants more safety, someone wants less safety, someone wants locale-aware strings, someone wants ABI stability, and so on, it's an endless list. Few people are brave enough to admit that reinventing the wheel is just easier and more fun than solving problems people actually have and care about.
    [-]
    - w4rh4wk5 11 hours ago
      Well, I am in game development and all of the issues I listed above are actual problems that we encountered either in the past or still wrangle with today. Especially locale-dependent functions are a bitch on Windows.
      Edit: one more thing I'd like add is that reinventing a wheel in C++ is quite horrible as it's such a complicated language.
- bluGill 8 hours ago
  We will never be past that is there are tradeoffs in implementing things. Safety was brought out in other comments, but there are others. In some cases a slower algorithm will be faster in the real world because it is more CPU cache friendly (depending of course on what N is, but often N is small enough). In some cases you can accept a less precise answer - thus making your algorithms faster.
  For most of us, in most problem spaces, the above doesn't matter and so the standard library is good enough. There will always be those who correctly have a need that is strong enough to be worth building their own standard library though.
  [-]
  - ryandrake 7 hours ago
    Whenever I've worked for a company that decided to re-implement all their own strings and containers, I'd always try to suss out whether they actually did it for one of those actual good reasons, or if they were just cargo culting someone long ago who once said "STL BAD IT'S SLOW" and that was all it took to convince them to write 20K lines of unmaintainable and unnecessary OurString OurVector, and OurList crap. So far, I've never found one place who knew why they reimplemented all that stuff. I do know one person who works in games whose company actually had a good reason to do their own containers, but these cases are rare.
- klaussilveira 8 hours ago
  https://github.com/electronicarts/EASTL
  Is pretty good for games.
- pjmlp 12 hours ago
  Yes, since 1998, but apparently legacy code lives on.
  At least in what concerns "strings, vectors and other basic things".
  Now if you conside something like networking part of "basic things", then it is another matter.
  Then again, vcpkg and conan exist now.
singularity2001 13 hours ago
I did that to get minimal wasm binaries, not sure if tree shaking today would sufficiently shrinks c++ wasm apps.
zerr 13 hours ago
I recall years ago there was some kind of competition in implementing full C++ parser/front-end and standard library, or something like that, allegedly organized by nvidia.
[-]
- quuxplusone 6 hours ago
  Maybe you're thinking of cppgm.org — the "C++ Grandmaster Certification"?
  - http://web.archive.org/web/20190824232557/http://www.cppgm.o...
  - https://news.ycombinator.com/item?id=5148895
  Nothing to do with Nvidia, though.
  [-]
  - zerr 5 hours ago
    Exactly! I believe there were rumors suggesting Nvidia involvement, but it was never confirmed.
    I wonder what the outcome was, did anyone become a "Certified C++ Grandmaster"?