A faster path to container images in Bazel

(tweag.io)

35 points | by malt3 6 days ago

4 comments

  • cyberax 30 minutes ago
    I'm struggling with the caching right now. I'm trying to switch from the Github actions to just running stuff in containers, and it works. Except for caching.

    Buildkit from Docker is just a pure bullshit design. Instead of the elegant layer-based system, there's now two daemons that fling around TAR files. And for no real reason that I can discern. But the worst thing is that the caching is just plain broken.

    • klysm 20 minutes ago
      The layers are tar files, I’m confused what behavior you actually want that isn’t supported.
      • cyberax 8 minutes ago
        The original Docker (and the current Podman) created each layer as an overlay filesystem. So each layer was essentially an ephemeral container. If a build failed, you could actually just run the last successful layer with a shell and see what's wrong.

        More importantly, the layers were represented as directories on the host system. So when you wanted to run something in the final container, Docker just needed to reassemble it.

        Buildkit has broken all of it. Now building is done, essentially, in a separate system, the "docker buildx" command talks with it over a socket. It transmits the context, and gets the result back as an OCI image that it then needs to unpack.

        This is an entirely useless step. It also breaks caching all the time. If you build two images that differ only slightly, the host still gets two full OCI artifacts, even if two containers share most of the layers.

        It looks like their Bazel infrastructure optimized it by moving caching down to the file level.

  • jeffbee 1 hour ago
    Funny that the article only obliquely references the compression issues. The OCI users that I have seen are using gzip due to inertia, while zstd layers have been supported for a while and a radically faster.
  • forrestthewoods 59 minutes ago
    Uhhh what? Isn’t the whole point of Bazel that it’s a monorepo with all dependencies so you don’t need effing docker just to build or run a bloody computer program?

    It drives me absolute batshit insane that modern systems are incapable of either building or running computer programs without docker. Everyone should profoundly embarrassed and ashamed by this.

    I’m a charlatan VR and gamedev that primarily uses Windows. But my deeply unpopular opinion is that windows is a significantly better dev environment and runtime environment because it doesn’t require all this Docker garbage. I swear that building and running programs does not actually have to be that complicated!! Linux userspace got pretty much everything related to dependencies and packages very very very wrong.

    I am greatly pleased and amused that the most reliable API for gaming in Linux is Win32 via Proton. That should be a clear signal that Linux userspace has gone off the rails.

    • jakewins 48 minutes ago
      You’re converging a lot of ground here! The article is about producing container images for deployment, and have no relation to Bazels building stuff for you - if you’re not deploying as containers, you don’t need this?

      On Linux vs Win32 flame warring: can you be more specific? What specifically is very very wrong with Linux packaging and dependency resolution?

      • forrestthewoods 29 minutes ago
        > The article is about producing container images for deployment

        Fair. Docker does trigger my predator drive.

        I’m pretty shocked that the Bazel workflow involves downloading Docker base images from external URLs. That seems very unbazel like! That belongs in the monorepo for sure.

        > What specifically is very very wrong with Linux packaging and dependency resolution?

        Linux userspace for the most part is built on a pool of global shared libraries and package managers. The theory is that this is good because you can upgrade libfoo.so just once for all programs on the system.

        In practice this turns into pure dependency hell. The total work around is to use Docker which completely nullifies the entire theoretic benefit.

        Linux toolchains and build systems are particularly egregious at just assuming a bunch of crap is magically available in the global search path.

        Docker is roughly correct in that computer programs should include their gosh darn dependencies. But it introduces so many layers of complexity that are solved by adding yet another layer. Why do I need estargz??

        If you’re going to deploy with Docker then you might as well just statically link everything. You can’t always get down to a single exe. But you can typically get pretty close!

        • dilyevsky 14 minutes ago
          > I’m pretty shocked that the Bazel workflow involves downloading Docker base images from external URLs. That seems very unbazel like! That belongs in the monorepo for sure.

          Not every dependency in Bazel requires you to "first invent the universe" locally. Lots of examples of this like toolchains, git_repository, http_archive rules and on and on. As long as they are checksum'ed (as they are in this case) so that you can still output a reproducible artifact, I don't see the problem

  • odie5533 1 hour ago
    Awful AI images everywhere. Can we not help ourselves?
    • CBLT 1 hour ago
      Is my adblocker blocking them? I only saw the stack of tars in a coat. Didn't break the article's flow for me.
      • comex 1 minute ago
        I also only saw that, but the text feels a bit fluffed out by AI as well, if I’m not mistaken.