I appreciate that Aaron is focusing on the practical algorithm/design improvements that could be made to Bundler, vs. prematurely going all in on "rewrite in Rust".
Really interesting post, but this part from the beginning stuck out to me:
Ruby Gems are tar files, and one of the files in the tar file is a YAML representation of the GemSpec. This YAML file declares all dependencies for the Gem, so RubyGems can know, without evaling anything, what dependencies it needs to install before it can install any particular Gem. Additionally, RubyGems.org provides an API for asking about dependency information, which is actually the normal way of getting dependency info (again, no eval required).
It would be interesting to compare and contrast the parsing speed for a large representative set of Python dependencies compared to a large representative set of Ruby dependencies. YAML is famously not the most efficient format to parse. We might have been better than `pip`, but I would be surprised if there isn't any room left on the table to parse dependency information in a more efficient format (JSON, protobufs, whatever).
That said, the points at the end about not needing to parse gemspecs to install "most" dependencies would make this pretty moot (if the information is already returned from the gemserver)
Although Yaml is a dreadful thing, given the context and the size of a normal gemspec I would be very surprised if it showed up in any significant capacity when psych should be in the low single digit MB/s throughput.
I’ve been squinting at the “global cache for all bundler instances” issue[1] and I’m trying to figure out if it’s a minefield of hidden complication or if it’s actually relatively straight forward.
It’s interesting as a target because it pays off more the longer it has been implemented as it only would be shared from versions going forward.
I've been doing software of all kinds for a long long time.
I've never, ever, been in a position where I was concerned about the speed of my package manager.
When you're in a big company, the problem really starts showing. A service can have 100+ dependencies, many in private repos, and once you start adding and modifying dependencies, it has to figure out how to version it to create the lock file across all of these and it can be really slow.
Cloud dev environments can also take several minutes to set up.
Many of these package managers get invoked countless times per day (e.g., in CI to prepare an environment and run tests, while spinning up new dev/AI agent environments, etc).
Is the package manager a significant amount of time compared to setting up containers, running tests etc? (Genuine question, I’m on holiday and can’t look up real stats for myself right now)
Anecdotally unless I'm doing something really dumb in my Dockerfile (recently I found a recursive `chown` that was taking 20+ to finish, grr) installing dependencies is longest step of the build. It's also the most failure prone (due to transient network issues).
Ye,s but if your CI isn't terrible, you have the dependencies cached, so that subsequent runs are almost instant, and more importantly, you don't have a hard dependency on a third party service.
The reason for speeding up bundler isn't CI, it's newcomer experience. `bundle install` is the overwhelming majority of the duration of `rails new`.
> Ye,s but if your CI isn't terrible, you have the dependencies cached, so that subsequent runs are almost instant, and more importantly, you don't have a hard dependency on a third party service.
I’d wager the majority of CI usage fits your bill of “terrible”. No provider provides OOTB caching in my experience, and I’ve worked with multiple in house providers, Jenkins, teamcity, GHA, buildkite.
Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.
Well, now my opinion of uv has been damaged. It...
> Ignoring requires-python upper bounds. When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive
Man, it's easy to be fast when you're wrong. But of course it is fast because Rust not because it just skips the hard parts of dependency constraint solving and hopes people don't notice.
> When multiple package indexes are configured, pip checks all of them. uv picks from the first index that has the package, stopping there. This prevents dependency confusion attacks and avoids extra network requests.
Ambiguity detection is important.
> uv ignores pip’s configuration files entirely. No parsing, no environment variable lookups, no inheritance from system-wide and per-user locations.
Stuff like this sense unlikely to contribute to overall runtime, but it does decrease flexibility.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install.
... thus shifting the bytecode compilation burden to first startup after install. You're still paying for the bytecode compilation (and it's serialized, so you're actually spending more time), but you don't associate the time with your package manager.
I mean, sure, avoiding tons of Python subprocesses helps, but in our bold new free threaded world, we don't have to spawn so many subprocesses.
>> Ignoring requires-python upper bounds. When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive
>
> Man, it's easy to be fast when you're wrong. But of course it is fast because Rust not because it just skips the hard parts of dependency constraint solving and hopes people don't notice.
Version bound checking is NP complete but becomes tractable by dropping the upper bound constraint. Russ Cox researched version selection in 2016 and described the problem in his "Version SAT" blog post (https://research.swtch.com/version-sat). This research is what informed Go's Minimal Version Selection (https://research.swtch.com/vgo-mvs) for modules.
It appears to me that uv is walking the same path. If most developers don't care about upper bounds and we can avoid expensive algorithms that may never converge, then dropping upper bound support is reasonable. And if uv becomes popular, then it'll be a sign that perhaps Python's ecosystem as a whole will drop package version upper bounds.
> uv ignores pip’s configuration files entirely. No parsing, no environment variable lookups, no inheritance from system-wide and per-user locations.
Stuff like this sense unlikely to contribute to overall runtime, but it does decrease flexibility.
Astral have been very clear that they have no intention of replicating all of pip. uv pip install was a way to smooth the transition from using pip to using uv. The point of uv wasn't to rewrite pip in rust - and thankfully so. For all of the good that pip did it has shortcomings which only a new package manager turned out capable of solving.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install.
... thus shifting the bytecode compilation burden to first startup after install. You're still paying for the bytecode compilation (and it's serialized, so you're actually spending more time), but you don't associate the time with your package manager.
In most cases this will have no noticeable impact (so a sane default) - but when it does count you simply turn on --compile-bytecode.
There's never going to be a Python 4 so I don't think they are wrong. Even if lighting strikes thrice there's no way they could migrate people to Python 4 before uv could be updated to "fix" that.
> Ambiguity detection is important.
I'm not sure what you mean here. Pip doesn't detect any ambiguities. In fact Pip's behaviour is a gaping security hole that they've refused to fix, and as far as I know the only way to avoid it is to use `uv` (or register all of your internal company package names on PyPI which nobody wants to do).
> thus shifting the bytecode compilation burden to first startup after install
How uv got so fast - https://news.ycombinator.com/item?id=46393992 - Dec 2025 (457 comments)
That said, the points at the end about not needing to parse gemspecs to install "most" dependencies would make this pretty moot (if the information is already returned from the gemserver)
It’s interesting as a target because it pays off more the longer it has been implemented as it only would be shared from versions going forward.
[1] https://github.com/ruby/rubygems/issues/7249
Compiler, yes. Linker, sure. Package downloader. No.
Cloud dev environments can also take several minutes to set up.
The reason for speeding up bundler isn't CI, it's newcomer experience. `bundle install` is the overwhelming majority of the duration of `rails new`.
I’d wager the majority of CI usage fits your bill of “terrible”. No provider provides OOTB caching in my experience, and I’ve worked with multiple in house providers, Jenkins, teamcity, GHA, buildkite.
Buildkite can be used in tons of different ways, but it's common to use it with docker and build a docker image with a layer dedicated to the gems (e.g. COPY Gemfile Gemfile.lock; RUN bundle install), effectively caching dependencies.
But in public tooling, where the benefit is across tens of thousands or more? It's basically always worth it.
took an hour to install 30 dependencies
> Ignoring requires-python upper bounds. When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive
Man, it's easy to be fast when you're wrong. But of course it is fast because Rust not because it just skips the hard parts of dependency constraint solving and hopes people don't notice.
> When multiple package indexes are configured, pip checks all of them. uv picks from the first index that has the package, stopping there. This prevents dependency confusion attacks and avoids extra network requests.
Ambiguity detection is important.
> uv ignores pip’s configuration files entirely. No parsing, no environment variable lookups, no inheritance from system-wide and per-user locations.
Stuff like this sense unlikely to contribute to overall runtime, but it does decrease flexibility.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install.
... thus shifting the bytecode compilation burden to first startup after install. You're still paying for the bytecode compilation (and it's serialized, so you're actually spending more time), but you don't associate the time with your package manager.
I mean, sure, avoiding tons of Python subprocesses helps, but in our bold new free threaded world, we don't have to spawn so many subprocesses.
Version bound checking is NP complete but becomes tractable by dropping the upper bound constraint. Russ Cox researched version selection in 2016 and described the problem in his "Version SAT" blog post (https://research.swtch.com/version-sat). This research is what informed Go's Minimal Version Selection (https://research.swtch.com/vgo-mvs) for modules.
It appears to me that uv is walking the same path. If most developers don't care about upper bounds and we can avoid expensive algorithms that may never converge, then dropping upper bound support is reasonable. And if uv becomes popular, then it'll be a sign that perhaps Python's ecosystem as a whole will drop package version upper bounds.
Stuff like this sense unlikely to contribute to overall runtime, but it does decrease flexibility.
Astral have been very clear that they have no intention of replicating all of pip. uv pip install was a way to smooth the transition from using pip to using uv. The point of uv wasn't to rewrite pip in rust - and thankfully so. For all of the good that pip did it has shortcomings which only a new package manager turned out capable of solving.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install.
... thus shifting the bytecode compilation burden to first startup after install. You're still paying for the bytecode compilation (and it's serialized, so you're actually spending more time), but you don't associate the time with your package manager.
In most cases this will have no noticeable impact (so a sane default) - but when it does count you simply turn on --compile-bytecode.
There's never going to be a Python 4 so I don't think they are wrong. Even if lighting strikes thrice there's no way they could migrate people to Python 4 before uv could be updated to "fix" that.
> Ambiguity detection is important.
I'm not sure what you mean here. Pip doesn't detect any ambiguities. In fact Pip's behaviour is a gaping security hole that they've refused to fix, and as far as I know the only way to avoid it is to use `uv` (or register all of your internal company package names on PyPI which nobody wants to do).
> thus shifting the bytecode compilation burden to first startup after install
Which is a much better option.
There will, but called Pyku ...