> The fact that an idle Mac has over 2,000 threads running in over 600 processes is good news, and the more of those that are run on the E cores, the faster our apps will be
This doesn't make sense in a rather fundamental way - there is no way to design a real computer where doing some useless work is better than doing no work, just think about energy consumption and battery life since this is laptops.
Or that's just resources your current app can't use
Besides, they aren't that well engineered, bugs exist and last and come back, etc, so even when on average the impact isn't big, you can get a few photo analysis indexing going haywire for awhile and get stuck
I think in the example the OP is making, the work is not useless. They're saying if you had a system doing the same work, with maybe 60 processes, you're better off splitting that into 600 processes and a couple thousand threads, since that will allow granular classification of tasks by their latency sensitivity
But it is, he's talking about real systems with real processes in a generic way, not a singular hypothetical where suddenly all that work must be done, so you can also apply you general knowledge that some of those background processes aren't useful (but can't even be disabled due to system lockdown)
I think you're right that the article didn't provide criteria for when this type of system is better or worse than another. For example, the cost of splitting a work into threads and switching between threads needs to be factored in. If that cost is very high, then the multi-thread system could very well be worse. And there are other factors too.
However, given the trend in modern software engineering to break work into units and the fact that on modern hardware thread switches happen very quickly, being able to distribute that work across different compute clusters that make different optimization choices is a good thing and allows schedulers to get results closer to optimal.
So really it boils down to if the gains in doing the work on different compute outweighs the cost splitting and distributing the work, then it's a win. And for most modern software on most modern hardware, the win is very significant.
> (...) a singular hypothetical where suddenly all that work must be done (...)
This is far from being a hypothesis. This is an accurate description of your average workstation. I recommend you casually check the list of processes running at any given moment in any random desktop or laptop you find in a 5 meter radius.
> (...) where doing some useless work is better than doing no work (...)
This take expresses a fundamental misunderstanding of the whole problem domain. There is a workload comprised of hundreds of processes, some of which multithreaded, that need to be processed. That does not change nor go away. You have absolutely no suggestion that any of those hundreds of processes is "useless". What you will certainly have are processes that will be waiting for IO, but waiting for a request to return a response is not useless.
In this case the “useless” work is the cost of moving and distributing the threads between different compute clusters. That cost is nonzero, and does needs to be factored in, but it’s also more than overwhelmed by the benefits gained from doing the move.
I would say a good number of those processes/cores are something you don't want running. And you can't turn them off unless you can modify the boot partition to disable the launch configs.
These processors are good all around. The P cores kick butt too.
I ran a performance test back in October comparing M4 laptops against high-end Windows desktops, and the results showed the M-series chips coming out on top.
This is likely more of a Windows filesystem benchmark than anything else: there are fundamental restrictions on how fast file access can be on Windows due to filesystem filter drivers. I would bet that if you tried again with Linux (or even in WSL2, as long as you stay in the WSL filesystem image), you'd see significantly improved results.
You are seeing basically the lithography node used to make the CPU.
Since Apple books more capacity than anyone else, they have their chip 5-6 months ahead of the market, you'll see chips with similar performance by core.
And what about the M3 Ultra, that sits at number 3 and came out ten months ago? Why was it not beaten five months ago? Might I add that the M3 Ultra is on an older node than the M5. And what about the A19 Pro, which is better at single core than every desktop chip in the world, and happens to be inside a phone!
Apple has the best silicon team in the world. They choose perf per watt over pure perf, which means they don't win on multi-core, but they're simply the best in the world in the most complicated, difficult, and impossible metric to game: single core perf.
It's bench score on single thread is 0.6% better than the Intel Core Ultra 9 285K, which have a lower TDP and was released 6 months before. Boths use the same lithography node.
If you look at the chip by their lithography node, the Apple silicons are the same than the others...
Apple's M-series chips are fantastic, but I do agree with you that it's mostly a combination of newer process and lots of cache.
Even when they were new, they competed with AMD's high end desktop chips. Many years later, they're still excellent in the laptop power range - but not in the desktop power range, where chips with a lot of cache match it in single core performance and obliterate it in multicore.
From your article it seems like you benchmark compile times. I am not an expert on the subject, but I don't see the point in comparing ARM compilation times with Intel. There are probably different tricks involved in compilation and the instructions set are not the same.
I've often been suspicious of this too, having noticed that building one of my projects on Apple Silicon is way quicker than I'd expect relative to x64, given relative test suite run times and relative PassMark numbers.
I don't know how to set up a proper cross compile setup on Apple Silicon, so I tried compiling the same code on 2 macOS systems and 1 Linux system, running the corresponding test suite, and getting some numbers. It's not exactly conclusive, and if I was doing this properly properly then I'd try a bit harder to make everything match up, but it does indeed look like using clang to build x64 code is more expensive - for whatever reason - than using it to build ARM code.
Systems, including clang version and single-core PassMark:
M4 Max Mac Studio, clang-1700.6.3.2 (PassMark: 5000)
x64 i7-5557U Macbook Pro, clang-1500.1.0.2.5 (PassMark: 2290)
x64 AMD 2990WX Linux desktop, clang-20 (PassMark: 2431)
Single thread build times (in seconds). Code is a bunch of C++, plus some FOSS dependencies that are C, everything built with optimisation enabled:
Mac Studio: 365
x64 Macbook Pro: 1705
x64 Linux: 1422
(Linux time excludes build times for some of the FOSS dependencies, which on Linux come prebuilt via the package manager.)
Single thread test suite times (in seconds), an approximate indication of relative single thread performance:
Mac Studio: 120
x64 Macbook Pro: 350
x64 Linux: 309
Build time/test time makes it look like ARM clang is an outlier:
Mac Studio: 3.04
x64 Macbook Pro: 4.87
x64 Linux: 4.60
(The Linux value is flattered here, as it excludes dependency build times, as above. The C dependencies don't add much when building in parallel, but, looking at the above numbers, I wonder if they'd add up to enough when built in series to make the x64 figures the same.)
Which still come out behind other than multi core, while using substantially more power.
Those panther lake comparisons are from the top end PTL to the base M series. If they were compared to their comparative SKUs they’d be even further behind.
The article said the M5 has significantly higher single core CPU performance, Panther Lake has significantly higher GPU performance. The Panther Lake devices had OLED screens, which consume significantly more power than LCDs, so they were at a disadvantage.
They consume more power at the chip level. You can see this in Intels spec sheets. The base recommended power envelope of the PTL is the maximum power envelope of the M5. They’re completely different tiers. You’re comparing a 25-85W tier chip to a 5W-25W chip.
They also only win when it comes to multi core whether that’s CPU or GPU. If they were fairly compared to the correct SoC (an M4 Pro) they’d come out behind on both multicore CPU and GPU.
This was all mentioned in my comment addressing the article. This is the trick that apples competitors are using, by comparing across SKU ranges to grab the headlines. PTL is a strong chip, no doubt, but it’s still behind Apple across all the metrics in a like for like comparison.
Are the Intel systems plugged in when running those tests? Usually when Apple machines do the tests then the difference between battery/plugged in is small if any.
Genuine question, when people talk about apple silicon being fast, is the comparison to windows intel laptops, or Mac intel architecture?
Because, when running a Linux intel laptop, even with crowd strike and a LOT of corporate ware, there is no slowness.
When blogs talk about "fast" like this I always assumed it was for heavy lifting, such as video editing or AI stuff, not just day to day regular stuff.
I'm confused, is there a speed difference in day to day corporate work between new Macs and new Linux laptops?
I use pretty much all platforms and architectures as my "daily drivers" - x64, Apple Silicon, and ARM Cortex, with various mixtures of Linux/Mac/Windows.
When Apple released Apple Silicon, it was a huge breath of fresh air - suddenly the web became snappy again! And the battery lasted forever! Software has bloated to slow down MacBooks again, RAM can often be a major limiting factor in performance, and battery life is more variable now.
Intel is finally catching up to Apple for the first time since 2020. Panther Lake is very competitive on everything except single-core performance (including battery life). Panther Lake CPU's arguably have better features as well - Intel QSV is great if you compile ffmpeg to use it for encoding, and it's easier to use local AI models with OpenVINO than it is to figure out how to use the Apple NPU's. Intel has better tools for sampling/tracing performance analysis, and you can actually see you're loading the iGPU (which is quite performant) and how much VRAM you're using. Last I looked, there was still no way to actually check if an AI model was running on Apple's CPU, GPU, or NPU. The iGPU's can also be configured to use varying amounts of system RAM - I'm not sure how that compares to Apple's unified memory for effective VRAM, and Apple has higher memory bandwidth/lower latency.
I'm not saying that Intel has matched Apple, but it's competitive in the latest generation.
This was the same for me. M4 Pro is my first Macbook ever and it's actually incredible how much I prefer the daily driving experience versus my brand new 9800x3d/RTX 5080 desktop, or my work HP ZBook with 13th Gen intel i9. The battery lasts forever without ANY thought. On previous Windows laptops I had to keep an eye on the battery, or make sure it's in power saving mode, or make sure all the background processes aren't running or whatever. My Macbook just lasts forever.
My work laptop will literally struggle to last 2 hours doing any actual work. That involves running IDEs, compiling code, browsing the web, etc. I've done the same on my Macbook on a personal level and it barely makes a dent in the battery.
I feel like the battery performance is definitely down to the hardware. Apple Silicon is an incredible innovation. But the general responsiveness of the OS has to be down to Windows being god-awful. I don't understand how a top of the line desktop can still feel sluggish versus even an M1 Macbook. When I'm running intensive applications like games or compiling code on my desktop, it's rapid. But it never actual feels fast doing day to day things. I feel like that's half the problem. Windows just FEELS so slow all the time. There's no polish.
Have you checked whether the work laptop's bad battery life is due to the OS, or due to the mountain of crapware security and monitoring stuff that many corporations put on all their computers?
I currently have a M3 Pro for a work laptop. The performance is fine, but the battery life is not particularly impressive. It often hits low battery after just 2-3 hours without me doing anything particularly CPU-intensive, and sometimes drains the battery from full to flat while sitting closed in a backpack overnight. I'm pretty sure this is due to the corporate crapware, not any issues with Apple's OS, though it's difficult to prove.
I've tended to think lately that all of the OSes are basically fine when set up reasonably well, but can be brought to their knees by a sufficient amount of low-quality corporate crapware.
My work MBP also can drain the battery in a couple hours of light use. But that's because of FireEye / Microsoft Defender. FireEye has a bug where it pegs the CPU at 100% indefinitely and needs to be killed to stop its infinite loop. Defender hates when a git checkout changes 30,000 files and uses up all my battery (but I can't monitor this because I can't view the processes).
It’s always the corporate wares that caused the issues, in my case it’s crowdstrike and zscaler. Even with these wares I can last a full day with my M1 pro, I only notice the battery was drained to 0 once when I went to vacation for a week, it’s never happened before these wares
I also have to run Defender on my MacBook at work.
If you have access to the Defender settings, I found it to be much better after setting an exclusion for the folder that you clone your git repositories to. You can also set exclusions for the git binary and your IDE.
Part of why Windows feels sluggish is because a lot of the components in many Windows machines are dogshit - especially storage. Even the old M2 is at 1400 MB/s write speed [2], M5 is at 6068 MB/s [2]. Meanwhile in the Windows world, supposed "gamer" laptops struggle to get above 3 GB/s [3]. And on top of that, on Apple devices the storage is directly attached to the SoC - as far as I know, no PCIe, no nothing, just dumb NAND. That alone eliminates a lot of latency, and communication data paths are direct as well, with nothing pesky like sockets or cables degrading signal quality and requiring link training and whatnot.
That M2 MBA however, it only feels sluggish at > 400 Chrome tabs open because only then swapping becomes a real annoyance.
> Part of why Windows feels sluggish is because a lot of the components in many Windows machines are dogshit - especially storage.
Except that you can replace Windows with Linux and suddenly it doesn't feel like dogshit anymore. SSDs are fast enough that they should be adding zero perceived latency for ordinary day-to-day operation. In fact, Linux still runs great on a pure spinning disk setup, which is something no other OS can manage today.
Hmm, for most desktop stuff, you're still limited to random access, where even if leagues above HDD, the NVMe still suck compared to sequential. It's sad that intel killed Optane/3D X-point, because those are mych better at random workloads and they had still lower latencies than the latest NVMe (not by much anymore).
I don't understand why Optane hasn't been revived already for modern AI datacenter workloads. Being able to augment and largely replace system RAM across the board with something cheaper (though not as cheap as NAND, and more power-hungry too) ought to be a huge plus, even if the technology isn't suitable for replacing HBM or VRAM due to bulk/power constraints.
Windows laptops have been pretty much exclusively NVMe for years. The 2.5" SATA form factor was a waste of space that laptop OEMs were very happy to be rid of, first with mSATA then with M.2 using SATA or NVMe. NVMe finished displacing SATA years ago, when the widespread availability of hardware supporting the NVMe Host Memory Buffer feature meant that entry-level NVMe SSDs could be both faster and cheaper than the good SATA SSDs. Most of the major SSD vendors discontinued their M.2 SATA SSDs long ago, indicating that demand for that product segment had collapsed.
Apple silicon is very fast per size/watt. The mind blowing thing is the macbook air that has weighs very little, doesn't have a fan, and feels competitive with top of the line desktop pcs.
I looked into this for the M1 MBA and it had the exact same performance at full load as the MBP...for 7 minutes. Then the thermal throttling hits and it slows down. I'm not sure what the time limit is for newer models. Regardless, the MBA's aren't offered with Pro/Ultra chips, which I desire (and would thermally throttle much sooner than 7 minutes).
My recommendation to friends asking about MBP / MBA is entirely based on whether they do anything that will load the CPU for more than 7 minutes. For me, I need the fans. I even use Macs Fan Control[0], a 3rd party utility, to control the fans for some of my workflows - pegging the fans to 100% to pre-cool the CPU between loads can help a lot.
I edit tons of raw images and 4K video like it’s going out of style.
My used M1 mba is the fastest computer I’ve ever used. If a video render is going to take more than 7 minutes I walk away or just do something in another app anyway. The difference of a few mini means nothing.
I've got a cheap laptop stand with built-in fans that blow against the bottom case of my MBA. With my previous M1 and current M3 the stand keeps them from thermal throttling for longer periods. Most of the time it's completely unnecessary but I use it occasionally when doing long duration compiles or other long term heavy loads. Even without using the stand the tasks would complete in a reasonable amount of time, it just gives me a few extra minutes of "full blast" which is often all I need.
I’ve been amazed that while it absolutely uses a ton of battery, so has to be plugged in, my kid is able to play 3D online games with me using my old M1 MacBook Air. Not top of the line stuff (and had to change the resolution to 1440x900), but still. It gets hot, but doesn’t thermal throttle. I had half expected it to start throttling but we played for 3 hours last night with no issues.
What’s surprising is it DOES throttle using Discord with video after an hour or so, unless the battery is already full (I’m guessing it tries to charge which generates a lot of heat). You get way less thermals with a full battery and it using power instead of discharging/charging the battery during heavy usage.
My M1 MacBook Air is honestly the best laptop I’ve ever owned. Still snappy and responsive years after release. Fantastic machine. But I’m starting to crave an M5 Air…
I appreciate your helping to strengthen my resolve. More importantly, my wife thanks you as well. That said, the increased RAM available on the new models is really what I want. I have lots of programs open simultaneously.
I'm on an M1. I talk myself out of upgrading by remembering that I after a few hours of happiness my actual day-to-day experience won't noticably change.
Yea, that’s what I have been telling myself. The 16 GB of RAM I have on the M1 is starting to be a limiting factor now. If the RAM was upgradable, I would do that and probably keep the M1.
Apple chips are very good especially for their power envelope but let's not get ahead of ourselves, the only way a Macbook Air feels competitive with a top-of-the-line desktop is if you're not actually utilizing the full sustained power of the desktop. There's a reason why Apple sells much bigger Max/Ultra chips with active cooling.
I do believe Apple are still the fastest single-core (M5, A19 Pro, and M3 Ultra leading), which still matters for a shocking amount of my workloads. But only the M5 has any noticeable gap vs Intel (~16%). Also the rankings are a bit gamed because AMD and Intel put out a LOT of SKU's that are nearly the same product, so whenever they're "winning" on a benchmark they take up a bunch of slots right next to eachother even though they're all basically the exact same chip.
Also, all the top nearly 50 multi-core benchmarks are taken up by Epyc and Xeon chips. For desktop/laptop chips that aren't Threadripper, Apple still leads with the M3 Ultra 32-core in multi-core passmark benchmark. The usual caveats of benchmarks not being representative of any actual workload still apply, of course.
And Apple does lag behind in multi-core benchmarks for laptop chips - The M3 Ultra is not offered in a laptop form-factor, but it does beat every AMD/Intel laptop chip as well in multicore benchmarks.
No, the AMD headliners still dominate for single-core performance[1]. Even if you normalize for similar/"same" chips; which really just means you have five cores each generation: AMD's, Intel's, Apple's, and ARM Cortex-A and Cortex-X.
Obviously it's an Apple-to-Oranges (pardon the pun) comparison since the AMD options don't need to care about the power envelope nearly as much; and the comparison gets more equal when normalizing for Apple's optimized domain (power efficiency), but the high-end AMD laptop chips still edge it out.
But then this turns into some sort of religious war, where people want to assume that their "god" should win at everything. It's not, the Apple chips are great; amazing even, when considering they're powering laptops/phones for 10+ hours at a time in smaller chassis than their competitors. But they still have to give in certain metrics to hit that envelope.
I can't find which benchmarks those scores are from. It looks like sometimes they might have been comparing gaming FPS to AMDs paired with Nvidia 5090's? Something feels off about the site you linked - the methodology and scores aren't even cursorily explained, and gaming scores don't make sense. The 5600X doesn't even have an iGPU and the GFX card they had to have paired with it isn't listed.
What does "single core gaming performance" even mean for a CPU that doesn't have an iGPU? How could that not be a category error to compare against Apple Silicon?
Even at the time of announcement M5 was not the fastest chip. Not even on single core benchmark where apple usually shines due to the design choice of having fewer but more powerful cores (AMD for examples does the opposite). For example on geekbench Core i9-14900KS and Core Ultra 9 285K were faster.
The distance was not huge, maybe 3%. You can obviously pick and choose your benchmarks until you find one where "your" CPU happens to be the best.
Apple leads all of these in single core, by a significant margin. Even at geekbench.com (3398 for AMD 9950X3D vs 3235 for the 14900KS vs ~4000 for various Apple chips)
I'm not sure I could find a single core benchmark it would lose no matter how hard I tried...
Don’t worry, my new M4 doesn’t feel much faster either due to all the corporate crapware. Since Windows Defender got ported to Mac it’s become terrible in I/O and overall responsiveness. Any file operations will consume an entire core or two on Defender processes.
My personal M1 feels just as fast as the work M4 due to this.
I was impressed with my M4 mini when I got it a year ago but sometime after the Liquid Glass update it is now: beachball… beachball… beachball… reboot… beachball… beachball… Reminds me of the bad old days of Win XP.
How much RAM do you have? That seems to be the main thing that slows down my MacBooks (original launch-day 16GB M1 MBP and 32 GB M2 Pro). The M1 CPU is finally starting to show its age for some things, but the M2 Pro is really only RAM limited in perceived speed for me.
The cores are. Nothing is beating a M4/M5 on single CPU performance, and per-cycle nothing is even particularly close.
At the whole-chip level, there are bigger devices from the x86 vendors which will pull ahead on parallel benchmarks. And Apple's unfortunate allergy to effective cooling techniques (like, "faster fans move more air") means that they tend to throttle on chip-scale loads[1].
But if you just want to Run One Thing really fast, which even today still correlates better to "machine feels fast" than parallel loads, Apple is the undisputed king.
[1] One of the reasons Geekbench 6, which controversially includes cooling pauses, looks so much better for Apple than version 5 did.
For laptops at least, I appreciate not having fans that sound like a helicopter. I guess for Mac Mini and Mac Studio having more fan noise is acceptable (maybe a switch would be nice). One of the things that I love about my Air is there is zero fan noise all the time. Yes, it throttles, and 99% of the time I don’t notice and don’t care. Yes, I know there are workloads where it would be very noticeable and I would care, but I don’t personally run too many CPU bound tasks.
Bigger fans can move a lot more air while being less noisy, so if you care about a silent profile for any given amount of work the Mac Studio (or the Mac Mini if you don't need the full power of a Studio) is the best choice.
Same. It’s always disappointing when otherwise promising competing laptops turn out to be considerably more noisy if you’re doing anything more intense than using MS Paint.
It’s probably the single most common corner to cut in x86 laptops. Manufacturers love to shove hot chips into a chassis too thin for them and then toss in whatever cheap tiny-whiny-fan cooling solution they happen to have on hand. Result: laptop sounds like a jet engine when the CPU is being pushed.
Even something like MS Paint can turn a laptop in to a aircraft.
The issue is actually very simple. In order to gain more performance, manufactures like AMD / Intel for a long time have been in a race for the highest frequency but if you have some knowhow in hardware, you know that higher frequency = more power draw the higher you clock.
So you open your MS Paint, and ... your CPU pushes to 5.2Ghz, and it gets fed 15W on a single core. This creates a heat spike in the sensors, and your fans on laptops, all too often are set to react very fast. And VROOOOEEEEM goes your fan as the CPU Temp sensor hits 80C on a single core, just for a second. But wait, your MS Paint is open, and down goes the fan. And repeat, repeat, repeat ...
Notice how Apple focused on running their CPUs no higher then 4.2Ghz or something... So even if their CPU boosts to 100%, that thermal peak will be maybe 7W.
Now combine that with Apple using a much more tolerant fan / temp sensor setup. They say: 100C is perfectly acceptable. So when your CPU boosts, its not dumping 15W, but only 7W. And because the fan reaction threshold is so high, the fans do not react on any Apple product. Unless you run a single or MT process for a LONG time.
And even then, the fans will only ramp up slowly if your 100C has been going on for a few seconds, and while yes, your CPU will be thermal throttling while the fans spin up. But you do not feel this effect.
That is the real magic of Apple. Yes, their CPUs are masterpieces at how they get so much performance from a lower frequency, but the real kicker is their thermal / fan profile design.
The wife has a old Apple clone laptop from 2018. Thing is for 99.9% of the time silent. No fans, nothing. Because Xiaomi used the same tricks on that laptop, allowing it to boost to the max, without triggering the fan ramping. And when it triggers with a long running process, they use a very low fan rpm until it goes way too high. I had laptops with the same CPU from other brands in the same time periode, and they all had annoying fan profiles. That showed me that a lot of Apple magic is good design around the hardware/software/fan.
But ironically, that magic has been forgotten in later models by Xiaomi ... Tsk!
Manufactures think: Its better if millions of people suffer from more noise, then if we need to have a few thousand laptops that die / get damaged, from too much heat. So ramp up the fans!!!
And as a cherry on top, Apple uses custom fans designed to emit noise in less annoying frequencies and when multiple fans are in play, slightly varies their speeds to avoid harmonizing. So even when they do run, they're not perceived as being as loud at most speeds.
You can mostly fix this by running your CPU in "battery saving" mode. CPUs should basically never boost to the 5GHz+ range unless they're doing something that's absolutely latency-critical. It's a huge waste of energy for a negligible increase in performance.
I don't get your first line. When people talk about Apple's core speeds they're not talking about cycles per instruction or something, they're talking about single-thread performance on a benchmark like Geekbench. Geekbench runs various real-world code and it's the gross throughput that is measured, and it's there that Apple cores shine.
> It doesn't really make much sense to compare per-cycle performance across microarchitectures as there are multiple valid trade-offs.
That's true in principle, but IMHO a little too evasive. In point of fact Apple 100% won this round. Their wider architecture is actually faster than the competition in an absolute sense even at the deployed clock rates. There's really no significant market where you'd want to use anything different for CPU compute anywhere. Datacenters would absolutely buy M5 racks if they were offered. M5 efficiency cores are better than Intel's or Zen 5c every time they're measured too.
Just about the only spaces where Apple is behind[1] are die size and packaging: their cores take a little more area per benchmark point, and they're still shipping big single dies. And they finance both of those shortcomings with much higher per-part margins.
Intel and AMD have moved hard into tiled architectures and it seems to be working out for them. I'd expect Apple to do the same soon.
[1] Well, except the big elephant in the room that "CPU Performance Doesn't Matter Much Anymore". Consumer CPUs are fast enough and have been for years now, and the stuff that feels slow is on the GPU or the cloud these days. Apple's in critical danger of being commoditized out of its market space, but then that's true of every premium vendor throughout history.
Oh. Apple won this and the last few rounds for sure. They definitely picked the right microarchitecture and delivered masterfully.
Early on personally I had doubts they could scale their CPU to high end desktop performance, but obviously it hasn't been an issue.
My nitpick was purely about using clock per cycle as a performance metric, which is as much nonsense as comparing GHz: AFAIK Apple cpus still top at 4.5 GHz, while the AMD/Intel reach 6Ghz, so obviously the architectures are optimized for different target frequencies (which makes sense: the power costs of a high GHz design are astronomical).
And as an microarchitecture nerd I'm definitely interested in how they can implement such a wide architecture, but wide-ness per-se is not a target.
Nowhere in the submission or even the comment you replied to did anyone say "fastest". The incredibly weird knee-jerk defensiveness by some is bizarre.
It was a discussion about how the P cores are left ready to speedily respond to input via the E cores satisfying background needs, in this case talking specifically about Apple Silicon because that's the writer's interest. But of course loads of chips have P and E cores, for the same reason.
I think you're bringing up a great question here. If you ask a random person on the street "is your laptop fast", the answer probably has more to do with what software that person is running, than what hardware.
My Apple silicon laptop feels super fast because I just open the lid and it's running. That's not because the CPU ran instructions super fast, it's because I can just close the lid and the battery lasts forever.
Replaced a good Windows machine (Ryzen 5? 32 Gb) and I have a late intel Mac and a Linux workstation (6 core Ryzen 5, 32 Gb).
Obviously the Mac is newer. But wow. It's faster even on things that CPU shouldn't matter, like going through a remote samba mount through our corporate VPN.
- Much faster than my intel Mac
- Faster than my Windows
- Haven't noticed any improvements over my Linux machines, but with my current job I no longer get to use them much for desktop (unfortunately).
Of course, while I love my Debian setup, boot up is long on my workstation; screensaver/sleep/wake up is a nightmare on my entertainment box (my fault, but common!). The Mac just sleeps/wakes up with no problems.
The Mac (smallest air) is also by far the best laptop Ive ever had from a mobility POV. Immediate start up, long battery, decent enough keyboard (but If rather sacrifice for a longer keypress)
Part of it is that the data pipelines in the Mac are far more efficient with its soldered memory and enhanced buses. You would have to use something like Halo Strix on the PC side see similar performance upticks at a somewhat affordable price bracket. Things like Samba/VPN mounting should not matter much (unless your mac network interface is significantly better), but you might see a general snappiness improvement. Heavy compute tasks will be a give and take with modern PC hardware, but Apple is still the king of efficiency.
I still use an M1 MB Air for work mostly docked... the machine is insane for what it can still do, it sips power and has a perfect stability track record for me. I also have a Halo Strix machine that is the first machine that I can run linux and feel like I'm getting a "mac like" experience with virtually no compromises.
I've used Linux as a daily driver for 6 months and I am now back to my M1 Max for the past month.
I didn't find any reply mentioning the easy of use, benefits and handy things the mac does and Linux won't. Spotlight, Photos app with all the face recognition and general image index, contact sync, etc. Takes ages to setup those on Linux and with macs everything just works with an Apple account. So I wonder if Linux had to do all this background stuff, if it would be able to run smoothly as Macs run this days.
For context: I was running Linux for 6 months for the first time in 10 years (which I was daily driving macs). My M1 Max still beats my full tower gaming PC, which I was using linux at. I've used Windows and Linux before, and Windows for gaming too. My Linux setup was very snappy without any corporate stuff. But my office was getting warm because of the PC. My M1 barely turn on the fans, even with large DB migrations and other heavy operation during software development.
This is a metric I never really understood. how often are people booting? The only time I ever reboot a machine is if I have to. For instance the laptop I'm on right now has an uptime of just under 100 days.
Back in the bad old days of Intel Macs, I had a full system crash just as I was about to get up to give a presentation in class.
It rebooted and got to desktop, restoring all my open windows and app state, before I got to the podium (it was a very small room).
The Mac OS itself seems to be relatively fast to boot, the desktop environment does a good job recovering from failures, and now the underlying hardware is screaming fast.
I should never have to reboot, but in the rare instances when it happens, being fast can be a difference maker.
My Mac - couldn’t tell you, I just close the lid. My work laptop? Probably every day, as it makes its own mind up what it does when you close the lid. Even the “shut down” button in the start menu often restarts the machine in win 11.
My work desktop? Every day, and it takes > 30 seconds to go from off to desktop, and probably another minute or two for things like Docker to decide that they’ve actually started up.
Windows can boot pretty fast these days, I'm always surprised by it. I run LTSC on mine though, so zero bloat. Both my Macs and Windows LTSC have quick boots nowadays, I'm not sure I could say which is faster, but it might be the Windows.
It can boot and show a desktop fast after logging in. However, after that it seems still to be doing a lot in the background. If I try to open up Firefox, or any other app, immediately after I see the desktop it will take forever to load. When I let the desktop sit for a minute and then open Firefox it opens instantly.
Presumably a whole bunch of services are still being (lazy?) loaded.
On the other hand, my cachyos install takes a bit longer to boot, but after it jumps to the desktop all apps that are autostart just jump into view instantly.
Most time on boot seems to be spent on initializing drives and finding the right boot drive and load it.
What hardware? Up until a recent BIOS update my X870 board 9950X3D spent 3 minutes of a cold boot training the RAM… then booting up the OS in 4-8 seconds, so my Mac would always win these comparisons. Now it still takes a while at first boot, but subsequent reboots are snappy.
Something else to consider: chromebook on arm boots significantly faster than dito intel. Yes, nowadays Mediateks latest cpus wipe the floor with intel N-whatever, but it has been like this since the early days when the Arm version was relatively underpowered.
My guess would be that ARM Chromebooks might run substantially more cut-down firmware? While intel might need a more full-fat EFI stack? But I haven't used either and am just speculating.
You can notice that memory bandwidth advantage even in workloads like photo editing and code compilation. That and the performance cores reserved for foreground compute, on top of the usual "Linux sucks at swap" (was it fixed? I haven't enabled swap on my Linux machines for ages by now), does make a day-to-day difference in my usage.
I love apple and mainly use one for personal use, but apple users consistently overrate how fast their machines are. I used to see sentiment like "how will nvidia ever catch up with apples unified silicon approach" a few years ago. But if you just try nvidia vs apple and compare on a per dollar level, nvidia is so obviously the winner.
I haven’t used a laptop other than a mac in 10 years. I remember being extremely frustrated with the Intel macs. What I hated most was getting into video meetings, which would make the Intel CPU sound like a 747 taxiing.
The switch from a top spec, new Intel Mac to a base model M1 Macbook Air was like a breath of fresh air. I still use that 5 year old laptop happily because it was such a leap forward in performance. I dont recall ever being happy with a 5 year old device.
I think you should spend some time looking at actual laptop review coverage before asking questions like this.
There are dozens of outlets out there that run synthetic and real world benchmarks that answer these questions.
Apple’s chips are very strong on creative tasks like video transcoding, they have the best single core performance as well as strong multi-core performance. They also have top tier power efficiency, battery life, and quiet operation, which is a lot of what people look for when doing corporate tasks.
Depending on the chip model, the graphics performance is impressive for the power draw, but you can get better integrated graphics from Intel Panther Lake, and you can get better dedicated class graphics from Nvidia.
Some outlets like Just Josh tech on YouTube are good at demonstrating these differences.
> The fact that an idle Mac has over 2,000 threads running in over 600 processes is good news
Not when one of those decides to wreck havoc - spotlight indexing issues slowly eating away your disk space, icloud sync spinning over and over and hanging any app that tries to read your Documents folder, Photos sync pegging all cores at 100%… it feels like things might be getting a little out of hand. How can anyone model/predict system behaviour with so many moving parts?
My pet peeve with the modern macOS architecture & its 600 coordinating processes & Grand Central Dispatch work queues is debugability.
Fifteen years ago, if an application started spinning or mail stopped coming in, you could open up Console.app and have reasonable confidence the app in question would have logged an easy to tag error diagnostic. This was how the plague of mysterious DNS resolution issues got tied to the half-baked discoveryd so quickly.
Now, those 600 processes and 2000 threads are blasting thousands of log entries per second, with dozens of errors happening in unrecognizable daemons doing thrice-delegated work.
>with dozens of errors happening in unrecognizable daemons doing thrice-delegated work.
It seems like a perfect example of Jevons paradox (or andy/bill law): unified logging makes logging rich and cheap and free, but that causes everyone to throw it everywhere willy nilly. It's so noisy in there that I'm not sure who the logs are for anymore, it's useless for the user of the computer and even as a developer it seems impossible to debug things just by passively watching logs unless you already know the precise filter predicate.
In fact they must realize it's hopeless because the new Console doesn't even give you a mechanism to read past logs (I have to download eclecticlight's Ulbow for that).
> Now, those 600 processes and 2000 threads are blasting thousands of log entries per second, with dozens of errors happening in unrecognizable daemons doing thrice-delegated work.
This is the kind of thing that makes me want to grab Craig Federighi by the scruff and rub his nose in it. Every event that’s scrolling by here, an engineer thought was a bad enough scenario to log it at Error level. There should be zero of these on a standard customer install. How many of these are legitimate bugs? Do they even know? (Hahaha, of course they don’t.)
Something about the invisibility of background daemons makes them like flypaper for really stupid, face-palm level bugs. Because approximately zero customers look at the console errors and the crash files, they’re just sort of invisible and tolerated. Nobody seems to give a damn at Apple any more.
It's slowly approaching what SRE has been dealing with for distributed systems... You just have to accept things won't be fully understood and whip out your statistical tooling, it's ok. And if they get the engineering right, you might still keep your low latency corner where only an understandable set of things are allowed.
If open source projects like Slackware Linux can keep it stable on zero budget with a zoo of components since before we knew what SRE was, then Apple can afford to have operating system specialists who know the whole system. It’s like they gave up and welded the jukebox closed because it was making enough money.
Slackware Linux is way less complicated than MacOS. It runs far fewer, and much simpler, components and much less functionality in a default install. Like any Linux, there are myriad problems that can arise as users begin customizing the system, but until then, all those potential bugs remain deceptively hidden below the surface. And Slackware also has no constantly moving hardware team to keep track against and no hard timelines to hit for releases.
I wonder if that explains my intermittent keyboard lockups on MacOS? The keyboard just failing to work for a few minutes. The keyboard, a logitec one with a dongle, never has problems under windows or linux. M1 mac mini, not upgraded to Tahoe yet.
Pretty heavy iMessage user here, but I can't say I experience any issues, and that's probably why your issue is not getting fixed - ie. nobody at Apple is able to reproduce it. Maybe you should gather some info about it and see if you can send a bug report?
It happens more often when switching devices, particularly if it’s a less than regularly used device like an iPad. Happens with my travel MacBook. Takes messages a solid day to catch up.
The most I’ve heard back from a reproducible bug report was ”cool, it shouldn’t do that”. The response came on the dev forums, the actual bug has never been acknowledged or fixed. Multiple times. Why bother?
What I've found is if I open a picture in iMessage it tends to trigger the CPU hungry behavior. I notice it after a while as my laptop starts getting hot and battery draining much faster than expected. I hard quite iMessage, reopen it, and all is fine.
fair point, I should — one classic symptom I experience is the emoji picker will make it crash, not load quickly, and if it finishes loading they all appear as empty placeholders (maybe because I have way too many stickers and iCloud sync is tripping? idk)
and if it paid off, that would almost be acceptable! But no. After spotlight has indexed my /Applications folder, when I hit command-spacebar and type "preview.app", it takes ~4 seconds on my M4 laptop to search the sqlite database for it and return that entry.
On pre-Tahoe macOS there is the “Applications” view (accessible e.g. from the dock). Since the only thing I would use Spotlight for is searching through applications to start, I changed the Cmd+Space keybind to launch the Applications view. The search is instant.
Spotlight, aside from failing to find applications also pollutes the search results with random files it found on the filesystem, some shortcuts to search the web and whatnot. Also, at the start of me using a Mac it repeatedly got into the state of not displaying any results whatsoever. Fixing that each time required running some arcane commands in the terminal. Something that people associate with Linux, but ironically I think now Linux requires less of that than Mac.
But in Tahoe they removed the Applications view, so my solution is gone now.
All in all, with Apple destroying macOS in each release, crippling DTrace with SIP, Liquid Glass, poor performance monitoring compared to what I can see with tools like perf on Linux, or Intel VTune on Windows, Metal slowly becoming the only GPU programming option, I think I’m going to be switching back to Linux.
> I quickly found out that Apple Instruments doesn’t support fetching more than 10 counters, sometimes 8, and sometimes less. I was constantly getting errors like '<SOME_COUNTER>' conflicts with a previously added event. The maximum that I could get is 10 counters. So, the first takeaway was that there is a limit to how many counters I can fetch, and another is that counters are, in some way, incompatible with each other. Why and how they’re incompatible is a good question.
Your first example is a CPU limitation that Instruments doesn't model (does perf?), but is still mostly better than Intel chips that are limited to 4 dynamic counters (I think still? At least that's what I see in the Alder Lake's Golden Cove perfmon files...)
Your second example, is the complaint that Instruments doesn't have flamegraph visualization? That was true a decade ago when it was written, and is not true today. Or that Instrument's trace file format isn't documented?
So far I have provided you with examples of how Instruments.app loses to perf. Perf does not have these limitations. You have not provided any examples in the reverse direction.
Both of your examples are actually very good at explaining my point. Both Instruments and perf largely expose the same information, since they use trace features in the hardware together with kernel support to profile code. Where they differ is the UI they provide. perf provides almost nothing; Instruments provides almost everything. This is because perf is basically a library and Instruments is a tool that you use to find performance problems.
Why do I like Instruments and think it is better? Because the people who designed it optimized it for solving real performance problems. There are a bunch of "templates" that are focused on issues like "why is my thing so slow, what is it doing" to "why am I using too much memory" to "what network traffic is coming out of this app". These are real, specific problems while perf will tell you things like "oh this instruction has a 12% cache miss rate because it got scheduled off the core 2ms ago". Which is something Instruments can also tell you, but the idea is that this is totally the wrong interface that you should be presenting for doing performance work since just presenting people with data is barely useful.
What people do instead with perf is they have like 17 scripts 12 of which were written by Brendan Gregg to load the info into something that can be half useful to them. This is to save you time if you don't know how the Linux kernel works. Part of the reason why flamegraphs and Perfetto are so popular is because everyone is so desperate to pull out the info and get something, anything, that's not the perf UI that they settle for what they can get. Instruments has exceptionally good UI for its tools, clearly designed by people who solve real performance problems. perf is a raw data dump from the kernel with some lipstick on it.
Mind you, I trust the data that perf is dumping because the tool is rock-solid. Instruments is not like that. It's buggy, sometimes undocumented (to be fair, perf is not great either, but at least it is open source), slow, and crashes a lot. This majorly sucks. But even with this I solve a lot more problems clicking around Instruments UI and cursing at it than I do with perf. And while they are slow to fix things they are directionally moving towards cleaning up bugs and allowing data export, so the problems that you brought up (which are very valid) are solved or on their way towards being solved.
> Because the people who designed it optimized it for solving real performance problems.
The implication that perf is not is frankly laughable. Perhaps one major difference is that perf assumes you know how the OS works, and what various syscalls are doing.
> perf assumes you know how the OS works, and what various syscalls are doing.
You just proved again that it's not optimized for reality because that knowledge can't be assumed as the pool of people trying to solve real performance problems is much wider than the pool with that knowledge
I have the same issue on my M4 Macbook Pro and I had it on my previous M2 Apple Mac Mini, on several macOS versions (pre-Tahoe). I suspect it has to do with the virtual filesystem layer, as I had used OneDrive for Mac and now Proton Drive. Whatever it is, it has been broken for years on several devices and OSes and I am pretty sure Apple doesn't care about it.
I’ve actually had worse problems as recently as last week: Apps stopped showing up completely in spotlight.
Only a system reinstall + manually deleting all index files fixed it. Meanwhile it was eating 20-30GB of disk space. There are tons of reports of this in the apple forums.
Even then, it feels a lot slower in MacOS 26 than it did before, and you often get the rug-pull effect of your results changing a millisecond before you press the enter key. I would pay good money to go back to Snow Leopard.
I had the same problem last year, re-indexing all the files fixed it for me[1].
That being said, macOS was definitely more snappy back on Catalina, which was the first version I had so I can't vouch for Snow Leopard. Each update after Catalina felt gradually worse and from what i heard Tahoe feels like the last nail in the coffin.
I hope the UX team will deliver a more polished, expressive and minimal design next time.
Catalina and Mojave were the closest releases in terms of quality that we got to Snow Leopard. Catalina in particular since it was the release that removed more 32-bit cruft (like Snow Leopard before it).
I may be a spotlight unicorn, but I’ve never seen this behavior people complain about. Spotlight has always been instant for me, since its introduction and I’ve never seen a regression.
It is completely useless on network mounts, however, where I resort to find/grep/rg
I've never had this issue. M1 Max. But I also disable some of the Spotlight indexes. Cmmd+Space has no files for me, when I know I am searching for a file I use Finder search instead.
It feels like the 2000s era of “Mac software is better but you have to tolerate their hardware to enjoy it” has inverted in the last 5 years. Incredible hardware down to in-house silicon, but software that would have given Steve Jobs a stroke.
Firstly performance issues like wtf is going on with search. Then there seems to be a need to constantly futz with stable established apps UXes every annual OS update for the sake of change. Moving buttons, adding clicks to workflows, etc.
My most recent enraging find was the date picker in the reminders app. When editting a reminder, there is an up/down arrow interface to the side of the date, but if you click them they change the MONTH. Who decided that makes any sense. In what world is bumping a reminder by a month the most common change? It’s actually worse than useless, its actively net negative.
I just got my first ARM Mac to replace my work Win machine (what has MS done to Windows!?!? :'()
Used to be I could type "display" and Id get right to display settings in settings. Now it shows thousands of useless links to who knows what. Instead I have to type "settings" and then, within settings, type "display"
Still better than the Windows shit show.
Honestly, a well setup Linux machine has better user experience than anything on the market today.
Even if the degradation of the user interfaces is noticed especially by older people, I doubt that this has anything to do with them being old and I believe that it is caused only by them being more experienced, i.e. having seen more alternatives for user interfaces.
For several decades, I have used hundreds of different computers, from IBM mainframes, DEC minicomputers and early PCs with Intel 8080 or Motorola MC6800 until the latest computers with AMD Zen 5 or Intel Arrow Lake. I have used a variety of operating systems and user interfaces.
During the first decades, there has been a continuous and obvious improvement in user interfaces, so I never had any hesitation to switch to a new program with a completely different user interface for the same application, even every year or every few months, whenever such a change resulted in better results and productivity.
Nevertheless, an optimum seems to have been reached around 20 years ago, and since then more often than not I see only worse interfaces that make harder to do what was simpler previously, so there is no incentive for an "upgrade".
Therefore I indeed customize my GUIs in Linux to a mode that resembles much more older Windows or MacOS than their recent versions and which prioritizes instant responses and minimum distractions over the coolest look.
In the rare occasions when I find a program that does something in a better way than what I am using, I still switch immediately to it, no matter how different it may be in comparison with what I am familiar, so conservatism has nothing to do with preferring the older GUIs.
> Nevertheless, an optimum seems to have been reached around 20 years ago, and since then more often than not I see only worse interfaces that make harder
A consequence of having "UI designers" paid on salary instead of individual contract jobs that expire when the specific fix is complete. In order to preserve their continuing salary, the UI designers have to continue making changes for changes sake (so that the accounting dept. does not begin asking: "why are we paying salary for all these UI designers if they are not creating any output"). So combining reaching an optimum 20 years ago with the fact that the UI designers must make changes for the sake of change, results in the changes being sub-optimal.
> Nevertheless, an optimum seems to have been reached around 20 years ago, and since then more often than not I see only worse interfaces that make harder to do what was simpler previously, so there is no incentive for an "upgrade".
“I've come up with a set of rules that describe our reactions to technologies:
1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
2. Anything that's invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
3. Anything invented after you're thirty-five is against the natural order of things.”
Because of Linux's Fragmentation there are so many different UI being created right now and so much more experimentation.
There is KDE with endless customization, Gnome for the people who are okay with a more opinionated UI, Cosmic the completely new Desktop Environment and of course the good old lxde. I have to use Windows at work these day and every single one of these makes the Windows feel like it's more suited for Old People because of how slow it can be.
Yes but most people just fire up the distro, UI, and settings that they’re most comfortable with for their daily driver. People can and often keep things the same
Or, and bear with me hear, there is a problem even if you aren't experiencing it.
I've been using spotlight since it was introduced for... everything.
In Tahoe it has been absolutely terrible. Unusable.
Always indexing.
Never showing me applications which is the main thing I use it for (yes, it is configured to show applications!).
They broke something.
Although there is the consistent trap of tools that assign threads/workers based on the number of cores (e.g unit testing or bundling tools). This means the efficiency cores get dragged in and can absolutely tank the process.
This was particularly pronounced on the M1 due to the 50/50 split. We reduced the number of workers on our test suite based on the CPU type and it sped up considerably.
Does anyone have any insight into the MacOS scheduler and the algorithm it uses to place threads on E vs. P cores? Is it as simple as noting whether a thread was last suspended blocking on I/O or for a time slice timeout and mapping I/O blockers to E cores and time slice blockers to P cores? Or does the programmer indicate a static mapping at thread creation? I write code on a Mac all the time, but I use Clojure and all the low level OS decisions are opaque to me.
The baseline is static: low QoS tasks are dispatched to the E cores, while high QoS tasks are dispatched to P cores. IIRC high QoS cores can migrate to the E cores if all P cores are loaded, but my understanding is that the lowest QoS tasks (background) never get promoted to P cores.
The Apple software stack makes heavy use of thread pools via libdispatch. Individual work items are tagged with QoS, which influences which thread picks up the work item from the queue.
The article mentions P or E is generally decided by if it's a "background" process (whatever than means). Possible some (undocumented) designation in code or directive to the compiler of the binary decides this at compile time.
My M2 MBA doesn't have a fan but literally smokes the majority on Intel systems which are space heaters this time of year. Those legacy x86 apps don't really exist for the majority of people anymore.
If you place 1mm thermal pads between the sinks and the case, the CPUs/GPUs won't throttle as readily. At least for my M3 MBA (check your actual clearance).
I replaced a MacPro5,1 with an M2Pro — which uses soooooo much less energy performing similarly mundane tasks (~15x+). Idle is ~25W v. 160W
Can't Windows/Linux pin background threads to specific cores on Intel too? So that your foreground app isn't slowed down by all the background activity going on? Or there's something else to it that I don't understand. I thought E cores' main advantage is that they use less power which is good for battery life on laptops. But the article makes it sound like main advantage of Apple Silicon is that it splits foreground/background workloads better. Isn't it something that can already be done without a P/E distinction?
One thing that distinguishes macOS here is that the mach kernel has the concept of “vouchers” which helps the scheduler understand logical calls across IPC boundaries. So if you have a high-priority (UserInitiated) process, and it makes an IPC call out to a daemon that is usually a low-priority background daemon, the high-priority process passes a voucher to the low-priority one, which allows the daemon’s ipc handling thread to run high-priority (and thus access P-cores) so long as it’s holding the voucher.
This lets Apple architect things as small, single-responsibility processes, but make their priority dynamic, such that they’re usually low-priority unless a foreground user process is blocked on their work. I’m not sure the Linux kernel has this.
That it actually quite simple and nifty. It reminds me of the 4 priorities RPC requests can have within the Google stack. 0 being if this fails it will result in a big fat error for the user to 3, we don’t care if this fails because we will run the analysis job again in a month or so.
IIRC in macOS you do need to pass the voucher, it isn’t inherited automatically. Linux has no knowledge of it, so first it has to be introduced as a concept and then apps have to start using it.
Multithreading has been more ubiquitous in Mac apps for a long time thanks to Apple having offered mainstream multi-CPU machines very early on (circa 2000), predating even OS X itself, and has made a point of making multithreading easier in its SDK. By contrast multicore machines weren’t common in the Windows/x86 world until around the late 2000s with the boom of Intel’s Core series CPUs, but single core x86 CPUs persisted for several years following and Windows developer culture still hasn’t embraced multithreading as fully as its Mac counterpart has.
This then made it dead simple for Mac developers to adopt task prioritization/QoS. Work was already cleanly split into threads, so it’s just a matter of specifying which are best suited for putting on e-cores and which to keep on P-cores. And overwhelmingly, Mac devs have done that.
So the system scheduler is a good deal more effective than its Windows counterpart because third party devs have given it cues to guide it. The tasks most impactful to the user’s perception of snappiness remain on the P-cores, the E-cores stay busy with auxiliary work and keep the P-cores unblocked and able to sleep more quickly and often.
Yeah, this my guess as well. The other OSes have the ability to pin to specific cores, but first party Apple leaned hard into coding to that hardware vision. Since Apple would love to merge the desktop and mobile software, being very deliberate about what is background vs foreground work is essential. Windows and Linux have not had the hardware guarantees of differentiating between cores, so few programs have taken the effort to be explicit about how the work is executed.
When I ran Gnome, I was regularly annoyed at how often an indexing service would chew through CPU.
There was an article by Raymond Chen where he argued that giving app developers an API option to say "run me under high/low priority" rarely works because every developer views their program as the main character on the stage and couldn't care less about other programs' performance, and they are incentivized to enable the "high priority" option if given a chance because it makes their program run better (at the expense of other programs). So unless there's a strict audit on some kind of app store or some API rules which enforce developers don't abuse the priority API, sometimes it's better to let the OS decide all the scheduling dynamically as the programs run (say, a foreground UI window automatically is given a high priority by the OS), so that the scheduling was fair.
The way it’s conceptualized on Apple platforms is primarily user-initiated vs. program initiated, with the former getting priority. It’s positioned as being about tasks within a program competing for resources rather than programs competing with each other.
So for example, if in an email client the user has initiated the export of a mailbox, that is given utmost priority while things like indexing and periodic fetches get put on the back burner.
This works because even a selfish developer wants their program to run well, which setting all tasks as high priority actively and often visibly impedes, and so they push less essential work to the background.
It just happens that in this case, smart threading on the per-process level makes life for the system scheduler easier.
It's the combination of the two that yields the best of both worlds.
Android SoCs have adopted heterogenous CPU architectures ("big.LITTLE" in the ARM sphere) years before Apple, and as a result, there have been multiple attempts to tackle this in Linux. The latest, upstream, and perhaps the most widely deployed way of efficiently using such processors involves using Energy-Aware Scheduling [1]. This allows the kernel to differentiate between performant and efficient cores, and schedule work accordingly, avoiding situations in which brief workloads are put on P cores and the demanding ones start hogging E cores. Thanks to this, P cores can also be put to sleep when their extra power is not needed, saving power.
One advantage macOS still has over Linux is that its kernel can tell performance-critical and background workloads apart without taking guesses. This is beneficial on all sorts of systems, but particularly shines on those heterogenous ones, allowing unimportant workloads to always occupy E cores, and freeing P cores for loads that would benefit from them, or simply letting them sleep for longer. Apple solved this problem by defining a standard interface for the user-space to communicate such information down [2]. As far as I'm aware, Linux currently lacks an equivalent [3].
Technically, your application can still pin its threads to individual cores, but to know which core is which, it would have to parse information internal to the scheduler. I haven't seen any Linux application that does this.
Similarly, are there any modern benchmarks of the performance impact of pinning programs to a core in Linux? Are we talking <1% or something actually notable for a CPU bound program?
I have read there are some potential security benefits if you were to keep your most exploitable programs (eg web browser) on its own dedicated core.
It's very heavily dependent on what your processes are doing. I've seen extreme cases where the gains of pinning were large (well over 2x when cooperative tasks were pinned to the same core), but thats primarily about preventing the CPU from idling long enough to enter deeper idle states.
Pinning exists, but the interesting part is signal quality: macOS gets consistent “urgency” signals (QoS) from a lot of frameworks/apps, so scheduling on heterogeneous cores is less guessy than infer from runtime behavior.
> Admittedly the impression isn’t helped by a dreadful piece of psychology, as those E cores at 100% are probably running at a frequency a quarter of those of P cores shown at the same 100%
It’s about half, actually
> The fact that an idle Mac has over 2,000 threads running in over 600 processes is good news
>If you use an Apple silicon Mac I’m sure you have been impressed by its performance.
This article couldn't have come at a better time. Because frankly speaking I am not that impressed after I tested Omarchy Linux. Everything was snappy. It is like back to DOS or Windows 3.11 era. ( Not quite but close ) It makes me wonder why Mac couldn't be like that.
Apple Silicon is fast, no doubt about it. It isn't some benchmarks but even under emulation, compiling or other workload it is fast if not the fastest. So there are plenty of evidence it isn't benchmark specific which some people claims Apple is only fast on Geekbench. The problem is macOS is slow. And for whatever reason haven't improved much. I am hoping dropping support for x86 in next macOS meant they have time and excuses to do a lot of work on macOS under the hood. Especially with OOM and Paging.
I have a ThinkPad besides my main MacBook. I recently switched to KDE, a full desktop environment, and it is just insane how much faster everything renders than on macOS. And that's on a relatively underpowered integrated Ryzen GPU. Window dragging is butter smooth on a 120Hz screen, which I cannot say of macOS (though it outright terrible with the recent Electron issue).
Apple Silicon is awesome and was a game changer when it came out. Still very impressive that they have been able to keep the MacBook Air passively cooled since the first M1. But yeah, macOS is holding it back.
Modern machines are so insanely fast, the UI should almost react before you do an action.
My 5900X machine with relatively slow RAM running CachyOS actually almost feels as instant as DOS machines with incredible low latency.
Instead, we get tons of layers of abstraction (Electron etc), combined with endpoint security software in the enterprise world that bring any machine to its knees by inspecting every I/O action.
Zed is an awesome editor where every interaction is just “snappy”.
I feel all the animations and transitions have been introduced to macOS just to paper over slowness. I’ve disabled them via the accessibility settings which helps in perceived snappiness a bit, but it also makes yank visible where performance is not up to par.
That's just framing. A different wording could be: by moving more work to slow (but power efficient) cores, the other cores (let's call them performance cores) are free to do other stuff.
This doesn't make sense in a rather fundamental way - there is no way to design a real computer where doing some useless work is better than doing no work, just think about energy consumption and battery life since this is laptops. Or that's just resources your current app can't use
Besides, they aren't that well engineered, bugs exist and last and come back, etc, so even when on average the impact isn't big, you can get a few photo analysis indexing going haywire for awhile and get stuck
However, given the trend in modern software engineering to break work into units and the fact that on modern hardware thread switches happen very quickly, being able to distribute that work across different compute clusters that make different optimization choices is a good thing and allows schedulers to get results closer to optimal.
So really it boils down to if the gains in doing the work on different compute outweighs the cost splitting and distributing the work, then it's a win. And for most modern software on most modern hardware, the win is very significant.
As always, YMMV
This is far from being a hypothesis. This is an accurate description of your average workstation. I recommend you casually check the list of processes running at any given moment in any random desktop or laptop you find in a 5 meter radius.
This take expresses a fundamental misunderstanding of the whole problem domain. There is a workload comprised of hundreds of processes, some of which multithreaded, that need to be processed. That does not change nor go away. You have absolutely no suggestion that any of those hundreds of processes is "useless". What you will certainly have are processes that will be waiting for IO, but waiting for a request to return a response is not useless.
Hmm I guess the apple silicon laptops don't exist? Did I dream that I bought one year? Maybe I did - it has been a confusing year.
> he's talking about real systems with real processes in a generic way
So which real but impossible to design systems are we discussing then if not the Apple silicon systems?
sigh.
also a mandatory: and yet the macbooks are faster and more battery efficient than any PC laptop with linux/windows
I ran a performance test back in October comparing M4 laptops against high-end Windows desktops, and the results showed the M-series chips coming out on top.
https://www.tyleo.com/blog/compiler-performance-on-2025-devi...
Apple has the best silicon team in the world. They choose perf per watt over pure perf, which means they don't win on multi-core, but they're simply the best in the world in the most complicated, difficult, and impossible metric to game: single core perf.
Even when they were new, they competed with AMD's high end desktop chips. Many years later, they're still excellent in the laptop power range - but not in the desktop power range, where chips with a lot of cache match it in single core performance and obliterate it in multicore.
https://www.cpu-monkey.com/en/compare_cpu-apple_m4-vs-amd_ry...
I don't know how to set up a proper cross compile setup on Apple Silicon, so I tried compiling the same code on 2 macOS systems and 1 Linux system, running the corresponding test suite, and getting some numbers. It's not exactly conclusive, and if I was doing this properly properly then I'd try a bit harder to make everything match up, but it does indeed look like using clang to build x64 code is more expensive - for whatever reason - than using it to build ARM code.
Systems, including clang version and single-core PassMark:
Single thread build times (in seconds). Code is a bunch of C++, plus some FOSS dependencies that are C, everything built with optimisation enabled: (Linux time excludes build times for some of the FOSS dependencies, which on Linux come prebuilt via the package manager.)Single thread test suite times (in seconds), an approximate indication of relative single thread performance:
Build time/test time makes it look like ARM clang is an outlier: (The Linux value is flattered here, as it excludes dependency build times, as above. The C dependencies don't add much when building in parallel, but, looking at the above numbers, I wonder if they'd add up to enough when built in series to make the x64 figures the same.)Not even a bad little gaming machine on the rare occasion
Those panther lake comparisons are from the top end PTL to the base M series. If they were compared to their comparative SKUs they’d be even further behind.
This was all mentioned in the article.
See the chart here for what the intel SKUs are: https://www.pcworld.com/article/3023938/intels-core-ultra-se...
They consume more power at the chip level. You can see this in Intels spec sheets. The base recommended power envelope of the PTL is the maximum power envelope of the M5. They’re completely different tiers. You’re comparing a 25-85W tier chip to a 5W-25W chip.
They also only win when it comes to multi core whether that’s CPU or GPU. If they were fairly compared to the correct SoC (an M4 Pro) they’d come out behind on both multicore CPU and GPU.
This was all mentioned in my comment addressing the article. This is the trick that apples competitors are using, by comparing across SKU ranges to grab the headlines. PTL is a strong chip, no doubt, but it’s still behind Apple across all the metrics in a like for like comparison.
That was literally my desktop CPU for a very long time.
Because, when running a Linux intel laptop, even with crowd strike and a LOT of corporate ware, there is no slowness.
When blogs talk about "fast" like this I always assumed it was for heavy lifting, such as video editing or AI stuff, not just day to day regular stuff.
I'm confused, is there a speed difference in day to day corporate work between new Macs and new Linux laptops?
Thank you
When Apple released Apple Silicon, it was a huge breath of fresh air - suddenly the web became snappy again! And the battery lasted forever! Software has bloated to slow down MacBooks again, RAM can often be a major limiting factor in performance, and battery life is more variable now.
Intel is finally catching up to Apple for the first time since 2020. Panther Lake is very competitive on everything except single-core performance (including battery life). Panther Lake CPU's arguably have better features as well - Intel QSV is great if you compile ffmpeg to use it for encoding, and it's easier to use local AI models with OpenVINO than it is to figure out how to use the Apple NPU's. Intel has better tools for sampling/tracing performance analysis, and you can actually see you're loading the iGPU (which is quite performant) and how much VRAM you're using. Last I looked, there was still no way to actually check if an AI model was running on Apple's CPU, GPU, or NPU. The iGPU's can also be configured to use varying amounts of system RAM - I'm not sure how that compares to Apple's unified memory for effective VRAM, and Apple has higher memory bandwidth/lower latency.
I'm not saying that Intel has matched Apple, but it's competitive in the latest generation.
My work laptop will literally struggle to last 2 hours doing any actual work. That involves running IDEs, compiling code, browsing the web, etc. I've done the same on my Macbook on a personal level and it barely makes a dent in the battery.
I feel like the battery performance is definitely down to the hardware. Apple Silicon is an incredible innovation. But the general responsiveness of the OS has to be down to Windows being god-awful. I don't understand how a top of the line desktop can still feel sluggish versus even an M1 Macbook. When I'm running intensive applications like games or compiling code on my desktop, it's rapid. But it never actual feels fast doing day to day things. I feel like that's half the problem. Windows just FEELS so slow all the time. There's no polish.
I currently have a M3 Pro for a work laptop. The performance is fine, but the battery life is not particularly impressive. It often hits low battery after just 2-3 hours without me doing anything particularly CPU-intensive, and sometimes drains the battery from full to flat while sitting closed in a backpack overnight. I'm pretty sure this is due to the corporate crapware, not any issues with Apple's OS, though it's difficult to prove.
I've tended to think lately that all of the OSes are basically fine when set up reasonably well, but can be brought to their knees by a sufficient amount of low-quality corporate crapware.
If you have access to the Defender settings, I found it to be much better after setting an exclusion for the folder that you clone your git repositories to. You can also set exclusions for the git binary and your IDE.
That M2 MBA however, it only feels sluggish at > 400 Chrome tabs open because only then swapping becomes a real annoyance.
[1] https://9to5mac.com/2022/07/14/m2-macbook-air-slower-ssd-bas...
[2] https://www.tomshardware.com/laptops/macbooks/m5-macbook-pro...
[3] https://www.reddit.com/r/AcerNitro/comments/1i0nbt4/slow_ssd...
Except that you can replace Windows with Linux and suddenly it doesn't feel like dogshit anymore. SSDs are fast enough that they should be adding zero perceived latency for ordinary day-to-day operation. In fact, Linux still runs great on a pure spinning disk setup, which is something no other OS can manage today.
With Windows, you're probably still getting SATA and not even NVMe.
The options in that space are increasingly dwindling which is a problem when supporting older machines.
Sometimes it is cheaper to get a sketchy m2 ssd and adapter than to get an actual sata drive from one of the larger manufactures.
(I love my MacBook Air, but it does have its limits.)
My recommendation to friends asking about MBP / MBA is entirely based on whether they do anything that will load the CPU for more than 7 minutes. For me, I need the fans. I even use Macs Fan Control[0], a 3rd party utility, to control the fans for some of my workflows - pegging the fans to 100% to pre-cool the CPU between loads can help a lot.
0: https://crystalidea.com/macs-fan-control
My used M1 mba is the fastest computer I’ve ever used. If a video render is going to take more than 7 minutes I walk away or just do something in another app anyway. The difference of a few mini means nothing.
What’s surprising is it DOES throttle using Discord with video after an hour or so, unless the battery is already full (I’m guessing it tries to charge which generates a lot of heat). You get way less thermals with a full battery and it using power instead of discharging/charging the battery during heavy usage.
Happiness #1
Apples CPUs are most powerful efficient however, due to a bunch of design and manufacturing choices.
But to answer your question, yes Windows 11 with modern security crap feels 2-3 slower than vanilla Linux on the same hardware.
Also, all the top nearly 50 multi-core benchmarks are taken up by Epyc and Xeon chips. For desktop/laptop chips that aren't Threadripper, Apple still leads with the M3 Ultra 32-core in multi-core passmark benchmark. The usual caveats of benchmarks not being representative of any actual workload still apply, of course.
And Apple does lag behind in multi-core benchmarks for laptop chips - The M3 Ultra is not offered in a laptop form-factor, but it does beat every AMD/Intel laptop chip as well in multicore benchmarks.
Obviously it's an Apple-to-Oranges (pardon the pun) comparison since the AMD options don't need to care about the power envelope nearly as much; and the comparison gets more equal when normalizing for Apple's optimized domain (power efficiency), but the high-end AMD laptop chips still edge it out.
But then this turns into some sort of religious war, where people want to assume that their "god" should win at everything. It's not, the Apple chips are great; amazing even, when considering they're powering laptops/phones for 10+ hours at a time in smaller chassis than their competitors. But they still have to give in certain metrics to hit that envelope.
1 - https://thepcbottleneckcalculator.com/cpu-benchmarks-2026/
What does "single core gaming performance" even mean for a CPU that doesn't have an iGPU? How could that not be a category error to compare against Apple Silicon?
I was looking at https://www.cpubenchmark.net/single-thread/
See also:
https://nanoreview.net/en/cpu-list/cinebench-scores
https://browser.geekbench.com/mac-benchmarks vs https://browser.geekbench.com/processor-benchmarks
The distance was not huge, maybe 3%. You can obviously pick and choose your benchmarks until you find one where "your" CPU happens to be the best.
https://www.cpubenchmark.net/single-thread/
https://browser.geekbench.com/mac-benchmarks vs https://browser.geekbench.com/processor-benchmarks
Apple leads all of these in single core, by a significant margin. Even at geekbench.com (3398 for AMD 9950X3D vs 3235 for the 14900KS vs ~4000 for various Apple chips)
I'm not sure I could find a single core benchmark it would lose no matter how hard I tried...
My personal M1 feels just as fast as the work M4 due to this.
With maximum corporate spyware it consistently takes 1 second to get a visual feedback on Windows.
The cores are. Nothing is beating a M4/M5 on single CPU performance, and per-cycle nothing is even particularly close.
At the whole-chip level, there are bigger devices from the x86 vendors which will pull ahead on parallel benchmarks. And Apple's unfortunate allergy to effective cooling techniques (like, "faster fans move more air") means that they tend to throttle on chip-scale loads[1].
But if you just want to Run One Thing really fast, which even today still correlates better to "machine feels fast" than parallel loads, Apple is the undisputed king.
[1] One of the reasons Geekbench 6, which controversially includes cooling pauses, looks so much better for Apple than version 5 did.
It’s probably the single most common corner to cut in x86 laptops. Manufacturers love to shove hot chips into a chassis too thin for them and then toss in whatever cheap tiny-whiny-fan cooling solution they happen to have on hand. Result: laptop sounds like a jet engine when the CPU is being pushed.
The issue is actually very simple. In order to gain more performance, manufactures like AMD / Intel for a long time have been in a race for the highest frequency but if you have some knowhow in hardware, you know that higher frequency = more power draw the higher you clock.
So you open your MS Paint, and ... your CPU pushes to 5.2Ghz, and it gets fed 15W on a single core. This creates a heat spike in the sensors, and your fans on laptops, all too often are set to react very fast. And VROOOOEEEEM goes your fan as the CPU Temp sensor hits 80C on a single core, just for a second. But wait, your MS Paint is open, and down goes the fan. And repeat, repeat, repeat ...
Notice how Apple focused on running their CPUs no higher then 4.2Ghz or something... So even if their CPU boosts to 100%, that thermal peak will be maybe 7W.
Now combine that with Apple using a much more tolerant fan / temp sensor setup. They say: 100C is perfectly acceptable. So when your CPU boosts, its not dumping 15W, but only 7W. And because the fan reaction threshold is so high, the fans do not react on any Apple product. Unless you run a single or MT process for a LONG time.
And even then, the fans will only ramp up slowly if your 100C has been going on for a few seconds, and while yes, your CPU will be thermal throttling while the fans spin up. But you do not feel this effect.
That is the real magic of Apple. Yes, their CPUs are masterpieces at how they get so much performance from a lower frequency, but the real kicker is their thermal / fan profile design.
The wife has a old Apple clone laptop from 2018. Thing is for 99.9% of the time silent. No fans, nothing. Because Xiaomi used the same tricks on that laptop, allowing it to boost to the max, without triggering the fan ramping. And when it triggers with a long running process, they use a very low fan rpm until it goes way too high. I had laptops with the same CPU from other brands in the same time periode, and they all had annoying fan profiles. That showed me that a lot of Apple magic is good design around the hardware/software/fan.
But ironically, that magic has been forgotten in later models by Xiaomi ... Tsk!
Manufactures think: Its better if millions of people suffer from more noise, then if we need to have a few thousand laptops that die / get damaged, from too much heat. So ramp up the fans!!!
Of course Apple did pick a very good sweet spot favoring a wide core as opposed to a speed daemon more than the competition.
That's true in principle, but IMHO a little too evasive. In point of fact Apple 100% won this round. Their wider architecture is actually faster than the competition in an absolute sense even at the deployed clock rates. There's really no significant market where you'd want to use anything different for CPU compute anywhere. Datacenters would absolutely buy M5 racks if they were offered. M5 efficiency cores are better than Intel's or Zen 5c every time they're measured too.
Just about the only spaces where Apple is behind[1] are die size and packaging: their cores take a little more area per benchmark point, and they're still shipping big single dies. And they finance both of those shortcomings with much higher per-part margins.
Intel and AMD have moved hard into tiled architectures and it seems to be working out for them. I'd expect Apple to do the same soon.
[1] Well, except the big elephant in the room that "CPU Performance Doesn't Matter Much Anymore". Consumer CPUs are fast enough and have been for years now, and the stuff that feels slow is on the GPU or the cloud these days. Apple's in critical danger of being commoditized out of its market space, but then that's true of every premium vendor throughout history.
Early on personally I had doubts they could scale their CPU to high end desktop performance, but obviously it hasn't been an issue.
My nitpick was purely about using clock per cycle as a performance metric, which is as much nonsense as comparing GHz: AFAIK Apple cpus still top at 4.5 GHz, while the AMD/Intel reach 6Ghz, so obviously the architectures are optimized for different target frequencies (which makes sense: the power costs of a high GHz design are astronomical).
And as an microarchitecture nerd I'm definitely interested in how they can implement such a wide architecture, but wide-ness per-se is not a target.
It was a discussion about how the P cores are left ready to speedily respond to input via the E cores satisfying background needs, in this case talking specifically about Apple Silicon because that's the writer's interest. But of course loads of chips have P and E cores, for the same reason.
You are comparing 256 AMD Zen6c Core to What? M4 Max?
When people say CPU they meant CPU Core, And in terms of Raw Speed, Apple CPU holds the fastest single core CPU benchmarks.
https://www.cpubenchmark.net/laptop.html#cpumark
https://www.cpubenchmark.net/single-thread/
Where the M5 (non-pro, the one that will be in the next MacBook Air) is on top.
When the M5 multicore scores arrive, the multi-core charts will be interesting.
My Apple silicon laptop feels super fast because I just open the lid and it's running. That's not because the CPU ran instructions super fast, it's because I can just close the lid and the battery lasts forever.
Replaced a good Windows machine (Ryzen 5? 32 Gb) and I have a late intel Mac and a Linux workstation (6 core Ryzen 5, 32 Gb).
Obviously the Mac is newer. But wow. It's faster even on things that CPU shouldn't matter, like going through a remote samba mount through our corporate VPN.
- Much faster than my intel Mac
- Faster than my Windows
- Haven't noticed any improvements over my Linux machines, but with my current job I no longer get to use them much for desktop (unfortunately).
Of course, while I love my Debian setup, boot up is long on my workstation; screensaver/sleep/wake up is a nightmare on my entertainment box (my fault, but common!). The Mac just sleeps/wakes up with no problems.
The Mac (smallest air) is also by far the best laptop Ive ever had from a mobility POV. Immediate start up, long battery, decent enough keyboard (but If rather sacrifice for a longer keypress)
I still use an M1 MB Air for work mostly docked... the machine is insane for what it can still do, it sips power and has a perfect stability track record for me. I also have a Halo Strix machine that is the first machine that I can run linux and feel like I'm getting a "mac like" experience with virtually no compromises.
I didn't find any reply mentioning the easy of use, benefits and handy things the mac does and Linux won't. Spotlight, Photos app with all the face recognition and general image index, contact sync, etc. Takes ages to setup those on Linux and with macs everything just works with an Apple account. So I wonder if Linux had to do all this background stuff, if it would be able to run smoothly as Macs run this days.
For context: I was running Linux for 6 months for the first time in 10 years (which I was daily driving macs). My M1 Max still beats my full tower gaming PC, which I was using linux at. I've used Windows and Linux before, and Windows for gaming too. My Linux setup was very snappy without any corporate stuff. But my office was getting warm because of the PC. My M1 barely turn on the fans, even with large DB migrations and other heavy operation during software development.
After I put an SSD in it, that is.
I wonder what my Apple silicon laptop is even doing sometimes.
Mac on intel feels like it was about 2x slower at these basic functions. (I don’t have real data points)
Intel Mac had lag when opening apps. Silicon Mac is instant and always responsive.
No idea how that compares to Linux.
This is a metric I never really understood. how often are people booting? The only time I ever reboot a machine is if I have to. For instance the laptop I'm on right now has an uptime of just under 100 days.
It rebooted and got to desktop, restoring all my open windows and app state, before I got to the podium (it was a very small room).
The Mac OS itself seems to be relatively fast to boot, the desktop environment does a good job recovering from failures, and now the underlying hardware is screaming fast.
I should never have to reboot, but in the rare instances when it happens, being fast can be a difference maker.
My work desktop? Every day, and it takes > 30 seconds to go from off to desktop, and probably another minute or two for things like Docker to decide that they’ve actually started up.
Presumably a whole bunch of services are still being (lazy?) loaded.
On the other hand, my cachyos install takes a bit longer to boot, but after it jumps to the desktop all apps that are autostart just jump into view instantly.
Most time on boot seems to be spent on initializing drives and finding the right boot drive and load it.
But I'm running a fairly slim Archlinux install without a desktop environment or anything like that. (It's just XMonad as a window manager.)
Even Windows (or at least my install that doesn't have any crap besides visual studio on it) can run for weeks these days...
My work PC will decide to not idle and will spin up fans arbitrarily in the evenings so I shut it down when I’m not using it.
Something else to consider: chromebook on arm boots significantly faster than dito intel. Yes, nowadays Mediateks latest cpus wipe the floor with intel N-whatever, but it has been like this since the early days when the Arm version was relatively underpowered.
Why? I have no idea.
It’s all about the perf per watt.
The switch from a top spec, new Intel Mac to a base model M1 Macbook Air was like a breath of fresh air. I still use that 5 year old laptop happily because it was such a leap forward in performance. I dont recall ever being happy with a 5 year old device.
There are dozens of outlets out there that run synthetic and real world benchmarks that answer these questions.
Apple’s chips are very strong on creative tasks like video transcoding, they have the best single core performance as well as strong multi-core performance. They also have top tier power efficiency, battery life, and quiet operation, which is a lot of what people look for when doing corporate tasks.
Depending on the chip model, the graphics performance is impressive for the power draw, but you can get better integrated graphics from Intel Panther Lake, and you can get better dedicated class graphics from Nvidia.
Some outlets like Just Josh tech on YouTube are good at demonstrating these differences.
Not when one of those decides to wreck havoc - spotlight indexing issues slowly eating away your disk space, icloud sync spinning over and over and hanging any app that tries to read your Documents folder, Photos sync pegging all cores at 100%… it feels like things might be getting a little out of hand. How can anyone model/predict system behaviour with so many moving parts?
Fifteen years ago, if an application started spinning or mail stopped coming in, you could open up Console.app and have reasonable confidence the app in question would have logged an easy to tag error diagnostic. This was how the plague of mysterious DNS resolution issues got tied to the half-baked discoveryd so quickly.
Now, those 600 processes and 2000 threads are blasting thousands of log entries per second, with dozens of errors happening in unrecognizable daemons doing thrice-delegated work.
It seems like a perfect example of Jevons paradox (or andy/bill law): unified logging makes logging rich and cheap and free, but that causes everyone to throw it everywhere willy nilly. It's so noisy in there that I'm not sure who the logs are for anymore, it's useless for the user of the computer and even as a developer it seems impossible to debug things just by passively watching logs unless you already know the precise filter predicate.
In fact they must realize it's hopeless because the new Console doesn't even give you a mechanism to read past logs (I have to download eclecticlight's Ulbow for that).
This is the kind of thing that makes me want to grab Craig Federighi by the scruff and rub his nose in it. Every event that’s scrolling by here, an engineer thought was a bad enough scenario to log it at Error level. There should be zero of these on a standard customer install. How many of these are legitimate bugs? Do they even know? (Hahaha, of course they don’t.)
Something about the invisibility of background daemons makes them like flypaper for really stupid, face-palm level bugs. Because approximately zero customers look at the console errors and the crash files, they’re just sort of invisible and tolerated. Nobody seems to give a damn at Apple any more.
grumble
Spotlight, aside from failing to find applications also pollutes the search results with random files it found on the filesystem, some shortcuts to search the web and whatnot. Also, at the start of me using a Mac it repeatedly got into the state of not displaying any results whatsoever. Fixing that each time required running some arcane commands in the terminal. Something that people associate with Linux, but ironically I think now Linux requires less of that than Mac.
But in Tahoe they removed the Applications view, so my solution is gone now.
All in all, with Apple destroying macOS in each release, crippling DTrace with SIP, Liquid Glass, poor performance monitoring compared to what I can see with tools like perf on Linux, or Intel VTune on Windows, Metal slowly becoming the only GPU programming option, I think I’m going to be switching back to Linux.
> I quickly found out that Apple Instruments doesn’t support fetching more than 10 counters, sometimes 8, and sometimes less. I was constantly getting errors like '<SOME_COUNTER>' conflicts with a previously added event. The maximum that I could get is 10 counters. So, the first takeaway was that there is a limit to how many counters I can fetch, and another is that counters are, in some way, incompatible with each other. Why and how they’re incompatible is a good question.
Also: https://hmijailblog.blogspot.com/2015/09/using-intels-perfor...
Your second example, is the complaint that Instruments doesn't have flamegraph visualization? That was true a decade ago when it was written, and is not true today. Or that Instrument's trace file format isn't documented?
Why do I like Instruments and think it is better? Because the people who designed it optimized it for solving real performance problems. There are a bunch of "templates" that are focused on issues like "why is my thing so slow, what is it doing" to "why am I using too much memory" to "what network traffic is coming out of this app". These are real, specific problems while perf will tell you things like "oh this instruction has a 12% cache miss rate because it got scheduled off the core 2ms ago". Which is something Instruments can also tell you, but the idea is that this is totally the wrong interface that you should be presenting for doing performance work since just presenting people with data is barely useful.
What people do instead with perf is they have like 17 scripts 12 of which were written by Brendan Gregg to load the info into something that can be half useful to them. This is to save you time if you don't know how the Linux kernel works. Part of the reason why flamegraphs and Perfetto are so popular is because everyone is so desperate to pull out the info and get something, anything, that's not the perf UI that they settle for what they can get. Instruments has exceptionally good UI for its tools, clearly designed by people who solve real performance problems. perf is a raw data dump from the kernel with some lipstick on it.
Mind you, I trust the data that perf is dumping because the tool is rock-solid. Instruments is not like that. It's buggy, sometimes undocumented (to be fair, perf is not great either, but at least it is open source), slow, and crashes a lot. This majorly sucks. But even with this I solve a lot more problems clicking around Instruments UI and cursing at it than I do with perf. And while they are slow to fix things they are directionally moving towards cleaning up bugs and allowing data export, so the problems that you brought up (which are very valid) are solved or on their way towards being solved.
The implication that perf is not is frankly laughable. Perhaps one major difference is that perf assumes you know how the OS works, and what various syscalls are doing.
You just proved again that it's not optimized for reality because that knowledge can't be assumed as the pool of people trying to solve real performance problems is much wider than the pool with that knowledge
Only a system reinstall + manually deleting all index files fixed it. Meanwhile it was eating 20-30GB of disk space. There are tons of reports of this in the apple forums.
Even then, it feels a lot slower in MacOS 26 than it did before, and you often get the rug-pull effect of your results changing a millisecond before you press the enter key. I would pay good money to go back to Snow Leopard.
That being said, macOS was definitely more snappy back on Catalina, which was the first version I had so I can't vouch for Snow Leopard. Each update after Catalina felt gradually worse and from what i heard Tahoe feels like the last nail in the coffin.
I hope the UX team will deliver a more polished, expressive and minimal design next time.
[1] - https://support.apple.com/en-us/102321
It is completely useless on network mounts, however, where I resort to find/grep/rg
Firstly performance issues like wtf is going on with search. Then there seems to be a need to constantly futz with stable established apps UXes every annual OS update for the sake of change. Moving buttons, adding clicks to workflows, etc.
My most recent enraging find was the date picker in the reminders app. When editting a reminder, there is an up/down arrow interface to the side of the date, but if you click them they change the MONTH. Who decided that makes any sense. In what world is bumping a reminder by a month the most common change? It’s actually worse than useless, its actively net negative.
I just got my first ARM Mac to replace my work Win machine (what has MS done to Windows!?!? :'()
Used to be I could type "display" and Id get right to display settings in settings. Now it shows thousands of useless links to who knows what. Instead I have to type "settings" and then, within settings, type "display"
Still better than the Windows shit show.
Honestly, a well setup Linux machine has better user experience than anything on the market today.
We probably have to preface that with “for older people”. IMO Linux has changed less UX wise than either Windows or MacOS in recent years
For several decades, I have used hundreds of different computers, from IBM mainframes, DEC minicomputers and early PCs with Intel 8080 or Motorola MC6800 until the latest computers with AMD Zen 5 or Intel Arrow Lake. I have used a variety of operating systems and user interfaces.
During the first decades, there has been a continuous and obvious improvement in user interfaces, so I never had any hesitation to switch to a new program with a completely different user interface for the same application, even every year or every few months, whenever such a change resulted in better results and productivity.
Nevertheless, an optimum seems to have been reached around 20 years ago, and since then more often than not I see only worse interfaces that make harder to do what was simpler previously, so there is no incentive for an "upgrade".
Therefore I indeed customize my GUIs in Linux to a mode that resembles much more older Windows or MacOS than their recent versions and which prioritizes instant responses and minimum distractions over the coolest look.
In the rare occasions when I find a program that does something in a better way than what I am using, I still switch immediately to it, no matter how different it may be in comparison with what I am familiar, so conservatism has nothing to do with preferring the older GUIs.
A consequence of having "UI designers" paid on salary instead of individual contract jobs that expire when the specific fix is complete. In order to preserve their continuing salary, the UI designers have to continue making changes for changes sake (so that the accounting dept. does not begin asking: "why are we paying salary for all these UI designers if they are not creating any output"). So combining reaching an optimum 20 years ago with the fact that the UI designers must make changes for the sake of change, results in the changes being sub-optimal.
“I've come up with a set of rules that describe our reactions to technologies:
1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
2. Anything that's invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
3. Anything invented after you're thirty-five is against the natural order of things.”
Douglas Adams
I just installed Plasma with Endevouros and use it. I used Cinnamon before it. They don't require much effort.
And yet on Windows 11, hit Win key, type display, it immediately shows display settings as the first result.
People are really unable to differentiate “I am having issues” and “things are universally or even widely broken”
I've been using spotlight since it was introduced for... everything. In Tahoe it has been absolutely terrible. Unusable. Always indexing. Never showing me applications which is the main thing I use it for (yes, it is configured to show applications!). They broke something.
This was particularly pronounced on the M1 due to the 50/50 split. We reduced the number of workers on our test suite based on the CPU type and it sped up considerably.
It’s a QoS level: https://developer.apple.com/documentation/dispatch/dispatchq...
I replaced a MacPro5,1 with an M2Pro — which uses soooooo much less energy performing similarly mundane tasks (~15x+). Idle is ~25W v. 160W
Edit: It looks like there was some discussion about this on the Asahi blog 2 years ago[0].
[0]: https://asahilinux.org/2024/01/fedora-asahi-new/
This lets Apple architect things as small, single-responsibility processes, but make their priority dynamic, such that they’re usually low-priority unless a foreground user process is blocked on their work. I’m not sure the Linux kernel has this.
IIRC in macOS you do need to pass the voucher, it isn’t inherited automatically. Linux has no knowledge of it, so first it has to be introduced as a concept and then apps have to start using it.
Multithreading has been more ubiquitous in Mac apps for a long time thanks to Apple having offered mainstream multi-CPU machines very early on (circa 2000), predating even OS X itself, and has made a point of making multithreading easier in its SDK. By contrast multicore machines weren’t common in the Windows/x86 world until around the late 2000s with the boom of Intel’s Core series CPUs, but single core x86 CPUs persisted for several years following and Windows developer culture still hasn’t embraced multithreading as fully as its Mac counterpart has.
This then made it dead simple for Mac developers to adopt task prioritization/QoS. Work was already cleanly split into threads, so it’s just a matter of specifying which are best suited for putting on e-cores and which to keep on P-cores. And overwhelmingly, Mac devs have done that.
So the system scheduler is a good deal more effective than its Windows counterpart because third party devs have given it cues to guide it. The tasks most impactful to the user’s perception of snappiness remain on the P-cores, the E-cores stay busy with auxiliary work and keep the P-cores unblocked and able to sleep more quickly and often.
When I ran Gnome, I was regularly annoyed at how often an indexing service would chew through CPU.
So for example, if in an email client the user has initiated the export of a mailbox, that is given utmost priority while things like indexing and periodic fetches get put on the back burner.
This works because even a selfish developer wants their program to run well, which setting all tasks as high priority actively and often visibly impedes, and so they push less essential work to the background.
It just happens that in this case, smart threading on the per-process level makes life for the system scheduler easier.
Android SoCs have adopted heterogenous CPU architectures ("big.LITTLE" in the ARM sphere) years before Apple, and as a result, there have been multiple attempts to tackle this in Linux. The latest, upstream, and perhaps the most widely deployed way of efficiently using such processors involves using Energy-Aware Scheduling [1]. This allows the kernel to differentiate between performant and efficient cores, and schedule work accordingly, avoiding situations in which brief workloads are put on P cores and the demanding ones start hogging E cores. Thanks to this, P cores can also be put to sleep when their extra power is not needed, saving power.
One advantage macOS still has over Linux is that its kernel can tell performance-critical and background workloads apart without taking guesses. This is beneficial on all sorts of systems, but particularly shines on those heterogenous ones, allowing unimportant workloads to always occupy E cores, and freeing P cores for loads that would benefit from them, or simply letting them sleep for longer. Apple solved this problem by defining a standard interface for the user-space to communicate such information down [2]. As far as I'm aware, Linux currently lacks an equivalent [3].
Technically, your application can still pin its threads to individual cores, but to know which core is which, it would have to parse information internal to the scheduler. I haven't seen any Linux application that does this.
[1] https://www.kernel.org/doc/html/latest/scheduler/sched-energ...
[2] https://developer.apple.com/library/archive/documentation/Pe...
[3] https://github.com/swiftlang/swift-corelibs-libdispatch?tab=...
SCHED_BATCH and SCHED_IDLE scheduling policies. They've been there since forever.
I have read there are some potential security benefits if you were to keep your most exploitable programs (eg web browser) on its own dedicated core.
It’s about half, actually
> The fact that an idle Mac has over 2,000 threads running in over 600 processes is good news
I mean, only if they’re doing something useful
This article couldn't have come at a better time. Because frankly speaking I am not that impressed after I tested Omarchy Linux. Everything was snappy. It is like back to DOS or Windows 3.11 era. ( Not quite but close ) It makes me wonder why Mac couldn't be like that.
Apple Silicon is fast, no doubt about it. It isn't some benchmarks but even under emulation, compiling or other workload it is fast if not the fastest. So there are plenty of evidence it isn't benchmark specific which some people claims Apple is only fast on Geekbench. The problem is macOS is slow. And for whatever reason haven't improved much. I am hoping dropping support for x86 in next macOS meant they have time and excuses to do a lot of work on macOS under the hood. Especially with OOM and Paging.
Apple Silicon is awesome and was a game changer when it came out. Still very impressive that they have been able to keep the MacBook Air passively cooled since the first M1. But yeah, macOS is holding it back.
My 5900X machine with relatively slow RAM running CachyOS actually almost feels as instant as DOS machines with incredible low latency.
Instead, we get tons of layers of abstraction (Electron etc), combined with endpoint security software in the enterprise world that bring any machine to its knees by inspecting every I/O action.
Zed is an awesome editor where every interaction is just “snappy”.
I feel all the animations and transitions have been introduced to macOS just to paper over slowness. I’ve disabled them via the accessibility settings which helps in perceived snappiness a bit, but it also makes yank visible where performance is not up to par.