>Overall the Intel Arc Pro B50 was at 1.47x the performance of the NVIDIA RTX A1000 with that mix of OpenGL, Vulkan, and OpenCL/Vulkan compute workloads both synthetic and real-world tests. That is just under Intel's own reported Windows figures of the Arc Pro B50 delivering 1.6x the performance of the RTX A1000 for graphics and 1.7x the performance of the A1000 for AI inference. This is all the more impressive when considering the Arc Pro B50 price of $349+ compared to the NVIDIA RTX A1000 at $420+.
They may not be disabling them maliciously -- they may be "binning" them -- running tests on the parts and then fusing off/disabling broken pieces of the silicon in order to avoid throwing away a chip that mostly works.
Yep, that's likely the case - but they still charge double for the reduced-performance binned chip, just because it's a "professional" GPU (which, last I heard, really just means it can use the pro variant of the GPU drivers)
Funny enough, maybe the fusing itself (if they go a bit above-and-beyond on it) is exactly why it is a pro model.
I.e. maybe Nvidia say "if we're going to fuse some random number of cores such that this is no longer a 3050, then let's not only fuse the damaged cores, but also do a long burn-in pass to observe TDP, and then fuse the top 10% of cores by measured TDP."
If they did that, it would mean that the resulting processor would be much more stable under a high duty cycle load, and so likely to last much longer in an inference-cluster deploy environment.
And the extra effort (= bottlenecking their supply of this model at the QC step) would at least partially justify the added cost. Since there'd really be no other way to produce a card with as many FLOPS/watt-dollar, without doing this expensive "make the chip so tiny it's beyond the state-of-the-art to make it stably, then analyze it long enough to precision-disable everything required to fully stabilize it for long-term operation" approach.
Comparing price to performance in this space might not make much sense as it would seem. One of the (very few) interesting qualities in the A1000 is that it's single slot, low profile, workstation GPU. Intel kept the "powered by the PCIe slot" aspect, but made it dual slot and full height. Needing a "workstation" GPU in a tiny form factor (i.e. not meant to slot and power full sized GPUs) was something one could squeeze on price for, but the only selling point of this is the price.
I think you might be mistaken on the height of the card, if you look at the ports they are mini-DP on a low profile bracket. The picture also states that it includes both types of brackets.
Great catch, Serve The Home has a stacked picture of the two cards and they are indeed both low profile https://www.servethehome.com/wp-content/uploads/2025/09/NVID.... If your SFF box/1u server has a 2-thick NIC slot it may well be great for that use case then!
I'm still waiting for one of Nvidia/AMD/Intel to realize that if they make an inference-focused Thunderbolt eGPU "appliance" (not just a PCIe card in an eGPU chassis, but a sealed, vertically-integrated board-in-box design), then that would completely free them from design constraints around size/shape/airflow in an ATX chassis.
Such an appliance could plug into literally any modern computer — even a laptop or NUC. (And for inference, "running on an eGPU connected via Thunderbolt to a laptop" would actually work quite well; inference doesn't require much CPU, nor have tight latency constraints on the CPU<->GPU path; you mostly just need enough arbitrary-latency RAM<->VRAM DMA bandwidth to stream the model weights.)
(And yeah, maybe your workstation doesn't have Thunderbolt, because motherboard vendors are lame — but then you just need a Thunderbolt PCIe card, which is guaranteed to fit more easily into your workstation chassis than a GPU would!)
That misses the "vertically integrated" part. (As does everything else right now, which was my point.)
The thing you linked is just a regular Gigabyte-branded 5090 PCIe GPU card (that they produced first, for other purposes; and which does fit into a regular x16 PCIe slot in a standard ATX chassis), put into a (later-designed) custom eGPU enclosure. The eGPU box has some custom cooling [that replaces the card's usual cooling] and a nice little PSU — but this is not any more "designing the card around the idea it'll be used in an enclosure" than what you'd see if an aftermarket eGPU integrator built the same thing.
My point was rather that, if an OEM [that produces GPU cards] were to design one of their GPU cards specifically and only to be shipped inside an eGPU enclosure that was designed together with it — then you would probably get higher perf, with better thermals, at a better price(!), than you can get today from just buying standalone peripheral-card GPU (even with the cost of the eGPU enclosure and the rest of its components taken into account!)
Where by "designing the card and the enclosure together", that would look like:
- the card being this weird nonstandard-form-factor non-card-edged thing that won't fit into an ATX chassis or plug into a PCIe slot — its only means of computer connection would be via its Thunderbolt controller
- the eGPU chassis the card ships in, being the only chassis it'll comfortably live in
- the card being shaped less like a peripheral card and more like a motherboard, like the ones you see in embedded industrial GPU-SoC [e.g. automotive LiDAR] use-cases — spreading out the hottest components to ensure nothing blocks anything else in the airflow path
- the card/board being designed to expose additional water-cooling zones — where these zones would be pointless to expose on a peripheral card, as they'd be e.g. on the back of the card, where the required cooling block would jam up against the next card in the slot-array
...and so on.
It's the same logic that explains why those factory-sealed Samsung T-series external NVMe pucks can cost less than the equivalent amount of internal m.2 NVMe. With m.2 NVMe, you're not just forced into a specific form-factor (which may not be electrically or thermally optimal), but you're also constrained to a lowest-common-denominator assumption of deployment environment in terms of cooling — and yet you have to ensure that your chips stay stable in that environment over the long term. Which may require more-expensive chips, longer QC burn-in periods, etc.
But when you're shipping an appliance, the engineering tolerances are the tolerances of the board-and-chassis together. If the chassis of your little puck guarantees some level of cooling/heat-sinking, then you can cheap out on chips without increasing the RMA rate. And so on. This can (and often does) result in an overall-cheaper product, despite that product being an entire appliance vs. a bare component!
>were to design one of their GPU cards specifically and only to be shipped inside an eGPU enclosure that was designed together with it
And why they would do so?
Do you understand what it would drive the price a lot?
> at a better price(!)
With less production/sales numbers than a regular 5090 GPU? No way. Economics 101.
> the card being this weird nonstandard-form-factor non-card-edged thing
Even if we skip the small series nuances (which makes this a non-starter by the price alone), there is a little what some other 'nonstandard-form-factor' can do for the cooling - you still need the RAM near the chip... and that's all. You just designed the same PCIe card for the sake of it being incompatible..
> won't ... plug into a PCIe slot
Again - why? What this would provide what the current PCIe GPU lacks? BTW you still need the 16 lines of PCIe and you know which connector provides the most useful and cost effective way to do so? A regular 16x PCIe connector. That one you ditched.
> the card being shaped less like a peripheral card and more like a motherboard
You don't need to 're-design it from scratch', it's enough not to be constrained with a 25cm limit to have a proper air-flow along a properly oriented radiator.
> why those factory-sealed Samsung T-series external NVMe pucks
The strength is the weakness here - if the appliance gets so little from plugging directly into the host system then requiring the appliance to plug in to the host system to work at all becomes more of a burden than a value.
With this wattage I'm not sure why they went double slot in this generation. Maybe they thought having a few dB more silence was a more unique placement for the card or something. The thickness of a GPU largely comes from the cooler, everything else typically fits under the height of the display connectors, and this GPU could certainly work with a single slot cooler.
My first software job was at a place doing municipal architecture. The modelers had and needed high end GPUs in addition to the render farm, but plenty of roles at the company simply needed anything with better than what the Intel integrated graphics of the time could produce in order to open the large detailed models.
In these roles the types of work would include things like seeing where every pipe, wire, and plenum for a specific utility or service was in order to plan work between a central plant and a specific room. Stuff like that doesn’t need high amounts of VRAM since streaming textures in worked fine. A little lag never hurt anyone here as the software would simply drop detail until it caught up. Everything was pre-rendered so it didn’t need large amounts of power to display things. What did matter was having the grunt to handle a lot of content and do it across three to six displays.
Today I’m guessing the integrated chips could handle it fine but even my 13900K’s GPU only does DisplayPort 1.4 and up to only three displays on my motherboard. It should do four but it’s up to the ODMs at that point.
For a while Matrox owned a great big slice of this space but eventually everyone fell to the wayside except NVidia and AMD.
It's already got 2x the ram and roughly 1.5x the performance of the more expensive NVidia competitor... I'm not sure where you are getting your expectations from.
Sure.. but even going consumer, for 16gb+, you can get an ARC A770 for $80 less, or an RX 9060 XT for a few dollars more... Will it perform better, I don't know. RTX 5060 Ti 16gb is about $70 more.
Prices from NewEgg on 16gb+ consumer cards, sold by NewEgg and in stock.
I wonder why everyone keep saying "just put more VRAM" yet no cards seem to do that. If it is that easy to compete with Nvidia, why don't we already have those cards?
Maybe because only AI enthusiasts want that much VRAM, and most of them will pony up for a higher-end GPU anyways? Everyone is suggesting it here because that's what they want, but I don't know if this crowd is really representative of broader market sentiment.
There are a lot of local AI hobbyists, just visit /r/LocalLLama to see how many are using 8GB cards, or all the people asking for higher RAM version of cards.
This makes it mysterious since clearly CUDA is an advantage, but higher VRAM lower cost cards with decent open library support would be compelling.
There is no point in using a low-bandwidth card like the B50 for AI. Attempting to use 2x or 4x cards to load a real model will result in poor performance and low generation speed. If you don’t need a larger model, use a 3060 or 2x 3060, and you’ll get significantly better performance than the B50—so much better that the higher power consumption won’t matter (70W vs. 170W for a single card). Higher VRAM wont make the card 'better for AI'.
Are there any performance bottlenecks with using 2 cards instead of a single card? I don't think any one the consumer Nvidia cards use NVlink anymore, or at least they haven't for a while now.
If VRAM is ~$10/gb I suspect people paying $450 for a 12GB card would be happy to pay $1200 for a 64gb card. Running local LLM only uses about 3-6% of my GPU's capability, but all of it's VRAM. Local LLM has no need for 6 3090s to serve a single or handful of users; they just need the VRAM to run the model locally.
Exactly. People would be thrilled with a $1200 64GB card with ok processing power and transfer speed. It's a bit of a mystery why it doesn't exist. Intel is enabling vendors to 'glue' two 24GB cards together for a $1200 list price 48GB card, but it's a frankenstein monster and will probably not be available for that price.
Nvidia has zero incentives to undercut their enterprise GPUs by adding more RAM to “cheap” consumer cards like the 5090.
Intel and even AMD can’t compete or aren’t bothering. I guess we’ll see how the glued 48GB B60 will do, but that’s a still relatively slow GPU regardless of memory. Might be quite competitive with Macs, though.
r/LocalLLaMA has 90,000 subscribers. r/PCMasterRace has 9,000,000. I'll bet there are a lot more casual PC gamers who don't talk about it online than there are casual local AI users, too.
because the cards already sell at very very good prices with 16GB and optimizations in generative AI is bringing down memory requirements. Optimizing profits means yyou sell with the least amount of VRAM possible not only to save the direct cost of the RAM but also to guard future profit and your other market segments. the cost of the ram itself is almost nothing compared to that. any intel competitor can more easily release products with more than 16GB and smoke them. Intel tries for a market segment that was only served by gaming cards twice as expensive up until now. this frees those up to be finally sold at MSRP.
If intel was serious about staging a comeback, they would release a 64GB card.
But intel is still lost in it's hubris, and still thinks it's a serious player and "one of the boys", so it doesn't seem like they want to break the line.
I believe that VRAM has massively shot up in price, so this is where a large part of the costs are. Besides I wouldn't be very surprised if Nvidia has such strong market share they can effectively tell suppliers to not let others sell high capacity cards. Especially because VRAM suppliers might worry about ramping up production too much and then being left with an oversupply situation.
This could well be the reason why the rumored RDNA5 will use LPDDR5X/LPDDR5X instead of GDDR7 memory, at least for the low/mid range configurations (the top-spec and enthusiast configurations AT0 and AT2 configurations will still use GDDR7 it seems).
I don't really know what I'm talking about (whether about graphic cards or in AI inference), but if someone figures out how to cut the compute needed for AI inference significantly then I'd guess the demand for graphic cards would suddenly drop?
Given how young and volatile this domain still is, it doesn't seem unreasonable to be wary of it. Big players (google, openai and the likes) are probably pouring tons of money into trying to do exactly that
I would suspect that for self hosted LLMs, quality >>> performance, so the newer releases will always expand to fill capacity of available hardware even when efficiency is improved.
There does seem to be a grey market for it in China. You can buy cards where they swap the memory modules with higher capacity ones on Aliexpress and ebay.
Ryzen AI max+ 395 128GB can do 256GBps so lets put all these "ifs" to bed once for all. That is absolutely no brainer to drop more RAM as long as there is enough bits in address space of physical hardware. And there usually is, as same silicons are branded and packaged differently for commercial market and for consumer market. Check up how chinese are doubling 4090s RAM from 24 to 48GB.
> If it is that easy to compete with Nvidia, why don't we already have those cards?
Businesswise? Because Intel management are morons. And because AMD, like Nvidia, don't want to cannibalize their high end.
Technically? "Double the RAM" is the most straightforward (that doesn't make it easy, necessarily ...) way to differentiate as it means that training sets you couldn't run yesterday because it wouldn't fit on the card can now be run today. It also takes a direct shot at how Nvidia is doing market segmentation with RAM sizes.
Note that "double the RAM" is necessary but not sufficient.
You need to get people to port all the software to your cards to make them useful. To do that, you need to have something compelling about the card. These Intel cards have nothing compelling about them.
Intel could also make these cards compelling by cutting the price in half or dropping two dozen of these cards on every single AI department in the US for free. Suddenly, every single grad student in AI will know everything about your cards.
The problem is that Intel institutionally sees zero value in software and is incapable of making the moves they need to compete in this market. Since software isn't worth anything to Intel, there is no way to justify any business action isn't just "sell (kinda shitty) chip".
No. The A1000 was well over $500 last year. This is the #3 player coming out with a card that's a better deal than what the #1 player currently has to offer.
I don't get why there's people trying to twist this story or come up with strawmen like the A2000 or even the RTX5000 series. Intel's coming into this market competitively, which as far as I know is a first, and it's also impressive.
Coming into the gaming GPU market had always been too ambitious a goal for Intel, they should have started with competing in the professional GPU market. It's well known that Nvidia and AMD have always been price gouging this market so it's fairly easy to enter it competitively.
If they can enter this market successfully and then work their way up on the food chain then that seems like good way to recover from their initial fiasco.
NVIDIA is looking for profit, Intel is looking for market share, the pricing reflects this. Of course your product looks favorable to something released April 2024 when you're cutting pricing to get more attention.
Well, no. It doesn't. The comparison is to the A1000.
Toss in a 5060 Ti into the compare table, and we're in an entirely different playing field.
There are reasons to buy the workstation NVidia cards over the consumer ones, but those mostly go away when looking at something like the new Intel. Unless one is in an exceptionally power-constrained environment, yet has room for a full-sized card (not SFF or laptop), I can't see a time the B50 would even be in the running against a 5060 Ti, 4060 Ti, or even 3060 Ti.
> There are reasons to buy the workstation NVidia cards over the consumer ones
I seem to recall certain esoteric OpenGL things like lines being fast was a NVIDIA marketing differentiator, as only certain CAD packages or similar cared about that. Is this still the case, or has that software segment moved on now?
For me (not quite at the A1000 level, but just above -- still in the prosumer price range), a major one is ECC.
Thermals and size are a bit better too, but I don't see that as $500 better. I actually don't see (m)any meaningful reasons to step up to an Ax000 series if you don't need ECC, but I'd love to hear otherwise.
"release from a year and a half ago", that's technically true but a really generous assessment of the situation.
We could just as well compare it to the slightly more capable RTX A2000, which was released more than 4 years ago. Either way, Intel is competing with the EoL Ampere architecture.
> 1.7x the performance of the A1000 for AI inference
That's a bold claim when their acceleration software (IPEX) is barely maintained and incompatible with most inference stacks, and their Vulkan driver is far behind it in performance.
Really confused why the Intel and AMD both continue to struggle and yet still refuse to offer what Nvidia wont, i.e. high ram consumer GPUs. I'd much prefer paying 3x cost for 3x VRAM (48GB/$1047), 6x cost for 6x VRAM (96GB/$2094), 12x cost for 12x VRAM (192GB/$4188), etc.
They'd sell like hotcakes and software support would quickly improve.
At 16GB I'd still prefer to pay a premium for NVidia GPUs given its superior ecosystem, I really want to get off NVidia but Intel/AMD isn't giving me any reason to.
Because the market of people who want huge RAM GPUs for home AI tinkering is basically about 3 Hacker News posters. Who probably won’t buy one because it doesn’t support CUDA.
PS5 has something like 16GB unified RAM, and no game is going to really push much beyond that in VRAM use, we don’t really get Crysis style system crushers anymore.
> PS5 has something like 16GB unified RAM, and no game is going to really push much beyond that in VRAM use, we don’t really get Crysis style system crushers anymore.
This isn't really true from the recreational card side, nVidia themselves are reducing the number of 8GB models as a sign of market demand [1].
Games these days are regularly maxing out 6 & 8 GB when running anything above 1080p for 60fps.
The prevalence of Unreal Engine 5 also recently with a low quality of optimization for weaker hardware is causing games to be released basically unplayable for most.
For recreational use the sentiment is that 8GB is scraping the bottom of the requirements. Again this is partly due to bad optimizations, but games are being played in higher resolutions also, which required more memory for larger texture sizes.
As someone that started on 8 bit computing, Tim Sweeny is right the Electron garbage culture when applied to Unreal 5 is one of the reasons so much RAM is needed, with such bad performance.
While I dislike some of the handmade hero culture, in one thing they are right, regarding how bad modern hardware happens to be used.
I remember UE1 being playable even software mode, such as the first Deus EX.
Now, I think the Surreal Engine (UE1 reimplementation) needs damn GL 3.3 (if not 4.5 and Vulkan) to play games I used to play in an Athlon. Now I can't use surreal to Play DX on my legacy n270 netbook with GL 2.1... something that was more than enough to play the game at 800x600 with everything turned on and much more.
A good thing is that I turned myself into libre/indie gaming with games such as Cataclysm DDA:Bright Ness with far less requeriments than a UE5 game and yet being enyojable due to playability and in-game lore (and a proper ending compared to vanilla CDDA).
UE1 was in the timeframe that 3D acceleration was only starting to get adopted, and IIRC from some interview Epic continued with a software option for UT2003/2004 (licensed pixomatic?) because they found out a lot of players were still playing their games on systems where full GPUs weren't always available, such as laptops.
I know this is going back to Intel's Larrabee where they tried it, but I'd be real interested to see what the limits of a software renderer is now considering the comparative strength of modern processors and amount of multiprocessing. While I know there's DXVK or projects like dgVoodoo2 which can be an option with sometimes better backwards compatibility, just software would seem like a stable reference target than the gradually shifting landscape of GPUs/drivers/APIs
Lavapipe on Vulkan makes VKQuake playable even con Core Duo 2 systems. Just as a concept, of course. I know about software rendered Quakes since forever.
Vanilla CDDA has a lot of entertaining endings, proper or not. I tend to find one within the first one or two in-game days. Great game! I like to install it now and then just to marvel at all the new things that have been added and then be killed by not knowing what I am doing.
Never got far enough to interact with most systems in the game or worry about proper endings.
The Unreal Engine software renderer back then had a very distinct dithering pattern. I played it after I got a proper 3D card, but it didn't feel the same, felt very flat and lifeless.
Maybe today, but the more accessible and affordable they become, the more likely people can start offering "self hosted" options.
We're already seeing competitors of AWS but only targeting things like Qwen , deepseek, etc.
There's Enterprise customers who have compliance laws and literally want AI but cannot use any of the top models because everything has to be run on their own infrastructure.
> PS5 has something like 16GB unified RAM, and no game is going to really push much beyond that in VRAM use
That's pretty funny considering that PC games are moving more towards 32GB RAM and 8GB+ VRAM. The next generation of consoles will of course increase to make room for higher quality assets.
Sure, but not always. Future games will have more detailed assets which will require more memory. Running at 4K or higher resolution will be more common which also requires more memory.
"Because in the real world, I have to write up lists of stuff I have to go to the grocery store to buy. And I have never thought to myself that realism is fun. I go play games to have fun."
Another use for high RAM GPUs is the simulation of turbulent flows for research. Compared to CPU, GPU Navier-Stokes solvers are super fast, but the size of the simulated domain is limited by the RAM.
Marketing is misreading the room. I believe there's a bunch of people buying no video cards right now that would if there were high vram options available
This card does have double the VRAM of the more expensive Nvidia competitor (the A1000, which has 8 GB), but I take your point that it doesn't feel like quite enough to justify giving up the Nvidia ecosystem. The memory bandwidth is also... not great.
They also announced a 24 GB B60 and a double-GPU version of the same (saves you physical slots), but it seems like they don't have a release date yet (?).
I am not sure there is significant enough market for those. That is selling enough consumer units to cover all design and other costs. From gamer perspective 16GB is now a reasonable point. 32GB is most one would really want and even that not at more than say 100 more price point.
This to me is the gamer perspective. This segment really does not need even 32GB, let alone 64GB or more.
Never underestimate bragging rights in gamers community. Majority of us run unoptimized systems with that one great piece of gear and as long as the game runs at decent FPS and we have some bragging rights it's all ok.
My work computer with Widnows, Outlook, few tabs open and Excel already craps put with 16 GB
If you hae a private computer, why wpuld you even bug something with 16GB in 2025?
My 10 year old laptop had that much.
Im looking for a new laptop and Im looking at a 128GB setup - so those 200 chrome tabs can eat it, I have space to run other stuff, like those horrible electron chat apps + a game
> I am not sure there is significant enough market for those.
How so? The prosumer local AI market is quite large and growing every day, and is much more lucrative per capita than the gamer market.
Gamers are an afterthought for GPU manufacturers. NVIDIA has been neglecting the segment for years, and is now much more focused on enterprise and AI workloads. Gamers get marginal performance bumps each generation, and side effect benefits from their AI R&D (DLSS, etc.). The exorbitant prices and performance per dollar are clear indications of this. It's plain extortion, and the worst part is that gamers accepted that paying $1000+ for a GPU is perfectly reasonable.
> This segment really does not need even 32GB, let alone 64GB or more.
4K is becoming a standard resolution, and 16GB is not enough for it. 24GB should be the minimum, and 32GB for some headroom. While it's true that 64GB is overkill for gaming, it would be nice if that would be accessible at reasonable prices. After all, GPUs are not exclusively for gaming, and we might want to run other workloads on them from time to time.
While I can imagine that VRAM manufacturing costs are much higher than DRAM costs, it's not unreasonable to conclude that NVIDIA, possibly in cahoots with AMD, has been artificially controlling the prices. While hardware has always become cheaper and more powerful over time, for some reason, GPUs buck that trend, and old GPUs somehow appreciate over time. Weird, huh. This can't be explained away as post-pandemic tax and chip shortages anymore.
Frankly, I would like some government body to investigate this industry, assuming they haven't been bought out yet. Label me a conspiracy theorist if you wish, but there is precedent for this behavior in many industries.
I think the timeline is roughly: SGI (90s), Nvidia gaming (with ATi and then AMD) eating that cake. Then cryptocurrency took off at the end '00s / start '10s, but if we are honest things like hashcat were also already happening. After that AI (LLMs) took off during the pandemic.
During the cryptocurrency hype, GPUs were already going for insane prices and together with low energy prices or surplus (which solar can cause, but nuclear should too) allows even governments to make cheap money (and for hashcat cracking, too). If I was North Korea I'd know my target. Turns out, they did, but in a different way. That was around 2014. Add on top of this Stadia and GeForce Now as examples of renting GPU for gaming (there are more, and Stadia flopped).
I didn't mention LLMs since that has been the most recent development.
All in all, it turns out GPUs are more valuable than what they were sold for if your goal isn't personal computer gaming. Hence the price gone up.
Now, if you want to thoroughly investigate this market you need to figure what large foreign forces (governments, businesses, and criminal enterprises) use these GPUs for. US government is aware for long time of above; hence export restrictions on GPUs. Which are meant as slowing opponent down to catch up. The opponent is the non-free world (China, North Korea, Russia, Iran, ...), though current administration is acting insane.
You're right, high demand certainly plays a role. But it's one thing for the second-hand market to dictate the price of used hardware, and another for new hardware to steadily get more expensive while its objective capabilities only see marginal improvements. At a certain point it becomes blatant price gouging.
NVIDIA is also taking consumers for a ride by marketing performance based on frame generation, while trying to downplay and straight up silence anyone who points out that their flagship cards still struggle to deliver a steady 4K@60 without it. Their attempts to control the narrative of media outlets like Gamers Nexus should be illegal, and fined appropriately. Why we haven't seen class-action lawsuits for this in multiple jurisdictions is beyond me.
Their GPU business is a slow upstart. If they have a play that could massively disrupt the competition, and has a small chance of epic failure, that should be very attractive to them.
I doubt you'd get linear scaling of price/capacity - the larger capacity modules are more expensive per GB than smaller ones, and in some cases are supply constrained.
The number of chips on the bus is usually pretty low (1 or 2 of them on most GPUs), so GPUs tend to have to scale out their memory bus widths to get to higher capacity. That's expensive and takes up die space, and for the conventional case (games) isn't generally needed on low end cards.
What really needs to happen is someone needs to make some "system seller" game that is incredibly popular and requires like 48GB of memory on the GPU to build demand. But then you have a chicken/egg problem.
I think it's a bit of planned obsolescence as well. The 1080ti has been a monster with it's 11GB VRAM up until this generation. A lot of enthusiasts basically call out that Nvidia won't make that mistake again since it led to longer upgrade cycles.
for ai workloads? You're wrong. I use mine as a server, just ssh into it. I don't even have a keyboard or display hooked up to it.
You can get 96gb of vram and about 40-70% the speed of a 4090 for $4000.
Especially when you are running a large number of applications you want to talk to each other it makes sense ... the only way to do it on a 4090 is to hit disk, shut the application down, start up the other applciation, read from disk ... it's slowwww... the other option is a multi-gpu system but then it gets into real money.
trust me, it's a gamechanger. I just have it sitting in a closet. Use it all the time.
The other nice thing is unlike with any Nvidia product, you can walk into an apple store, pay the retail price and get it right away. No scalpers, no hunting.
Why not just buy 3 card then? These cards doesn't require active cooling anyways and you can just fit 3 in decent sized case. You will get 3x VRAM speed and 3x compute. And if your usecase is llm inference, it will be a lot faster than 1x card with 3x VRAM.
We will buy 4 cards if they are 48 GB or more. At a measly 16 GB, we’re just going to stick with 3090s, P40s, MI50s, etc.
> 3x VRAM speed and 3x compute
LLM scaling doesn’t work this way. If you have 4 cards, you may get 2x performance increase if you use vLLM. But you’ll also need enough VRAM to run FP8. 3 cards would only run at 1x performance.
Also less power efficient, takes up more PCI slots and a lot of software doesn't support GPU clustering. Already have 4x 16GB GPUs which is unable to run large models exceeding 16GB.
Currently running them different VMs to be able to make full use of them, used to have them running in different docker containers however OOM Exceptions would frequently bring down the whole server, which running in VMs helped resolve.
For LLM inference of batch size 1, it's hard to be saturate PCIe bandwidth specially for less powerful chips. You would get close to linear performance[1]. The obvious issue is few things on multiple GPU is harder, and many softwares don't fully support it or isn't optimized for it.
Nvidia uses VRAM amount for market segmentation. They can't make a 128GB consumer card without cannibalizing their enterprise sales.
Which means Intel or AMD making an affordable high-VRAM card is win-win. If Nvidia responds in kind, Nvidia loses a ton of revenue they'd otherwise have available to outspend their smaller competitors on R&D. If they don't, they keep more of those high-margin customers but now the ones who switch to consumer cards are switching to Intel or AMD, which both makes the company who offers it money and helps grow the ecosystem that isn't tied to CUDA.
People say things like "it would require higher pin counts" but that's boring. The increase in the amount people would be willing to pay for a card with more VRAM is unambiguously more than the increase in the manufacturing cost.
It's more plausible that there could actually be global supply constraints in the manufacture of GDDR, but if that's the case then just use ordinary DDR5 and a wider bus. That's what Apple does and it's fine, and it may even cost less in pins than you save because DDR is cheaper than GDDR.
It's not clear what they're thinking by not offering this.
> Intel or AMD making an affordable high-VRAM card is win-win.
100% agree. CUDA is a bit of a moat, but the earlier in the hype cycle viable alternatives appear, the more likely the non CUDA ecosystem becomes viable.
> It's not clear what they're thinking by not offering this.
They either dont like making money or have a fantasy that one day soon they will be able to sell pallets of $100,000 GPUs they made for $2.50 like Nvidia can. It doesn't take a PhD and MBA to figure out that the only reason Nvidia have, what should be a short term market available to them is the failings of Intel and AMD and the VC / Innovation side to offer any competition.
It is such an obvious win-win that it would probably be worth skipping the engineering and just announcing the product, for sale by the end of the year and force everyones hand.
> The increase in the amount people would be willing to pay for a card with more VRAM is unambiguously more than the increase in the manufacturing cost.
I guess you already have the paper if it is that unambiguous. Would you mond sharing the data/source?
The cost of more pins is linear in the number of pins, and the pins aren't the only component of the manufacturing cost, so a card with twice as many pins will have a manufacturing cost of significantly less than twice that of a card with half as many pins.
Cards with 16GB of VRAM exist for ~$300 retail.
Cards with 80GB of VRAM cost >$15,000 and customers pay that.
A card with 80GB of VRAM could be sold for <$1500 with five times the margin of the $300 card because the manufacturing cost is less than five times as much. <$1500 is unambiguously a smaller number than >$15,000. QED.
This is almost true but not quite - I don't think much of the (dollar) spend on enterprise GPUs (H100, B200, etc.) would transfer if there was a 128 GB consumer card. The problem is both memory bandwidth (HBM) and networking (NVLink), which NVIDIA definitely uses to segment consumer vs enterprise hardware.
I think your argument is still true overall, though, since there are a lot of "gpu poors" (i.e. grad students) who write/invent in the CUDA ecosystem, and they often work in single card settings.
Fwiw Intel did try this with Arctic Sound / Ponte Vecchio, but it was late out the door and did not really perform (see https://chipsandcheese.com/p/intels-ponte-vecchio-chiplets-g...). It seems like they took on a lot of technical risk; hopefully some of that transfers over to a future project though Falcon Shores was cancelled. They really should should have released some of those chips even at a loss, but I don't know the cost of a tape out.
NVLink matters if you want to combine a whole bunch of GPUs, e.g. you need more VRAM than any individual GPU is available with. Many workloads exist that don't care about that or don't have working sets that large, particularly if the individual GPU actually has a lot of VRAM. If you need 128GB and you have GPUs with 40GB of VRAM then you need a fast interconnect. If you can get an individual GPU with 128GB, you don't.
There is also work being done to make this even less relevant because people are already interested in e.g. using four 16GB cards without a fast interconnect when you have a 64GB model. The simpler implementation of this is to put a quarter of the model on each card split in the order it's used and then have the performance equivalent of one card with 64GB of VRAM by only doing work on the card with that section of the data in its VRAM and then moving the (much smaller) output to the next card. A more sophisticated implementation does something similar but exploits parallelism by e.g. running four batches at once, each offset by a quarter, so that all the cards stay busy. Not all workloads can be split like this but for some of the important ones it works.
I think we might just disagree about how much of the GPU spend is on small vs large model (inference or training). I think it’s something like 99.9% of spending interest is on models that don’t fit into 128 GB (remember KV cache matters too). Happy to be proven wrong!
Even if they put out some super high memory models and just pass the ram through at cost it would increase sales -- potentially quite dramatically and increase their total income a lot and have a good chance of transitioning to being a market leader rather than an also-ran.
AMD has lagged so long because of the software ecosystem but the climate now is that they'd only need to support a couple popular model architectures to immediately grab a lot of business. The failure to do so is inexplicable.
I expect we will eventually learn that this was about yet another instance of anti-competitive collusion.
Lisa and Jensen are cousins. I think that explains it. Lisa can easily prove me wrong by releasing a high-memory GPU that significantly undercuts Nvidia's RTX 6000 Pro.
the whole RAM industry was twice sanctioned for price fixing, so I agree: any business that deals with RAM has, more likely than other industries by a lot, anti-competitive collusion
The new CEO of Intel has said that Intel is giving up competing with Nvidia.
Why would you bother with any Intel product with an attitude like that, gives zero confidence in the company. What business is Intel in, if not competing with Nvidia and AMD. Is it giving up competing with AMD too?
The new CEO of Intel has said that Intel is giving up competing with Nvidia.
No, he said they're giving up competing against Nvidia in training. Instead, he said Intel will focus on inference.
That's the correct call in my opinion. Training is far more complex and will span multi data centers soon. Intel is too far behind. Inference is much simpler and likely a bigger market going forward.
I disagree - training enormous LLMs is super complex and requires a data centre... But most research is not done at that scale. If you want researchers to use your hardware at scale you also have to make it so they can spend a few grand and do small scale research with one GPU on their desktop.
That's how you get things like good software support in AI frameworks.
I disagree with you. You don't need researchers to use your client hardware in order to make inference chips. All big tech are making inference chips in house. AMD and Apple are making local inference do-able on client.
Inference is vastly simpler than training or scientific compute.
AMD has also often said that they can't compete with Nvidia at the high end, and as the other commenter said: market segments exist. Not everyone needs a 5090. If anything, people are starved for options in the budget/mid-range market, which is where Intel could pick up a solid chunk of market share.
Regardless of what they say, they CAN compete in training and inference, there is literally no alternative to W7900 at the moment. That's 4080 performance with 48Gb VRAM for half of what similar CUDA devices would costs.
FP16+ doesn't really matter for local LLM inference, no one can run reasonably big models at FP16.
Usually the models are quantized to 8/4 bits, where the 5090 again demolishes the w7900 by having a multiple of max TOPS.
with 48 GB of vram you could run a 20b model at fp16. It won't be a better GPU for everything, but it definitely beats a 5090 for some use case. It's also a generation old, and the newer rx9070 seems like it should be pretty competitive with a 5090 from a flops perspective, so a workstation model with 32 gb of vram and a less cut back core would be interesting.
>What business is Intel in, if not competing with Nvidia and AMD.
Foundry business. The latest report on Discreet Graphics Market share Nvidia has 94%, AMD at 6% and Intel at 0%.
I may still have another 12 months to go. But in 2016 I made a bet against Intel engineers on Twitter and offline suggesting GPU is not a business they want to be in, or at least too late. They said at the time they will get 20% market share minimum by 2021. I said I would be happy if they did even 20% by 2026.
Intel is also losing money, they need cashflow to compete in Foundry business. I have long argued they should have cut off GPU segment when Pat Gelsinger arrives, turns out Intel bound themselves to GPU by all the government contract and supercomputer they promised to make. Now that they have delivered it all or mostly they will need to think about whether to continue or not.
Unfortunately unless US point guns at TSMC I just dont see how Intel will be able to compete, as Intel needs to be a leading edge position in order to command the margin required for Intel to function. Right now in terms of density Intel 18A is closer to TSMC N3 then N2.
The problem is they can’t not attempt or they’ll simply die of irrelevance in a few years. GPUs will eat the world.
If NVidia gets complacent as Intel has become when they had the market share in the CPU space, there is opportunity for Intel, AMD and others in NVidias margin.
They may not have to, frankly, depending on when China decides to move on Taiwan. It's useless to speculate—but it was certainly a hell of a gamble to open a SOTA (or close to it—4 nm is nothing to sneeze at) fab outside of the island.
I thought that he said that they gave up at competing with Nvidia at training, not in general. He left the door open to compete on inference. Did he say otherwise more recently?
A feature I haven't seen someone comment about yet is Project Battlematrix [1][2] with these cards, this allows for multi-GPU AI orchestration. A feature Nvidia offers for enterprise AI workloads (Run:ai), but Intel is bringing this to consumers
Huh, I didn't realize these were just released, I came across it looking for a GPU that had AV1 hardware encoding and been putting a shopping cart together for a mini-ITX xeon server for all my ffmpeg shenanigans.
I like to Buy American when I can but it's hard to find out which fabs various CPUs and GPUs are made in. I read Kingston does some RAM here and Crucial some SSDs. Maybe the silicon is fabbed here but everything I found is "assembled in Taiwan", which made me feel like I should get my dream machine sooner rather than later
You may want to check that your Xeon may already support hardware encoding of AV1 in the iGPU. I saved a bundle building a media server when I realized the iGPU was more than sufficient (and more efficient) than chucking a GPU in the case.
I have a service that runs continuously and reencodes any videos I have into h265 and the iGPU barely even notices it.
Looks like Core Ultra is the only chip with integrated Arc GPU with AV1 encode. The Xeon series I was looking at, the 1700 socket so the e2400s, definitely don't have iGPU. (The fact that the motherboard I'm looking at only has VGA is probably a clue xD)
I'll have to consider pros and cons with Ultra chips, thanks for the tip.
I don't know how big the impact really is, but Intel is pretty far behind on encoder quality mostly. Oh wait, on most codecs they are pretty far behind, but av1 they seem pretty competitive? Neat.
I have the answer for you, Intel's GPU chips are on TSMC's process. They are not made in Intel-owned fabs.
There really is no such thing as "buying American" in the computer hardware industry unless you are talking about the designs rather than the assembly. There are also critical parts of the lithography process that depend on US technology, which is why the US is able to enforce certain sanctions (and due to some alliances with other countries that own the other parts of the process).
Personally I think people get way too worked up about being protectionist when it comes to global trade. We all want to buy our own country's products over others but we definitely wouldn't like it if other countries stopped buying our exported products.
When Apple sells an iPhone in China (and they sure buy a lot of them), Apple is making most of the money in that transaction by a large margin, and in turn so are you since your 401k is probably full of Apple stock, and so are the 60+% of Americans who invest in the stock market. A typical iPhone user will give Apple more money in profit from services than the profit from the sale of the actual device. The value is really not in the hardware assembly.
In the case of electronics products like this, almost the entire value add is in the design of the chip and the software that is running on it, which represents all the high-wage work, and a whole lot of that labor in the US.
US citizens really shouldn't envy a job where people are sitting at an electronics bench doing repetitive assembly work for 12 hours a day in a factory wishing we had more of those jobs in our country. They should instead be focused on making high level education more available/affordable so that they stay on top of the economic food chain, where most/all of its citizens are doing high-value work rather than causing education to be expensive and beg foreign manufacturers to open satellite factories to employ our uneducated masses.
I think the current wave of populist protectionist ideology is essentially blaming the wrong causes of declining affordability and increasing inequality for the working class. Essentially, people think that bringing the manufacturing jobs back and reversing globalism will right the ship on income inequality, but the reality is that the reason that equality was so good for Americans m in the mid-century was because the wealthy were taxed heavily, European manufacturing was decimated in WW2, and labor was in high demand.
The above of course is all my opinion on the situation, and a rather long tangent.
Thanks for that perspective. I am just in a place of puzzling why none of this says Made in USA on it. I can get socks and tshirts woven in north carolina which is nice, and furniture made in illinois. That's all a resurgence of 'arts & craft' I suppose, valuing a product made in small batches by someone passionate about quality instead of just getting whatever is lowest cost. Suppose there's not much in the way of artisan silicon yet :)
EDIT: I did think of, what is the closest thing to artisan silicon and thought of the POWER9 CPUs and found out those are made in USA Talos II is also manufactured in the US with the IBM POWER9 processors being fabbed in New York while the Raptor motherboard is manufactured in Texas along with where their systems are assembled.
I would go even further than that and point out that the US still makes plenty of cheap or just "normal" priced, non-artisan items! You'll actually have a hard time finding grocery store Consumer Packaged Goods (CPG) made outside of the US and Canada - things like dish soap, laundry detergent, paper products, shampoo, and a whole lot of food.
I randomly thought of paint companies as another example, with Sherwin-Williams and PPG having US plants.
The US is still the #2 manufacturer in the world, it's just a little less obvious in a lot of consumer-visible categories.
the thing with iPhone production is not about producing iPhones per se, it's about providing a large volume customer for the supply chain below it - basic stuff like SMD resistors, capacitors, ICs, metal shields, frames, god knows what else - because you need that available domestically for weapons manufacturing, should China ever think of snacking Taiwan. But a potential military market in 10 years is not even close to "worth it" for any private investors or even the government to build out a domestic supply chain for that stuff.
GPUs prices really surprise me. Most PC part prices have remained the same over the decades with storage and RAM actually getting cheaper. GPUs however have gotten extremely expensive. $350 used to get you a really good GPU about 20 years ago, I think top of the line was around $450-500--now it only gets you entry level. Top of the line is now $1500+!
Datacenter gpu margins are 80%+. Consumer margins are like 25%. Any company with a datacenter product that sells out is just going to put all their fab allocation toward that and ignore the consumer segment. Plus these companies are really worried about their consumer products being used in datacenters and consuming their money maker so they kneecap the consumer vram to make sure that doesnt happen
Kinda bummed that it’s $50 more than originally said. But if it works well, a single slot card that can be powered by the PCIe slot is super valuable. Hoping there will be some affordable prebuilds so I can run some MoE LLM models.
I am confused as a lot of comments here seem to argue around gaming, but isn't this supposed to be a workstation card, hence not intended to be used for games? The phoronix review also seems to only focus on computing usage, not gaming.
It’s interesting that it uses 4 Display Ports and not a single HDMI.
Is HDMI seen as a “gaming” feature, or is DP seen as a “workstation” interface? Ultimately HDMI is a brand that commands higher royalties than DP, so I suspect this decision was largely chosen to minimize costs. I wonder what percentage of the target audience has HDMI only displays.
I'd say that's a more recent development though because of how long it took for DisplayPort 2 products to make it to market. On both my RTX 4000 series GPU, and gaming 1440p240hz OLED monitor, HDMI 2.1 (~42 Gigabit) is the higher bandwidth port over its DisplayPort 1.4 (~26 Gigabit). So I use the HDMI ports. 26 Gigabit isn't enough for 1440p240z at 10-bit HDR colour. You can do it with DSC, but that comes with its own issues.
HDMI is still valuable for those of us who use KVMs. Cheap Display port KVMs don't have EDID emulation and expensive Display Port KVMs just don't work (in my experience).
I have a Level1Techs hdmi KVM and it's awesome, and I'd totally buy a display port one once it has built in EDID cloners, but even at their super premium price point, it's just not something they're willing to do yet.
I have Linux (AMD RDNA2), Windows (NVIDIA Ada), and Mac (M3) systems hooked up to my L1T DP1.4 KVM[1] without any other gadgets and they all work fine. What problem(s) are you trying to solve/did you solve with the EDID cloner?
Without the EDID cloner, when you switch the KVM away from the system, it receives a monitor disconnect event. When you switch it back, it receives a monitor connect event. There are OS settings that help make it so that the windows end up back where they started, but not all programs support this well. With a EDID cloner in place, the computer never detects that the monitor shifted at all and so nothing gets repositioned and apps just carry on.
I have one and it still sucks. I ordered it after the one I bought on Amazon kind of sucked thinking the L1T would be better and it was worse than the Amazon one.
This is the right answer. I see a bunch of people talking about licensing fees for HDMI, but when you’re plugging in 4 monitors it’s really nice to only use one type of cable. If you’re only using one type of cable, it’s gonna be DP.
You can also get GT730's with 4xHDMI - not fast, but great for office work and display/status boards type scenarios. Single slot passive design too, so you can stack several in a single PC. Currently just £63 UK each.
Because you can actually fit 4 of them without impinging airflow from the heatsink. Mini HDMI is mechanically ass and I've never seen it anywhere but junky Android tablets.
DP also isn't proprietary.
As far as things I care about go, the HDMI Forum’s overt hostility[1] to open-source drivers is the important part, but it would indeed be interesting to know what Intel cared about there.
(Note that some self-described “open” standards are not royalty-free, only RAND-licensed by somebody’s definiton of “R” and “ND”. And some don’t have their text available free of charge, either, let alone have a development process open to all comers. I believe the only thing the phrase “open standard” reliably implies at this point is that access to the text does not require signing an NDA.
DisplayPort in particular is royalty-free—although of course with patents you can never really know—while legal access to the text is gated[2] behind a VESA membership with dues based on the company revenue—I can’t find the official formula, but Wikipedia claims $5k/yr minimum.)
See, the openness is one reason I'd lean towards Intel ARC. They literally provide programming manuals for Alchemist, which you could use to implement your own card driver. Far more complete and less whack than dealing with AMD's AtomBIOS.
As someone who has toyed with OS development, including a working NVMe driver, that's not to be underestimated. I mean, it's an absurd idea, graphics is insanely complex. But documentation makes it theoretically possible... a simple framebuffer and 2d acceleration for each screen might be genuinely doable.
I'm not 100% sure but last time I looked it wasn't openly available anymore - it may still royalty free but when I tried to download the specification the site said you had to be a member of VESA now to download the standard (it is still possible to find earlier versions openly).
That's because DP sources can (and nearly always do) support encoding HDMI as a secondary mode, so all you need is a passive adapter. Going the other way requires active conversion.
I assume you have to pay HDMI royalties for DP ports which support the full HDMI spec, but older HDMI versions were supersets of DVI, so you can encode a basic HDMI compatible signal without stepping on their IP.
As long as the port supports it passively (called "DP++ Dual Mode"), if you have a DP-only port then you need an active converter which are the same as the latter pricing you mentioned.
USB-C would fit and has display port alt mode, technically. Not much out there natively supports it. Mini DP can be passively converted to a lot however so I assume that was the choice. Also Nvidia workstation cards have similar port configuration.
DP is perfectly fine for gaming (it's better than HDMI). The only reason HDMI is lingering around is the cartel which profits from patents on it, and manufacturers of TVs which stuff them with HDMI and don't provide DP or USB-C ports.
Otherwise HDMI would have been dead a long time ago.
There’s also weirdness with the drivers and hdmi, I think around encryption mainly. But if you only have DP and include an adapter, it’s suddenly “not my problem” from the perspective of Intel.
HDMI is shit. If you've never had problems with random machine hdmi port -> hdmi cable -> hdmi port on monitor you just haven't had enough monitors.
> Is HDMI seen as a “gaming” feature
It's a tv content protection feature. Sometimes it degrades the signal so you feel like you're watching tv. I've had this monitor/machine combination that identified my monitor as a tv over hdmi and switched to ycbcr just because it wanted to, with assorted color bleed on red text.
It's not competing with amd/nvidia at twice the price on terms of performance, but it's also too expensive for a cheap gaming rig. And then there are people who are happy with integrated graphics.
Maybe I'm just lacking imagination here, I don't do anything fancy on my work and couch laptops and I have a proper gaming PC.
Last time I had anything to do with the low-mid range pro GPU world, the use case was 3D CAD and certain animation tasks. That was ~10 years ago, though.
CAD, and medical were always the use case for high end workstations and professional GPUs. Companies designing jets and cars need more than iGPU, but they prefer slim desktops and something distanced from games.
An obvious use case is high-end NVRs. Low power, ample GPU for object detection/tracking, ample encoders for streaming. Should make a good surveillance platform.
With SR-IOV* there is a low cost path for GPU in virtual machines. Until now this has (mostly) been a feature exclusive to costly "enterprise" GPUs. Combine that with the good encoders and some VDI software and you have VM hosted GPU accelerated 3D graphics to remote displays. There are many business use cases for this, and no small number of "home lab" use cases as well.
Linux is a first class citizen with Intel's display products, and B50/60 is no different, so it's a nice choice when you want a GPU accelerated Linux desktop with minimum BS. Given the low cost and power, it could find its way into Steam consoles as well.
Finally, Intel is the scrappy competitor in this space: they are being very liberal with third parties and their designs, unlike the incumbents. We're already seeing this with Maxsun and others.
I have an NVidia Tesla P40 in my home NAS/media server that I use for video encoding purposes. It doesn’t even have any video outputs, but it does have dual media encoders and a decent amount of VRAM for lots of (relatively) high quality simultaneous transcoding streams using NVENC/NVDEC to re-encode 4K Blu-ray remux’s on the fly.
A lot of them don’t though. My Xeon doesn’t, so I threw a cheap used Nvidia Tesla P40 in there to do the job. Also it can handle a lot more simultaneous streams than any iGPU I’m aware of.
Another advantage of Intel GPU is vGPU SR-IOV, while consumer video cards of NVIDIA and AMD didn't support it. But even the integrated GPU of N100, N97 support it[1],
Therefore I can install Proxmox VE and run multiple VMs, assigning a vGPU to each of them a for video transcoding (IPCam NVR), AI and other applications.
I'm glad Intel is continuing to make GPUs, really. But ultimately it seems like an uphill battle against a very entrenched monopoly with a software and community moat that was built up over nearly 20 years at this point. I wonder what it will take to break through.
If you buy Intel Arc cards for their competitive video encoding/decoding capabilities, it appears that all of them are still capped at 8 parallel streams. The "B" series have more headroom at high resolutions and bitrates, on the other hand some "A" series cards need only a single PCIe slot so you can stick more of them into a single server.
It's half-height (fits in "slim" desktops, those media center PCs, and in a 2U server without having to turn it sideways/use a riser), and barely longer than the PCIe socket. Phoronix has a picture with a full-height bracket which maybe gives a better point of comparison: https://www.phoronix.com/review/intel-arc-pro-b50-linux
(A half-height single-slot card would be even smaller, but those are vanishingly rare these days. This is pretty much as small as GPUs get unless you're looking more for a "video adapter" than a GPU.)
Agreed. I have an A40 GPU in an epyc system right now specifically because it's a single slot card. I did not pay for gobs of PCIE expansion in this system just to block slots with double wide GPUs. Sure it can't do the heavy lift of some beefier cards but there is a need for single space cards still.
Kind of. It's more two 24gb b60s in a trenchcoat. It connects to one slot but it's two completely separate gpus and requires the board to support pcie bifurcation.
> and requires the board to support pcie bifurcation
And lanes. My board has two PCIe x16 slots fed by the CPU, but if I use both they'll only get x8 lanes each. Thus if I plugged two of these in there, I'd still only have two working GPUs, not four.
I think the answer to that right now is highly workload dependent. From what I have seen, it is improving rapidly, but still very early days for the software stack compared to Nvidia
Intel is doing poorly, but I believe Apple was in much, much worse shape than this in the early 2000's. AMD was also in much, much worse shape that this.
Intel has many, many solid customers at the government, enterprise and consumer levels.
> Intel is doing poorly, but I believe Apple was in much, much worse shape than this in the early 2000's. AMD was also in much, much worse shape that this.
Were they really? I don't think Intel is going anywhere any time soon either, but damn do they seem in bad shape. AMD, didn't they just have lackluster products for a few years and they were kind of the scrappy budget underdogs? I don't recall their fate seeming so...hopeless.
Wasn't that before the era of hyperscalers? Intel offers nothing of value anymore. What's to stop one for the giants from just swallowing them up like 3dfx or ATI?
They sell a lot to the hyperscalers as well, suggesting they offer something of value. I don't think anything prevents them from being swallowed up, but I'm not sure of what value that would be to a hyperscaler unless they want to get into the chip making business.
It clocks in at 1503.4 samples per second, behind the NVidia RTX 2060 (1590.93 samples / sec, released Jan 2019), AMD Radeon RX 6750 XT (1539, May 2022), and Apple M3 Pro GPU 14 cores (1651.85, Oct 2023).
Note that this perf comparison is just ray-tracing rendering, useful for games, but might give some clarity on performance comparisons with its competition.
It wouldn't surprise me if there was 10-20% perf improvement in drivers/software for this. Intel's architecture is pretty new and nothing is optimized for it yet.
I really think Intel is on the right track to dethrone both AMD and NVIDIA, while also competing with ARM SoCs. It's fascinating to watch.
Both their integrated and dedicated GPUs have been steadily improving each generation. The Arc line is both cheaper and comparable in performance to more premium NVIDIA cards. The 140T/140V iGPUs do the same to AMD APUs. Their upcoming Panther Lake and Nova Lake architectures seem promising, and will likely push this further. Meanwhile, they're also more power efficient and cooler, to the point where Apple's lead with their ARM SoCs is not far off. Sure, the software ecosystem is not up to par with the competition yet, but that's a much easier problem to solve, and they've been working on that front as well.
I'm holding off on buying a new laptop for a while just to see how this plays out. But I really like how Intel is shaking things up, and not allowing the established players to rest on their laurels.
I really hope Intel continues with GPUs or the GPU market is doomed until China catches up, Nvidia produces good products with great software, best in industry really, with great length support, but that doesn't excuse them from monopolistic practices. The fact that AMD refuses to compete really makes it look like this entire thing is organized from the top (US government).
This reminds me a lot of the LLM craze and how they wanted to charge so much for simple usage at the start until China released deepseek. Ideally we shouldn't rely on China but do we have a choice? the entire US economy has become reliant on monopolies to keep their insanely high stock prices and profit margins
A $350 “workstation” GPU with 16 GB of VRAM? I... guess, but is that really enough for the kinds of things that would have you looking for workstation-level GPUs in the first place?
>Overall the Intel Arc Pro B50 was at 1.47x the performance of the NVIDIA RTX A1000 with that mix of OpenGL, Vulkan, and OpenCL/Vulkan compute workloads both synthetic and real-world tests. That is just under Intel's own reported Windows figures of the Arc Pro B50 delivering 1.6x the performance of the RTX A1000 for graphics and 1.7x the performance of the A1000 for AI inference. This is all the more impressive when considering the Arc Pro B50 price of $349+ compared to the NVIDIA RTX A1000 at $420+.
I guess it's a boon for Intel that NVidia repeatedly shoots their own workstation GPUs in the foot...
I.e. maybe Nvidia say "if we're going to fuse some random number of cores such that this is no longer a 3050, then let's not only fuse the damaged cores, but also do a long burn-in pass to observe TDP, and then fuse the top 10% of cores by measured TDP."
If they did that, it would mean that the resulting processor would be much more stable under a high duty cycle load, and so likely to last much longer in an inference-cluster deploy environment.
And the extra effort (= bottlenecking their supply of this model at the QC step) would at least partially justify the added cost. Since there'd really be no other way to produce a card with as many FLOPS/watt-dollar, without doing this expensive "make the chip so tiny it's beyond the state-of-the-art to make it stably, then analyze it long enough to precision-disable everything required to fully stabilize it for long-term operation" approach.
Such an appliance could plug into literally any modern computer — even a laptop or NUC. (And for inference, "running on an eGPU connected via Thunderbolt to a laptop" would actually work quite well; inference doesn't require much CPU, nor have tight latency constraints on the CPU<->GPU path; you mostly just need enough arbitrary-latency RAM<->VRAM DMA bandwidth to stream the model weights.)
(And yeah, maybe your workstation doesn't have Thunderbolt, because motherboard vendors are lame — but then you just need a Thunderbolt PCIe card, which is guaranteed to fit more easily into your workstation chassis than a GPU would!)
https://www.gigabyte.com/Graphics-Card/GV-N5090IXEB-32GD
The thing you linked is just a regular Gigabyte-branded 5090 PCIe GPU card (that they produced first, for other purposes; and which does fit into a regular x16 PCIe slot in a standard ATX chassis), put into a (later-designed) custom eGPU enclosure. The eGPU box has some custom cooling [that replaces the card's usual cooling] and a nice little PSU — but this is not any more "designing the card around the idea it'll be used in an enclosure" than what you'd see if an aftermarket eGPU integrator built the same thing.
My point was rather that, if an OEM [that produces GPU cards] were to design one of their GPU cards specifically and only to be shipped inside an eGPU enclosure that was designed together with it — then you would probably get higher perf, with better thermals, at a better price(!), than you can get today from just buying standalone peripheral-card GPU (even with the cost of the eGPU enclosure and the rest of its components taken into account!)
Where by "designing the card and the enclosure together", that would look like:
- the card being this weird nonstandard-form-factor non-card-edged thing that won't fit into an ATX chassis or plug into a PCIe slot — its only means of computer connection would be via its Thunderbolt controller
- the eGPU chassis the card ships in, being the only chassis it'll comfortably live in
- the card being shaped less like a peripheral card and more like a motherboard, like the ones you see in embedded industrial GPU-SoC [e.g. automotive LiDAR] use-cases — spreading out the hottest components to ensure nothing blocks anything else in the airflow path
- the card/board being designed to expose additional water-cooling zones — where these zones would be pointless to expose on a peripheral card, as they'd be e.g. on the back of the card, where the required cooling block would jam up against the next card in the slot-array
...and so on.
It's the same logic that explains why those factory-sealed Samsung T-series external NVMe pucks can cost less than the equivalent amount of internal m.2 NVMe. With m.2 NVMe, you're not just forced into a specific form-factor (which may not be electrically or thermally optimal), but you're also constrained to a lowest-common-denominator assumption of deployment environment in terms of cooling — and yet you have to ensure that your chips stay stable in that environment over the long term. Which may require more-expensive chips, longer QC burn-in periods, etc.
But when you're shipping an appliance, the engineering tolerances are the tolerances of the board-and-chassis together. If the chassis of your little puck guarantees some level of cooling/heat-sinking, then you can cheap out on chips without increasing the RMA rate. And so on. This can (and often does) result in an overall-cheaper product, despite that product being an entire appliance vs. a bare component!
The hottest one on the consumer market
> The eGPU box has some custom cooling
Custom liquid cooling to tame the enormous TDP
> and a nice little PSU
Yeah, an 850W one.
>were to design one of their GPU cards specifically and only to be shipped inside an eGPU enclosure that was designed together with it
And why they would do so?
Do you understand what it would drive the price a lot?
> at a better price(!)
With less production/sales numbers than a regular 5090 GPU? No way. Economics 101.
> the card being this weird nonstandard-form-factor non-card-edged thing
Even if we skip the small series nuances (which makes this a non-starter by the price alone), there is a little what some other 'nonstandard-form-factor' can do for the cooling - you still need the RAM near the chip... and that's all. You just designed the same PCIe card for the sake of it being incompatible..
> won't ... plug into a PCIe slot
Again - why? What this would provide what the current PCIe GPU lacks? BTW you still need the 16 lines of PCIe and you know which connector provides the most useful and cost effective way to do so? A regular 16x PCIe connector. That one you ditched.
> the card being shaped less like a peripheral card and more like a motherboard
You don't need to 're-design it from scratch', it's enough not to be constrained with a 25cm limit to have a proper air-flow along a properly oriented radiator.
> why those factory-sealed Samsung T-series external NVMe pucks
Lol: https://www.zdnet.com/article/why-am-i-taking-this-samsung-t...
With 16GB everybody will just call it another in the long list of Intel failures.
My first software job was at a place doing municipal architecture. The modelers had and needed high end GPUs in addition to the render farm, but plenty of roles at the company simply needed anything with better than what the Intel integrated graphics of the time could produce in order to open the large detailed models.
In these roles the types of work would include things like seeing where every pipe, wire, and plenum for a specific utility or service was in order to plan work between a central plant and a specific room. Stuff like that doesn’t need high amounts of VRAM since streaming textures in worked fine. A little lag never hurt anyone here as the software would simply drop detail until it caught up. Everything was pre-rendered so it didn’t need large amounts of power to display things. What did matter was having the grunt to handle a lot of content and do it across three to six displays.
Today I’m guessing the integrated chips could handle it fine but even my 13900K’s GPU only does DisplayPort 1.4 and up to only three displays on my motherboard. It should do four but it’s up to the ODMs at that point.
For a while Matrox owned a great big slice of this space but eventually everyone fell to the wayside except NVidia and AMD.
Prices from NewEgg on 16gb+ consumer cards, sold by NewEgg and in stock.
This makes it mysterious since clearly CUDA is an advantage, but higher VRAM lower cost cards with decent open library support would be compelling.
Are there any performance bottlenecks with using 2 cards instead of a single card? I don't think any one the consumer Nvidia cards use NVlink anymore, or at least they haven't for a while now.
Plenty of people use eg 2, 4 or 6 3090s to run large models at acceptable speeds.
Higher VRAM at decent (much faster than DDR5) speeds will make cards better for AI.
Intel and even AMD can’t compete or aren’t bothering. I guess we’ll see how the glued 48GB B60 will do, but that’s a still relatively slow GPU regardless of memory. Might be quite competitive with Macs, though.
People actually use loaded out M-series macs for some forms of AI training. So, total memory does seem to matter in certain cases.
But intel is still lost in it's hubris, and still thinks it's a serious player and "one of the boys", so it doesn't seem like they want to break the line.
Given the high demand of graphic cards, is this a plausible scenario?
Given how young and volatile this domain still is, it doesn't seem unreasonable to be wary of it. Big players (google, openai and the likes) are probably pouring tons of money into trying to do exactly that
Businesswise? Because Intel management are morons. And because AMD, like Nvidia, don't want to cannibalize their high end.
Technically? "Double the RAM" is the most straightforward (that doesn't make it easy, necessarily ...) way to differentiate as it means that training sets you couldn't run yesterday because it wouldn't fit on the card can now be run today. It also takes a direct shot at how Nvidia is doing market segmentation with RAM sizes.
Note that "double the RAM" is necessary but not sufficient.
You need to get people to port all the software to your cards to make them useful. To do that, you need to have something compelling about the card. These Intel cards have nothing compelling about them.
Intel could also make these cards compelling by cutting the price in half or dropping two dozen of these cards on every single AI department in the US for free. Suddenly, every single grad student in AI will know everything about your cards.
The problem is that Intel institutionally sees zero value in software and is incapable of making the moves they need to compete in this market. Since software isn't worth anything to Intel, there is no way to justify any business action isn't just "sell (kinda shitty) chip".
- less people care about VRAM than HN commenters give impression of
- VRAM is expensive and wouldn't make such cards profitable at the HN desired price points
I don't get why there's people trying to twist this story or come up with strawmen like the A2000 or even the RTX5000 series. Intel's coming into this market competitively, which as far as I know is a first, and it's also impressive.
Coming into the gaming GPU market had always been too ambitious a goal for Intel, they should have started with competing in the professional GPU market. It's well known that Nvidia and AMD have always been price gouging this market so it's fairly easy to enter it competitively.
If they can enter this market successfully and then work their way up on the food chain then that seems like good way to recover from their initial fiasco.
Toss in a 5060 Ti into the compare table, and we're in an entirely different playing field.
There are reasons to buy the workstation NVidia cards over the consumer ones, but those mostly go away when looking at something like the new Intel. Unless one is in an exceptionally power-constrained environment, yet has room for a full-sized card (not SFF or laptop), I can't see a time the B50 would even be in the running against a 5060 Ti, 4060 Ti, or even 3060 Ti.
I seem to recall certain esoteric OpenGL things like lines being fast was a NVIDIA marketing differentiator, as only certain CAD packages or similar cared about that. Is this still the case, or has that software segment moved on now?
For me (not quite at the A1000 level, but just above -- still in the prosumer price range), a major one is ECC.
Thermals and size are a bit better too, but I don't see that as $500 better. I actually don't see (m)any meaningful reasons to step up to an Ax000 series if you don't need ECC, but I'd love to hear otherwise.
We could just as well compare it to the slightly more capable RTX A2000, which was released more than 4 years ago. Either way, Intel is competing with the EoL Ampere architecture.
There are huge markets that does not care about SOTA performance metrics but needs to get a job done.
That's a bold claim when their acceleration software (IPEX) is barely maintained and incompatible with most inference stacks, and their Vulkan driver is far behind it in performance.
At 16GB I'd still prefer to pay a premium for NVidia GPUs given its superior ecosystem, I really want to get off NVidia but Intel/AMD isn't giving me any reason to.
PS5 has something like 16GB unified RAM, and no game is going to really push much beyond that in VRAM use, we don’t really get Crysis style system crushers anymore.
This isn't really true from the recreational card side, nVidia themselves are reducing the number of 8GB models as a sign of market demand [1]. Games these days are regularly maxing out 6 & 8 GB when running anything above 1080p for 60fps.
The prevalence of Unreal Engine 5 also recently with a low quality of optimization for weaker hardware is causing games to be released basically unplayable for most.
For recreational use the sentiment is that 8GB is scraping the bottom of the requirements. Again this is partly due to bad optimizations, but games are being played in higher resolutions also, which required more memory for larger texture sizes.
[1] https://videocardz.com/newz/nvidia-reportedly-reduces-supply...
While I dislike some of the handmade hero culture, in one thing they are right, regarding how bad modern hardware happens to be used.
A good thing is that I turned myself into libre/indie gaming with games such as Cataclysm DDA:Bright Ness with far less requeriments than a UE5 game and yet being enyojable due to playability and in-game lore (and a proper ending compared to vanilla CDDA).
I know this is going back to Intel's Larrabee where they tried it, but I'd be real interested to see what the limits of a software renderer is now considering the comparative strength of modern processors and amount of multiprocessing. While I know there's DXVK or projects like dgVoodoo2 which can be an option with sometimes better backwards compatibility, just software would seem like a stable reference target than the gradually shifting landscape of GPUs/drivers/APIs
Lavapipe on Vulkan makes VKQuake playable even con Core Duo 2 systems. Just as a concept, of course. I know about software rendered Quakes since forever.
Never got far enough to interact with most systems in the game or worry about proper endings.
The Unreal Engine software renderer back then had a very distinct dithering pattern. I played it after I got a proper 3D card, but it didn't feel the same, felt very flat and lifeless.
500$ 32GB consumer GPU is an obvious best seller.
Thus let's call it how it is: they don't want to cannibalize their higher end GPUs.
Here is a better idea what to do with all that money,
https://themarcooffset.com/
We're already seeing competitors of AWS but only targeting things like Qwen , deepseek, etc.
There's Enterprise customers who have compliance laws and literally want AI but cannot use any of the top models because everything has to be run on their own infrastructure.
That's pretty funny considering that PC games are moving more towards 32GB RAM and 8GB+ VRAM. The next generation of consoles will of course increase to make room for higher quality assets.
Gabe Newell - https://www.gamesradar.com/gabe-newell-says-games-dont-need-...
Detailed assets don't equate good games.
You're wrong. It's probably more like 9 HN posters.
They also announced a 24 GB B60 and a double-GPU version of the same (saves you physical slots), but it seems like they don't have a release date yet (?).
https://www.asrock.com/Graphics-Card/Intel/Intel%20Arc%20Pro...
This to me is the gamer perspective. This segment really does not need even 32GB, let alone 64GB or more.
The only time usage was "high" was when I created a VM with 48GB RAM just for kicks.
It was useless. But I could say I had 64GB RAM.
If you hae a private computer, why wpuld you even bug something with 16GB in 2025? My 10 year old laptop had that much.
Im looking for a new laptop and Im looking at a 128GB setup - so those 200 chrome tabs can eat it, I have space to run other stuff, like those horrible electron chat apps + a game
But if I were to upgrade, I'd still get at least 128GB. Nobody's gonna be impressed with 64GB anymore. I don't need that in my life...
How so? The prosumer local AI market is quite large and growing every day, and is much more lucrative per capita than the gamer market.
Gamers are an afterthought for GPU manufacturers. NVIDIA has been neglecting the segment for years, and is now much more focused on enterprise and AI workloads. Gamers get marginal performance bumps each generation, and side effect benefits from their AI R&D (DLSS, etc.). The exorbitant prices and performance per dollar are clear indications of this. It's plain extortion, and the worst part is that gamers accepted that paying $1000+ for a GPU is perfectly reasonable.
> This segment really does not need even 32GB, let alone 64GB or more.
4K is becoming a standard resolution, and 16GB is not enough for it. 24GB should be the minimum, and 32GB for some headroom. While it's true that 64GB is overkill for gaming, it would be nice if that would be accessible at reasonable prices. After all, GPUs are not exclusively for gaming, and we might want to run other workloads on them from time to time.
While I can imagine that VRAM manufacturing costs are much higher than DRAM costs, it's not unreasonable to conclude that NVIDIA, possibly in cahoots with AMD, has been artificially controlling the prices. While hardware has always become cheaper and more powerful over time, for some reason, GPUs buck that trend, and old GPUs somehow appreciate over time. Weird, huh. This can't be explained away as post-pandemic tax and chip shortages anymore.
Frankly, I would like some government body to investigate this industry, assuming they haven't been bought out yet. Label me a conspiracy theorist if you wish, but there is precedent for this behavior in many industries.
During the cryptocurrency hype, GPUs were already going for insane prices and together with low energy prices or surplus (which solar can cause, but nuclear should too) allows even governments to make cheap money (and for hashcat cracking, too). If I was North Korea I'd know my target. Turns out, they did, but in a different way. That was around 2014. Add on top of this Stadia and GeForce Now as examples of renting GPU for gaming (there are more, and Stadia flopped).
I didn't mention LLMs since that has been the most recent development.
All in all, it turns out GPUs are more valuable than what they were sold for if your goal isn't personal computer gaming. Hence the price gone up.
Now, if you want to thoroughly investigate this market you need to figure what large foreign forces (governments, businesses, and criminal enterprises) use these GPUs for. US government is aware for long time of above; hence export restrictions on GPUs. Which are meant as slowing opponent down to catch up. The opponent is the non-free world (China, North Korea, Russia, Iran, ...), though current administration is acting insane.
NVIDIA is also taking consumers for a ride by marketing performance based on frame generation, while trying to downplay and straight up silence anyone who points out that their flagship cards still struggle to deliver a steady 4K@60 without it. Their attempts to control the narrative of media outlets like Gamers Nexus should be illegal, and fined appropriately. Why we haven't seen class-action lawsuits for this in multiple jurisdictions is beyond me.
Their GPU business is a slow upstart. If they have a play that could massively disrupt the competition, and has a small chance of epic failure, that should be very attractive to them.
The number of chips on the bus is usually pretty low (1 or 2 of them on most GPUs), so GPUs tend to have to scale out their memory bus widths to get to higher capacity. That's expensive and takes up die space, and for the conventional case (games) isn't generally needed on low end cards.
What really needs to happen is someone needs to make some "system seller" game that is incredibly popular and requires like 48GB of memory on the GPU to build demand. But then you have a chicken/egg problem.
Example: https://wccftech.com/nvidia-geforce-rtx-5090-128-gb-memory-g...
You can get 96gb of vram and about 40-70% the speed of a 4090 for $4000.
Especially when you are running a large number of applications you want to talk to each other it makes sense ... the only way to do it on a 4090 is to hit disk, shut the application down, start up the other applciation, read from disk ... it's slowwww... the other option is a multi-gpu system but then it gets into real money.
trust me, it's a gamechanger. I just have it sitting in a closet. Use it all the time.
The other nice thing is unlike with any Nvidia product, you can walk into an apple store, pay the retail price and get it right away. No scalpers, no hunting.
Why not just buy 3 card then? These cards doesn't require active cooling anyways and you can just fit 3 in decent sized case. You will get 3x VRAM speed and 3x compute. And if your usecase is llm inference, it will be a lot faster than 1x card with 3x VRAM.
> 3x VRAM speed and 3x compute
LLM scaling doesn’t work this way. If you have 4 cards, you may get 2x performance increase if you use vLLM. But you’ll also need enough VRAM to run FP8. 3 cards would only run at 1x performance.
Currently running them different VMs to be able to make full use of them, used to have them running in different docker containers however OOM Exceptions would frequently bring down the whole server, which running in VMs helped resolve.
[1]: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inferen...
Which means Intel or AMD making an affordable high-VRAM card is win-win. If Nvidia responds in kind, Nvidia loses a ton of revenue they'd otherwise have available to outspend their smaller competitors on R&D. If they don't, they keep more of those high-margin customers but now the ones who switch to consumer cards are switching to Intel or AMD, which both makes the company who offers it money and helps grow the ecosystem that isn't tied to CUDA.
People say things like "it would require higher pin counts" but that's boring. The increase in the amount people would be willing to pay for a card with more VRAM is unambiguously more than the increase in the manufacturing cost.
It's more plausible that there could actually be global supply constraints in the manufacture of GDDR, but if that's the case then just use ordinary DDR5 and a wider bus. That's what Apple does and it's fine, and it may even cost less in pins than you save because DDR is cheaper than GDDR.
It's not clear what they're thinking by not offering this.
100% agree. CUDA is a bit of a moat, but the earlier in the hype cycle viable alternatives appear, the more likely the non CUDA ecosystem becomes viable.
> It's not clear what they're thinking by not offering this.
They either dont like making money or have a fantasy that one day soon they will be able to sell pallets of $100,000 GPUs they made for $2.50 like Nvidia can. It doesn't take a PhD and MBA to figure out that the only reason Nvidia have, what should be a short term market available to them is the failings of Intel and AMD and the VC / Innovation side to offer any competition.
It is such an obvious win-win that it would probably be worth skipping the engineering and just announcing the product, for sale by the end of the year and force everyones hand.
I guess you already have the paper if it is that unambiguous. Would you mond sharing the data/source?
Cards with 16GB of VRAM exist for ~$300 retail.
Cards with 80GB of VRAM cost >$15,000 and customers pay that.
A card with 80GB of VRAM could be sold for <$1500 with five times the margin of the $300 card because the manufacturing cost is less than five times as much. <$1500 is unambiguously a smaller number than >$15,000. QED.
I think your argument is still true overall, though, since there are a lot of "gpu poors" (i.e. grad students) who write/invent in the CUDA ecosystem, and they often work in single card settings.
Fwiw Intel did try this with Arctic Sound / Ponte Vecchio, but it was late out the door and did not really perform (see https://chipsandcheese.com/p/intels-ponte-vecchio-chiplets-g...). It seems like they took on a lot of technical risk; hopefully some of that transfers over to a future project though Falcon Shores was cancelled. They really should should have released some of those chips even at a loss, but I don't know the cost of a tape out.
There is also work being done to make this even less relevant because people are already interested in e.g. using four 16GB cards without a fast interconnect when you have a 64GB model. The simpler implementation of this is to put a quarter of the model on each card split in the order it's used and then have the performance equivalent of one card with 64GB of VRAM by only doing work on the card with that section of the data in its VRAM and then moving the (much smaller) output to the next card. A more sophisticated implementation does something similar but exploits parallelism by e.g. running four batches at once, each offset by a quarter, so that all the cards stay busy. Not all workloads can be split like this but for some of the important ones it works.
AMD has lagged so long because of the software ecosystem but the climate now is that they'd only need to support a couple popular model architectures to immediately grab a lot of business. The failure to do so is inexplicable.
I expect we will eventually learn that this was about yet another instance of anti-competitive collusion.
Lisa and Jensen are cousins. I think that explains it. Lisa can easily prove me wrong by releasing a high-memory GPU that significantly undercuts Nvidia's RTX 6000 Pro.
Why would you bother with any Intel product with an attitude like that, gives zero confidence in the company. What business is Intel in, if not competing with Nvidia and AMD. Is it giving up competing with AMD too?
That's the correct call in my opinion. Training is far more complex and will span multi data centers soon. Intel is too far behind. Inference is much simpler and likely a bigger market going forward.
That's how you get things like good software support in AI frameworks.
Inference is vastly simpler than training or scientific compute.
In many cases where 32GB won't be enough, 48 wouldn't be enough either.
Oh and the 5090 is cheaper.
Foundry business. The latest report on Discreet Graphics Market share Nvidia has 94%, AMD at 6% and Intel at 0%.
I may still have another 12 months to go. But in 2016 I made a bet against Intel engineers on Twitter and offline suggesting GPU is not a business they want to be in, or at least too late. They said at the time they will get 20% market share minimum by 2021. I said I would be happy if they did even 20% by 2026.
Intel is also losing money, they need cashflow to compete in Foundry business. I have long argued they should have cut off GPU segment when Pat Gelsinger arrives, turns out Intel bound themselves to GPU by all the government contract and supercomputer they promised to make. Now that they have delivered it all or mostly they will need to think about whether to continue or not.
Unfortunately unless US point guns at TSMC I just dont see how Intel will be able to compete, as Intel needs to be a leading edge position in order to command the margin required for Intel to function. Right now in terms of density Intel 18A is closer to TSMC N3 then N2.
If NVidia gets complacent as Intel has become when they had the market share in the CPU space, there is opportunity for Intel, AMD and others in NVidias margin.
They may not have to, frankly, depending on when China decides to move on Taiwan. It's useless to speculate—but it was certainly a hell of a gamble to open a SOTA (or close to it—4 nm is nothing to sneeze at) fab outside of the island.
I want hardware that I can afford and own, not AI/datacenter crap that is useless to me.
1. https://youtu.be/iM58i3prTIU?si=JnErLQSHpxU-DlPP&t=225
2. https://www.intel.com/content/www/us/en/developer/articles/t...
I like to Buy American when I can but it's hard to find out which fabs various CPUs and GPUs are made in. I read Kingston does some RAM here and Crucial some SSDs. Maybe the silicon is fabbed here but everything I found is "assembled in Taiwan", which made me feel like I should get my dream machine sooner rather than later
I have a service that runs continuously and reencodes any videos I have into h265 and the iGPU barely even notices it.
I'll have to consider pros and cons with Ultra chips, thanks for the tip.
Apologies for the video link. But a recent pretty in depth comparison: https://youtu.be/kkf7q4L5xl8
There really is no such thing as "buying American" in the computer hardware industry unless you are talking about the designs rather than the assembly. There are also critical parts of the lithography process that depend on US technology, which is why the US is able to enforce certain sanctions (and due to some alliances with other countries that own the other parts of the process).
Personally I think people get way too worked up about being protectionist when it comes to global trade. We all want to buy our own country's products over others but we definitely wouldn't like it if other countries stopped buying our exported products.
When Apple sells an iPhone in China (and they sure buy a lot of them), Apple is making most of the money in that transaction by a large margin, and in turn so are you since your 401k is probably full of Apple stock, and so are the 60+% of Americans who invest in the stock market. A typical iPhone user will give Apple more money in profit from services than the profit from the sale of the actual device. The value is really not in the hardware assembly.
In the case of electronics products like this, almost the entire value add is in the design of the chip and the software that is running on it, which represents all the high-wage work, and a whole lot of that labor in the US.
US citizens really shouldn't envy a job where people are sitting at an electronics bench doing repetitive assembly work for 12 hours a day in a factory wishing we had more of those jobs in our country. They should instead be focused on making high level education more available/affordable so that they stay on top of the economic food chain, where most/all of its citizens are doing high-value work rather than causing education to be expensive and beg foreign manufacturers to open satellite factories to employ our uneducated masses.
I think the current wave of populist protectionist ideology is essentially blaming the wrong causes of declining affordability and increasing inequality for the working class. Essentially, people think that bringing the manufacturing jobs back and reversing globalism will right the ship on income inequality, but the reality is that the reason that equality was so good for Americans m in the mid-century was because the wealthy were taxed heavily, European manufacturing was decimated in WW2, and labor was in high demand.
The above of course is all my opinion on the situation, and a rather long tangent.
EDIT: I did think of, what is the closest thing to artisan silicon and thought of the POWER9 CPUs and found out those are made in USA Talos II is also manufactured in the US with the IBM POWER9 processors being fabbed in New York while the Raptor motherboard is manufactured in Texas along with where their systems are assembled.
https://www.phoronix.com/review/power9-threadripper-core9
I randomly thought of paint companies as another example, with Sherwin-Williams and PPG having US plants.
The US is still the #2 manufacturer in the world, it's just a little less obvious in a lot of consumer-visible categories.
Also, do these support SR-IOV, as in handing slices of the GPU to virtual machines?
Is HDMI seen as a “gaming” feature, or is DP seen as a “workstation” interface? Ultimately HDMI is a brand that commands higher royalties than DP, so I suspect this decision was largely chosen to minimize costs. I wonder what percentage of the target audience has HDMI only displays.
Converting from DisplayPort to HDMI is trivial with a cheap adapter if necessary.
HDMI is mostly used on TVs and older monitors now.
Only now are DisplayPort 2 monitors coming out
Not cheap though. And also not 100% caveat-free.
I have a Level1Techs hdmi KVM and it's awesome, and I'd totally buy a display port one once it has built in EDID cloners, but even at their super premium price point, it's just not something they're willing to do yet.
1. https://www.store.level1techs.com/products/p/14-display-port...
[0] https://www.amazon.co.uk/ASUS-GT730-4H-SL-2GD5-GeForce-multi...
https://www.theregister.com/2024/03/02/hdmi_blocks_amd_foss/
(Note that some self-described “open” standards are not royalty-free, only RAND-licensed by somebody’s definiton of “R” and “ND”. And some don’t have their text available free of charge, either, let alone have a development process open to all comers. I believe the only thing the phrase “open standard” reliably implies at this point is that access to the text does not require signing an NDA.
DisplayPort in particular is royalty-free—although of course with patents you can never really know—while legal access to the text is gated[2] behind a VESA membership with dues based on the company revenue—I can’t find the official formula, but Wikipedia claims $5k/yr minimum.)
[1] https://hackaday.com/2023/07/11/displayport-a-better-video-i...
[2] https://vesa.org/vesa-standards/
As someone who has toyed with OS development, including a working NVMe driver, that's not to be underestimated. I mean, it's an absurd idea, graphics is insanely complex. But documentation makes it theoretically possible... a simple framebuffer and 2d acceleration for each screen might be genuinely doable.
https://www.x.org/docs/intel/ACM/
I assume you have to pay HDMI royalties for DP ports which support the full HDMI spec, but older HDMI versions were supersets of DVI, so you can encode a basic HDMI compatible signal without stepping on their IP.
Otherwise HDMI would have been dead a long time ago.
> Is HDMI seen as a “gaming” feature
It's a tv content protection feature. Sometimes it degrades the signal so you feel like you're watching tv. I've had this monitor/machine combination that identified my monitor as a tv over hdmi and switched to ycbcr just because it wanted to, with assorted color bleed on red text.
It's not competing with amd/nvidia at twice the price on terms of performance, but it's also too expensive for a cheap gaming rig. And then there are people who are happy with integrated graphics.
Maybe I'm just lacking imagination here, I don't do anything fancy on my work and couch laptops and I have a proper gaming PC.
With SR-IOV* there is a low cost path for GPU in virtual machines. Until now this has (mostly) been a feature exclusive to costly "enterprise" GPUs. Combine that with the good encoders and some VDI software and you have VM hosted GPU accelerated 3D graphics to remote displays. There are many business use cases for this, and no small number of "home lab" use cases as well.
Linux is a first class citizen with Intel's display products, and B50/60 is no different, so it's a nice choice when you want a GPU accelerated Linux desktop with minimum BS. Given the low cost and power, it could find its way into Steam consoles as well.
Finally, Intel is the scrappy competitor in this space: they are being very liberal with third parties and their designs, unlike the incumbents. We're already seeing this with Maxsun and others.
* Intel has promised this for B50/60 in Q4
Therefore I can install Proxmox VE and run multiple VMs, assigning a vGPU to each of them a for video transcoding (IPCam NVR), AI and other applications.
https://github.com/Upinel/PVE-Intel-vGPU
All current Intel Flex cards seem to be based on the previous gen "Xe".
(A half-height single-slot card would be even smaller, but those are vanishingly rare these days. This is pretty much as small as GPUs get unless you're looking more for a "video adapter" than a GPU.)
[1] https://www.maxsun.com/products/intel-arc-pro-b60-dual-48g-t...
Kind of. It's more two 24gb b60s in a trenchcoat. It connects to one slot but it's two completely separate gpus and requires the board to support pcie bifurcation.
And lanes. My board has two PCIe x16 slots fed by the CPU, but if I use both they'll only get x8 lanes each. Thus if I plugged two of these in there, I'd still only have two working GPUs, not four.
The biggest Deepseek V2 models would just fit, as would some of the giant Meta open source models. Those have rather pleasant performance.
In theory, how feasible is that?
I feel like the software stack might be like a Jenga tower. And PCIe limitations might hit pretty hard.
> "Because 48GB is for spreadsheets, feed your rendering beast with a buffet of VRAM."
Edit: I guess must just be a bad translation or sloppy copywriting, and they mean it's not for just spreadsheets rather than it is...
I would happily buy 96 Gb for $3490, but this makes very little sense.
Intel has many, many solid customers at the government, enterprise and consumer levels.
They will be around.
Were they really? I don't think Intel is going anywhere any time soon either, but damn do they seem in bad shape. AMD, didn't they just have lackluster products for a few years and they were kind of the scrappy budget underdogs? I don't recall their fate seeming so...hopeless.
It clocks in at 1503.4 samples per second, behind the NVidia RTX 2060 (1590.93 samples / sec, released Jan 2019), AMD Radeon RX 6750 XT (1539, May 2022), and Apple M3 Pro GPU 14 cores (1651.85, Oct 2023).
Note that this perf comparison is just ray-tracing rendering, useful for games, but might give some clarity on performance comparisons with its competition.
I have this cool and quiet fetish so 70 W is making me extremely interested. IF it also works as a gaming GPU.
Both their integrated and dedicated GPUs have been steadily improving each generation. The Arc line is both cheaper and comparable in performance to more premium NVIDIA cards. The 140T/140V iGPUs do the same to AMD APUs. Their upcoming Panther Lake and Nova Lake architectures seem promising, and will likely push this further. Meanwhile, they're also more power efficient and cooler, to the point where Apple's lead with their ARM SoCs is not far off. Sure, the software ecosystem is not up to par with the competition yet, but that's a much easier problem to solve, and they've been working on that front as well.
I'm holding off on buying a new laptop for a while just to see how this plays out. But I really like how Intel is shaking things up, and not allowing the established players to rest on their laurels.
This reminds me a lot of the LLM craze and how they wanted to charge so much for simple usage at the start until China released deepseek. Ideally we shouldn't rely on China but do we have a choice? the entire US economy has become reliant on monopolies to keep their insanely high stock prices and profit margins