Laptop Isn't Ready for LLMs. That's About to Change

(spectrum.ieee.org)

30 points | by barqawiz 4 hours ago

11 comments

  • Morromist 1 hour ago
    I was in the market for a laptop this month. Many new laptops now advertise AI features like this "HP OmniBook 5 Next Gen AI PC" which advertises:

    "SNAPDRAGON X PLUS PROCESSOR - Achieve more everyday with responsive performance for seamless multitasking with AI tools that enhance productivity and connectivity while providing long battery life"

    I don't want this garbage on my laptop, especially when its running of its battery! Running AI on your laptop is like playing Starcraft Remastered on the Xbox or Factorio on your steamdeck. I hear you can play DOOM on a pregnancy test too. Sure, you can, but its just going to be a tedious inferior experiance.

    Really, this is just a fine example of how overhyped AI is right now.

    • Legend2440 1 hour ago
      Laptop manufacturers are too desperate to cash on the AI craze. There's nothing special about an 'AI PC'. It's just a regular PC with Windows Copilot... which is a standard Windows feature anyway.

      >I don't want this garbage on my laptop, especially when its running of its battery!

      The one bit of good news is it's not going to impact your battery life because it doesn't do any on-device processing. It's just calling an LLM in the cloud.

      • marcus_holmes 41 minutes ago
        Doesn't this lead to a lot of tension between the hardware makers and Microsoft?

        MS wants everyone to run Copilot on their shiny new data centre, so they can collect the data on the way.

        Laptop manufacturers are making laptops that can run an LLM locally, but there's no point in that unless there's a local LLM to run (and Windows won't have that because Copilot). Are they going to be pre-installing Llama on new laptops?

        Are we going to see a new power user / normal user split? Where power users buy laptops with LLMs installed, that can run them, and normal folks buy something that can call Copilot?

        Any ideas?

        • zdragnar 12 minutes ago
          It isn't just copilot that these laptops come with; manufacturers are already putting their own AI chat apps as well.

          For example, the LG gram I recently got came with just such an app named Chat, though the "ai button" on the keyboard (really just right alt or control, I forget which) defaults to copilot.

          If there's any tension at all, it's just who gets to be the default app for the "ai button" on the keyboard that I assume almost nobody actually uses.

        • autoexec 35 minutes ago
          > MS wants everyone to run Copilot on their shiny new data centre, so they can collect the data on the way.

          MS doesn't care where your data is, they're happy to go digging through your C drive to collect/mine whatever they want, assuming you can avoid all the dark patterns they use to push you to save everything on OneDrive anyway and they'll record all your interactions with any other AI using Recall

          • marcus_holmes 29 minutes ago
            I had assumed that they needed the usage to justify the investment in the data centre, but you could be right and they don't care.
      • zamadatix 22 minutes ago
        > It's just a regular PC with Windows Copilot... which is a standard Windows feature anyway.

        "AI PC" branded devices get "Copilot+" and additional crap that comes with that due to the NPU. Despite desktops having GPUs with up to 50x more TOPs than the requirement, they don't get all that for some reason https://www.thurrott.com/mobile/copilot-pc/323616/microsoft-...

      • autoexec 51 minutes ago
        Even collecting and sending all that data to the cloud is going to drain battery life. I'd really rather my devices only do what I ask them to than have AI running the background all the time trying to be helpful or just silently collecting data.
        • Legend2440 13 minutes ago
          Copilot is just ChatGPT as an app.

          If you don't use it, it will have no impact on your device. And it's not sending your data to the cloud except for anything you paste into it.

        • sandworm101 44 minutes ago
          >> I'd really rather my devices only do what I ask them to

          Linux hears your cry. You have a choice. Make it.

      • bitwize 59 minutes ago
        AI PCs also have NPUs which I guess provide accelerated matmuls, albeit less accelerated than a good discrete GPU.
  • socketcluster 1 hour ago
    I feel like there's no point to get a graphics card nowadays. Clearly, graphics cards are optimized for graphics; they just happened to be good for AI but based on the increased significance of AI, I'd be surprised if we don't get more specialized chips and specialized machines just for LLMs. One for LLMs, a different one for stable diffusion.

    With graphics processing, you need a lot of bandwidth to get stuff in and out of the graphics card for rendering on a high-resolution screen, lots of pixels, lots of refreshes, lots of bandwidth... With LLMs, a relatively small amount of text goes in and a relatively small amount of text comes out over a reasonably long amount of time. The amount of internal processing is huge relative to the size of input and output. I think NVIDIA and a few other companies already started going down that route.

    But probably graphics cards will still be useful for stable diffusion; especially AI-generated videos as the inputs and output bandwidth is much higher.

    • zamadatix 29 minutes ago
      > Clearly, graphics cards are optimized for graphics; they just happened to be good for AI

      I feel like the reverse has been true since after the Pascal era.

    • Legend2440 59 minutes ago
      LLMs are enormously bandwidth hungry. You have to shuffle your 800GB neural network in and out of memory for every token, which can take more time/energy than actually doing the matrix multiplies. GPUs are almost not high bandwidth enough.
      • Zambyte 42 minutes ago
        This doesn't seem right. Where is it shuffling to and from? My drives aren't fast enough to load the model every token that fast, and I don't have enough system memory to unload models to.
        • zamadatix 32 minutes ago
          If you're using a MoE model like DeepSeek V3 the full model is 671 GB but only 37 GB are active per token, so it's more like running a 37 GB model from the memory bandwidth perspective. If you do a quant of that it could e.g. be more like 18 GB.
        • Legend2440 28 minutes ago
          From VRAM to the tensor cores and back. On a modern GPU you can have 1-2tb moving around inside the GPU every second.

          This is why they use high bandwidth memory for VRAM.

        • p1esk 32 minutes ago
          It is right. The shuffling is from CPU memory to GPU memory, and from GPU memory to GPU. If you don’t have enough memory you can’t run the model.
        • smallerize 36 minutes ago
          You're probably not using an 800GB model.
    • autoexec 48 minutes ago
      I don't doubt that there will be specialized chips that make AI easier, but they'll be more expensive than the graphics cards sold to consumers which means that a lot of companies will just go with graphics cards, either because the extra speed of specialized chips won't be worth the cost, or will they'll be flat out too expensive and priced for the small number of massive spenders who'll shell out insane amounts of money for any/every advantage (whatever they think that means) they can get over everyone else.
  • aappleby 1 hour ago
    I predict we will see compute-in-flash before we see cheap laptops with 128+ gigs of ram.
    • p1esk 29 minutes ago
      We’ve had “compute in flash” for a few years now: https://mythic.ai/product/
    • zamadatix 20 minutes ago
      I can't tell if this is optimism for compute-in-flash or pessimism with how RAM has been going lately!
    • aitchnyu 32 minutes ago
      Memristors are (IME) missing from the news. They promised to act as both persistent storage and fast RAM.
    • wkat4242 1 hour ago
      Yeah especially since what is happening in the memory market
      • noosphr 1 hour ago
        Feast and famine.

        In three years we will be swimming in more ram than we know what to do with.

        • fallat 1 hour ago
          Kind of feel that's already the case today... 4GB I find is still plenty for even business workloads.
          • autoexec 40 minutes ago
            Video games have driven the need for hardware more than office work. Sadly games are already being scaled back and more time is being spent on optimization instead of content since consumers can't be expected to have the kind of RAM available they normally would and everyone will be forced to make due with whatever RAM they have for a long time.
  • j45 3 minutes ago
    This must be referring mostly to windows, or non-Apple laptops
  • wkat4242 1 hour ago
    This article is so dumb. It totally ignores the memory price explosion that will make large fast memory laptops unfeasible for years and states stuff like this:

    > How many TOPS do you need to run state-of-the-art models with hundreds of millions of parameters? No one knows exactly. It’s not possible to run these models on today’s consumer hardware, so real-world tests just can’t be done.

    We know exactly the performance needed for a given responsiveness. TOPS is just a measurement independent from the type of hardware it runs on..

    The less TOPS the slower the model runs so the user experience suffers. Memory bandwidth and latency plays a huge role too. And context, increase context and the LLM becomes much slower.

    We don't need to wait for consumer hardware until we know much much is needed. We can calculate that for given situations.

    It also pretends small models are not useful at all.

    I think the massive cloud investments will put pressure away from local AI unfortunately. That trend makes local memory expensive and all those cloud billions have to be made back so all the vendors are pushing for their cloud subscriptions. I'm sure some functions will be local but the brunt of it will be cloud, sadly.

    • vegabook 1 hour ago
      also, state of the art models have hundreds of _billions_ of parameters.
      • omneity 1 hour ago
        It tells you about their ambitions..
  • bfrog 1 hour ago
    I suppose it depends on the model, code was useless. As a lossy copy of an interactive Wikipedia it could be ok not good or great just ok.

    Maybe for creative suggestions and editing it’d be ok.

  • seanmcdirmid 1 hour ago
    I’ve been running LLMs on my laptop (M3 Max 64GB) for a year now and I think they are ready, especially with how good mid sized models are getting. I’m pretty sure unified memory and energy efficient GPUs will be more than just a thing on Apple laptops in the next few years.
    • allovertheworld 58 minutes ago
      Only because of Apples unified memory architecture. The groundwork is there, we just need memory to be cheaper so we can fit 512+GB now ;)
      • seanmcdirmid 14 minutes ago
        Memory prices will rise short term and generally fall long term, even with the current supply hiccup the answer is to just build out more capacity (which will happen if there is healthy competition). I meant, I expect the other mobile chip providers to adopt unified architecture and beefy GPU cores on chip and lots of bandwidth to connect it to memory (at the max or ultra level, at least), I think AMD is already doing UM at least?
  • fwipsy 1 hour ago
    Seems like wishful thinking.

    > How many TOPS do you need to run state-of-the-art models with hundreds of millions of parameters? No one knows exactly.

    Why not extrapolate from open-source AIs which are available? The most powerful open-source AI (which I know of) is Kimi K2 and >600gb. Running this at acceptable speed requires 600+gb GPU/NPU memory. Even $2000-3000 AI-focused PCs like the DGX spark or Strix Halo typically top out at 128gb. Frontier models will only run on something that costs many times a typical consumer PC, and only going to get worse with RAM pricing.

    In 2010 the typical consumer PC had 2-4gb of RAM. Now the typical PC has 12-16gb. This suggests RAM size doubling perhaps every 5 years at best. If that's the case, we're 25-30 years away from the typical PC having enough RAM to run Kimi K2.

    But the typical user will never need that much RAM for basic web browsing, etc. The typical computer RAM size is not going to keep growing indefinitely.

    What about cheaper models? It may be possible to run a "good enough" model on consumer hardware eventually. But I suspect that for at least 10-15 years, typical consumers (HN readers may not be typical!) will prefer capability, cheapness, and especially reliability (not making mistakes) over being able to run the model locally. (Yes AI datacenters are being subsidized by investors; but they will remain cheaper, even if that ends, due to economies of scale.)

    The economics dictate that AI PCs are going to remain a niche product, similar to gaming PCs. Useful AI capability is just too expensive to add to every PC by default. It's like saying flying is so important, everyone should own an airplane. For at least a decade, likely two, it's just not cost-effective.

    • sipjca 43 minutes ago
      > It may be possible to run a "good enough" model on consumer hardware eventually

      10-15 years?!!!! What is the definition of good enough? Qwen3 8B or A30B are quite capable models which run on a lot of hardware even today. SOTA is not just getting bigger, it's also getting more intelligence and running it more efficiently. There have been massive gains in intelligence at the smaller model sizes. It is just highly task dependent. Arguably some of these models are "good enough" already, and the level of intelligence and instruction following is much better from even 1 year ago. Sure not Opus 4.5 level, but still much could be done without that level of intelligence.

    • epicureanideal 43 minutes ago
      You may be correct, but I wonder if we'll see Mac Mini sized external AI boxes that do have the 1TB of RAM and other hardware for running local models.

      Maybe 100% of computer users wouldn't have one, but maybe 10-20% of power users would, including programmers who want to keep their personal code out of the training set, and so on.

      I would not be surprised though if some consumer application made it desirable for each individual, or each family, to have local AI compute.

      It's interesting to note that everyone owns their own computer, even though a personal computer sits idle half the day, and many personal computers hardly ever run at 80% of their CPU capacity. So the inefficiency of owning a personal AI server may not be as much of a barrier as it would seem.

      • seanmcdirmid 10 minutes ago
        > but I wonder if we'll see Mac Mini sized external AI boxes that do have the 1TB of RAM

        Isn't that the Mac Studio already? Ok, it seems to max at 512 GB.

    • marcus_holmes 30 minutes ago
      > In 2010 the typical consumer PC had 2-4gb of RAM. Now the typical PC has 12-16gb. This suggests RAM size doubling perhaps every 5 years at best. If that's the case, we're 25-30 years away from the typical PC having enough RAM to run Kimi K2.

      Part of the reason that RAM isn't growing faster is that there's no need for that much RAM at the moment. Technically you can put multiple TB of RAM in your machine, but no-one does that because it's a complete waste of money [0]. Unless you're working in a specialist field 16Gb of RAM is enough, and adding more doesn't make anything noticeably faster.

      But given a decent use-case, like running an LLM locally, and you'd find demand for lots more RAM, and that would drive supply, and new technology developments, and in ten years it'll be normal to have 128TB of RAM in a baseline laptop.

      Of course, that does require that there is a decent use-case for running an LLM locally, and your point that that is not necessarily true is well-made. I guess we'll find out.

      [0] apart from a friend of mine working on crypto who had a desktop Linux box with 4TB of RAM in it.

  • esses 1 hour ago
    I spent a good 30 seconds trying to figure out what DDS was an acronym for in this context.
  • gguncth 45 minutes ago
    I have no desire to run an LLM on my laptop when I can run one on a computer the size of six football fields.
    • sandworm101 33 minutes ago
      I've been playing around with my own home-built AI server for a couple months now. It is so much better than using a cloud provider. It is the difference between drag racing in your own car, and renting one from a dealership. You are going to learn far more doing things yourself. Your tools will be much more consistent and you will walk away with a far greater understanding of every process.

      A basic last-generation PC with something like a 3060ti (12GB) is more than enough to get started. My current rig pulls less than 500w with two cards (3060+5060). And, given the current temperature outside, the rig helps heat my home. So I am not contributing to global warming, water consumption, or any other datacenter-related environmental evil.