I am building a cloud

(crawshaw.io)

319 points | by bumbledraven 4 hours ago

47 comments

  • dajonker 1 hour ago
    > Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.

    So well put, my good sir, this describes exactly my feelings with k8s. It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

    After spending a lot of time "optimizing" or "hardening" the cluster, cloud spend has doubled or tripled. Incidents have also doubled or tripled, as has downtime. Debugging effort has doubled or tripled as well.

    I ended up saying goodbye to those devops folks, nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker. Despite having only a single VM rather than a cluster, things have never been more stable and reliable from an infrastructure point of view. Costs have plummeted as well, it's so much cheaper to run. It's also so much easier and more fun to debug.

    And yes, a single VM really is fine, you can get REALLY big VMs which is fine for most business applications like we run. Most business applications only have hundreds to thousands of users. The cloud provider (Google in our case) manages hardware failures. In case we need to upgrade with downtime, we spin up a second VM next to it, provision it, and update the IP address in Cloudflare. Not even any need for a load balancer.

    • adamtulinius 1 hour ago
      If you spin up Kubernetes for "a couple of containers to run your web app", I think you're doing something wrong in the first place, also coupled with your comment about adding SDN to Kubernetes.

      People use Kubernetes for way too small things, and it sounds like you don't have the scale for actually running Kubernetes.

      • dajonker 1 hour ago
        I totally agree, but that's not what happens in reality: the average devops knows k8s and will slap it onto anything they see (if only so they can put in on their resume). The average manager hears about k8s, gets convinced they need and hires beforementioned devops to build it.
        • goombaskoop 48 minutes ago
          > the average devops knows k8s and will slap it onto anything they see

          This is certainly the case from all the third person accounts I hear. Online. I never actually met a single one that is like that, if anything, those same people are the ones that are first to tell me about their Hetzner setups.

          • hkt 17 minutes ago
            DevOps here.

            The trouble is that we are literally expected to do this everywhere we go. I've personally advocated for approaches which use say, a pair of dedicated servers, or VMs as in GPs example. If you want it outside of AWS/GCP/Azure, you're regarded as a crazy person. If you don't adopt "best practices" (as defined by vendors) then management are scared. Management very often trust the sales and marketing departments of big vendors more than their own staff. Many of us have given up fighting this, because what it comes down to is a massive asymmetry of information and trust.

        • darkwater 33 minutes ago
          And the average developer doesn't even know where to start to deploy things in prod. When the feature product asks passes QA... to the next sprint! we are done!
      • Thanemate 44 minutes ago
        I know that "resume-driven development" exists, where the tradeoffs between approaches aren't about the technical fit of the solution but the career trajectory. I've seen people making plain workstation preparation scripts using Rust, only to have something to flex about in interviews.

        I'm not surprised even in the slightest that DevOps workers will slap k8s on everything, to show "real industry experience" in a job market where the resume matches the tools.

        • ororoo 35 minutes ago
          there are alsp people with devops title that do not know anything else than the hammer, and then everything is a hammer problem.

          I mean, I worked with people who were suprised that you can run more applications inside ec2 vm than just 1 app.

      • rvz 1 hour ago
        They use it for inflating their resume for career progression rather than actually evaluating if they need it in the first place.

        This is why you get many folks over-thinking the solution and picking the most hyped technologies and using them to solve the wrong problems without thinking about what they are selling.

        You don't need K8s + AWS EC2 + S3 just to host a web app. That tells me they like lighting money on fire and bankrupting the company and moving to the next one.

    • psviderski 16 minutes ago
      A single VM is indeed the most pragmatic setup that most apps really need. However I still prefer to have at least two for little redundancy and peace of mind. It’s just less stressful to do any upgrades or changes knowing there is another replica in case of a failure.

      And I’m building and happily using Uncloud (https://github.com/psviderski/uncloud) for this (inspired by Kamal). It makes multi-machine setups as simple as a single VM. Creates a zero-config WireGuard overlay network and uses the standard Docker Compose spec to deploy to multiple VMs. There is no orchestrator or control plane complexity. Start with one VM, then add another when needed, can even mix cloud VMs and on-prem.

    • eddythompson80 1 hour ago
      And those devops folks just let your single debian VM be? It sounds like you have, like many of us, an organizational/people problem, not a k8s problem.

      Maybe those devops folks only pay attention to k8s clusters and you're flying under their radar with your single debian VM + Kamal. But the same thinking that results in an overtly complex, impossible to debug, expensive to run k8s cluster can absolutely result in the same using regular VMs unless, again, you are just left to your own devices because their policies don't apply to VMs, yet.

      The problem usually is you're one mistake away from someone shoving their nose in it. "What are you doing again? What about HA and redundancy? slow rollout and rollback? You must have at least 3 VMs (ideally 5) and can't expose all VMs to the internet of course. You must define a virtual network with policies that we can control and no wireguard isn't approved. You must split the internet facing load balancer from the backend resources and assign different identities with proper scoping to them. Install these 4 different security scanners, these 2 log processors, this watchdog and this network monitor. Are you doing mtls between the VMs on the private network? what if there is an attacker that gains access to your network? What if your proxy is compromised? do you have visibility into all traffic on the network? everything must flow throw this appliance"

      • onlybosshaskeys 43 minutes ago
        I mean, it's pretty clear the only reason they even got to swap to a single VM and take the glory is because they fired the devops in question. As in, they're the actual boss of a small operation. That's what saying goodbye and nuking the cluster implies here.
    • bfivyvysj 1 hour ago
      I thought we collectively learned this with stack overflows engineering blog years ago.

      Scale vertically until you can't because you're unlikely to hit a limit and if you do you'll have enough money to pay someone else to solve it.

      Docker is amazing development tooling but it makes for horrible production infrastructure.

      • KronisLV 7 minutes ago
        Docker is great development tooling (still some rough edges, of course).

        Docker Compose is good for running things on a single server as well.

        Docker Swarm and Hashicorp Nomad are good for multi-server setups.

        Kubernetes is... enterprise and I guess there's a scale where it makes sense. K3s and similar sort of fill the gap, but I guess it's a matter of what you know and prefer at that point.

        Throw on Portainer on a server and the DX is pretty casual (when it works and doesn't have weird networking issues).

        Of course, there's also other options for OCI containers, like Podman.

    • sibellavia 1 hour ago
      Clearly, Kubernetes wasn’t the right solution for your case, and I also agree that using it for smaller architectures is overkill. That said, it’s the standard for large-scale production platforms that need reproducibility and high availability. As of today I don’t see many *truly* viable alternatives and honestly I haven't even seen them.
    • ferngodfather 1 hour ago
      Cloud providers have put a lot of time and effort into making you believe every web app needs 99.9999% availability. Making you pay for auto scaled compute, load balancers, shared storage, HA databases, etc, etc.

      All of this just adds so much extra complexity. If I'm running Amazon.com then sure, but your average app is just fine on a single VM.

      • gloomyday 36 minutes ago
        Marketing has such a gigantic influence in our field. It is absolutely insane. It feels unavoidable, since IT is (was?) constantly filled with new blood that picks up where people left off.
    • serbrech 29 minutes ago
      Yes, I mean, I’m an engineer on a cloud Kubernetes service, and I don’t run Kubernetes for my home services. I just run podman quadlets (systems units). But that is entirely different from an enterprise scale setup with monitoring, alerting, and scale in mind…
    • PunchyHamster 24 minutes ago
      Well, you used a tank to plow a field then complained about maintenance and fuel usage.

      If you have actual need to deploy few dozen services all talking with eachother k8s isn't bad way to do it, it has its problems but it allows your devs to mostly self-service their infrastructure needs vs having to process ticket for each vm and firewall rules they need. That is saying from perspective of migrating from "old way" to 14 node actual hardware k8s cluster.

      It does make debugging harder as you pretty much need central logging solution, but at that scale you want central logging solution anyway so it isn't big jump, and developers like it.

      Main problem with k8s is frankly nothing technical, just the "ooh shiny" problem developers have where they see tech and want to use tech regardless of anything

    • dgb23 1 hour ago
      > Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

      I'm not familiar with kubernetes, but doesn't it already do SDN out of the box?

      • mystifyingpoi 44 minutes ago
        > doesn't it already do SDN out of the box

        Yes and no. Kubernetes defines specification about network behavior (in form of CNI), but it contains no actual implementation. You have to install the network plugin basically as the first setup step.

    • yard2010 1 hour ago
      I don't get it, I think that k8s is the best software written since win95. It redefines computing in the same way IMHO. I have some experience in working with k8s on prod and I loved every moment of it. I'm definitely missing something.
      • RyanHamilton 1 hour ago
        Can you expand how it redefined computing for you personally?
    • robshep 1 hour ago
      If you replaced k8s with a single app on a single VM then you’ve taken a hype fuelled circuitous route to where you should have been anyway.
    • 1dom 1 hour ago
      I think this comment and replies capture the problem with Kubernetes. Nobody gets fired for choosing Kubernetes now.

      It's obvious to you, me and the other 2 presumably techie people who've responded within 15 mins that you shouldn't have been using Kubernetes. But you probably work in a company of full of techie people, who ended up using Kubernetes.

      We have HN, an environment full of techie people here who immediately recognise not to use k8s in 99% of cases, yet in actually paid professional environments, in 99% of cases, the same techie people will tolerate, support and converge on the idea they should use k8s.

      I feel like there's an element of the emperors new clothes here.

    • marcosscriven 1 hour ago
      First time I’ve heard of Kamal. Looks ideal!

      Do you pair it with some orchestration (to spin up the necessary VM)?

    • wernerb 1 hour ago
      DevOps lost the plot with the Operator model. When it was being widely introduced as THE pattern I was dismayed. These operators abstract entirely complex services like databases behind yaml and custom go services. When going to kubecon i had one guy tell me he collects operators like candy. Answers on Lifecycle management, and inevitable large architectural changes in an ever changing operator landscape was handwaved away with series of staging and development clusters. This adds so much cost.. Fundamentally the issue is the abstractions being too much and entirely on the DevOps side of the "shared responsibility model". Taking an RDBMS from AWS of Azure is so vastly superior to taking all that responsibility yourself in the cluster.. Meanwhile (being a bit of an infrastructure snob) I run Nixos with systemd oci containers at home. With AI this is the easiest to maintain ever.
      • lifty 1 hour ago
        Those managed databases from the big cloud providers have even more machinery and operator patterns behind them to keep them up and running. The fact that it's hidden away is what you like. So the comparison makes no sense.
  • stingraycharles 4 hours ago
    Potentially useful context: OP is one of the cofounders of Tailscale.

    > Traditional Cloud 1.0 companies sell you a VM with a default of 3000 IOPS, while your laptop has 500k. Getting the defaults right (and the cost of those defaults right) requires careful thinking through the stack.

    I wish them a lot of luck! I admire the vision and am definitely a target customer, I'm just afraid this goes the way things always go: start with great ideals, but as success grows, so must profit.

    Cloud vendor pricing often isn't based on cost. Some services they lose money on, others they profit heavily from. These things are often carefully chosen: the type of costs that only go up when customers are heavily committed—bandwidth, NAT gateway, etc.

    But I'm fairly certain OP knows this.

    • faangguyindia 1 hour ago
      i was just curious so i tested this actually.

      Using fio

      Hetzner (cx23, 2vCPU, 4 GB) ~3900 IOPS (read/write) ~15.3 MB/s avg latency ~2.1 ms 99.9th percentile ≈ ~5 ms max ≈ ~7 ms

      DigitalOcean (SFO1 / 2 GB RAM / 30 GB Disk) ~3900 IOPS (same!) ~15.7 MB/s (same!) avg latency ~2.1 ms (same!) 99.9th percentile ≈ ~18 ms max ≈ ~85 ms (!!)

      using sequential dd

      Hetzner: 1.9 GB/s DO: 850 MB/s

      Using low end plan on both but this Hetzner is 4 euro and DO instance is $18.

      • zuhsetaqi 45 minutes ago
        Just for comparison I use the cheapest netcup root server:

        RS 1000 G12 AMD EPYC™ 9645 8 GB DDR5 RAM (ECC) 4 dedicated cores 256 GB NVMe

        Costs 12,79 €

        Results with the follwing command:

        fio --name=randreadwrite \ --filename=testfile \ --size=5G \ --bs=4k \ --rw=randrw \ --rwmixread=70 \ --iodepth=32 \ --ioengine=libaio \ --direct=1 \ --numjobs=4 \ --runtime=60 \ --time_based \ --group_reporting

        IOPS Read: 70.1k IOPS Write: 30.1k IOPS ~100k IOPS total

        Throughput Read: 274 MiB/s Write: 117 MiB/s

        Latency Read avg: 1.66 ms, P99.9: 2.61 ms, max 5.644 ms Write avg: 0.39 ms, P99.9: 2.97 ms, max 15.307 ms

        • Medowar 24 minutes ago
          That is a bit of a unfair comparison. The Hetzner and DO instances are shared hosting, you are using dedicated ressources.

          Using a Netcup VPS 1000 G12 is more comparable.

          read: IOPS=18.7k, BW=73.1MiB/s

          write: IOPS=8053, BW=31.5MiB/s

          Latency Read avg: 5.39 ms, P99.9: 85.4 ms, max 482.6 ms

          Write avg: 3.36 ms, P99.9: 86.5 ms, max 488.7 ms

        • yread 27 minutes ago
          Nice, on Hetzner AX41-nvme (~50 eur, from 2020) non-raid I get:

          IOPS: read 325k, write 139k

          Throughput: read 1271MB/s, write 545MB/s

          Latency: read avg 0.3ms, P99.9 2.7ms, max 20ms; write: 0.14ms, P99.9 0.35ms max 3.3ms

          so roughly 100 times iops and throughput of the cloud VMs

      • yard2010 1 hour ago
        I love Hetzner so much. I'm not affiliated I'm a really happy customer these guys just do everything right.
    • torginus 1 hour ago
      >3000 IOPS

      If that's true, I wonder if this is a deliberate decision by cloud providers to push users towards microservice architectures with proprietary cloud storage like S3, so you can't do on-machine dbs even for simple servers.

    • sroussey 3 hours ago
      Many cloud vendors have you pay through the nose for IOPS and bandwidth.

      Edit: I posted this before reading, and these two are the same he points out.

      • stingraycharles 2 hours ago
        Yes, but you can’t directly compare SAN-style storage with a local NVMe. But I agree that it’s too expensive, but not nearly as insane as the bandwidth pricing. If you go to a vendor and ask for a petabyte of storage, and it needs to be fully redundant, and you need the ability to take PIT-consistent multi-volume snapshots, be ready to pay up. And this is what’s being offered here.

        And yes, IO typically happens in 4kb blocks, so you need a decent amount of IOPS to get the full bandwidth.

    • fragmede 1 hour ago
      > Cloud vendor pricing often isn't based on cost.

      Business 101 teaches us that pricing isn't based on cost. Call it top down vs bottom up pricing, but the first principles "it costs me $X to make a widget, so 1.y * $X = sell the product for $Y is not how pricing works in practice.

      • _el1s7 1 hour ago
        That's not a business 101.
        • lelanthran 34 minutes ago
          > That's not a business 101.

          It kinda is, but obscured by GP's formula.

          More simply; if it costs you $X to produce a product and the market is willing to pay $Y (which has no relation to $X), why would you price it as a function of $X?

          If it costs me $10 to make a widget and the market is happy to pay $100, why would I base my pricing on $10 * 1.$MARGIN?

          • carefree-bob 30 minutes ago
            Exactly. The mechanism by which the price ends up as X plus margin is just competition. Others enter the market and compete with you until the returns are driven down to the rental rate of capital. Any barriers to entry result in higher margins.

            But that is an equilibrium result, and famously does not apply to monopolies, where elasticity of substitution will determine the premium over the rental rate of capital.

      • jeffrallen 1 hour ago
        Just to spell this out more clearly for the back row.of the classroom:

        The price is what the customer will pay, regardless of your costs.

        • barrkel 1 hour ago
          Economics teaches us that a big difference between cost and price attracts competition which should make the price trend towards the cost.
          • ncruces 10 minutes ago
            Only if the barrier of entry is low.

            Which it won't be, if at every turn you choose the hyperscaler.

          • _el1s7 1 hour ago
            Exactly.
  • clktmr 2 hours ago
    > Agents, by making it easiest to write code, means there will be a lot more software. Economists would call this an instance of Jevons paradox. Each of us will write more programs, for fun and for work.

    There is already so much software out there, which isn't used by anyone. Just take a look at any appstore. I don't understand why we are so obsessed with cranking out even more, whereas the obvious usecase for LLMs should be to write better software. Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.

    • croemer 6 minutes ago
      That's not what Jevons paradox means though. He's just name dropping some concept.

      Jevons paradox would be if despite software becoming cheaper to produce the total spend on producing software would increase because the increase in production outruns the savings

      Jevons paradox applies when demand is very elastic, i.e. small changes in price cause large changes in quantity demanded. It's a property of the market.

    • delbronski 1 hour ago
      I think we, as engineers, are a bit stuck on what “software” has traditionally been. We think of systems that we carefully build, maintain, and update. Deterministic systems for interacting with computers. I think these “traditional” systems will still be around. But AI has already changed the way users interact with computers. This new interaction will give rise to another type of software. A more disposable type of software.

      I believe right now we are still in the phase of “how can AI help engineers write better software”, but are slowly shifting to “how can engineers help AI write better software.” This will bring in a new herd of engineers with completely different views on what software is, and how to best go about building computer interactions.

    • skybrian 2 hours ago
      Sometimes “better” means “customized for my specific use case.” I expect that there will be a lot of custom software that never appears in any app store.
      • stingraycharles 2 hours ago
        The amount of single purpose scripts in my ~/playground/ folder has increased dramatically over the past year. Super useful, wouldn’t have had the time for it otherwise, but not in any way shareable. Eg “parse this excel sheet I got from my obscure bank and upload it to my budgeting app’s REST API”. Wouldn’t have had the time or energy to do this before, now I have it and it scratches an itch.
      • AussieWog93 2 hours ago
        This. Just today I added a full on shopping list system to our internal dashboard at work (small business) simply because it was slightly annoying and could be solved in 3 prompts and 15 minutes.
    • esjeon 1 hour ago
      > Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.

      My view is actually the opposite. Software now belongs to cattle, not pet. We should use one-offs. We should use micro-scale snippets. Speaking language should be equivalent to programming. (I know, it's a bit of pipe dream)

      In that sense, exe.dev (and tailscale) is a bit like pet-driven projects.

    • cush 1 hour ago
      > I don't understand why we are so obsessed with cranking out even more... the obvious usecase for LLMs should be to write better software

      I honestly think this is ideal. Video games aside, I think one day we'll look back and realize just how insane it was that we built software for millions or even billions of users to use. People can now finally build the software that does exactly what they've wanted their software to do without competing priorities and misaligned revenue models working against them. One could argue this kind of software, by definition, is higher quality.

    • dgb23 2 hours ago
      Both will likely happen to some degree.

      As for the average quality: it’s unclear.

      My intuition is that agents lift up the floor to some degree, but at the same time will lead to more software being produced that’s of mediocre quality, with outliers of higher quality emerging at a higher rate than before.

    • andai 2 hours ago
      Alas, we shifted from quality to quantity somewhere in the mid 19th century.
    • rvz 1 hour ago
      There will be only 1 Microsoft® Excel, 1 Google Sheets and 1 LibreOffice and the rest are billions of dead vibe-coded "Excel killers" that no-one uses.
      • fragmede 1 hour ago
        Except that list originally had one item, and that item was Visicalc. Times change, but that list is going to stop being relevant before Excel gets knocked off the list.

        If you're doing anything complicated, Excel just doesn't make sense anymore. it'll still the be data exchange format (at least, something more advanced than csv), but it's no longer the only frontend.

        "No one uses" is no longer the insult it once was. I don't need or want to make software for every last person on the world to use. I have a very very small list of users (aka me) that I serve very well with most of the software that I generate these days outside of work.

  • farfatched 3 hours ago
    Nice post. exe.dev is a cool service that I enjoyed.

    I agree there is opportunity in making LLM development flows smooth, paired with the flexibility of root-on-a-Linux-machine.

    > Time and again I have said “this is the one” only to be betrayed by some half-assed, half-implemented, or half-thought-through abstraction. No thank you.

    The irony is that this is my experience of Tailscale.

    Finally, networking made easy. Oh god, why is my battery doing so poorly. Oh god, it's modified my firewall rules in a way that's incompatible with some other tool, and the bug tracker is silent. Now I have to understand their implementation, oh dear.

    No thank you.

    • LoganDark 2 hours ago
      I find it difficult to configure Tailscale for my use case because they seem to completely not support making ACL rules based on the identity of the device rather than a part of the address space. I'm not configuring a router here, I'm configuring a peer-to-peer networking layer... or at least I'm supposed to be...
      • spockz 1 hour ago
        I remember from the docs you can use node names. At the very least you can use tags for sure. Assign tags to nodes and define the ACL based on those.
        • LoganDark 1 hour ago
          Last I read the docs while troubleshooting this very problem, you cannot specify node names as the source or destination of a grant. You can specify direct IP address ranges, node groups (including autogenerated ones) or tags, but not names.

          Tags permanently erase the user identity from a device, and disable things like Taildrop. When I tried to assign a tag for ACLs, I found that I then could not remove it and had to endure a very laborous process to re-register a Tailscale device that I added to Tailscale for the express purpose of remotely accessing

  • faangguyindia 3 hours ago
    i just use Hetzner.

    Everything which cloud companies provide just cost so much, my own postgres running with HA setup and backup cost me 1/10th the price of RDS or CloudSQL service running in production over 10 years with no downtime.

    i directly autoscales instances off of the Metrics harvested from graphana it works fine for us, we've autoscaler configured via webhooks. Very simple and never failed us.

    i don't know why would i even ever use GCP or AWS anymore.

    All my services are fully HA and backup works like charm everyday.

    • Manfred 3 hours ago
      Companies buy cloud services because they want to reduce in-house server management and operations, for them it's a trade-off with hiring the right people. But you are right, when you can find the right people doing it yourself can be a lot cheaper.
      • mrweasel 1 hour ago
        In some sense I'm starting to think it has more to do with accounting. Hardware, datacenters and software licenses (unless it's a subscription, which is probably is these days) are capital expenses, cloud is an operation expense. Management in a lot of companies hates capital expenditures, presumable because it forces long term thinking, i.e. three to five years for server hardware. Better to go the cloud route and have "room for manoeuvrability". I worked for a company that would hire consultants, because "you can fire those at two weeks notice, with no severance". Sure, but they've been here for five years now, at twice the cost of actual staff. Companies like that also loves the cloud.

        Whether or not cloud is viable for a company is very individual. It's very hard to pin point a size or a use case that will always make cloud the "correct" choice.

      • fnoef 3 hours ago
        Right... That's why the hire "AWS Certified specialist ninja"
      • Tepix 3 hours ago
        I get the feeling that with LLMs in the mix, in-house server management can do a lot more than it used to.
        • tgv 2 hours ago
          Perhaps it saves some time looking through the docs, but do you really trust an LLM to do the actual work?
          • windex 2 hours ago
            Yes and an LLM checks it as well. I am yet to find a sysadmin task that an LLM couldn't solve neatly.
            • jdkoeck 1 hour ago
              A nice bonus is that sysadmin tasks tend to be light in terms of token usage, that’s very convenient given the increasingly strict usage limits these days.
          • andoando 1 hour ago
            Yes, with a lot of reviewing what its doing/asking questions, 100%
          • fragmede 1 hour ago
            By this point? Absolutely. They still get stuck in rabbit holes and go down the wrong path sometimes, so it's not fully fire and forget, but if you aren't taking advantage of LLMs to perform generic sysadmin drudgery, you're wasting your time that could be better spent elsewhere.
    • alishayk 8 minutes ago
      I find it interesting that Hetzner was never a consideration, until... LLMs started recommending them.
    • huijzer 3 hours ago
      Agree, I used to always use Heroku or Render style platforms for my own software, but nowadays I just have a Linux server with Docker Compose and a Cron job. The cron job every minute runs docker pull (downloads latest image) and docker up -d (switches to new version only if there is a new version). And put caddy in front for the HTTPS. This has been very cheap and reliable for years now.
      • RandomBK 24 minutes ago
        One annoyance (I don't know if they've since fixed it) was that Docker Hub would count pulls that don't contain an update towards the rate limit. That ultimately prompted me to switch to alternate repositories.
      • saltmate 2 hours ago
        What images are you running that you'd need the latest version up after just a minute?
        • burner420042 2 hours ago
          I'm not the OP but I'd clarify the cron check for new versions is done every minute. So when new images are pushed they're picked up quickly.

          OP is not saying they push new versions at such a high frequency they need checks every one minute.

          The choice of one minute vs 15 minute is implementation detail and when architected like this costs nothing.

          I hope that helps. Again this is my own take.

    • pants2 2 hours ago
      Especially these days you can SSH to a baremetal server and just tell Claude to set up Postgres. Job done. You don't need autoscaling because you can afford a server that's 5X faster from the start.
      • i5heu 2 hours ago
        You just use docker.

        It is like 4 lines of config for Postgres, the only line you need to change is on which path Postgres should store the data.

        • spockz 1 hour ago
          You also probably want the Postgres storage on a different (set) of disks.

          Maybe change the filesystem?

    • kippinsula 1 hour ago
      we've done both. Hetzner dedicated was genuinely fine, until a disk started throwing SMART warnings on a Sunday morning and we remembered why we pay 10x elsewhere for some things. probably less about the raw cost and more about which weekends you want back.
      • omnimus 26 minutes ago
        Isn't this nature of every dedicated server? You also take on the hardware management burden - that's why they can be insanely cheap.

        But there is middleground in form of VPS, where hardware is managed by the provider. It's still way way cheaper than some cloud magic service.

        • RandomBK 21 minutes ago
          VPS comes at the cost of potential for oversubscription - even from more reputable vendors. You never really know if you're actually getting what you're paying for.
    • TiccyRobby 2 hours ago
      Honestly I like Hetzner a lot but lately it has been very unstable for us. https://status.hetzner.com/ this page always has couple of incidents happening at the same time. I really appreciate the services they provide but i wish they were more stable.
      • lifty 57 minutes ago
        There are several things going on even now, 1 hour after your comment. But I appreciate that they list them. That hopefully means that they have a good culture of honesty, and they can improve.
        • omnimus 30 minutes ago
          I looked through the issues and basically only ongoing thing is that backup power is not working in one of the data centers (could be a problem). The rest are warnings about planned shutdown of some services and speed limitation of object storage in one location.

          I am sure it's luck but we have few hetzner VPSes in both German locations and in last 5 years afaik they've never been down. On our http monitor service they have 100s of days uptime only because we restarted them ourselves.

    • kubb 3 hours ago
      [dead]
    • MagicMoonlight 2 hours ago
      Because if I have a government service with millions of users, I don’t want the cheap shitter servers to crap out on me.

      An employee is going to cost anywhere between 8k and 50k per month. Hiring an employee to save 200/month on servers by using a shitty VPS provider is not saving you any money.

      • kennywinker 1 hour ago
        If you have millions of users, you absolutely need to have someone whose whole job is managing infrastructure. Expecting servers or cloud services to not crap out on you without someone with the skills and time to keep things running seems foolish.
  • sahil-shubham 54 minutes ago
    The point about VMs being the wrong shape because they’re tied to CPU/memory resonates hard. The abstraction forces you to pay for time, not work.

    I ended up buying a cheap auctioned Hetzner server and using my self-hostable Firecracker orchestrator on top of it (https://github.com/sahil-shubham/bhatti, https://bhatti.sh) specifically because I wanted the thing he’s describing — buy some hardware, carve it into as many VMs as I want, and not think about provisioning or their lifecycle. Idle VMs snapshot to disk and free all RAM automatically. The hardware is mine, the VMs are disposable, and idle costs nothing.

    The thing that, although obvious, surprised me most is that once you have memory-state snapshots, everything becomes resumable. I make a browser sandbox, get Chromium to a logged-in state, snapshot it, and resume copies of that session on demand. My agents work inside sandboxes, I run docker compose in them for preview environments, and when nothing’s active the server is basically idle. One $100/month box does all of it.

  • celrenheit 1 hour ago
    Shameless plug: https://clawk.work/

    `ssh you/repo/[email protected]` → jump directly into Claude Code (or Codex) with your repo cloned and credentials injected. Firecracker VMs, 19€/mo.

    POC, please be kind.

    • chimpanzee2 1 hour ago
      honestly sounds interesting

      at 19€/mo are you subsidizing it given the sharp rise of LLM costs lately?

      or are you heavily restricting model access. surely there is no Opus?

      • celrenheit 52 minutes ago
        The 19€/mo is infra only. Claude Code inside the VM signs in via OAuth to the user's own Anthropic account. I'd love to explore bundling open models (Qwen, etc..) into the subscription down the line, but that needs product validation first, not going to ship something I'm not sure people actually want.
  • socketcluster 1 hour ago
    Virtual machines are the wrong abstraction. Anyone who has worked with startups knows that average developers cannot produce secure code. If average developers are incapable of producing secure code, why would average non-technical vibe-coders be able to? They don't know what questions to ask. There's no way vibe coders can produce secure backend software with or without AI. The average software that AI is trained on is insecure. If the LLM sees a massive pile of fugly vibe-coded spaghetti and you tell it "Make it secure please", it will turn into a game of Whac-a-Mole. Patch a vulnerability and two new ones appear. IMO, the right solution is to not allow vibe-coders to access the backend. It is beyond their capabilities to keep it secure, reliable and scalable, so don't make it their responsibility. I refuse to operate a platform where a non-technical user is "empowered" to build their own backend from scratch. It's too easy to blame the user for building insecure software. But IMO, as a platform provider, if you know that your target users don't have the capability to produce secure software, it's your fault; you're selling them footguns.
  • qxmat 1 hour ago
    Europe is crying out for sovereign clouds. If this is to be a viable alt cloud, US jurisdiction is a no.

    Not sure we can move away from cpu/memory/io budgeting towards total metal saturation because code isn't what it used to be because no one handles malloc failure any more, we just crash OOM

    • Quothling 17 minutes ago
      Europe is already moving into the EU cloud. Hetzner, OGH Cloud and so on as well as local data centers where partner companies set up own cloud with various things to rival office 365. So far it's mainly the public sector. My own city cut their IT budget by 70% by switching from Microsoft.

      The key point is the partner companies. Almost nobody is actually running their own clouds the way they would with various 365 products, AWS or Azure. They buy the cloud from partners, similar to how they used to (and still do) buy solutions from Microsoft partners. So if you want to "sell cloud" you're probably going to struggle unless you get some of these onboard. Which again would probably be hard because I imagine a lot of what they sell is sort of a package which basically runs on VM's setup as part of the package that they already have.

    • effisfor 42 minutes ago
      For anybody interested, the meat of 'EU sovereign' means EU companies, not US or UK companies with EU servers. (because of CLOUD Act and the UK-US bilateral arrangement connected to it).

      International visitors might tell us more about benefits of non EU, US or UK nexus companies/legal/rights.

  • boesboes 1 hour ago
    I have mixed feelings about this concept, I agree that the way clouds work now is far from great and stronger abstractions are possible. But this article offers nothing of the sort, it just handwaves 'we solve some problem and that saves you tokens'???

    Checking the current offering, it's just prepaid cloud-capacity with rather low flexibility. It's cheap though, so that is nice I guess. But does this solve anything new? Anything fly.io orso doesn't solve?

    What is the new idea here? Or is it just the vibes?

  • zackify 4 hours ago
    That's insane funding so congrats.

    Just shows I'm the Dropbox commentator. I have what exe provides on my own and am shocked by the value these abstractions provide everyone else!! One off containers on my own hardware spin up spin down run async agents, etc, tailscale auth, team can share or connect easily by name.

    • sixhobbits 2 hours ago
      Investment is done by relationships, belief in a future vision and team, and growth metrics like number of paying customers.

      The technology itself in its current form is not valuable

      • isoprophlex 2 hours ago
        Sobering comment for all the little people like myself who dream of owning a business based on a vision of cool tech that just does what it promises (as opposed to all the corporate shovelware out there)
        • dgb23 2 hours ago
          You can still do that. Not every business needs to be a hyperscaling startup.
  • st-keller 3 hours ago
    Hahaha! Have fun! I‘m doing the same - together with Claude Code. Since August. With https (mTLS1.3) everywhere, because i can. Just my money, just my servers, just for me. Just for fun. And what a fun it is!
    • anonzzzies 3 hours ago
      Me too. I already moved our products to it and it is getting fairly robust. Guess many smaller companies got tired with the big guys asking a lot of money for things that should be cheap.
    • setnone 3 hours ago
      Yeah i feel like it's getting cloudy
  • tee-es-gee 46 minutes ago
    I will follow this one for sure. There are a few more companies with the extremely ambitious goal of "a better AWS", and I am interested in the various strategies they take to approach that goal incrementally.

    A service offering VMs for $20 is a long way from AWS, but I see how it makes sense as a first step. AWS also started with EC2, but in a completely different environment with no competition.

  • sroussey 3 hours ago
    > The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center.

    Oh, that’s too kind. More like 100x to 1000x. Raw bandwidth is cheap.

  • bedstefar 1 hour ago
    This looks like an excellent platform for running a "homelab" in the cloud (no, the irony is not lost on me) for lighter stuff like Readeck, Calibre-web, Immich. Maybe even Home Assistant too if we can find a way (Tailscale?) to get the mDNS/multicast traffic tunnelled.
    • omnimus 18 minutes ago
      With pricing 100gb/8usd Immich would be wildly uneconomical. Better to wait for upcoming immich hosting to support the project or use ente.io - those are 1tb/10usd.
  • PunchyHamster 12 minutes ago
    The author seems to have no clue what is cloud problem, and what is k8s problem, and is blaming everything on k8s. The whole post reeks of ignorance. I have no love to k8s but he is just flat out putting out false information.

    > Finally, clouds have painful APIs. This is where projects like K8S come in, papering over the pain so engineers suffer a bit less from using the cloud.

    K8s's main function isn't to paint over existing cloud APIs, that is just necessity when you deploy it in cloud. On normal hardware it's just an orchestration layer, and often just a way to pass config from one app to another in structured format.

    > But VMs are hard with Kubernetes because the cloud makes you do it all yourself with lumpy nested virtualization.

    Man discovered system designed for containers is good with containers, not VMs. More news at 10

    > Disk is hard because back when they were designing K8S Google didn’t really even do usable remote block devices, and even if you can find a common pattern among clouds today to paper over, it will be slow.

    Ignorance. k8s have abstractions over a bunch of types of storage, for example using Ceph as backend will just use KVM's Ceph backend, no extra overhead. It also supports "oldschool" protocols used for VM storage like NFS or iSCSI. It might be slow in some cases for cloud if cloud doesn't provide enough control, but that's not k8s fault.

    > Networking is hard because if it were easy you would private link in a few systems from a neighboring open DC and drop a zero from your cloud spend.

    He mistakes cloud problems with k8s problems(again). All k8s needs is visibility between nodes. There are multiple providers to achieve that, some with zero tunelling, just routing. It's still complex, but no more than "run a routing daemon".

    I expect his project to slowly reinvent cloud APIs and copying what k8s and other projects did once he starts hitting problems those solutions solved. And do it worse, because instead of researching of why and why not that person seems to want to throw everything out with learning no lessons.

    Do not give him money

  • k9294 2 hours ago
    That's really cool!

    One thing I'm confused with is how to create a shared resources like e.g. a redis server and connect to it from other vms? It looks now quite cumbersome to setup tailscale or connect via ssh between VMS. Also what about egress? My guess is that all traffic billed at 0.07$ per GB. It looks like this cloud is made to run statefull agents and personal isolated projects and distributed systems or horizontal scaling isn't a good fit for it?

    Also I'm curious why not railway like billing per resource utilization pricing model? It’s very convenient and I would argue is made for agents era.

    I did setup for my friends and family a railway project that spawns a vm with disk (statefull service) via a tg bot and runs an openclaw like agent - it costs me something like 2$ to run 9 vms like this.

  • sudo_cowsay 1 hour ago
    I'm still new to cloud computing. I've only ever used linode. What is this supposed to be? I couldn't figure out a specific design through the article well. Pls help
  • pjc50 2 hours ago
    The "one price" is oddly small for a cloud company. I'm sure it's nice and fast but the $20/mo seems smaller than some companies' free tiers, especially for disk.

    The main reason clouds offer network block devices is abstraction.

    • imafish 2 hours ago
      Don’t worry - that will certainly change in the future if they have any kind of success :)
  • ianpurton 3 hours ago
    I don't get it, what is this, how is it different?
    • szszrk 1 hour ago
      You choose a region. Then you pay for some compute size (vcpu and mem), and then you can create a lot of VMs using those limits. If some VM's don't consume all resources, others can consume it in burst.

      VMs have a built-in gateway to cloud providers with a fixed url with no auth. You can top that in via the service itself. No need for your own keys.

      So likely a good tool for managing AI agents. And "cloud" is a bit of a stretch, the service is very narrow.

      The complete lack of more detailed description of the regions except city name makes it really only suitable for ephemeral/temporary deployments. We don't know what the datacenters are, what redundancy is in place, no backups or anything like that.

    • saltmate 2 hours ago
      As I understand, a cloud provider where instead of paying for each VM (with a set of resources), you pay for the resources, and can get as many VMs as you can fit on these resources.
  • qaq 2 hours ago
    With LLMs there is no real dev velocity penalty of using high perf. langs like say Rust. A pair of 192 Core AMD EPYC boxes will have enough headroom for 99.9% of projects.
    • kennywinker 1 hour ago
      That’ll be true for the 0.1% of project that were limited by the speed of their programming language. For the other 99.9% of projects their vibe coded rust can fly and their database, network, or raw computation will still be the bottleneck.

      (Percentages cited above are tongue-in-cheek, actual numbers are probably different)

  • esher 2 hours ago
    Much respect for the ambitous plan, I wish I could do such bold thinking. I am running a small PHP PaaS (fortrabbit) for more than 10 years. For me, it's not only "scratch your own itch", but also "know your audience". So, a limited feature set with a high level of abstraction can also be useful for some users > clear path.
  • _nhh 41 minutes ago
    just take a look at hetzner cloud. Its everything 99% of the people need, good pricing. Convert that ux to terminal and you done
  • 47872324 3 hours ago
    exe.dev. 111 IN A 52.35.87.134

    52.35.87.134 <- Amazon Technologies Inc. (AT-88-Z)

    • skybrian 2 hours ago
      Their first location (PDX) is on Amazon I believe and not accepting new customers. They’ve said it’s much more expensive for them than the others. Their other locations are listed here:

      https://exe.dev/docs/regions

    • MagicMoonlight 2 hours ago
      Well yes, because they needed high availability and flexibility and tons of features…

      Hey wait a minute!

    • awhitty 2 hours ago
      "I am white labeling a cloud"
      • transitorykris 2 hours ago
        FTA “Hence the Series A: we have some computers to buy.”
  • nopurpose 36 minutes ago
    From the linked blog post:

    > The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center.

    From the exe.dev pricing page:

    > additional data transfer $0.07/GB/month

    So at least on the network price promise they don't seem to deliver, still costs an arm and a leg like your neighbourhood hyperscaler.

    Overall service looks interesting, I like simplicity with convenience, something which packet.net deliberately decided not to offer at the time.

  • import 3 hours ago
    Article doesn’t really tell what fundamental problems will be solved, except fancy VM allocation. Nothing about hardware, networking, reliability, tooling and such. Well, nice, good luck.
  • speedgoose 2 hours ago
    I welcome the initiative but it’s pretty costly compared to the bare metal cloud providers. So the value as to be the platform as service too.
  • Growtika 2 hours ago
    Congrats. Just checked your homepage. I love the fact you also show this comment

    "That must be worst website ever made"

    Made me love the site and style even more

  • achille 2 hours ago
    What will happen to my "Grandfathered Plan" I signed up to test it, don't recall if I gave you my credit card
  • z3t4 3 hours ago
    You can run several VM's or containers with isolation on your phone hardware, why even use the cloud when you just want to show your friends?
    • skybrian 1 hour ago
      For me it’s so my coding agent keeps running when I close my laptop lid and it goes to sleep. VM in the cloud because I’m too lazy to set up a computer to be running as a server all the time.
  • kjok 3 hours ago
    How difficult is it to build a second startup on the side?
  • tamimio 1 hour ago
    > $20/month for your VMs

    >One price, no surprises. You get 2 CPUs, 8 GB of RAM, and 25 GB of disk—shared across up to 25 VMs.

    This might sounds like a good thing compared to the current state of clouds, but what’s better than that is having your own. The other day I got a used optiplex for $20, it had 2TB hdd, 265gb ssd, 16gb, and corei7. This is a one time payment, not monthly. You can setup proxmox, have dozens of lxc and vm, and even nest inside them whatever more lxc too, your hardware, physically with you, backed up by you, monitored by you, and accessed only by you. If you have stable internet and electricity, there’s really no excuse not to invest on your own hardware. A small business can even invest in that as well, not just as a personal one. Go to rackrat.net and grab a used server if you are a business, or a good station for personal use.

  • poly2it 4 hours ago
    Why is an imperative SSH interface a better way of setting cloud resources than something like OpenTofu? In my experience humans and agents work better in declarative environments. If an OpenTofu integration is offered in the future, will exe.dev offer any value over existing cost-effective VPS providers like Hetzner? Technically, Hetzner, for example, also allows you to set up shared disk volumes:

    https://github.com/hetzneronline/community-content/blob/mast...

    It also has a CLI, hcloud. Am I getting any value with exe.dev I couldn't get with an 80 line hcloud wrapper?

    • ZihangZ 2 hours ago
      I don't think SSH vs OpenTofu is the core issue here.

      For agents, declarative plans are still valuable because they are reviewable. The interesting question is whether exe.dev changes the primitive: resource pools for many isolated VM-like processes, or just nicer VPS provisioning.

      • poly2it 2 hours ago
        It doesn't do either at competitive rates by the looks of it.
  • jeffrallen 1 hour ago
    So much good stuff is happening at https://exe.dev, keep it up guys!
  • Razengan 59 minutes ago
    Isn't it high time to figure out a distributed physical layer / swarm internet or whatever the buzzword is? Would be perfect for distributed AI too..
  • troupo 2 hours ago
    Did... did you just scare Microsoft? They now announced a similar thing https://x.com/satyanadella/status/2047033636923568440
  • rambambram 1 hour ago
    Now that we're talking about clouds... what happened to the word 'webhosting'?
  • piokoch 21 minutes ago
    How this is different from getting dedicated server from any other provider? Typically you need to pay a bit more - $40-$50 but you get more RAM and cores.

    And what it has to do with the "cloud"? Cloud means one use cloud-provided services - security, queue, managed database, etc. and that's their selling point. This exe.dev is a bare server where I can install what I want, this is fine, but this is not a cloud and, frankly speaking, nothing new.

  • vasco 3 hours ago
    I know its a personal blog but the writing style is really full of himself. What a martyr, starting a second company.
    • Animats 2 hours ago
      It's hard to see the scale of what he's doing. Could be:

      - I'm building a server farm in my homelab.

      - I'm doing a small startup to see if this idea works.

      - We're taking on AWS by being more cost effective. Funding secured.

  • pelasaco 1 hour ago
    Such statement is so off:

    "In some tech circles, that is an unusual statement. (“In this house, we curse computers!”) I get it, computers can be really frustrating. But I like computers. I always have. It is really fun getting computers to do things. Painful, sure, but the results are worth it. Small microcontrollers are fun, desktops are fun, phones are fun, and servers are fun, whether racked in your basement or in a data center across the world. I like them all."

    The reality: Everyone reading his blog or this HN entry loves computers.

  • jrflowers 1 hour ago
    > The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center.

    > $160/month

      50 VM
      25 GB disk+
      100 GB data transfer+
    
    100GB/mo is <1mbps sustained lmao
  • handsometong 1 hour ago
    [dead]
  • ZihangZ 2 hours ago
    [dead]
  • asiffareed 1 hour ago
    [dead]
  • hani1808 3 hours ago
    [dead]
  • WhereIsTheTruth 2 hours ago
    > 100 GB data transfer+

    > $20 a month

    2025 or 2005, what's the difference?