Why is the title of the post-mortem "GitHub Outage"? It makes it sound like Lovable somehow brought down GitHub, when in reality it seems like they were rate-limited by GitHub for creating lots of repositories, then got their GitHub App completely blocked for breaching the Terms of Service.
> Incident report for the GitHub outage on January 2-3, 2025
Writing it like that looks like you're pushing the blame on your downtime/outage to GitHub, like they're responsible for your application to be up, instead of taking full responsibility for it.
> Writing it like that looks like you're pushing the blame on your downtime/outage to GitHub, like they're responsible for your application be up, instead of taking full responsibility for it.
Well, they did what they were supposed to - they explicitly asked Github what they were up to, Github gave an explicit "we're ok with this, go ahead", and once Github sees that, whoops, it's causing errors they don't even bother to check if there are support tickets open with the customer, they just go and disable their access.
But that's true for any 3rd party you'd depend on. Everything will work until it doesn't. Doesn't mean you're less responsible for your project being down.
A title like "GitHub caused our outage" would still make it clear the downtime wasn't the direct action of anyone on the team, yet still take responsibility over that it happened. Instead, labeling it "Incident report for the GitHub outage" just seems like straight up blaming someone else.
Well, Github explicitly took responsibility. The first action Github did once Lovable reached out for support was "reinstate our app and apologize for the issues it caused us and our users."
And no, you are not responsible for every 3rd party service you use. Some services are unavoidable, some services are just nice-to-have, but if you can't trust a service to perform its advertised function, it is the service's fault.
> Well, Github explicitly took responsibility. The first action Github did once Lovable reached out for support was "reinstate our app and apologize for the issues it caused us and our users."
No, GitHub won't ever take responsibility for what you have between you and your projects/apps/companies users. Nor did they do so in this case.
If I use service X for doing Y, and I write an email asking if it's OK that I upload 1000s of files every day, even if they say OK today, but then next week turn around and say "Nah", I'm still responsible for my users, who trust me and my team. Service X has no responsibility towards those users at all. It sucks though, I agree with that.
> And no, you are not responsible for every 3rd party service you use. Some services are unavoidable, some services are just nice-to-have, but if you can't trust a service to perform its advertised function, it is the service's fault.
Besides DNS and BGP I suppose, what services are "unavoidable" exactly? Git hosting isn't some arcane distributed network technology needing years of reading/experience to understand, the CLI ships with a web ui you can basically copy-paste to have your own Git hosting.
I'd say you are responsible for everything that you use and depend on. And if you think "Ah, I'll just chunk 10K repositories at GitHub a day, they say it's fine" and then don't have any plan in case GitHub suddenly says it isn't OK, you are responsible for the falloff if shit hits the fan.
Well, the app store, for example. Sure, it is of course a good idea to comply with the app store policies to the extent possible, but ultimately there not much you can do to prevent Google or Apple saying "we don't like this app" and pulling it. For example with the UTM emulator. So how then can Google or Apple making such a decision be "your responsibility"?
As another example, let's say you build a house in a hurricane-prone area. It's your responsibility to ensure the owner buys hurricane insurance, as mandated by law. It's not your responsibility to build a nuclear-bunker-grade house that is impervious to hurricanes. It is easy to point the finger and say "you should have thought of that", but in practice it is easier to deal with such catastrophes as they happen.
Besides the fact that you also need packet traversal via a multitude of layer 3 protocols a lot of which would pass through some corp cloud.
And the net neutrality is dead in the USA.
As for the repositories, there are no good SLA terms for such storage. Nobody offers this service because it's expensive to offer. Not on this scale. So if you need it, you have to invest a lot in hardware or colocation or specialized clouds.
Plus admin. A whole datacenter, and suddenly you're an ant vs three goliaths.
If I read it correctly, it was a support person that provided them with assurance. Not an executive or vice president or manager or vp of sales. GitHub did not give them permission nor their approval; it was a single person in support department.
Who relies on support people to determine the basis of their business when it’s obvious that they were concerned with the high usage rate and that it might cause problems for their customers?
Oh yeah, support staff aren't always aware of everything. For example, with one of our latest features, we didn’t have time to add a UI option to disable the feature. The expectation was that support staff could disable it via their special admin panel upon user request. However, I accidentally discovered that when users asked to hide the feature, tech support told them it wasn’t possible! It turned out the tech support lead forgot to share that information with the team.
As for the OP, they should have conducted load testing and implemented rate limits on their end, rather than blindly relying on someone’s word that GitHub was ready to handle all their product's load for free.
sounds like a pretty sane thing to do: github protects the majority of their customers from instability caused by a few.
unless you're a super important partner, the people on call might never have heard of your little app and just decide that it's the only safe thing to do to protect the reliability of the system.
The same user who posted this (Henrik501) also posted a comment two days ago (their only HN comment so far) praising the Lovable team for their incident response:
And now this post with an exaggerated title. Seems like they're shilling and trying to make Lovable sound like a product with such huge traction that it "even brought Github down". They keep making outlandish claims on social media too, like reaching $4m ARR in 5 weeks etc. This company is very suspicious.
Looking more into it, Henrik Westerlund works on "growth" at lovable (via LinkedIn) and posted Lovable to Product Hunt https://www.producthunt.com/@henrik_westerlund. So it appears they are shilling.
>praising the Lovable team for their incident response
For 8 hours, no one is aware the service is down. It then takes ~3 more days to fix it. One of their first decisions is to 'make as much noise as possible on social media' (?), and every step seems to create additional problems (corrupt repos etc.) Nothing appears to be well thought out, the blog post reads like they weren't ready for this at all, panicked and chaotic decisions without understanding the tech stack on a deeper level (race conditions, rate limits etc.) Not a lot of confidence in the team behind a project that looks like nothing more than a glue between an LLM and a storage backend.
I like that HN is so minimal, but obvious stuff like this makes me want to write a browser extension that lets me custom tag accounts for my own notes.
Many mistakes were made by Lovable that they could be berated for but on a more positive note, there is a lesson for us all: if you're doing something that you're worried about being problematic (e.g: creating a large volume of GitHub repositories) reaching out is a good thing but it is important to understand who you are reaching out to. GitHub is a huge organization, front-line support is not likely to have intimate knowledge of how exactly the acceptable usage policy is enforced nor the permission to make agreements. The key when reaching out is to find someone who has authority on the subject. Ideally, GitHub's front-line support would have escalated to the appropriate person/team but that isn't always possible (maybe they don't know who, maybe they're having a bad day and forgot). If the answer you get seems too convenient, it is probably not correct.
If I reach out and they can’t direct me to the right solution, that doesn’t seem like something I need to continue to solve for them. Seems too onerous.
I don't really understand how a well funded startup like this, with something that is relatively trivial, yet critical to their product, decided to just shove it into GitHub.
> Their product is the creation of Git repos. Putting it on the platform their customers want to use makes a lot of sense.
Maybe I read the landing page very wrong, but it seems to be a "app building toolkit" of some sorts? Not just "creation of git repos".
They could have made the GitHub repository creation happen when the user does some action, instead of at the stage of "create app" which probably every single user does at least once, even people with no intention of actually building apps.
Or better yet, offer their own viewer for Git repositories they themselves host. It's not overly difficult, and the `git` CLI tools even ship with a web UI you can take inspiration from.
If you read the post-mortem, you will see that this has nothing to do with the platform their customers want to use, and everything to do with creating repos internally for their own organization.
Their product has change tracking which is clearly powered by git repos on GitHub.
So again, I ask, for something that is a SPOF, why rely on a third party?
"How can somebody who does X just do Y?", implying that Y is somehow bad or wrong in the context of X, or contrasts with the experience level implied by X.
Maybe it's true of the subject. Maybe they should have known better. But I think it's beneficial to take a step back and view it from the perspective of readers: Many people - even people familiar with X and Y - may have no idea why Y is bad in the given context, or, more relevantly, why this should be obvious, as you imply by using the word "just". I think there's a lot of benefit to be gained by trying to observe yourself, and notice when you are writing in this style, and add more context.
Honestly, doesn't surprise me much. Homebrew, the most popular package manager/repository for macOS, basically lives on GitHub and bases everything on top of it. Over the years, I think there been times when they've actually brought down GitHub (or close to at least).
Most folks seem fine with it, at least it still lives on like normal as far as I know. I think engineering principles flew out the window a long time ago, all people care about now is shipping as fast as they possibly can.
Your observation about shipping fast to build an MVP house of cards is true. However, Homebrew is a free package manager, that likely relies on GitHub due to cheap or free storage for a free tool.
My assumption is that lovable is not free, so using a cheap or free service while it is taking money from their customers fits squarely into the categorization of poor engineering and possibly incompetence.
315,000 repositories + 10,000 per day? They were obviously correctly concerned this wouldn't be able to go on forever, hence the pre-emptive email, and of course they got the response saying it was okay, but I really feel like this is the kind of thing that's too dangerous to leave your company sitting on because sooner or later they were going to be told "no". It feels too much like it's found a point of arbitrage in Github's ToS, and indeed ended up causing problems.
I suppose they did respond pretty fast, but if I were them I'd have liked to have had the S3 option in my back pocket earlier. Maybe I'm just being too risk-averse here...
If nothing else, I was a little surprised they didn't (or at least didn't mention) having a fail over plan in place already. Seems like "Prepare for the worst, hope for the best" would have been the logical game plan.
git is not github, and its kind of funny that the best these guys could come up with for storing git repos of text files was using github. any competent dev could come up with a solution in an afternoon. using github at all tells you quite a bit
The title ("Year old startup overloaded GitHub – Incident report") is a misrepresentation. A coding-LLM-as-a-service startup got banned by GitHub for abusing it by creating thousands of repositories.
> Incident report for the GitHub outage on January 2-3, 2025
Writing it like that looks like you're pushing the blame on your downtime/outage to GitHub, like they're responsible for your application to be up, instead of taking full responsibility for it.
Well, they did what they were supposed to - they explicitly asked Github what they were up to, Github gave an explicit "we're ok with this, go ahead", and once Github sees that, whoops, it's causing errors they don't even bother to check if there are support tickets open with the customer, they just go and disable their access.
A title like "GitHub caused our outage" would still make it clear the downtime wasn't the direct action of anyone on the team, yet still take responsibility over that it happened. Instead, labeling it "Incident report for the GitHub outage" just seems like straight up blaming someone else.
And no, you are not responsible for every 3rd party service you use. Some services are unavoidable, some services are just nice-to-have, but if you can't trust a service to perform its advertised function, it is the service's fault.
No, GitHub won't ever take responsibility for what you have between you and your projects/apps/companies users. Nor did they do so in this case.
If I use service X for doing Y, and I write an email asking if it's OK that I upload 1000s of files every day, even if they say OK today, but then next week turn around and say "Nah", I'm still responsible for my users, who trust me and my team. Service X has no responsibility towards those users at all. It sucks though, I agree with that.
> And no, you are not responsible for every 3rd party service you use. Some services are unavoidable, some services are just nice-to-have, but if you can't trust a service to perform its advertised function, it is the service's fault.
Besides DNS and BGP I suppose, what services are "unavoidable" exactly? Git hosting isn't some arcane distributed network technology needing years of reading/experience to understand, the CLI ships with a web ui you can basically copy-paste to have your own Git hosting.
I'd say you are responsible for everything that you use and depend on. And if you think "Ah, I'll just chunk 10K repositories at GitHub a day, they say it's fine" and then don't have any plan in case GitHub suddenly says it isn't OK, you are responsible for the falloff if shit hits the fan.
As another example, let's say you build a house in a hurricane-prone area. It's your responsibility to ensure the owner buys hurricane insurance, as mandated by law. It's not your responsibility to build a nuclear-bunker-grade house that is impervious to hurricanes. It is easy to point the finger and say "you should have thought of that", but in practice it is easier to deal with such catastrophes as they happen.
As for the repositories, there are no good SLA terms for such storage. Nobody offers this service because it's expensive to offer. Not on this scale. So if you need it, you have to invest a lot in hardware or colocation or specialized clouds. Plus admin. A whole datacenter, and suddenly you're an ant vs three goliaths.
Your customers don't care if it's the service's fault and will blame you.
Who relies on support people to determine the basis of their business when it’s obvious that they were concerned with the high usage rate and that it might cause problems for their customers?
As for the OP, they should have conducted load testing and implemented rate limits on their end, rather than blindly relying on someone’s word that GitHub was ready to handle all their product's load for free.
unless you're a super important partner, the people on call might never have heard of your little app and just decide that it's the only safe thing to do to protect the reliability of the system.
https://news.ycombinator.com/item?id=42646297
And now this post with an exaggerated title. Seems like they're shilling and trying to make Lovable sound like a product with such huge traction that it "even brought Github down". They keep making outlandish claims on social media too, like reaching $4m ARR in 5 weeks etc. This company is very suspicious.
Looking more into it, Henrik Westerlund works on "growth" at lovable (via LinkedIn) and posted Lovable to Product Hunt https://www.producthunt.com/@henrik_westerlund. So it appears they are shilling.
For 8 hours, no one is aware the service is down. It then takes ~3 more days to fix it. One of their first decisions is to 'make as much noise as possible on social media' (?), and every step seems to create additional problems (corrupt repos etc.) Nothing appears to be well thought out, the blog post reads like they weren't ready for this at all, panicked and chaotic decisions without understanding the tech stack on a deeper level (race conditions, rate limits etc.) Not a lot of confidence in the team behind a project that looks like nothing more than a glue between an LLM and a storage backend.
> Seems to onerous.
I would only say that about a non-critical personal project.
Getting a flag like this hardly “overloaded” anything. Alerts for these things usually trigger well below any actual risk to the system.
They probably should have had a backup location from day 2 though, I agree. If nothing else, in case of a GitHub outage.
Maybe I read the landing page very wrong, but it seems to be a "app building toolkit" of some sorts? Not just "creation of git repos".
They could have made the GitHub repository creation happen when the user does some action, instead of at the stage of "create app" which probably every single user does at least once, even people with no intention of actually building apps.
Or better yet, offer their own viewer for Git repositories they themselves host. It's not overly difficult, and the `git` CLI tools even ship with a web UI you can take inspiration from.
Their product has change tracking which is clearly powered by git repos on GitHub.
So again, I ask, for something that is a SPOF, why rely on a third party?
"How can somebody who does X just do Y?", implying that Y is somehow bad or wrong in the context of X, or contrasts with the experience level implied by X.
Maybe it's true of the subject. Maybe they should have known better. But I think it's beneficial to take a step back and view it from the perspective of readers: Many people - even people familiar with X and Y - may have no idea why Y is bad in the given context, or, more relevantly, why this should be obvious, as you imply by using the word "just". I think there's a lot of benefit to be gained by trying to observe yourself, and notice when you are writing in this style, and add more context.
Most folks seem fine with it, at least it still lives on like normal as far as I know. I think engineering principles flew out the window a long time ago, all people care about now is shipping as fast as they possibly can.
My assumption is that lovable is not free, so using a cheap or free service while it is taking money from their customers fits squarely into the categorization of poor engineering and possibly incompetence.
I suppose they did respond pretty fast, but if I were them I'd have liked to have had the S3 option in my back pocket earlier. Maybe I'm just being too risk-averse here...
>We're currently investigating issues. Please stay tuned until this error banner has been updated.
When I try to create a new project, it says "An unknown error occurred with the code sandbox"
Something about S3-backed repos didn't work out?
UPD. "Under maintenance: Lovable is currently not able to reach the cloud provider for our previews, fly.io" Now it's fly.io's problem, not GitHub's