This was apparently a Linde installation custom built for TSMC in Arizona.[1]
Nitrogen, oxygen, and argon are extracted from air on-site and purified.
That's Linde's primary business; liquefying and distilling air. This isn't some little local company or a company operating outside their area of expertise.
Those gases are storeable, so it's surprising there wasn't enough tank capacity to deal with outages.
The site plan [2] shows "Gas Plant 1", and future "Gas Plant 2" and "Gas Plant 3". The gas plants are across a small road from the fab and feed the plant directly. Once Gas Plants 2 and 3 were built, there would be redundancy, but at this stage, there isn't a backup. The plan doesn't show a large tank farm, so they can't store gases in bulk.
Unless things have changed a lot since I fled semiconductor manufacturing, you would still need silane tanks at least. I’m as surprised as you that they don’t have buffer tanks.
I don't think Linde™ needs an add, everyone knows Linde™ is the most reliable partner in producing and storing gases in all available purity classes.
(joke off, it's probably not an add, but they were excited to share the reason you see Linde on all sorts of gas tanks all over the place. It's actually quite common and if you see it once you see it everywhere. )
Possibly but also plausible is that its a deep joke that everyone is in on.
When googling the company, the marketing slogan that comes up is "Linde is Everywhere" but that works on so many levels. They sell air, air is everywhere. Therefore Linde is everywhere.
They are a company that sells air: that stuff that people breathe. Forget this AI nonsense. Jensen has to constantly pull something out of rear to keep food on the table. These guys sell air. What a business. :)
> Those gases are storeable, so it's surprising there wasn't enough tank capacity to deal with outages.
It probably depends on the duration of the outage. I'd expect they have some storage, and if they plan on having the compressor plant down for longer that that can manage they'll bring in tanks.
This isn’t very big news. Issues occur during bring-up often. Linde’s processes are possibly so power intensive that failing over to generator power is not possible. TSMC is right to put Linde on notice since Linde should have a PFMEA and control plan to eliminate any root causes for downtime. I suspect in the long term TSMC has plans to insource this if the issue persists. Scrap happens sometimes during manufacturing, if the writer only has journalism experience and no manufacturing experience then they may not have a conceptual understanding of acceptable first pass yield. After all, the TSMC logo features failing parts!
In many ways I agree with you, but the problem statement (constrained/exhausted gas supply from vendor) makes it seems like this was not just line down, but the whole factory stopped for a few hours. Line down is a miserable migrane but still managable... while a whole factory stoppage makes a lobotomy seem like a good idea. It also sounds like there was not enough forewarning to park critical customer wafers in a "safe" stage of the process.
Even so, I also would still call this another monday at a semiconductor factory. Welcome! Here we play a nearly endless game of whack-a-mole. Here's your mallet and your towel. Now whack enough of the moles hard enough until they stop coming back (at least through the same holes). Beware the alpha moles.
By any road, I am surprised to see even this high-level perspective on a quality event disclosed to the mainstream public; I thought this was not standard practice. I enjoyed the read.
The opening paragraph feels a bit pearl clutching to me.
> the company had to scrap thousands of wafers that were in production for clients at the site which include Apple, Nvidia, and AMD.
Eh. So what? I am sure scrap thousands of wafers for all kinds of other reasons. I would be better to know the cost per hour of a total plant shutdown. (Of course, I'm sure the author doesn't have this information.)
> After all, the TSMC logo features failing parts!
Normally a wafer would have die-sized spaces for test structures used for optical, electrical, chemical and other tests. Think the TV test card https://en.wikipedia.org/wiki/Test_card
> forcing the facility to shut down for at least a few hours
> As a result, the company had to scrap thousands of wafers
Anything involving wet chemistry, photoresist, furnaces, etc. is very time-constrained. You can't let wafers sit around indefinitely. Certain process steps must be followed up very quickly to avoid scrap.
This is why you dont see redundant power for manufacturing lines. A 3nm line needs hundreds of megawatts to operate. You cant clear queued lots without a fully functional line. There's not much you could save by keeping part of the line operational.
Some mods in modded minecraft had that and it's a very punishing mechanic unless implemented well.
It eats all of your power and usually also very expensive items very quickly usually. Assume you have like 600RF/tick generated, common with certain generator constellations. 1 tick - 1024 RF and one input consumed, crafting fails due to not enough power. 1 tick wait, 1 tick, 1024 RF and one input consumed, ... This can void 10+ items / second, which can hurt very badly. Even for common items in fact.
It also tends to kick you while you're down, because it only kicks in if everything else is already failing. Then the only thing to continue functioning is the thing voiding your energy and your expensive items. Or even worse, if you did one miscalculation about your power grid, and then all of your resources are gone, often before you can react.
It can be interesting in the right packs, but it is Gregtech level hardness.
GT:NH has “easy mode” enabled in some regards - it won’t finish the craft but it WILL wait for power (actually keep trying) - so if you fix the power problems you can finish and not lose the mats.
Multiblocks power fail and void, but then your machine shuts down until you restart it. This is much better than suggested above, where you'd void over and over, but it can still utterly mess up a large craft being orchestrated thru AE2, which is still waiting forever for he failed craft to submit a part back into the system.
GregTech doesn't use RF though, at least it didn't. Machines pull packets of amps through the wires from the generators/batteries, the whole system is pretty interesting. Also high-level circuits have to be manufactured in cleanrooms with a pretty complex tech chain.
Oh GTs power is absolutely not RF. Back in the day, even GTs power could be cruel though. You could over-volt your machines and thus void machines you spend literal days on crafting. And the cables in the process too. And you could lose your entire infrastructure once it rained and you had no roof :)
Out of curiosity, which mods are this cruel? I've been playing GT (modern) lately and even it doesn't void your machine's items unless you break the machine itself.
Oh this was in the days of yore of modded 1.4 and early 1.7. I don't remember specific mods, I just remember the pain and frustration of this happening.
I'm currently playing Stoneblock 4 and have been playing GT:NH and Nomifactory some time ago, and the more modern mods have learned a lot from those old janky things. Heck, back in the day every mod had a different power system and you needed a nonsense amount of conversion infrastructure, unless the modpack did a lot of work to combine all of this somehow, haha.
Power brownouts are pretty rare outside of the very early game. It's too easy and cheap to massively over produce power for that to really harm players outside the early game so I don't think there'd be much interest. Usually brownouts rapidly develop into full blown blackouts and black restarts as your miners reduce output during the brownout often leading to a reduction in incoming fuel leading to even less power being generated in a self consuming cycle.
Suppose you can start production with only 1 of each input required for a recipe, but to keep it going you need to keep feeding all of the inputs to finish it. If any of them run out, then the recipe fails, you lose the inputs, and the machine stalls.
This works better for high latency recipes (>10s) with lots of inputs, like low density structure, modules, and atomic bombs.
Usually the answer is to just slightly overproduce the inputs, only the new planet Gleba even slightly discourages letting items just sit on the conveyors with their freshness mechanic. What's the benefit?
Mindustry has something similar with pumping various gasses/liquids through plumbing. If you accidentally mix them while building new lines, things stop working when your gases get mixed up forcing you to purge the line.
Someone needs to make the whole chip manufacturing process into a factorio like game and let the gamers optimize it, then build the factories around that.
TSMC has backup generators in their AZ fab. You actually have to have backup power or a few hundred millisecond blip could cause days or weeks of tool down time. You should see what happens when you lose the ability to keep a clean room at temp/humidity/airflow...it's weeks or months.
It didn’t happen, but the facilities team at the fab where I worked was seriously considering installing a flywheel to cover power bumps. What I don’t get about this story is how this actually happened. All our process gasses were out in a tank farm and we knew how much pressure we had. We would have stopped the line if there wasn’t enough to proceed. Were they separating air onsite or something?
I was very impressed by the modest little fab I worked at having thousands of lead acid batteries for momentary takeover, and 8 five-megawatt locomotive engines for longer term redundancy. Apparently their steady state usage was 25MW, which allowed still having a hot spare and concurrent downtime for two of the locomotive generator units.
Yes, Linde has an onsite plant and is building two more.
For some processes, stopping will botch the wafer. In the event of a gas shortage, do plants plan which lines to take down first, and which lines should complete a process step?
The way this worked at the fab where I was, was that facilities would have paged everybody, and whoever needed to hold wafers would do so. You could mark your equipment down or unavailable for a particular step. I don’t know what we would have done if it was “hey, we lost dry nitrogen a minute ago.” I think at that point you lose a lot of wafers in wet cleans.
In the case of a power interruption at the fab, consequences were highly dependent on the equipment and the unit process. A prolonged power interruption to diffusion was the worst case scenario. You’d have 150 wafers in the furnace, and any significant deviation from the nominal temperature profile meant they were all scrap. Worse, if the furnace cooled off, you had to scrap the quartz boat the wafers rode in, too. Other processes had a smaller blast radius but were even more of a headache to disposition. Implant, you’d lose beam and probably lose vacuum too. Then the wafer in the chamber would be dusted and in an indeterminate state, and the rest of the wafers you’d have to sleuth out whether they were implanted or not. Sometimes you’d have a lot sitting in the end station and it wouldn’t be clear whether or not it had been run at all. At least in photolithography you could tell whether or not a wafer was patterned by looking at it.
seems like what is often downplayed or silent on American media is the cultural mismatch between TSMC taiwanese engineers and their american counterparts
so it always comes to those out of the loop as a bit of a surprise but from what I've read from individual Taiwanese workers and their feedback its clear that there is significant regret from one side.
and it doesn't seem to limited to just TSMC but another large company as of recent that receive icey reception for their large investment in America manufacturing.
i think this is a big reason why lot of these jobs simply wouldn't stay in america as the consumer would not be able to foot the costs added by "cultural premium" faster than what innovation can reduce.
TW workers have a majority of their compensation in bonuses, so the OT portion is quite small and many do not even bother to ask for it. The overall compensation between a TW and US engineer at TSMC is also significant. Not to mention the lowest paid hourly workers...where in TW they make 2-3X minimum wage, but in the US it's like 1.25X.
This reddit post captures what I've seen at TSMC in Taiwan. $120K is normal pay at the director level...engineers make $2500-5000 a month. TSMC AZ starting pay for a new college grad w/ BS is probably just under $100K/year with just salary, with the potential to make over $120K within a few years with full vested bonuses.
I think your numbers match what my cousin shared. In both my conversations with my cousin and in the reddit post, it is unclear if reported salaries are take-home or don't include the OT and bonuses, but I don't get your point?
My point is: Engineers in Taiwan work more hours because they are paid to work more hours (OT). Engineers in the USA are not paid more if they work 35 hours or 60 hours.
If TSMC wants to address the culture gap (get the Americans to work more), TSMC should pay up.
The entire approach is different. Especially with Taiwanese engineers, their entire focus is whatever work they are doing. Everything else (quite literally), their wives handle.
Americans typically ask for things like work life balance, non abusive working hours, etc. they also don’t (anymore) have the type of family life setup that allows them to actually focus so much - being pulled into child care duties, or taking care of family members, or whatever their next vacation should be, etc.
The general attitude is also more ‘yeah whatever’ to some extent.
The amount of singular obsessive engineering you get out of one vs the other is hard to compare.
hmmm this is interesting I was always the impression Taiwanese wives were more progressive and men had to do lot more lifting vs other regional cultures in east asia
my original thinking after reading some of the anecdotes from TSMC engineers is that they were obsessively dedicated which means extreme hours from North American culture
its also the same in places like Samsung where the company treats employers very well with perks and long career stability but its not free always requires huge sacrifice I'd imagine similar to Japanese conglomerates.
I'm not sure which is better in America its definitely transactional relationship but it also comes with stability issues relatively compared to what these East Asian giants offer but at the cost of not being able to switch if and when you find yourselves at odds.
Not sure what it was like at Nokia but also another conglomerate that ultimately folded under competition and also a country with more stringent labor/life constraints that you would find less enforced in East Asia.
Getting a bit distracted here but noting how much culture plays a role in these large companies and their management styles.
There are like half a dozen semiconductor manufacturers in Phoenix that were here before TSMC arrived. There's a robust pipeline from ASU to these same manufacturers. Can we please just stop with the nonsensical notion that "Americans don't know how to fabricate semiconductors"?
TSMC is a publicly traded company just like the others. I'm not familiar with their governance but Google tells me the largest owner (a state development fund) has 6%.
They have a special advantage because they don't compete with their customers, which leads to trust, which leads to customers paying for their R&D for them.
Intel on the other hand just kind of sucks at their job. Skill issue basically. (But they aren't /that/ far behind.)
its not that the USA can't produce semi-conductors. Its that semi-conductor production, at TSMC's scale (both in terms of number of units, yield rates, and depth) currently requires highly skilled workers to work a lot of their hours to "baby sit" the wafer production.
Maybe there is a world where TSMC can hire enough skilled workers and optimize processes enabling people to go home at 5p, but that is not currently the case.
Yes. This. So, yeah, essentially fundamentally incompatible with the US economy.
The US is going to have to heavily subsidize the payroll of tens of thousands of very accomplished EEs/etc to make this work. By doing that they will also wreck the HW part of SV.
There isn't really a HW part of SV. Hardware engineers aren't paid well enough to live there in droves like programmers. There are some of course, but the ones I know are in San Diego or Bremerton or Israel.
Also, it's completely normal to run a factory 24/7. I think people are just impressed because TSMC is the only one they've read about?
(However, it's correct that a TSMC fab is the most advanced and complicated process on the planet.)
But first being reported now: It was only speculation on the financial reports before this. How quickly do they normally report disruptions like this?
I wouldn't think it would have to be too quickly since I've heard about fab disruptions from fires and such since the early 2000's. Probably just sometime after quarterly reporting to set the record straight? Why not in the report?
I also had the impression from the report that shareholders were miffed about this Q3 snag, so they had to publish this even though they were about to treat this as business as usual.
Those gases are storeable, so it's surprising there wasn't enough tank capacity to deal with outages.
The site plan [2] shows "Gas Plant 1", and future "Gas Plant 2" and "Gas Plant 3". The gas plants are across a small road from the fab and feed the plant directly. Once Gas Plants 2 and 3 were built, there would be redundancy, but at this stage, there isn't a backup. The plan doesn't show a large tank farm, so they can't store gases in bulk.
[1] https://www.aztechcouncil.org/utility-company-makes-progress...
[2] https://semiwiki.com/forum/threads/tsmc-phoenix-arizona-fab-...
From the outside, I would love to participate in semiconductor manufacturing.
(Well, it's cheaper.)
(joke off, it's probably not an add, but they were excited to share the reason you see Linde on all sorts of gas tanks all over the place. It's actually quite common and if you see it once you see it everywhere. )
What is funny though, at least in the Australia and UK regions, they still use the BOC brand, which is a subsidiary under Linde.
Supagas tend to have better prices for smaller operators, and hobbyists.
When googling the company, the marketing slogan that comes up is "Linde is Everywhere" but that works on so many levels. They sell air, air is everywhere. Therefore Linde is everywhere.
They are a company that sells air: that stuff that people breathe. Forget this AI nonsense. Jensen has to constantly pull something out of rear to keep food on the table. These guys sell air. What a business. :)
It probably depends on the duration of the outage. I'd expect they have some storage, and if they plan on having the compressor plant down for longer that that can manage they'll bring in tanks.
Even so, I also would still call this another monday at a semiconductor factory. Welcome! Here we play a nearly endless game of whack-a-mole. Here's your mallet and your towel. Now whack enough of the moles hard enough until they stop coming back (at least through the same holes). Beware the alpha moles.
By any road, I am surprised to see even this high-level perspective on a quality event disclosed to the mainstream public; I thought this was not standard practice. I enjoyed the read.
I'm not sure about that, I think the blank spaces are just parts that have been picked. The dies have been cut and the good ones are being removed.
> As a result, the company had to scrap thousands of wafers
Anything involving wet chemistry, photoresist, furnaces, etc. is very time-constrained. You can't let wafers sit around indefinitely. Certain process steps must be followed up very quickly to avoid scrap.
This is why you dont see redundant power for manufacturing lines. A 3nm line needs hundreds of megawatts to operate. You cant clear queued lots without a fully functional line. There's not much you could save by keeping part of the line operational.
A new failure mode resets output progress back to zero if you lose power or some other input while crafting.
You could design circuit networks to cut power to non-essential systems so the rest of the factory can keep producing.
It eats all of your power and usually also very expensive items very quickly usually. Assume you have like 600RF/tick generated, common with certain generator constellations. 1 tick - 1024 RF and one input consumed, crafting fails due to not enough power. 1 tick wait, 1 tick, 1024 RF and one input consumed, ... This can void 10+ items / second, which can hurt very badly. Even for common items in fact.
It also tends to kick you while you're down, because it only kicks in if everything else is already failing. Then the only thing to continue functioning is the thing voiding your energy and your expensive items. Or even worse, if you did one miscalculation about your power grid, and then all of your resources are gone, often before you can react.
It can be interesting in the right packs, but it is Gregtech level hardness.
May or may not apply to multi blocks.
GT's system of only pulling power on-demand is very nice though; no wasting fuel
I'm currently playing Stoneblock 4 and have been playing GT:NH and Nomifactory some time ago, and the more modern mods have learned a lot from those old janky things. Heck, back in the day every mod had a different power system and you needed a nonsense amount of conversion infrastructure, unless the modpack did a lot of work to combine all of this somehow, haha.
Suppose you can start production with only 1 of each input required for a recipe, but to keep it going you need to keep feeding all of the inputs to finish it. If any of them run out, then the recipe fails, you lose the inputs, and the machine stalls.
This works better for high latency recipes (>10s) with lots of inputs, like low density structure, modules, and atomic bombs.
It still looks kinda easy, the machines just do it automatically on the default game.
Surely the reality might be much more complex (like... the yield/quality drop by time function?)
For some processes, stopping will botch the wafer. In the event of a gas shortage, do plants plan which lines to take down first, and which lines should complete a process step?
In the case of a power interruption at the fab, consequences were highly dependent on the equipment and the unit process. A prolonged power interruption to diffusion was the worst case scenario. You’d have 150 wafers in the furnace, and any significant deviation from the nominal temperature profile meant they were all scrap. Worse, if the furnace cooled off, you had to scrap the quartz boat the wafers rode in, too. Other processes had a smaller blast radius but were even more of a headache to disposition. Implant, you’d lose beam and probably lose vacuum too. Then the wafer in the chamber would be dusted and in an indeterminate state, and the rest of the wafers you’d have to sleuth out whether they were implanted or not. Sometimes you’d have a lot sitting in the end station and it wouldn’t be clear whether or not it had been run at all. At least in photolithography you could tell whether or not a wafer was patterned by looking at it.
It's probably not 100% identical to TSMC's process.
so it always comes to those out of the loop as a bit of a surprise but from what I've read from individual Taiwanese workers and their feedback its clear that there is significant regret from one side.
and it doesn't seem to limited to just TSMC but another large company as of recent that receive icey reception for their large investment in America manufacturing.
i think this is a big reason why lot of these jobs simply wouldn't stay in america as the consumer would not be able to foot the costs added by "cultural premium" faster than what innovation can reduce.
He admitted, even with their OT and bonuses, he probably makes more than them w2 salaries.
But my point still remains: if they want US (or TW) folks to work more hours, they need to pay for those hours.
This reddit post captures what I've seen at TSMC in Taiwan. $120K is normal pay at the director level...engineers make $2500-5000 a month. TSMC AZ starting pay for a new college grad w/ BS is probably just under $100K/year with just salary, with the potential to make over $120K within a few years with full vested bonuses.
My point is: Engineers in Taiwan work more hours because they are paid to work more hours (OT). Engineers in the USA are not paid more if they work 35 hours or 60 hours.
If TSMC wants to address the culture gap (get the Americans to work more), TSMC should pay up.
I’m not an expert on Taiwanese labor laws but their list of exempt labor categories in the LSA is much shorter than the one in the American FLSA.
Americans typically ask for things like work life balance, non abusive working hours, etc. they also don’t (anymore) have the type of family life setup that allows them to actually focus so much - being pulled into child care duties, or taking care of family members, or whatever their next vacation should be, etc.
The general attitude is also more ‘yeah whatever’ to some extent.
The amount of singular obsessive engineering you get out of one vs the other is hard to compare.
my original thinking after reading some of the anecdotes from TSMC engineers is that they were obsessively dedicated which means extreme hours from North American culture
its also the same in places like Samsung where the company treats employers very well with perks and long career stability but its not free always requires huge sacrifice I'd imagine similar to Japanese conglomerates.
I'm not sure which is better in America its definitely transactional relationship but it also comes with stability issues relatively compared to what these East Asian giants offer but at the cost of not being able to switch if and when you find yourselves at odds.
Not sure what it was like at Nokia but also another conglomerate that ultimately folded under competition and also a country with more stringent labor/life constraints that you would find less enforced in East Asia.
Getting a bit distracted here but noting how much culture plays a role in these large companies and their management styles.
Think about how Intel, who pioneered the know how, can't build cutting edge nodes in the levels that they need to make it profitable.
IBM had to sell their fabs to cater to the whims of "shareholders".
It's the greed of stockholders that you need to blame.
They have a special advantage because they don't compete with their customers, which leads to trust, which leads to customers paying for their R&D for them.
Intel on the other hand just kind of sucks at their job. Skill issue basically. (But they aren't /that/ far behind.)
Maybe there is a world where TSMC can hire enough skilled workers and optimize processes enabling people to go home at 5p, but that is not currently the case.
The US is going to have to heavily subsidize the payroll of tens of thousands of very accomplished EEs/etc to make this work. By doing that they will also wreck the HW part of SV.
Also, it's completely normal to run a factory 24/7. I think people are just impressed because TSMC is the only one they've read about?
(However, it's correct that a TSMC fab is the most advanced and complicated process on the planet.)
https://news.ycombinator.com/item?id=17686310 ("Computer Virus Cripples Several Taiwan Semiconductor Plants (bloomberg.com)"—2018, 100 comments)
https://news.ycombinator.com/item?id=19214952 ("TSMC's Photoresist Material Incident: $550M Loss (anandtech.com)"—2019, 15 comments)
I wouldn't think it would have to be too quickly since I've heard about fab disruptions from fires and such since the early 2000's. Probably just sometime after quarterly reporting to set the record straight? Why not in the report?