You know the corporate screws are coming down hard, when the model (which can be run off a single A100) doesn't get a code release or a weight release, but instead sits behind an API, and the authors say fuck it and copy-paste the entirety of the model code in pseudocode on page 31 of the white paper.
Please Google/Demis/Sergei, just release the darn weights. This thing ain't gonna be curing cancer sitting behind an API and it's not gonna generate that much GCloud revenue when the model is this tiny.
I wish there's some breakthrough in cell simulation that would allow us to create simulations that are similarly useful to molecular dynamics but feasible on modern supercomputers. Not being able to see what's happening inside cells seems like the main blocker to biological research.
STATE is not a simulation. It's a trained graphical model that does property prediction as a result of a perturbation. There is no physical model of a cell.
Personally, I think arc's approach is more likely to produce usable scientific results in a reasonable amount of time. You would have to make a very coarse model of the cell to get any reasonable amount of sampling and you would probably spend huge amounts of time computing things which are not relevant to the properties you care amount. An embedding and graphical model seems well-suited to problems like this, as long as the underlying data is representative and comprehensive.
I don't think DM is the only lab doing high-impact AI applications research, but they really seem to punch above their weight in it. Why is that or is it just that they have better technical marketing for their work?
Agreed, there’s been some interesting developments in this space recently (e.g. AgroNT). Very excited for it, particularly as genome sequencing gets cheaper and cheaper!
I’d pitch this paper as a very solid demonstration of the approach, and im sure it will lead to some pretty rapid developments (similar to what Rosettafold/alphafold did)
They have been at it for a long time and have a lot of resources courtesy of Google. Asking perplexity it says the alphafold 2 database took "several million GPU hours".
Money and resources are only a partial explanation. There’s some equally and more valuable companies that aren’t having nearly as much success in applied AI.
this is such an interesting problem. Imagine expanding the input size to 3.2Gbp, the size of human genome. I wonder if previously unimaginable interactions would occur. Also interesting how everything revolves around U-nets and transformers these days.
You would not need much more than 2 megabases. The genome is not one contiguous sequence. It is organized (physically segregated) into chromosomes and topologically associated domains. IIRC 2 megabases is like the 3 sd threshold for interactions between cis regulatory elements / variants and their effector genes.
Or to a man with a wheel and some magnets and copper wire...
There are technologies applicable broadly, across all business segments. Heat engines. Electricity. Liquid fuels. Gears. Glass. Plastics. Digital computers. And yes, transformers.
When I went to work at Google in 2008 I immediately advocated for spending significant resources on the biological sciences (this was well before DM started working on biology). I reasoned that Google had the data mangling and ML capabilities required to demonstrate world-leading results (and hopefully guide the way so other biologists could reproduce their techniques). We made some progress- we used exacycle to demonstrate some exciting results in protein folding and design, and later launched Cloud Genomics to store and process large datasets for analytics.
I parted ways with Google a while ago (sundar is a really uninspiring leader), and was never able to transfer into DeepMind, but I have to say that they are executing on my goals far better than I ever could have. It's nice to see ideas that I had germinating for decades finally playing out, and I hope these advances lead to great discoveries in biology.
It will take some time for the community to absorb this most recent work. I skimmed the paper and it's a monster, there's just so much going on.
I understand, but he made google a cash machine. Last quarter BEFORE he was CEO in 2015, google made a quarterly profit of around 3B. Q1 2025 was 35B. a 10x profit growth at this scale well, its unprecedented, the numbers are inspiring themselves, that's his job. He made mistakes sure, but he stuck to google's big gun, ads, and it paid off. The transition to AI started late but gemini is super competitive overall. Deepmind has been doing great as well.
Sundar is not a hypeman like Sam or Cook, but he delivers. He is very underrated imo.
Like Ballmer, he was set up for success by his predecessor(s), and didn't derail strong growth in existing businesses but made huge fumbles elsewhere. The question is, who is Google's Satya Nadella? Demis?
Since we're on the topic of Microsoft, I'm sure you'd agree that Satya has done a phenomenal job. If you look objectively, what is Satya's accomplishments? One word - Azure. Azure is #2, behind AWS because Satya's effective and strategic decisions. But that's it. The "vibes" for Microsoft has changed, but MS hasnt innovated at all.
Satya looked like a genius last year with OpenAI partnership, but it is becoming increasingly clear that MS has no strategy. Nobody is using Github Copilot (pioneer) or MS Copilot (a joke). They dont have any foundational models, nor a consumer product. Bing is still.. bing, and has barely gained any market share.
People now days don't understand how genius MS was in the 90s.
Their strategy and execution was insanely good, and I doubt we'll ever see anything so comprehensive ever again.
1. Clear mission statement: A PC in very house.
2. A nationwide training + certification program for software engineers and system admins across all of Microsoft's tooling
3. Programming lessons in schools and community centers across the country to ensure kids got started using MS tooling first
4. Their developer operations divisions was an insane powerhouse, they had an army of in house technical writers creating some of the best documentation that has ever existed. Microsoft contracted out to real software engineering companies to create fully fledged demo apps to show off new technologies, these weren't hello world sample apps, they were real applications that had months of effort and testing put into them.
5. Because the internet wasn't a distribution platform yet, Microsoft mailed out huge binders of physical CDs with sample code, documentation, and dev editions of all their software.
6. Microsoft hired the top technical writers to write books on the top MS software stacks and SDKs.
7. Their internal test labs had thousands upon thousands of manual testers whose job was to run through manual tests of all the most popular software, dating back a decade+, ensuring it kept working with each new build of Windows.
8. Microsoft pressed PC OEMs to lower prices again and again. MS also put their weight behind standards like AC'97 to further drop costs.
9. Microsoft innovated relentlessly, from online gaming to smart TVs to tablets. Microsoft was an early entrant in a ton of fields. The first Windows tablet PC was in 1991! Microsoft tried to make smart TVs a thing before there was any content, or even wide spread internet adoption (oops). They created some of the first e-readers, the first multimedia PDAs, the first smart infotainment systems, and so on and so forth.
And they did all this with a far leaner team than what they have now!
(IIRC the Windows CE kernel team was less than a dozen people!)
> the Windows CE kernel team was less than a dozen people!
It showed
CE was a dog and probably a big part of the reason Windows Phone failed. Migrating off of it was a huge distraction and prevented the app platform from being good for a long time. I was at Microsoft and worked on Silverlight for a bit back then.
> Azure is #2, behind AWS because Satya's effective and strategic decisions
I am going to have to disagree with this. Azure is number 2, because MS is number 1 in business software. Cloud is a very natural expansion for that market. They just had to build something that isn't horrible and the customers would have come crawling to MS.
You could just as easily make the argument that cloud is a very natural expansion for Google given their expertise in datacenters and cloud software infrastructure, but they are still behind. Satya absolutely deserves credit for Microsoft's success here.
Microsoft has become a lot more friendly to open source under Satya. VSCode, GitHub, and WSL happened during his tenure, and probably wouldn't have happened under Ballmer. Turning the ship from a focus on protecting platform lock-in to meeting developers where they are is a huge accomplishment IMO.
> Microsoft has become a lot more friendly to open source under Satya.
True, but that's just few open source projects, albeit influential ones. There are soo many other companies doing influential open source projects.
I dont disagree with anything you said because turning a ship around is hard. But hand-to-heart, what big tech company is truly innovating to the future. Lets look at each company.
Apple - bets are on VR/AR. Apple Car is dead. So it is just Vision Pro
Amazon - No new bets. AWS is printing money, but nothing for the future.
Microsoft - No new bets. They fumbled their early lead in AI.
Google - Gemini, Waymo ..
I think Satya gets a lot more coverage than his peer at Google.
Waymo and DeepMind and the TPU program all predate Sundar as CEO.
IMO Google should have invested more in Waymo and scaled sooner. Instead they partnered with traditional automakers and rideshare companies, sought outside investment, and prioritized a prestige launch in SF over expanding as fast as possible in easier markets.
In other areas they utterly wasted huge initial investments in AR/VR and robotics, remain behind in cloud, and Google X has been a parade of boondoggles (excluding Waymo which, again, predates Sundar and even X itself).
You could also argue that they fumbled AI, literally inventing the transformer architecture but failing at building products. Gemini 2.5 Pro is good, but they started out many years ahead and lost their lead.
His genius is really just making good bets on people, and letting them do their thing.
People like Scott Guthrie who was a key person behind dot.net, and went on to be the driving force behind Azure. Anyone who did any dot.net work 10+ years ago would know the ScottGu blog and his red shirt.
Google similarly bet on Demis, and the results also show. For someone who got his start doing level design on Syndicate (still one of my all-time favourite games) he's come a long way.
Diversifying Microsoft away from the traditional cash cow of Windows and Office is the single most important strategy for Microsoft and he executed it well.
This is kind of bullshit. One can equally say Satya was setup for success by Ballmer as he stepped away graciously taking all the blame so new CEO can start unencumbered.
He might have delivered a lot of revenue growth yea, but Google culture is basically gone. Internally we're not very far from Amazon style "performance management"
Their brand is almost cooked though. At least the legacy search part. Maybe they'll morph into AI center of the future, but "Google" has been washed away.
World is much.. much bigger than HN bubble. Last year, we were all so convinced that Microsoft had it all figured out, and now look at them. Billion is a very, very large number, and sometimes you fail to appreciate how big that is.
Oh I'm conveying opinions other than mines, tech people I work with, that are very very removed from the HN mindset actually, were shitting on google search for a long time this week.
who didn't ? I meant in the future, if this becomes a long term fruitful economic value (sorry but video and image generation have no value, it's laughable and used for cheap needs, and most of the time people are very annoyed by it).
Read back what you just wrote. It is literally "willy nilly".
"Somethings are because of CEO, and some things are in spite of CEO"
And it was "willy nilly" attributed that enshittification was because of CEO (how do we know? maybe it was CFO, or board) and Gemini because of Demis (how do we know? maybe it was CEO, or CFO, or Demis himself).
You're misunderstanding what he's saying. He's saying Google has started enshittifing products and Sundar gets the blame for that. Sundar is also the CEO so he gets credit for Gemini. Google's playbook is enshittification though and if Gemini ever gets a big enough moat, it will be enshittified. Even Gemini 2.5 Pro has gotten worse for me with the small updates and it's not as good when it first launched. Google topped the benchmarks and then made it worse.
I guess I don't understand why you so strongly believe that CuriouslyC's comment reflects an uninformed opinion without any basis in fact.
I see somebody saying something on here, I tend to assume that they have a reason for believing it.
If your opinions differ from theirs, you could talk about what you believe, instead of incorrectly saying that a CEO can only be responsible for everything or nothing that a company does.
Not really, pressure to move into AI is so vast that it in reality the CEO had little saying about moving into it or not, and they already had smart employees to make it a reality, vastly different that what happened with enshitification which Gemini is part of, just recently people were complaining that the turn off button was hijacked to start Gemini in their Android phones.
Demis reports to Sundar. All of Demis's decisions would have been vetted by and either approved, rejected, or refined by Sundar. There's no way to actually distinguish how much of the value was from whom, unless you have inside info.
> The transition to AI started late but gemini is super competitive overall.
If by competitive you mean "We spent $75 Billion dollars and now have a middle of the pack model somewhere between Anthropic and Chinese startup", that's a generous way to put it.
Citation needed. Gemini 2.5 pro is one of the best models there is right now, and it doesn't look like they're slowing down. There is a LLM response to basically every single Google search query, it's built into the billions of android phones etc. They're winning.
By competitive, i mean no.1 in LM arena overall, in webdev, in image gen, in grounding etc. Plus, leading the chatbot arena ELO. Flash is the most used model in openrouter this month as well. Gemma models are leading on device stats as well. So yes, competitive
Gemini 2.5 Pro is excellent. Top model in public benchmarks and soundly beat the alternatives (including all Claudes and that Chinese startup’s flagship) in my company’s internal benchmarks.
I’m no Google lover — in fact I’m usually a detractor due to the overall enshittification of their products — but denying that Gemini tops the pile right now is pure ignorance.
> It's nice to see ideas that I had germinating for decades finally playing out
I'm sure you're a smart person, and probably had super novel ideas but your reply comes across as super arrogant / pretentious. Most of us have ideas, even impressive ones (here's an example - lets use LLMs to solve world hunger & poverty, and loneliness & fix capitalism), but it'd be odd to go and say "Finally! My ideas are finally getting the attention".
A charitable view is that they intended "ideas that I had germinating for decades" to be from their own perspective, and not necessarily spurred inside Google by their initiative. I think that what they stated prior to this conflated the two, so it may come across as bragging. I don't think they were trying to brag.
I don't find it rude or pretentious. Sometimes it's really hard to express yourself in hmm acceptable neutral way when you worked on truly cool stuff. It may look like bragging, but that's probably not the intention. I often face this myself, especially when talking to non-tech people - how the heck do I explain what I work on without giving a primer on computer science!? Often "whenever you visit any website, it eventually uses my code" is good enough answer (worked on aws ec2 hypervisor, and well, whenever you visit any website, some dependency of it eventually hits aws ec2)
It is a lot to expect of readers... It's also explicitly asked of us in this forum. https://news.ycombinator.com/newsguidelines.html. "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."
It's also natural language though, one can find however much ambiguity in there as they can inject. It hasn't for a single moment come across as pretentious to me for example.
Think of all the tiresome Twitter discussions that went like "I like bagels -> oh, so you hate croissants?".
Did you ride the Santa Cruz shuttle, by any chance? We might have had conversations about this a long while ago. It sounded so exciting then, and still does with AlphaGenome.
I have incredibly mixed feelings on Sundar. Where I can give him credit is really investing in AI early on, even if they were late to productize it, they were not late to invest in the infra and tooling to capitalize on it.
I also think people are giving maybe a little too much credit to Demis and not enough to Jeff Dean for the massive amount of AI progress they've made.
I found it disappointing that they ignored one of the biggest problems in the field, i.e. distinguishing between causal and non-causal variants among highly correlated DNA loci. In genetics jargon, this is called fine mapping. Perhaps, this is something for the next version, but it is really important to design effective drugs that target key regulatory regions.
One interesting example of such a problem and why it is important to solve it was recently published in Nature and has led to interesting drug candidates for modulating macrophage function in autoimmunity: https://www.nature.com/articles/s41586-024-07501-1
Does this get us closer? Pretty uninformed but seems that better functional predictions make it easier to pick out which variants actually matter versus the ones just along for the ride. Step 2 probably is integrating this with proper statistical fine mapping methods?
Yes, but it's not dramatically different from what is out there already.
There is a concerning gap between prediction and causality. In problems, like this one, where lots of variables are highly correlated, prediction methods that only have an implicit notion of causality don't perform well.
Right now, SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome. These types of methods are quite close to Pearl's notion of causal graphs.
> This has existed for at least a decade, maybe two.
Methods have evolved a lot in a decade.
Note how AlphaGenome prediction at 1 bp resolution for CAGE is poor. Just Pearson r = 0.49. CAGE is very often used to pinpoint causal regulatory variants.
When I was restudying biology a few years ago, it was making me a little crazy trying to understand the structural geometry that gives rise to the major and minor grooves of DNA. I looked through several of the standard textbooks and relevant papers. I certainly didn't find any good diagrams or animations.
So out of my own frustration, I drew this. It's a cross-section of a single base pair, as if you are looking straight down the double helix.
Aka, picture a double-strand of DNA as an earthworm. If one of the earthworms segments is a base-pair, and you cut the earthworm in half, and turn it 90 degrees, and look into the body of the worm, you'd see this cross-sectional perspective.
Apologies for overly detailed explanation; it's for non-bio and non-chem people. :)
It's not really just base pairs forcing groove structure. The repulsion of the highly charged phosphates, the specific chemical nature of the dihedral bonds making up the backbone and sugar/base bond, the propensity of the sugar to pucker, the pi-pi stacking of adjacent pairs, salt concentration, and water hydration all contribute.
My graduate thesis was basically simulating RNA and DNA duplexes in boxes of water for long periods of time (if you can call 10 nanoseconds "long") and RNA could get stuck for very long periods of time in the "wrong" (IE, not what we see in reality) conformation, due to phosphate/ 2' sugar hydroxyl interactions.
Jeffhwang is correct, and dekhn is thinking way too hard. If you have any asymmetric planar structure that stacks into a helix into the third dimension there will be a minor groove and a major groove.
I bet the internal pitch is that genome will help deliver better advertisement, like if you are at risk of colon cancer they sell you "colon supplements", its likely they will be able to infer a bit about your personality just with your genome, "these genes are correlated with liking dark humor, use them to promote our new movie"
Please Google/Demis/Sergei, just release the darn weights. This thing ain't gonna be curing cancer sitting behind an API and it's not gonna generate that much GCloud revenue when the model is this tiny.
Personally, I think arc's approach is more likely to produce usable scientific results in a reasonable amount of time. You would have to make a very coarse model of the cell to get any reasonable amount of sampling and you would probably spend huge amounts of time computing things which are not relevant to the properties you care amount. An embedding and graphical model seems well-suited to problems like this, as long as the underlying data is representative and comprehensive.
I’d pitch this paper as a very solid demonstration of the approach, and im sure it will lead to some pretty rapid developments (similar to what Rosettafold/alphafold did)
Some pharmas like Genentech or GSK also have excellent AI groups.
https://arcinstitute.org/news/virtual-cell-model-state
To a man with a hammer…
There are technologies applicable broadly, across all business segments. Heat engines. Electricity. Liquid fuels. Gears. Glass. Plastics. Digital computers. And yes, transformers.
I parted ways with Google a while ago (sundar is a really uninspiring leader), and was never able to transfer into DeepMind, but I have to say that they are executing on my goals far better than I ever could have. It's nice to see ideas that I had germinating for decades finally playing out, and I hope these advances lead to great discoveries in biology.
It will take some time for the community to absorb this most recent work. I skimmed the paper and it's a monster, there's just so much going on.
I understand, but he made google a cash machine. Last quarter BEFORE he was CEO in 2015, google made a quarterly profit of around 3B. Q1 2025 was 35B. a 10x profit growth at this scale well, its unprecedented, the numbers are inspiring themselves, that's his job. He made mistakes sure, but he stuck to google's big gun, ads, and it paid off. The transition to AI started late but gemini is super competitive overall. Deepmind has been doing great as well.
Sundar is not a hypeman like Sam or Cook, but he delivers. He is very underrated imo.
Satya looked like a genius last year with OpenAI partnership, but it is becoming increasingly clear that MS has no strategy. Nobody is using Github Copilot (pioneer) or MS Copilot (a joke). They dont have any foundational models, nor a consumer product. Bing is still.. bing, and has barely gained any market share.
Their strategy and execution was insanely good, and I doubt we'll ever see anything so comprehensive ever again.
1. Clear mission statement: A PC in very house.
2. A nationwide training + certification program for software engineers and system admins across all of Microsoft's tooling
3. Programming lessons in schools and community centers across the country to ensure kids got started using MS tooling first
4. Their developer operations divisions was an insane powerhouse, they had an army of in house technical writers creating some of the best documentation that has ever existed. Microsoft contracted out to real software engineering companies to create fully fledged demo apps to show off new technologies, these weren't hello world sample apps, they were real applications that had months of effort and testing put into them.
5. Because the internet wasn't a distribution platform yet, Microsoft mailed out huge binders of physical CDs with sample code, documentation, and dev editions of all their software.
6. Microsoft hired the top technical writers to write books on the top MS software stacks and SDKs.
7. Their internal test labs had thousands upon thousands of manual testers whose job was to run through manual tests of all the most popular software, dating back a decade+, ensuring it kept working with each new build of Windows.
8. Microsoft pressed PC OEMs to lower prices again and again. MS also put their weight behind standards like AC'97 to further drop costs.
9. Microsoft innovated relentlessly, from online gaming to smart TVs to tablets. Microsoft was an early entrant in a ton of fields. The first Windows tablet PC was in 1991! Microsoft tried to make smart TVs a thing before there was any content, or even wide spread internet adoption (oops). They created some of the first e-readers, the first multimedia PDAs, the first smart infotainment systems, and so on and so forth.
And they did all this with a far leaner team than what they have now!
(IIRC the Windows CE kernel team was less than a dozen people!)
It showed
CE was a dog and probably a big part of the reason Windows Phone failed. Migrating off of it was a huge distraction and prevented the app platform from being good for a long time. I was at Microsoft and worked on Silverlight for a bit back then.
You have got to be kidding. The 90s was my heyday, and Microsoft documentation was extravagantly unhelpful, always.
I am going to have to disagree with this. Azure is number 2, because MS is number 1 in business software. Cloud is a very natural expansion for that market. They just had to build something that isn't horrible and the customers would have come crawling to MS.
- Created the windows server product
- Created the "rent a server" business line
- Identified the need for a VM kernel and hired the right people
- Oversaw MSFT's build out of web services (MSN, Xbox Live, Bing) which gave them the distributed systems and uptime know-how
- Picked Satya to take over Azure, and then to succeed him
Google is not behind capability wise, they are in front of MSFT actually. The customer relationships matter a whole lot more.
I dont disagree with anything you said because turning a ship around is hard. But hand-to-heart, what big tech company is truly innovating to the future. Lets look at each company.
Apple - bets are on VR/AR. Apple Car is dead. So it is just Vision Pro
Amazon - No new bets. AWS is printing money, but nothing for the future.
Microsoft - No new bets. They fumbled their early lead in AI.
Google - Gemini, Waymo ..
I think Satya gets a lot more coverage than his peer at Google.
IMO Google should have invested more in Waymo and scaled sooner. Instead they partnered with traditional automakers and rideshare companies, sought outside investment, and prioritized a prestige launch in SF over expanding as fast as possible in easier markets.
In other areas they utterly wasted huge initial investments in AR/VR and robotics, remain behind in cloud, and Google X has been a parade of boondoggles (excluding Waymo which, again, predates Sundar and even X itself).
You could also argue that they fumbled AI, literally inventing the transformer architecture but failing at building products. Gemini 2.5 Pro is good, but they started out many years ahead and lost their lead.
This is all the 1st step of embrace and extinguish.
People like Scott Guthrie who was a key person behind dot.net, and went on to be the driving force behind Azure. Anyone who did any dot.net work 10+ years ago would know the ScottGu blog and his red shirt.
Google similarly bet on Demis, and the results also show. For someone who got his start doing level design on Syndicate (still one of my all-time favourite games) he's come a long way.
Managing to keep the MS Office grift going and even expand it with MS Teams is something
100% it's Demis.
A Demis vs. Satya setup would be one for the ages.
He's also happens to be a really nice guy in person.
The question will be, when and how will the LLM's be attacked with product placements.
Open marked advertisement in premium models and integrated ads in free tier ones?
I still hope for a mostly adfree world, but in reality google seems in a good position now for the transition towards AI (with ads).
Haven't you been watching the headlines here on HN? The volume of major high-quality Google AI releases has been almost shocking.
And, they've got the best data.
Google's revenue in 2014 was $75B and in 2024 it was $348B, that's 4.64 times growth in 10 years or 3.1 times if corrected for the inflation.
And during this time, Google failed to launch any significant new revenue source.
"Somethings are because of CEO, and some things are in spite of CEO"
And it was "willy nilly" attributed that enshittification was because of CEO (how do we know? maybe it was CFO, or board) and Gemini because of Demis (how do we know? maybe it was CEO, or CFO, or Demis himself).
I see somebody saying something on here, I tend to assume that they have a reason for believing it.
If your opinions differ from theirs, you could talk about what you believe, instead of incorrectly saying that a CEO can only be responsible for everything or nothing that a company does.
If by competitive you mean "We spent $75 Billion dollars and now have a middle of the pack model somewhere between Anthropic and Chinese startup", that's a generous way to put it.
I’m no Google lover — in fact I’m usually a detractor due to the overall enshittification of their products — but denying that Gemini tops the pile right now is pure ignorance.
I'm sure you're a smart person, and probably had super novel ideas but your reply comes across as super arrogant / pretentious. Most of us have ideas, even impressive ones (here's an example - lets use LLMs to solve world hunger & poverty, and loneliness & fix capitalism), but it'd be odd to go and say "Finally! My ideas are finally getting the attention".
Think of all the tiresome Twitter discussions that went like "I like bagels -> oh, so you hate croissants?".
What makes you think that LLMs can do it?
[1] relapsed capitalist, at best, check the recent Doomscroll interview
I have incredibly mixed feelings on Sundar. Where I can give him credit is really investing in AI early on, even if they were late to productize it, they were not late to invest in the infra and tooling to capitalize on it.
I also think people are giving maybe a little too much credit to Demis and not enough to Jeff Dean for the massive amount of AI progress they've made.
One interesting example of such a problem and why it is important to solve it was recently published in Nature and has led to interesting drug candidates for modulating macrophage function in autoimmunity: https://www.nature.com/articles/s41586-024-07501-1
There is a concerning gap between prediction and causality. In problems, like this one, where lots of variables are highly correlated, prediction methods that only have an implicit notion of causality don't perform well.
Right now, SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome. These types of methods are quite close to Pearl's notion of causal graphs.
This has existed for at least a decade, maybe two.
> There is a concerning gap between prediction and causality.
Which can be bridged with protein prediction (alphafold) and non-coding regulatory predictions (alphagenome) amongst all the other tools that exist.
What is it that does not exist that you "found it disappointing that they ignored"?
Methods have evolved a lot in a decade.
Note how AlphaGenome prediction at 1 bp resolution for CAGE is poor. Just Pearson r = 0.49. CAGE is very often used to pinpoint causal regulatory variants.
So out of my own frustration, I drew this. It's a cross-section of a single base pair, as if you are looking straight down the double helix.
Aka, picture a double-strand of DNA as an earthworm. If one of the earthworms segments is a base-pair, and you cut the earthworm in half, and turn it 90 degrees, and look into the body of the worm, you'd see this cross-sectional perspective.
Apologies for overly detailed explanation; it's for non-bio and non-chem people. :)
https://www.instagram.com/p/CWSH5qslm27/
Anyway, I think the way base pairs bond forces this major and minor grove structure observed in B-DNA.
My graduate thesis was basically simulating RNA and DNA duplexes in boxes of water for long periods of time (if you can call 10 nanoseconds "long") and RNA could get stuck for very long periods of time in the "wrong" (IE, not what we see in reality) conformation, due to phosphate/ 2' sugar hydroxyl interactions.
At least they got the handedness right.
> AlphaGenome will be available for non-commercial use via an online API at http://deepmind.google.com/science/alphagenome
So, essentially the paper is a sales pitch for a new Google service.