Munich 1991: The Roots of the Current AI Boom

(people.idsia.ch)

140 points | by tosh 2 days ago

13 comments

  • HarHarVeryFunny 1 hour ago
    The current AI boom has more to do with NVIDIA, and the popularity of computer gaming giving us GPU compute, than who was using neural networks back in 1990's.

    More specifically, it was really AlexNet, the 2012 ImageNet entry, running on two NVIDIA GTX 580's, that highlighted the practicality and utility of running large scale neural nets on affordable hardware. CUDA had been released in 2006, but cuDNN (the CUDA library for neural nets) didn't come out until 2014 - after AlexNet had already kickstarted the demand.

    What followed from AlexNet was a few years of intense competition on the ImageNet benchmark, and larger and larger/deeper neural nets (CNNs), which gave rise to a lot of the algorithms and concepts still used today such as residual connections (originally from ResNet), ADAM (training algorithm), ReLU/etc, normalization, dropout, etc... all the fundamentals that made building large neural nets possible.

    Schmidhuber's continual reminding everyone that he was working on neural nets back in the 1990s is beyond tiresome. Yes, he should have been recognized alongside Hinton/Bengio/LeCun as one of the pioneers, but time for him to get over it.

    • LogicFailsMe 21 minutes ago
      And Google's acquisition of DNN Research to get the ball rolling with conv nets and AI moneyball, followed by the acquisition of Deepmind. Schmidhuber IMO *has* been recognized as one of the 4 horseman and rightly so, but what has he done lately? Just noticed they now say the 3 godfathers of AI. This is what people hate about academia. It's not academia itself, it's the mean girl politics that emerge from the tenure system. And at this point, tenure should be abolished IMO having been utterly weaponized to defend the status quo.
  • greenavocado 1 minute ago
    Schmidhuber will NEVER stop trying to aggressively preserve his relevance and its endlessly amusing. Good for him.
  • MeteorMarc 4 hours ago
    Also see Schmidhuber's take on the Hinton + Hopfield Nobel prize: https://people.idsia.ch/~juergen/physics-nobel-2024-plagiari...
    • Hoasi 3 hours ago
      Not that surprising since the whole LLM ecosystem is based on plagiarism.
    • h8hawk 4 hours ago
      It's sad that he is the only one speaking out about Hinton. This whole Hinton glorification seems like it's being pushed by an agenda. I'm not sure if he would receive this much attention if he held a different view (closer to LeCun or Ng), rather than these Effective Altruism takes on current AI.
    • letssaythat 3 hours ago
      [flagged]
      • vld_chk 2 hours ago
        Hm, first time I see that Russian bots came to HN, but here we are. The history of comments of this account is insane.
        • snowpid 21 minutes ago
          I believe just someone who got too deep into Russian propaganda rabbid hole. The comment is too leaky for somebody working in St. Petersburg / a LLM.
      • snowpid 2 hours ago
        well, so you think, all parts and peoples of USSR were voluntarily part of USSR?
  • cold_harbor 35 minutes ago
    worth separating: LSTM (Hochreiter & Schmidhuber 1997) is ironclad and widely cited. the transformer attention priority claims are far shakier. conflating them is how Schmidhuber undermines himself
  • practal 4 hours ago
    TU Munich and Nipkow, Makarius et.al. are also at the center of the influential Isabelle theorem prover. TU Munich is cool :-)
  • trashburger 2 hours ago
    This article, too, was originally discovered by Jürgen Schmidhuber in 1991!
  • jcattle 6 hours ago
    There's this crowd on HN which is very vocal against academia. From what I've seen, the main points are that academia isn't efficient, most of the science coming out of academia is useless and that the whole system is just a waste of taxpayers money. Instead, what is often argued, all good research is done in private labs. Then pointing to SpaceX, Moderna, OpenAI, Google, etc.

    And while it is very true that often the research coming out of Academia is useless, what is always neglected are the roots of the research done in private labs.

    When Jürgen Schmidhuber and team published their work on Neural Nets back in 1991 it was also useless. Unless you had a supercomputer and very, very deep pockets you were not going to do anything with what came out of their lab.

    But still, 30 years later here we are, standing on top of the shoulders of this useless research.

    • jillesvangurp 4 minutes ago
      Private labs feed off academia. Without academics to staff them, they'd get a lot less far.

      I used to work at Nokia Research when they still made phones. Probably the closest thing Europe had to Silicon Valley twenty years ago. Except it was in Helsinki. Lots of stuff got invented there. Nokia didn't really manage to capitalize on its own inventions of course. Or rather it got caught up in its own clumsy attempts throwing babies out of the window by the bucket load. But others sure did. A lot of modern smart phones still have tech in them that Nokia pioneered before either Google or Apple shipped a smart phone.

      At the time there was a lot of talk about the demise of industrial research labs. Bell labs (now actually owned by Nokia!), Xerox PARC, IBM, and all the other big US labs that produced amazing stuff are former shadows of themselves. There is some truth in that

      But you could argue that Google and Apple picked up some of the slack. And the current AI boom came out of Google cherry picking all the best universities for their AI talent and putting them all together in a research group that then got free reign. Like Nokia, that involved a lot of ejecting of babies with the bath water. But it seems to have spawned lots of new startups that can trace their roots back to that research group in Google.

    • yorwba 5 hours ago
      Like half of what Schmidhuber is always complaining about is that (except for LSTMs) people aren't standing on the shoulders of his research very much. They try to solve some of the same problems people have always wanted to solve, try some of the same approaches people always tend to try, and then tinker until it works. At no point do they consult Schmidhuber's decade-old papers where he tried something kind of similar but didn't get very impressive results, and hence they also do not think to cite him. Then he comes out of the woodwork to assert priority.
      • suddenlybananas 4 hours ago
        You can be influenced downstream by papers you haven't personally read.
        • bonzini 4 hours ago
          Shane Legg was in Schmidhuber's lab at IDSIA before being one of the founders of DeepMind, so he probably read the papers personally and knows what influenced him or not...
        • gillesjacobs 3 hours ago
          Of course, but if you haven't read them you also shouldn't cite them.

          And that's where Schmidhuber goes off the rails: publicly shaming published papers into citing you isn't good academic practice. It's bullying.

          • psb217 2 hours ago
            "if you haven't read them you also shouldn't cite them" -- this is wildly incorrect in an academic context. If I'm using ResNets, I should cite the original ResNet paper, even if I haven't read it. If I'm using Transformers, I should cite the original Transformer paper, even if I haven't read it. If my work is a direct extension of method B, and method B is a direct extension of method A, I should cite the source of A, even if I haven't read it.

            You can't claim independence from past work simply because you didn't look directly at it. The job of an academic researcher is to know the landscape of relevant ideas, where they come from, where they're going, and to hopefully contribute a few new good ones.

            Citation chains should extend back from your work, along a reasonable line conceptual inheritance, back to a reasonable point of origin. Schmidhuber has different definitions for both of these reasonables than the bulk of the ML research community, to a point that makes him difficult to satisfy.

            • jasonhong 1 hour ago
              It's worth pointing out that sometimes, some papers just become part of the general context of things and are no longer explicitly cited. Or people cite textbooks or general survey papers instead.

              For example, take a look at Albert Einstein's Google Scholar profile. He's not the top cited physicist. Not even close. It's because other researchers don't explicitly cite his papers. https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en...

              Same with Tim Berners-Lee and the World Wide Web. Imagine if his original paper were cited every time someone deployed a web site.

            • inigyou 2 hours ago
              You should read those papers then
          • dividedbyzero 2 hours ago
            > Of course, but if you haven't read them you also shouldn't cite them.

            But if you build on them you should have read them. I don't know about the specifics and I don't know if Schmidhuber is out of line or not, and citations and impact factors are a terrible mess, but generally speaking, you are responsible for finding and reading and citing any related work that needs to be cited, and if you work on neural networks in an academic context you probably have been forced to read that particular one at some point. Citation obligations don't just disappear because you don't want to do the research.

    • elorant 3 hours ago
      I do a lot of work that is based on academic research, aka building a proprietary sparse embedding model. My issue with academia is that they don’t bother to solve the practical issues. They tell you how to build a PPMI model, but what about hitting a database that’s 500TB to find co-occurrence numbers? This isn’t even touched so you’d then have to go and invent a bazillion of algorithms yourself to make your life easier. So while the bedrock is based on academic research and we thank them for that, scaling anything requires a lot of work in uncharted territories.
      • candu 1 hour ago
        Well, yeah. That's why we have "research & development" as a term.

        What you're referring to is the "development" part of that. In some sense: the job you have _exists precisely because it's not part of the research phase_, and it's equally as valuable as the research part. Research is the proof of concept; development is scaling up and making production-ready and finding small efficiencies and so on.

        From an industry perspective, it's tempting to conflate these, because that's what industry research labs are designed to do: integrated R&D. But that is not at all how academic research labs work.

      • jhbadger 2 hours ago
        But that isn't the purpose of academia -- the purpose of it is to discover new phenomena not to make products. It is true that there is a lot of work to turn a new advance into a product whether it is software or turning biological knowledge into a drug, but without discovery of new phenomena new products will come to a halt. While it is true that some corporate labs, most famously Bell Labs in its heyday, but also for example IBM's T.J. Watson and Xerox's PARC did do basic research besides product-focused work, this is pretty rare because it is hard to justify the cost of something that may only be practical in decades and often help your competitors as much as yourself.
      • gessha 1 hour ago
        I jest but database design is its own sub field of computer science, maybe look into their papers?
        • elorant 52 minutes ago
          I did that too. Ending up building my own reverse index with a fixed-size vocabulary. But that's my issue, you start building one product and you end-up building ten in the process to solve all edge cases because no one bothered to research how things scale.
      • utopiah 2 hours ago
        The practical issue of academia is epistemological. It's about learning how a phenomenon came to exists. If you are looking for efficiency the field of academia related to learning how to do so is computational complexity and it works quite well.

        The goal of academia isn't to be practical, "only" learning.

    • fedeb95 1 hour ago
      I think most people forget the graph-like nature of scientific research. You don't have n useful papers and m useless ones by themselves, you have an interconnection of those. There may be isolated cliques of uselessness, but there isn't a clear correlation between academia and private research.

      Many ideas come from philosophy, which many find useless.

      Heraclitus discovered change back in ancient Greek, I don't know where we would be in scientific research without that (deliberately ignoring the debate about the originality of what we know about Heraclitus work). I bet his contemporaries found his "research" useless.

    • ACCount37 4 hours ago
      Where is "this crowd" that you are talking about?

      The closest to that that I've seen is that traditional academia approaches are too far removed from practical applications for highly applied fields like software engineering, or too slow for fast-moving fields like modern day ML (thus, all the preprints).

    • tcp_handshaker 4 hours ago
      I think most of criticism of academia is about the rampant fraud and unreproducible results, due to the way the incentives are structured.
    • FrustratedMonky 59 minutes ago
      It's like the old saying "only 10% of my marketing budget is making a difference, I just don't know which 10%"

      You don't know ahead of time, where the breakthrough will come from.

      There is ton of research that sits on the shelf, and then years later, it gets re-combined with some other useless research, and boom, some big breakthrough.

      This current attitude of all research is worthless, so it should be cancelled, is shooting our future selves in the face.

    • wolfi1 3 hours ago
      and you still need tons of money
    • mschuster91 35 minutes ago
      > From what I've seen, the main points are that academia isn't efficient, most of the science coming out of academia is useless and that the whole system is just a waste of taxpayers money. Instead, what is often argued, all good research is done in private labs. Then pointing to SpaceX, Moderna, OpenAI, Google, etc.

      Well... that's "starve the beast" in action. A lot of things we take for granted, that underpin our modern ways of life, came to be due to government investing. Laser, radar, microwaves, the early Internet, that all was military R&D.

      "Unfortunately" (well, for the rich and the MIC, at least) there is no way for people to siphon off money in government-funded research, so once the libertarian/small-state BS completely took over following the collapse of the USSR, a lot of that got torn down or supplemented with enough bureaucracy to make Germans cry... and that's why reusable rockets were not invented at NASA but at SpaceX instead.

    • contingencies 1 hour ago
      Every western academic nearly systematically ignores eastern science and philosophy: classicism means "western European". Never mind Europe only flourished intellectually post Islam, which imported the science and engineering of China and India, critically including printing and zero[0]. IMHO this is why distaste for academia grows: it's based on appeals to authority which are demonstrably farcically misplaced. Alternatively stated: the emperor has no clothes, much less silk or paper!

      Just as the Dewey Decimal System really only served the purpose of providing the facetious nominal linearization of an arbitrary depth ontological oversimplification, so too humans are much more like random pattern matching machines than festidious sense-makers glued to absolutes derived from false appeals to static mono-perspective ontological hierarchies. The same is becoming lived experience in the LLM age, although the tiktokked youth apparently cannot string ten words together or focus longer than three seconds to attest, I'd wager they can feel it. Are we losing something by rejecting the habit of rigorously manually tending to spurious and temporary ontologies? Yes. Is it necessarily a loss in the long term? Probably not, in the same way we no longer write long-form letters or leave calling cards. Are we gaining something in response? Yes, at a minimum much stronger cross-pollination between ivory towers by fearless exploratory pragmatists who disrespect the would-be scope of nominal professions in favor of holistic thinking... both AI and human.

      [0] https://en.wikipedia.org/wiki/Science_and_Civilisation_in_Ch...

    • MrBuddyCasino 3 hours ago
      This is a straw-man if I ever saw one.

      Practically no one is against hard science research, properly conducted. The issues are rampant fraud / p-hacking / unreproducible garbage mixed with an unhealthy dose of ideological monoculture and indoctrination, garnished with rising tuition prices while sitting on huge endowments in case of the Ivy Leagues.

      • eru 2 hours ago
        > Practically no one is against hard science research, properly conducted.

        As long as you do that with your own money (or money got freely given from other people), sure.

        If you use taxpayer money, that's a different game.

        • MrBuddyCasino 1 hour ago
          There is a long list of grievances I have regarding the (mis-) use of taxpayer money, and funding the hard sciences is way, way down. I can’t even see it from where I stand.
      • jcattle 3 hours ago
        Yes all good points showing issues that academia has at the moment.

        However I often see this going from "there's issues" to discounting academia altogether and positioning private labs as a good or only alternative.

        After all, most people in the open science collaboration which published the seminal paper kicking off the replication crisis were from academia.

        • MrBuddyCasino 2 hours ago
          Yes there is no substitute for academia. Monopolist's research labs get close (Bell Labs etc), but they tend to be more "applied".
    • pembrook 2 hours ago
      I feel like you're constructing a strawman to argue against. I visit this site almost daily and the prevailing sentiment is usually the polar opposite of what you're suggesting.

      If sentiment on HN were as you say, how could your pro-academia and anti-big tech comment be sitting at the top as the most upvoted comment?

  • emmelaich 5 hours ago
  • gillesjacobs 3 hours ago
    Which work has more value: the abstract description of a catalogue of potential model architectures or their validated application trained on real data?

    In the Schmidhuber case their is 20 years and a chain of countless other works in between the two.

  • jacknews 6 hours ago
    Surely the roots, if we skip over the early preceptron work', are in backpropagation and Hinton, and the work going on at Edinburgh and elsewhere in the 80s.

    Indeed I remember buying a set of three conference-papers-as-books around that time, titled Artificial Neural Networks .. proceedings of the whatever the conference was.

    No doubt Schmidhuber made important contributions, but I see him pop up claiming to be the 'root' of it all every couple of years.

    • h8hawk 5 hours ago
      Hinton did not invent backpropagation.

      related paragraph from Wikipedia:

      Modern backpropagation was first published by Seppo Linnainmaa as "reverse mode of automatic differentiation" (1970)[26] for discrete connected networks of nested differentiable functions.[27][28][29]

      In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.

      • ogrisel 4 hours ago
        Paul Werbos did not apply backprop to MLPs as cleanly described in Hinton's paper, but rather to some kind of autoregressive non-linear parametrized functions with a much more specific application scope.

        Both papers are direct applications of the chain rule applied to estimate the gradient of a multivariate function.

    • hyttioaoa 5 hours ago
      That's what bugs me about him. So much work has gone into today's models that calling his contributions "the root" isn't really warranted. He's always complaining that Hinton, LeCun, and Bengio get more credit than they deserve, and now he's over-claiming himself.
      • BoredPositron 4 hours ago
        Both can be right.
        • HarHarVeryFunny 29 minutes ago
          They could be, but they really aren't.

          Name a single aspect of something modern like the Transformer architecture or how it is trained, that is even indirectly attributable to Schmidhuber.

          No doubt he'd be jumping up and down wanting to take credit for residual connections, but where was Schmidhuber in the ImageNet era when everyone else was discovering how to build deep neural nets? Why didn't Schmidhuber invent ResNets, but instead waited until someone else (Kaiming He) did, then claim credit for it?

          I'll bet Schmidhuber isn't done with yet ... when someone eventually comes up with an architecture for AGI, Schmidhuber will come out of the woodwork and point to a note he made on a napkin in 1800 that predicted it all.

    • emil-lp 5 hours ago
      Surely the roots go back to Turing, Gödel, Hilbert, Frege, Leibniz, Aristoteles.
  • jongjong 2 hours ago
    It's crazy to think that if Elon Musk hadn't mentioned Schmidhuber, most people would have no idea.

    It's nauseating how all the researchers who happened to work for big tech got tons of media coverage but Schmidhuber and his team were getting zero coverage yet they made massive contributions. I bet there are many others not mentioned.

    Nobody even knows about Frank Rosenblatt. It's insane how distorted our perception of innovation is.

    Even science has been corrupted. It makes one doubt every story we're told about who invented what.

    • gom_jabbar 1 hour ago
      Yes, Rosenblatt is another good example. I recently looked deeper into the development of the perceptron and it's absolutely fascinating.
  • storus 1 hour ago
    Instead of focusing on the future, EU is busy rewriting history to please some eccentric researcher that claims he invented it all.
    • greggoB 34 minutes ago
      How does the EU feature in TFA exactly?
      • storus 3 minutes ago
        There seems to be a coordinated push around Schmidhuber all around media in the EU, even LinkedIn is full of "random" posts about him in the past week.
    • impossiblefork 1 hour ago
      Schmidhuber isn't in the EU, nor Switzerland at the moment.
  • sagex 3 hours ago
    I believe invention of Transformers and especially Attention mechanism do have influence from past research but its not definitely only the Schmidhuber's work. Said that, if we remove the papers mentioned by Schmidhuber from history, I am quite certain that there will be no influence in the discovery of Transformers, hence his works can not be the root. He has to grow up and accept that work and equations can appear similar, looking at inverse squared law and saying Newton stole that from someone is being dishonest.