Meta 3D Gen

(ai.meta.com)

106 points | by meetpateltech 2 days ago

11 comments

  • LarsDu88 2 days ago
    This is crazy impressive, and the fact they have the whole thing running with a PBR texturing pipeline is really cool.

    That being said, I wonder if the use of signed distance fields (SDFs) results in bad topology.

    I saw a paper earlier this week that was recently released that seems to build "game-ready" topology --- stuff that might actually be riggable for animation. https://github.com/buaacyw/MeshAnything

  • wkat4242 2 days ago
    I can't wait for this to become usable. I love VR but the content generation is just sooooo labour intensive. Help creating 3D models would help so much and be the #1 enabler for the metaverse IMO.
    • jsheard 2 days ago
      VR is especially unforgiving of "fake" detailing, you need as much detail as possible in the actual geometry to really sell it. That's the opposite how these models currently work, they output goopy low-res geometry and approximate most of the detailing with textures, which would be immediately obvious with stereoscopic depth perception.
      • SV_BubbleTime 2 days ago
        Agreed.

        Everyone I see text to 3D, it’s ALWAYS textured. That is the obvious give-away that it is still garbage.

        Show me text to wireframe that looks good and I’ll get excited.

  • 999900000999 2 days ago
    Would love for an artist to provide some input, but I imagine this could be really good if it generates models that you can edit or start from later .

    Or, just throw a PS1 filter on top and make some retro games

  • explaininjs 2 days ago
    Looks fine, but you can tell the topology isn’t good based on the lack of wireframes.
    • tobyjsullivan 2 days ago
      They seem to admit as much in Table 1 which indicates this model is not capable of "clean topology". Somewhat annoyingly, they do not discuss topology anywhere else in the paper (at least, I could not find the word "topology" via Ctrl+F).
    • jsheard 2 days ago
      Credit where it's due, unlike most of these papers they do at least show some of their models sans textures on page 11, so you can see how undefined the actual geometry is (e.g. none of the characters have eyes until they are painted on).
      • SV_BubbleTime 2 days ago
        Sans texture is not wireframe though. They have a texture, it’s just all white.

        The wire frame is going to be unrecognizable-bad.

        Still a ways to go.

    • dyauspitr 2 days ago
      That doesn’t matter for things like 3D printing and CNC machining. Additionally, there are other mesh fixer AI tools. This is going to be gold for me.
      • jsheard 2 days ago
        However if you 3DP/CNC these you'll only get the base shape, without any of the fake details it painted over the surface.

        Expectation vs. reality: https://i.imgur.com/82R5DAc.png

      • eropple 2 days ago
        > That doesn’t matter for things like 3D printing and CNC machining

        It absolutely does. But great, let's look forward to Printables being ruined by off-model nonsense.

        • SV_BubbleTime 2 days ago
          It matters so much more, GP is just being hopeful and soon to be disappointed.
    • nuz 2 days ago
      Such a silly argument. Fixing topology is a nearly solved problem in geometry processing. (Or just start with a good topology and 'paste' a texture onto it like they develop techniques for here.)
      • zemo 2 days ago
        depends what you're talking about and what your criteria is. In gamedev, studios typically use a retopology tool like topogun (https://www.topogun.com/) to aid in the creation of efficient topologies, but it's still a manual task, as different topologies have different tradeoffs in terms of poly count, texture detail, options for how the model deforms when animated, etc. For example you may know that you're working on a model of a player character in a 3rd person game where the camera is typically behind you, so you want to spend more of your budget on the _back_ of the model than the _front_, because the player is typically looking at their character's back. If your criteria is "find the minimum number of polygons", sure, it's solved. That's just one of many different goals, and not the goal that is typically used by gamedev, which I assume to be a primary audience of this research.
      • explaininjs 2 days ago
        No… it’s not. But if you know something I don’t the 5 primes will certainly be happy to pay you handsomely for the implementation!
        • nuz 2 days ago
          • explaininjs 2 days ago
            A piece of software that hasn’t been touched in 5 years, let alone adopted in any professional production environment? Cool…
            • portaouflop 2 days ago
              AFAICT it’s used in professional applications and software does not need to be constantly updated, especially if it’s not for the web.
      • RicoElectrico 2 days ago
        It's an essential skill for reading scientific papers to notice what isn't there. It's as important as what is there.

        In my field, analog IC design, if we face a wall, we often do some literature review with a colleague and more often than not, results are not relevant for commercial application. Forget about Monte Carlo, sometimes even there aren't full PVT corners.

  • iamleppert 2 days ago
    I tried all the recent wave of text/image to 3D model services, some touting 100 MM+ valuations and tens of millions raised and found them all to produce unusable garbage.
    • architango 2 days ago
      I have too, and you’re quite right. Also the various 2D-to-3D face generators are mostly awful. I’ve done a deep dive on that and nearly all of them seem to only create slight perturbations on some base model, regardless of the input.
    • jampekka 2 days ago
      The gap from demos/papers to reality is huge. ML has a bad replication crisis.
  • rebuilder 2 days ago
    I’m puzzled by the poor texture quality in these. The colours are just bad - it looks like the textures are blown out (the detail at the bright end clip into white) and much too contrasty ( the turkey does that transition from red to white via a band of yellow). I wonder why that is - was the training data just done on the cheap?
    • firtoz 2 days ago
      It seems to be very well compared to the alternatives, however there's a long way to go forward indeed
  • anditherobot 2 days ago
    Can this potentially support : - Image Input to 3D model Output - 3D model(format) as Input

    Question: What is the current state of the art commercially available product in that niche?

    • egnehots 2 days ago
      This a pipeline for text to 3D.

      But it's using for 3D gen, a model that is more flexible:

      https://assetgen.github.io/

      It can be conditioned on text or image.

    • moffkalast 2 days ago
      Meshroom, if you have enough images ;)
  • Simon_ORourke 2 days ago
    Are those guys still banging on about that Metaverse? That's taken a decided back seat to all the AI innovation in the past 18 months.
    • dvngnt_ 2 days ago
      zuck has said before that ML will help make the "metaverse" more viable.

      he still needs a moat with its own ecosystem like the iphone

  • GaggiX 2 days ago
    In the comparison between the models only Rodin seems to produce clean topology, hopefully in the future we will see a model with the strength of both, hopefully from Meta as Rodin is a commercial model.
  • localfirst 2 days ago
    can somebody please please integrate SAM with 3d primitive RAGging? This is the golden chalice solution as a 3d modeler, having one of those "blobs" generated by Luma and likes aren't very useful
  • kgraves 2 days ago
    Can this be used for image to 3D generation? What is the SOTA in this area these days?
    • Fripplebubby 2 days ago
      I think what they did here was go text prompt -> generate multiple 2d views -> reconstruction network to go multiple 2d images to 3d representation -> mesh extraction from 3d representation.

      That's a long way of saying, no, I don't think that this introduces a component that specifically goes 2d -> 3d from a single 2d image.

    • tobyjsullivan 2 days ago
      The paper suggests Rodin Gen-1 [0] is capable of image-to-shape generation.

      [0] https://hyperhuman.deemos.com/rodin