Meta 3D Gen

(ai.meta.com)

477 points | by meetpateltech 469 days ago

24 comments

wkat4242 469 days ago
I can't wait for this to become usable. I love VR but the content generation is just sooooo labour intensive. Help creating 3D models would help so much and be the #1 enabler for the metaverse IMO.
[-]
- jsheard 469 days ago
  VR is especially unforgiving of "fake" detailing, you need as much detail as possible in the actual geometry to really sell it. That's the opposite how these models currently work, they output goopy low-res geometry and approximate most of the detailing with textures, which would be immediately register as fake with stereoscopic depth perception.
  [-]
  - spookie 469 days ago
    Yup. I'm doing a VR project, urban environment. Haven't really found a good enough solution for 3D reconstruction from images.
    Yes, there is gaussian splatting, NeRF and derivatives, but their outputs _really don't look good_. It's also necessary to have the surface remeshed if you go through that route, and then you need to retexture it.
    Crazy thing being able to see things up to scale and so close up :)
    [-]
    - dclowd9901 469 days ago
      Not meta VR, but one of my favorite things to do in gran turismo 7 with my PSVR2 is just “sit” in the cars and look around the cabins. The level of detail the devs put in is on another level.
      [-]
      - mhh__ 469 days ago
        I love simracing in the rain (rf2 is really punishing in particular and actually looks quite good) mainly for similar reasons.
      - 8n4vidtmkvmk 469 days ago
        Would be nice if they modelled real purchasable cars this way. With functional knobs and hud would be even better.
        [-]
        Haemm0r 469 days ago
        Knobs in cars are not a thing anymore... No need for 3D for that ;-)
        [-]
        brookst 468 days ago
        Instead they have to model screens and the operating systems that allow us to change the temperature with only 6 clicks!
        anakaine 468 days ago
        A number of manufa turers are heading back to knobs for frequent functions after customer sentiment whiplash.
    - bhewes 469 days ago
      I find it much easier to remesh and deal with textures with a crappy 3d reconstruction vs working with 2d images only. I also shoot HDRI and photos for PBR. I find sculpting tools super useful for VR, but yeah its still an Art even with all the AI help.
    - 4ggr0 469 days ago
      > _really don't look good_
      If you use "*" instead of "_" you can write in italic :) just a thought
    - ibrarmalik 469 days ago
      By output you mean the extracted surface geometry? Or are you directly rendering NeRFs in VR.
      [-]
      - spookie 469 days ago
        Given the scale it wouldn't be wise to render them directly. There's also the issue of being able to record in real life without changes happening while doing so.
        I should've have clarified it, but yes I was talking about the extracted surface geometry.
  - SV_BubbleTime 469 days ago
    Agreed.
    Everyone I see text to 3D, it’s ALWAYS textured. That is the obvious give-away that it is still garbage.
    Show me text to wireframe that looks good and I’ll get excited.
    [-]
  - newswasboring 469 days ago
    I would push back on this a bit. The best games I've played on my VR headset are not "realistic". "Fake" is more about consistency I guess, if everything around me in VR looks equal amount of "goopy" then I don't think it will feel weird.
    [-]
    - taneq 468 days ago
      Doom 2 in gzdoom VR mode is amazingly convincing for how janky the graphics are. Even with the enemies as 2D sprites it’s quite compelling, so it’s my go-to rebuttal for all the “VR games need amazing graphics” sentiment. Good low-detail artwork is fine.
  - edkennedy 469 days ago
    We need more of the stable diffusion tool set in 3d AI creation. Upscaling does incredible things in SD adding in all kinds of detail and bringing 512x512 to 4k. Inpainting to redo weird arms or deformities. Controlnets like outlines, depth, pose etc to do variations on an existing model.
    The real fun begins when rigging gets automated. Then full AI scene generation of all the models… then add agency… then the trip never ends.
  - outside415 469 days ago
    this is why I love half life alyx. it just gets so much detail in VR space in a way that no other game ever has that makes for a truly immersive experience.
  - TylerE 469 days ago
    I'd liken it to the trend from 5-10 years ago for every game to have randomly generated levels.
    It does't feel like an expansive world - it's the same few basic building blocks combined in every possible combination. It doesn't feel intentional or interesting.
  - lomase 469 days ago
    Half Life Alyx bakes everything on textures and is the best looking vr game.
    [-]
    - mplewis 468 days ago
      Alyx doesn’t use models that look like cheese left in a hot car.
      [-]
      - lomase 467 days ago
        We leave that for Half Life 1.
  - Liquix 469 days ago
    does displacement mapping not hold up in VR?
    [-]
    - lawlessone 469 days ago
      I think displacement maps are often made by starting with high detailed models and converting some of the smaller details to normal, bump, reflection? maps etc.
      [-]
      - readyman 469 days ago
        correct. the process is called texture baking. however, displacement uses displacement maps, which are grayscale, essentially depth maps. displacement is not very useful for realtime rendering because then the displacement has to be calculated in realtime. displacement is mostly used in offline rendering. realtime rendering just uses normal maps.
    - amonith 469 days ago
      It absolutely does, I don't quite get comments here. All VR games use normal assets with all their normal (map) shenanigans. It works the same as on desktop. Sure if you look at something super close the "illusion" breaks, but it has nothing to do with VR, same thing happens in flat screen games.
      Maybe people mistakenly think that most standalone Quest games don't have those maps because they don't work? Well it's not the case. The standalone games (especially on Q2 vs Q3) have just very low performance budget. You strip out what you can to make your game render in 90 fps for each eye (each eye uses a different camera perspective so each frame scene has to be rendered twice).
  - crazygringo 469 days ago
    Funny, pretty much everything on my Meta Quest seems to be "fake" detailing, without much detail in the actual geometry.
    I mean, yes it's obvious because the GPU is only so powerful. The difference against my Xbox is night-and-day.
    But even if VR is unforgiving of it, it's simply what we've got, at least on affordable devices. These models seem to be perfectly fine for current mainstream VR. Maybe Apple Vision is better, I don't know.
- samspenc 469 days ago
  There are a few services that do this already, but they are all somewhat lacking, hopefully Meta's paper / solution brings some significant improvements in this space.
  The existing ones:
  - Meshy https://www.meshy.ai/ one of the first movers in this space, though it's quality isn't that great
  - Rodin https://hyperhuman.deemos.com/rodin newer but folks are saying this is better
  - Luma Labs has a 3D generator https://lumalabs.ai/genie but doesn't seem that popular
- tsimionescu 469 days ago
  Regardless of cost, the meta verse is a dead end concept - at least until we get Ghost in the Shell-style computer-brain interfaces. Second Life probably was peak popularity for this kind of meta verse.
mintone 469 days ago
I've been bullish[1] on this as a major aspect of generative AI for a while now, so it's great to see this paper published.
3D has an extremely steep learning curve once you try to do anything non-trivial, especially in terms of asset creation for VR etc. but my real interest is where this leads in terms of real-world items. One of the major hurdles is that in the real-world we aren't as forgiving as we are in VR/games. I'm not entirely surprised to see that most of the outputs are "artistic" ones, but I'm really interested to see where this ends up when we can give AI combined inputs from text/photos/LIDAR etc and have it make the model for a physical item that can be 3D printed.
[1] https://www.technicalchops.com/articles/ai-inputs-and-output...
[-]
- aabhay 469 days ago
  I have to be a soggy blanket person, but there are some pretty strong reasons why 3D generative AI is going to have a much shallower adoption cycle than 2D. (I founded a company that was likely the first generative AI company for 3D assets)
  1. 3D is actually a broad collection of formats and not a single thing or representation. This is because of the deep relation between surface topology and render performance. 2. 3D is much more challenging to use in any workflow. Its much more physical, and the ergonomics of 3DOF makes it naturally hard to place in as many places as 2D 3. 3D is much more expensive to produce per unit value, in many ways. This is why, for example, almost every indie web comic artist draws in 2D instead of 3D. In an ai first world it might be less “work” but will still be leaps and bounds more expensive.
  In my opinion, the media that have the most appeal for genAI are basically (in order)
  - images - videos - music - general audio - 2D animation - tightly scoped 3D experiences such as avatars - games - general 3D models.
  My conclusion from being in this space was that there’s likely a world where 3D-style videos generated from pixels are more poised to take off than 3D as a data type.
  [-]
  - noduerme 469 days ago
    You skipped an important market segment. Porn. The history of technology for making and distributing images, from cave paintings to ancient Rome through the Renaissance, Betamax, Photoshop, Tomb Raider and Instagram filters, has always been driven by improving the rendering of boobs.
  - robust-cactus 468 days ago
    At this point GLTF seems pretty darn good and seems broadly usable. It embeds the mesh, textures, animations right in a single file. It can represent a single model or a scene. I also has a binary format.
    3d need not be so complicated! We've kinda made it complicated but a simplification wave is likely coming.
    The big unlock for 3d though will have to be some better compression technology. Games and 3d experiences are absolutely massive.
  - catgary 468 days ago
    There’s a middle ground in animation where you use techniques like text2motion with pre-existing assets that, I think, has a lot of appeal for games.
  - fartfeatures 469 days ago
    Doesn't Nvidia's USD format solve the multiple formats issue?
    [-]
    - unconed 469 days ago
      I think you mean Pixar's.
      USD was designed for letting very large teams collaborate on making animated movies.
      It's actually terrible as an interchange format. e.g. The materials/shading system is both overengineered and underdefined. It is entirely application specific.
      For the parts of USD that do work well, we have better and/or more widely supported standards, like GLTF and PBR.
      It was a very dumb choice to push this anywhere.
      [-]
      - fartfeatures 461 days ago
        Yeah that's the one. Did you see they were phasing out the materials support in it for materialX?
- slidehero 469 days ago
  >I've been bullish[1] on this as a major aspect of generative AI for a while now, so it's great to see this paper published.
  Me too. My first thought when seeing 2D AI generated images was that 3D would be a logical next step. Compared to pixels, there's so much additional data to work with when training these models I just assumed that 3D would be an easier problem to solve than 2D image generation. You have 3D point data, edges, edge loops, bone systems etc and a lot of the existing data is pretty well labeled too.
- lallysingh 469 days ago
  I'm excited to see the main deterrents to Indy gave dev: art and sound, get AI'd away. A single developer could use an off-the-shelf game engine and some AI generated assets (perhaps combining with whatever they can buy cheap) to develop some fun games.
  Still, getting from still models to something that animates is necessary:(
  [-]
  - vermilingua 469 days ago
    This is exactly what I’m not excited for: indie game discovery is already hard enough, Steam widening the floodgates has not been a positive experience. Reducing the effort to create games even further is going to DoS the indie game market as we see the same studios that pump out hundreds of hentai games suddenly able to broaden their audiences significantly.
    [-]
    - livrem 469 days ago
      I think having modern game engines reducing the need for game programmers to almost zero caused much of this, but it also resulted in some interesting games when artists could create games without a need to hire programmers.
      It will be interesting to see if AI art (and AI 3D models) will mean that we see interesting games instead created by programmers without having to hire any artists.
      What I do not look forward to is the predictable spam flood of games created without both artists and programmers.
      [-]
      - l33tman 469 days ago
        To be fair, this is already the case on all the platforms, as you can easily put together a game with free assets from the assetstores (or pay a few dollars for pretty high quality assets). For every standard game genre you can imagine I'm sure there are thousands of generic games released every year on every playform (don't have any real numbers but I get that feeling...)
        Rendering the assets by AI or buying them from the asset store is not going to change the number of generic games put out there I think, maybe AI gen can make some of them a bit more unique at best.
    - drschwabe 469 days ago
      That's the idea, you won't have to discover - instead you can just create the game you want.
      [-]
      - latentsea 469 days ago
        Be careful what you wish for.
    - noduerme 469 days ago
      Passable art is common. Original and interesting game mechanics are exceedingly rare, and will continue to be. The relationship between passable art and throwaway games is like that between bland AI content and marketing blogs.
      Really good games will still employ really good artists.
      [-]
      - vermilingua 469 days ago
        This is my point exactly, but even passable art takes some time to create. I’m not excited for the very-soon-to-arrive tide of VNs, deckbuilders, and JRPGs made with effectively 0 time or effort.
        [-]
        brookst 468 days ago
        I’ve never understood the effort = quality view of art. Just because someone spent thousands of hours does not mean it is good art. And plenty of great art is executed quickly.
        It seems as odd to me as bemoaning the way word processors let people write novels without even being good typists.
        [-]
        noduerme 468 days ago
        What is an example of some great art that was executed quickly and/or without a great many hours of prior experience on the part of the artist?
        [-]
        brookst 467 days ago
        With apologies for the BuzzFeed listicle: https://www.buzzfeed.com/imaraoshibanjo1/famous-songs-writte...
        Picasso produced 50,000 paintings in his career[1], about two per day every day. So probably considerably more on some days.
        It’s harder to find data on great art from relative novices. But consider the opposite — how much bad art is there from people who put their 10,000 hours or whatever in? I’m willing to believe some correlation between time spent and quality, but I am not willing to believe that tools that make artists more efficient necessarily reduce quality.
        1. https://www.guggenheim.org/teaching-materials/selections-fro...
        [-]
        noduerme 466 days ago
        I mean, part of my job is hiring illustrators and designers. I can tell by looking at a portfolio whether someone has put in their (slightly metaphorical) "10,000 hours". And much of that has nothing to do with execution or the tools they used. In fact, thinking that execution and tooling make them better is often a red flag.
        What I look for is that the artist knows what they want and that the ideas they're putting on the page are thoughtful, coherent, original, and well-executed in a style that's unique enough to justify hiring them personally. And the ability to hone ideas into visual form is not innate, nor have I ever seen it successfully done by someone who didn't spend countless hours trying and failing first.
        For example, upper management, who spend time looking at and approving art pieces, almost never understand that altering them is going to make them worse. "Add something here" or "take this out", generally undermine the piece when coming from someone not trained and experienced. Writing prompts is much the same as being a manager. You never get exactly the result you expect for what you asked, but that is also because you did not have the exact vision in your own mind of how it would look before it was executed.
        Practice is about developing that vision. Once you have that vision, execution is the easy part, and you don't really need a tool to draw it for you. In any case, the tool will not draw it the way you see it.
        So yes, a songwriter who's written tons of songs can suddenly write a good one in 30 minutes. Most of my best songs were written longhand with no edits. That happens sometimes after writing hundreds of songs that you throw away.
        Similarly, I've been coding for 25 years. Putting my fingers on the keys and typing out code is the easy part. I don't need copilot to do that for me. I don't really need a fancy IDE. What practice gives is the ability to see the best way to do something and how it fits into a larger project.
        If a tool could read the artist's mind and draw exactly what the artist sees, it would be crystal clear that 10,000 hours of trial and error in image-making results in a thought process that makes great art possible (if the artist is capable of it at all). The effort is mostly in the process of developing that mental skill set.
        noduerme 469 days ago
        But this is a category that didn't exist yet. So who knows what people without art skills or budgets might do? Probably nothing, but maybe one in ten thousand actually isn't garbage. Just like music at the advent of digital home recording. The market is already so flooded it hardly matters.
        I'm an artist and a gentleman coder and I'm disgusted and offended by careless work. But I don't think I need to die on the hill of stopping infinite crappy game mills from having access to infinite crappy art.
        [edit] I'm also just bitter after years working on pretty great original art / mechanics driven casual games that only garnered tiny devoted fan bases, and so I assume that when it comes to the kinds of long tail copycat games you're talking about, especially with AI art, no one's going to bother playing them anyway.
    - kilpikaarna 469 days ago
      Lol, yeah, the main deterrent/obstacle to indie game dev has little to do with actual development, and machine generated content is actively making that worse.
    - DrSiemer 469 days ago
      So we should not improve production methods, because it will give us more things for less effort?
      Just let the market sort it out. I for one can't wait for the next Cyriak or Sakupen, that can wield the full power of AI assistance to create their unique vision of what a game can be.
- milofeynman 469 days ago
  Autodesk has been building practical 3d models for years with generative design. I have to imagine it's only getting better with these recent advances, but I'm not immersed in the space.
iamleppert 469 days ago
I tried all the recent wave of text/image to 3D model services, some touting 100 MM+ valuations and tens of millions raised and found them all to produce unusable garbage.
[-]
- dudus 469 days ago
  SOTA text-to-image 5 years ago was complete garbage. Most people would think the same. Look how good it got now.
  You have to look at this as stepping stone research.
  [-]
  - raincole 469 days ago
    Did they got such high valuation 5 years ago? Genuine question.
    [-]
    - gpm 469 days ago
      I'm not sure I'd expect valuations to be at all similar.
      The potential target market is significantly different in scale (I assume, I haven't tried to estimate either). The potential competitors are... already in existence. It seems more likely now that we'll succeed at good 3d-generative-AI then it seemed before we got good 2d-generative-AI that we would succeed at that...
    - dinglestepup 469 days ago
      No. With one partial exception being OpenAI that got $1B investment ~5 years ago from MS before they launched DALL-E v1 (and even before GPT-3).
- architango 469 days ago
  I have too, and you’re quite right. Also the various 2D-to-3D face generators are mostly awful. I’ve done a deep dive on that and nearly all of them seem to only create slight perturbations on some base model, regardless of the input.
- ddtaylor 469 days ago
  We tried them too. My wife is a 3D artist, but we needed a lot of assets that frankly weren't that important. The plan was to use the output as a starting point and improve as needed manually.
  The problem is that the output you get is just baked meshes. If the object connects together or has a few pieces you'll have to essentially undo some of that work. Similar problems with textures as the AI doesn't work normally like other artists do.
  All of this is also on top of the output being basically garbage. Input photos ultimately fail in ways that would require so much work to fix it invalidates the concept. By the time you start to get something approaching decent output you've put in more work or money than just having someone make it to begin with while essentially also losing all control over the art pipeline.
- guyomes 469 days ago
  The field evolves quickly. The Meta 3D Gen paper states that Rodin Gen-1 [1,2,3] has a clean topology. As a non professional, the wireframes from some examples indeed look nice.
  [1]: https://github.com/CLAY-3D/OpenCLAY
  [2]: https://hyperhuman.deemos.com/
  [3]: https://huggingface.co/spaces/DEEMOSTECH/Rodin
- jampekka 469 days ago
  The gap from demos/papers to reality is huge. ML has a bad replication crisis.
  [-]
  - freeone3000 469 days ago
    This is not a “replication crisis”. Running the paper gets you the same results as the author; it’s uniquely replicable. The results not being useful in a product is not the same as a fundamental failure in our use of the scientific process.
    [-]
    - jampekka 469 days ago
      That is reproducibility. Replicability means that the results hold for replication outside the specific circumstances of one study.
      [-]
      - fngjdflmdflg 469 days ago
        >Replicability means that the results hold for replication outside the specific circumstances of one study.
        If by "hold for replication outside the specific circumstances of one study" you mean "useful for real world problems" as implied by your previous comment then I don't think you are correct.
        From a quick search it seems there are multiple definitions of Reproducibility and Replicability with some using the words interchangeably but the most favorable one I found to what you are saying is this definition:
        >Replicability is obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data.
        >[...]
        >In general, whenever new data are obtained that constitute the results of a study aimed at answering the same scientific question as another study, the degree of consistency of the results from the two studies constitutes their degree of replication.[0]
        However I think this holds true for a lot of ML research going on. The issue is not that the solutions do not generalize. It's that the solution itself is not useful for most real world applications. I don't see what replicability has to do with it. you can train a given model with a different but similar dataset and you will get the same quality non-useful results. I'm not sure exactly what definition of replicability you are using though if there is one I missed please point it out.
        [0] https://www.ncbi.nlm.nih.gov/books/NBK547546/
  - SV_BubbleTime 469 days ago
    >The gap from demos/papers to reality is huge.
    SAI showed Stable Diffusion 3 pictures of women laying on grass. If you haven’t been following SD3…
    https://arstechnica.com/information-technology/2024/06/ridic...
- dgellow 469 days ago
  Haven’t tried all, but yeah, pretty bad so far
LarsDu88 469 days ago
This is crazy impressive, and the fact they have the whole thing running with a PBR texturing pipeline is really cool.
That being said, I wonder if the use of signed distance fields (SDFs) results in bad topology.
I saw a paper earlier this week that was recently released that seems to build "game-ready" topology --- stuff that might actually be riggable for animation. https://github.com/buaacyw/MeshAnything
[-]
- jsheard 469 days ago
  The obvious major caveat with MeshAnything is that it only scales up to outputs with about 800 polygons, so even if their claims about the quality of their topology hold up it's not actually good for much as it stands. For reference a modern AAA game character model can easily exceed 100,000 polygons, and models made to be rendered offline can be an order of magnitude bigger still.
  [-]
  - LarsDu88 469 days ago
    I do some 3d modeling on the side with my side project (https://roguestargun.com), and I suspect those 800 polygons with good topology may be more useful to a lot of 3d artists than blobby fully textured SDF derived models.
    A low poly model with good topology can be very easily subdivided and details extruded for higher definition ala Ian Hubert's famous vending machine tutorial: https://www.youtube.com/watch?v=v_ikG-u_6r0
    And of course I'm sure those folks in Shanghai making the Mesh Anything paper did not have access to the datasets or compute power the Meta team had.
explaininjs 469 days ago
Looks fine, but you can tell the topology isn’t good based on the lack of wireframes.
[-]
- tobyjsullivan 469 days ago
  They seem to admit as much in Table 1 which indicates this model is not capable of "clean topology". Somewhat annoyingly, they do not discuss topology anywhere else in the paper (at least, I could not find the word "topology" via Ctrl+F).
- jsheard 469 days ago
  Credit where it's due, unlike most of these papers they do at least show some of their models sans textures on page 11, so you can see how undefined the actual geometry is (e.g. none of the characters have eyes until they are painted on).
  [-]
  - SV_BubbleTime 469 days ago
    Sans texture is not wireframe though. They have a texture, it’s just all white.
    The wire frame is going to be unrecognizable-bad.
    Still a ways to go.
- torginus 469 days ago
  Afaik, there's no topology - it outputs signed distance fields, not meshes.
  [-]
  - explaininjs 469 days ago
    This is incorrect.
    > Given a text prompt provided by the user, Stage I creates […] a 3D mesh.
- dyauspitr 469 days ago
  That doesn’t matter for things like 3D printing and CNC machining. Additionally, there are other mesh fixer AI tools. This is going to be gold for me.
  [-]
  - jsheard 469 days ago
    However if you 3DP/CNC these you'll only get the base shape, without any of the fake details it painted over the surface.
    Expectation vs. reality: https://i.imgur.com/82R5DAc.png
    [-]
    - dyauspitr 469 days ago
      That’s still not bad. I can use the normal and texture maps to generate appropriate depth maps to put the details in and do some final Wacom touch ups. Way better than making the whole thing from scratch.
  - eropple 469 days ago
    > That doesn’t matter for things like 3D printing and CNC machining
    It absolutely does. But great, let's look forward to Printables being ruined by off-model nonsense.
    [-]
    - dyauspitr 469 days ago
      Why does it matter. As long as there are no holes, my vectric software doesn’t care.
      [-]
      - TylerE 469 days ago
        If your normals are flipped, your cnc cutter is going to try to cut from inside up to the surface. That's no bueno.
        [-]
        dyauspitr 469 days ago
        Inverting the normals is pretty straightforward.
        [-]
        TylerE 469 days ago
        If ALL of them are inverted, yes.
        If the topology is a disaster...no.
        If you're hand massaging every poly you're rather defeating the purpose.
        [-]
        slidehero 469 days ago
        >If you're hand massaging every poly you're rather defeating the purpose.
        That's a bit of an overstatement. Fixing normals is far less time consuming than creating a mesh from scratch. This is particularly a win for people who lack the artistic skill to create the meshes in the first place.
        I've got a lot of technical 3D skills after using 3DSMax for years as a hobby. Unfortunately I lack the artistic skills to create good looking objects. This would definitely allow me to do things I couldn't before.
    - SV_BubbleTime 469 days ago
      It matters so much more, GP is just being hopeful and soon to be disappointed.
- nuz 469 days ago
  Such a silly argument. Fixing topology is a nearly solved problem in geometry processing. (Or just start with a good topology and 'paste' a texture onto it like they develop techniques for here.)
  [-]
  - RicoElectrico 469 days ago
    It's an essential skill for reading scientific papers to notice what isn't there. It's as important as what is there.
    In my field, analog IC design, if we face a wall, we often do some literature review with a colleague and more often than not, results are not relevant for commercial application. Forget about Monte Carlo, sometimes even there aren't full PVT corners.
    [-]
    - jampekka 469 days ago
      This is indeed a side effect from research papers being read more outside academia (which is strictly a good thing in itself).
      In research one learns that most (almost all) papers oversell their results and a lot of stuff is hidden in the "Limitations" section. This is a significant problem, but not that big a problem within academia as everybody, at least within the field, knows to take the results with a grain of salt. But those outside academia, or outside the field, often don't take this into account.
      Academic papers should be read a bit like marketing material or pitch decks.
  - zemo 469 days ago
    depends what you're talking about and what your criteria is. In gamedev, studios typically use a retopology tool like topogun (https://www.topogun.com/) to aid in the creation of efficient topologies, but it's still a manual task, as different topologies have different tradeoffs in terms of poly count, texture detail, options for how the model deforms when animated, etc. For example you may know that you're working on a model of a player character in a 3rd person game where the camera is typically behind you, so you want to spend more of your budget on the _back_ of the model than the _front_, because the player is typically looking at their character's back. If your criteria is "find the minimum number of polygons", sure, it's solved. That's just one of many different goals, and not the goal that is typically used by gamedev, which I assume to be a primary audience of this research.
    [-]
    - efilife 469 days ago
      Fyi, we use asterisks to put emphasis on text on HN
  - explaininjs 469 days ago
    No… it’s not. But if you know something I don’t the 5 primes will certainly be happy to pay you handsomely for the implementation!
    [-]
    - nuz 469 days ago
      https://github.com/wjakob/instant-meshes
      [-]
      - spookie 469 days ago
        I love that tool but it really doesn't fix bad topology.
        It gets you somewhere closer, but not a fix.
        Moreover, depending on what you have at hand, the resolution of your remeshing might destroy a LOT of detail or is unable to accomodate thin sections.
        Retopo isn't a solved problem. It only is for really basic, convex meshes.
      - explaininjs 469 days ago
        A piece of software that hasn’t been touched in 5 years, let alone adopted in any professional production environment? Cool…
        [-]
        portaouflop 469 days ago
        AFAICT it’s used in professional applications and software does not need to be constantly updated, especially if it’s not for the web.
        [-]
        explaininjs 469 days ago
        If the claim was that the problem was solved, sure it might make sense that the package does not need to be touched (in reality the field isn’t as slow as you presume, but I digress).
        Instead, the claim is that it’s “nearly^{TM}” solved, so the proof being an abandoned repo from half a decade ago actually speaks volumes: it’s solved except for the hard part, and nobody knows how to solve the hard part.
        [-]
        portaouflop 469 days ago
        Well yea I wasn’t claiming the problem is solved, that was GP.
        I don’t think you can truly “solve” any problem if you think about it.
        [-]
        explaininjs 468 days ago
        I'm talking satisfying business needs not philosophical mumbo jumbo.
        If you need a model to move, you give it to a rigger. If the mesh is bad, they will first need to remesh it or the rigging wont work right. This is the problem. They will solve this problem using a manual, labor intensive, process. It's not particularly difficult, any 3D artist considering themselves a professional ought to be able to do it. But it's not the sort of thing where you just press a button and turn some knobs and you're done either. It takes a lot of work, and in particularly bad cases it's easier to just start from scratch - remeshing a garbage mesh is indeed harder than modeling from scratch in many cases. Once either of these mechanisms have been applied, the problem will be solved and the rigger can move on to rigging.
        So yes, algorithms exist that pretend to remesh, and every professional modeling systems has one built in (because your sales guys don't want to be the ones without one), but professionals do not use them in production environments (my original claim, if you recall) because their results are so bad. Indeed I'm told several meme accounts exist dedicated to how badly they screw things up when folks do try to take the shortcuts.
        If this project was aiming to solve the problem (which is possible, as I have just explained), they would not have given up 5 years ago. Because it sure isn't solved now.
  - TrevorJ 469 days ago
    Hard disagree, as someone in the industry.
w_for_wumbo 469 days ago
I think this is another precursor step in recreating our reality digitally. As long as you're able to react to the persons' state, with enough metrics you're able to recreate environments and scenarios within a 'safe environment' for people to push through and learn to cope with the scenarios they don't feel safe to address in the 'real' world.
When the person then emerges from this virtual world, it'll be like an egg hatching into a new birth, having learned the lessons in their virtual cocoon.
If you don't like this idea, it's an interesting thought experiment regardless as we can't verify, we're not already in a form of this.
floppiplopp 469 days ago
Interesting, but what are the practical uses of 3d assets beyond gaming, where does it create a real advantage over what we already use as visual information and user interfaces? I cannot see VR replacing the interactions we have. It requires cumbersome, expensive hardware, it floods the users with additional mostly useless information (image, sound, 3d itself) they have to process, it's slow and expensive to create and maintain, in short: it's inefficient compared to established tech, which will always run circles around the lame try of imitating real world interactions in a 3d virtual space. The potential availability of very expensively (in terms of computing power) generated assets doesn't change that. It's still hard to do right, and even if done right, it seems like only a gimmick hardly anyone can stomach for more then a couple of hours at best. It's information overload to most people, and they have better alternatives.
[-]
- Ukv 469 days ago
  > Interesting, but what are the practical uses of 3d assets beyond gaming
  Probably many areas that we already use 3D assets/texturing for. Maybe objects to fill out an architectural render, CG in movies/TV shows, 3D printing, or just as an inspiration/mock-up to build off of. I'd imagine this generator is less useful for product design/manufacturing at the moment due to lack precise constraints - but maybe once we get the equivalent of ControlNets.
  If weights are released, it may also serve as a nice foundation model, or synthetic data generator, for other 3D tasks (including non-generative tasks like defect detection), in the same way Stable Diffusion and Segment Anything have for 2D tasks.
  > I cannot see VR replacing the interactions we have. It requires cumbersome, expensive hardware
  Currently sure, but it's been a reasonably safe bet that hardware will get smaller and cheaper. Something like the Bigscreen Beyond already has a fairly small form factor.
  But, I feel you're basing judgement of a 3D generator on one currently-niche potential use of 3D assets, that being VR/AR user interfaces (and in particular ones intended to replace a phone rather than, for instance, the interactive interfaces within VR games/experiences).
  > The potential availability of very expensively (in terms of computing power) generated assets doesn't change that
  Even just comparing computing power and not the human labour required, this is probably going to be an extremely cheap way to generate assets. The paper reports 30 seconds for AssetGen, then a further 20 seconds for TextureGen - both being feed-forward generators. They don't mention which GPU, but previous similar models have ran in a couple of minutes on consumer GPUs.
vletal 469 days ago
Seeems like simple enough 3D-to-3D will be possible soon!
I'll use it to upscale 8x all meshes and textures in the original Mafia and Unreal Tournament, write a good bye letter to my family and disappear.
I think the kids will understand when they grow up.
GaggiX 469 days ago
In the comparison between the models only Rodin seems to produce clean topology, hopefully in the future we will see a model with the strength of both, hopefully from Meta as Rodin is a commercial model.
[-]
- cchance 469 days ago
  Ya would be cool if we had something open that competed with rodin, but just like elevenlabs for voice, seems closed is gonna be ahead for a while
999900000999 469 days ago
Would love for an artist to provide some input, but I imagine this could be really good if it generates models that you can edit or start from later .
Or, just throw a PS1 filter on top and make some retro games
[-]
- dorkwood 469 days ago
  Sure.
  The paper doesn't show topology, UVs or the output texture, so we're left to assume the models look something like what you'd find when using photogrammetry: triangulated blobs with highly segmented UV islands and very large textures. Fine for background elements in a 3D render, but unsuitable for use in a game engine or real-time pipeline.
  In my job I've sometimes been given 3D scans and asked to include them in a game. They require extensive cleanup to become usable, if you care about visual quality and performance at all.
- raytopia 469 days ago
  Unless the topology is good it may not be worth it.
- doctorpangloss 469 days ago
  > for an artist to provide some input
  Sure, the results are excellent.
  > Or, just throw a PS1 filter on top and make some retro games
  There's so many creative ways to use these workflows. Consider how much people achieved with NES graphics. The biggest obstacles are tools and marketplaces.
  [-]
  - testfrequency 469 days ago
    I question that you’re actually an 3D artist. I’m an artist (as is my partner) and we both agree this looks better than most examples..but it still looks incredibly lackluster, poorly colored, and texturally continues to have weird uncanny smoothness to it that is distracting/obviously generated.
    I don’t have time to leave a longer reply, and I still need to read over their entire white paper later tonight, but I’m surprised to see someone who claims to be an artist be convinced that this is “incredible”.
anditherobot 469 days ago
Can this potentially support :
- Image Input to 3D model Output
- 3D model(format) as Input
Question: What is the current state of the art commercially available product in that niche?
[-]
- egnehots 469 days ago
  This a pipeline for text to 3D.
  But it's using for 3D gen, a model that is more flexible:
  https://assetgen.github.io/
  It can be conditioned on text or image.
- moffkalast 469 days ago
  Meshroom, if you have enough images ;)
localfirst 469 days ago
can somebody please please integrate SAM with 3d primitive RAGging? This is the golden chalice solution as a 3d modeler, having one of those "blobs" generated by Luma and likes aren't very useful
rebuilder 469 days ago
I’m puzzled by the poor texture quality in these. The colours are just bad - it looks like the textures are blown out (the detail at the bright end clip into white) and much too contrasty ( the turkey does that transition from red to white via a band of yellow). I wonder why that is - was the training data just done on the cheap?
[-]
- firtoz 469 days ago
  It seems to be very well compared to the alternatives, however there's a long way to go forward indeed
tiborsaas 469 days ago
Probably this is the best way to build the Metaverse. Publish all the research, let people build products over it and soon we'll in need for a place and platform to make use of all the instant assets in virtual spaces.
Well played, Meta.
kgraves 469 days ago
Can this be used for image to 3D generation? What is the SOTA in this area these days?
[-]
- Fripplebubby 469 days ago
  I think what they did here was go text prompt -> generate multiple 2d views -> reconstruction network to go multiple 2d images to 3d representation -> mesh extraction from 3d representation.
  That's a long way of saying, no, I don't think that this introduces a component that specifically goes 2d -> 3d from a single 2d image.
- tobyjsullivan 469 days ago
  The paper suggests Rodin Gen-1 [0] is capable of image-to-shape generation.
  [0] https://hyperhuman.deemos.com/rodin
f0e4c2f7 469 days ago
Is there a way to try this yet?
polterguy1000 469 days ago
Meta 3D Gen represents a significant step forward in the realm of 3D content generation, particularly for VR applications. The ability to generate detailed 3D models from text inputs could drastically reduce the labor-intensive process of content creation, making it more accessible and scalable. However, as some commenters have pointed out, the current technology still faces challenges, especially in producing high-quality, detailed geometry that holds up under the scrutiny of VR’s stereoscopic depth perception. The integration of PBR texturing is a promising feature, but the real test will be in how well these models can be refined and utilized in practical applications. It’s an exciting development, but there’s still a long way to go before it can fully meet the needs of VR developers and artists.
[-]
- guiomie 469 days ago
  That would be great. I've learnt some Unity, building my own little VR game, and I dread having to learn Blender or any other tool to make more detailed shapes/models. I've tried a few GenAI tool to create 3D models and the quality is not useable.
- xena 469 days ago
  Generally these things are useless for 3d artists because the wireframe is useless for them.
carbocation 469 days ago
For starters, I'd love to just see a rock-solid neural network replacement for screened poisson surface reconstruction. (I have seen MeshAnything and I don't think that's the end-game.)
Simon_ORourke 469 days ago
Are those guys still banging on about that Metaverse? That's taken a decided back seat to all the AI innovation in the past 18 months.
[-]
- dvngnt_ 469 days ago
  zuck has said before that ML will help make the "metaverse" more viable.
  he still needs a moat with its own ecosystem like the iphone
- yieldcrv 469 days ago
  Meta has spent like $50bn on their Metaverse line item since 2021 and hasn't stopped
  that probably means a bunch of H100's now for this Meta 3D Gen thing, and other yet unnannounced things still incubating in a womb of datasets
timeon 469 days ago
Why these pages want to bother visitors with popups? Just use "only essential" as default.
ziofill 468 days ago
Is this going toward 3D games entirely "hallucinated"? That would be amazing.
nightowl_games 468 days ago
Is this just a paper or can I run the program and generate some stuff?
surfingdino 469 days ago
Not sure how adding Gen AI is going to make VR any better? I wanted to type "it's like throwing good money after bad", but that's not quite right. Both are black holes where VC money is turned into papers and demos.
[-]
- Filligree 469 days ago
  The ultimate end goal is a VR game with infinite detail. Sword Art Online, however, remains fiction. Perhaps for the best.
antman 469 days ago
Work like this is the only way to revive the now defunct Metaverse, I was wondering whether Meta would fund research such as this that could lower the financial barrier to entry for Metaverse participants