Grafeo – A fast, lean, embeddable graph database built in Rust

(grafeo.dev)

245 points | by 0x1997 22 days ago

25 comments

lmeyerov 22 days ago
Speaking of embeddable, we just announced cypher syntax for gfql, so the first OSS CPU/GPU cypher query engine you can use on dataframes
Typically used with scaleout DBs like databricks & splunk for analytical apps: security/fraud/event/social data analysis pipelines, ML+AI embedding & enrichment pipelines, etc. We originally built it for the compute-tier gap here to help Graphistry users making embeddable interactive GPU graph viz apps and dashboards and not wanting to add an external graph DB phase into their interactive analytics flows.
Single GPU can do 1B+ edges/s, no need for a DB install, and can work straight on your dataframes / apache arrow / parquet: https://pygraphistry.readthedocs.io/en/latest/gfql/benchmark...
We took a multilayer approach to the GPU & vectorization acceleration, including a more parallelism-friendly core algorithm. This makes fancy features pay-as-you-go vs dragging everything down as in most columnar engines that are appearing. Our vectorized core conforms to over half of TCK already, and we are working to add trickier bits on different layers now that flow is established.
The core GFQL engine has been in production for a year or two now with a lot of analyst teams around the world (NATO, banks, US gov, ...) because it is part of Graphistry. The open-source cypher support is us starting to make it easy for others to directly use as well, including LLMs :)
Aurornis 22 days ago
Does anyone have any experience with this DB? Or context about where it came from?
From the commit history it's obvious that this is an AI coded project. It was started a few months ago, 99% of commits are from 1 contributor, and that 1 contributor has some times committed 100,000 lines of code per week. (EDIT: 200,000 lines of code in the first week)
I'm not anti-LLM, but I've done enough AI coding to know that one person submitting 100,000 lines of code a week is not doing deep thought and review on the AI output. I also know from experience that letting AI code the majority of a complex project leads to something very fragile, overly complicated, and not well thought out. I've been burned enough times by investigating projects that turned out to be AI slop with polished landing pages. In some cases the claimed benchmarks were improperly run or just hallucinated by the AI.
So is anyone actually using this? Or is this someone's personal experiment in building a resume portfolio project by letting AI run against a problem for a few months?
[-]
- StevenBtw 21 days ago
  Hi, I'm the one building grafeo, I have no idea why it is being posted everywhere. But I can probably answer your questions.
  The first version was largely a (slightly rearchitected) port of a local graph database I had been building called graphos. Most of the engine and core are handwritten, so are the python bindings and conformance tests. The rest is indeed largely AI generated, so is the documentation (Mkdocs). The AI generated parts are curated and validated, although it's not up to par for a production release yet.
  This is not a resume portfolio project and in no way related to my day job. I started writing grafeo(then graphos) out of frustration with Neo4j and being inspired by some discussions about database internals with Hännes from duckdb at a conference. I tried ladybug, but found memory usage insanely high and was sure I could do better. Anyone looking for an embedded battle tested graph database should probably still look at ladybug though. Grafeo is not that mature yet.
  And to be honest I also have no real plans with grafeo, I am using it myself for now and am very happy with it, but that's n=1. It's fully free and open source and contributors are very welcome, but its also not yet fully where I would want it to be, hence the beta status. I have no commercial interest, but had a lot of fun pouring multiple hundreds of hours in and creating something that I enjoy using myself.
  Hope that clarifies some things!
  [-]
  - adsharma 19 days ago
    Thank you for the shout out! I looked into your benchmark setup a bit. Two things going on:
    - Ladybug by default allocates 80% of the physical memory to the buffer pool. You can limit it. This wasn't the main reason.
    - Much of the RSS is in ladybug native memory connected to the python connection object. I noticed that you keep the connection open between benchmark runs. For whatever reason, python is not able to garbage collect the memory.
    We ran into similar lifetime issues with golang and nodejs bindings as well. Many race conditions where the garbage collector releases memory while another thread still has a reference to native memory. We now require that the connection be closed for the memory to be released.
```
  https://github.com/LadybugDB/ladybug/issues/320
  https://github.com/LadybugDB/go-ladybug/issues/7
  https://github.com/LadybugDB/ladybug-nodejs/pull/1
```
- jandrewrogers 22 days ago
  That is a lot of code for what appears to be a vanilla graph database with a conventional architecture. The thing I would be cautious about is that graph database engines in particular are known for hiding many sharp edges without a lot of subtle and sophisticated design. It isn't obvious that the necessary level of attention to detail has been paid here.
  [-]
  - adsharma 22 days ago
    Are you talking about Andy Pavlo bet here?
    https://news.ycombinator.com/item?id=29737326
    Kuzu folks took some of these discussions and implemented them. SIP, ASP joins, factorized joins and WCOJ.
    Internally it's structured very similar to DuckDB, except for the differences noted above.
    DuckDB 1.5 implemented sideways information passing (SIP). And LadybugDB is bringing in support for DuckDB node tables.
    So the idea that graph databases have shaky internals stems primarily from pre 2021 incumbents.
    4 more years to go to 2030!
    [-]
    - jandrewrogers 22 days ago
      I wasn't referring to the Pavlo bet but I would make the same one! Poor algorithm and architecture scalability is a serious bottleneck. I was part of a research program working on the fundamental computer science of high-scale graph databases ~15 years ago. Even back then we could show that the architectures you mention couldn't scale even in theory. Just about everyone has been re-hashing the same basic design for decades.
      As I like to point out, for two decades DARPA has offered to pay many millions of dollars to anyone who can demonstrate a graph database that can handle a sparse trillion-edge graph. That data model easily fits on a single machine. No one has been able to claim the money.
      Inexplicably, major advances in this area 15-20 years ago under the auspices of government programs never bled into the academic literature even though it materially improved the situation. (This case is the best example I've seen of obviously valuable advanced research that became lost for mundane reasons, which is pretty wild if you think about it.)
      [-]
      - zozbot234 22 days ago
        What do you need one trillion edges for? Wikidata is a huge, general purpose knowledge graph and it gets away with ~1B triples, give or take.
        [-]
        jandrewrogers 22 days ago
        Almost all analytic graphs of general scope surpass 1T edges, see below. DARPA also has an unfilled objective for 1B edge real-time continuously updated operational graphs. These are smaller and the write throughput requirements are in line with non-graph analytical databases but graph databases struggle to meet that standard.
        There are countless smaller graphs for narrow domains that may be <1B edges but many people have the ambition to stitch together these narrow graphs into a larger graph. When stitching graphs together, the number of edges is usually super-linear. A billion edges is kind of considered “Hello World” for system testing.
        The Semantic Web companies in the 2000s had graphs that were 100B+ edges. They wanted to go much larger but hit hard scaling walls around that point. That scaling wall killed them.
        Classic mapping data models are typically 10-100B edges. These could be much, much larger if they could process all the data available to them.
        Of course, intelligence agencies had all kinds of graphs far beyond trillions of edges 20 years ago. People, places, things, events.
        Any type of spatiotemporal entity graphs with large geographic scope are quadrillions of edges. It isn’t just a lot of inferred relationships between entities, the relationships evolve over time which also must be captured. These are probably the most commercially valuable type of graph. You could build hundreds of different graphs of this type with 1T+ edges in most regions, never mind doing it at scale. These are so large that we usually don’t store them. Subgraphs are generated on demand, which is computationally expensive.
        These spatiotemporal entity graphs also have the largest write loads. Single sources generate tens of PB/day of new edges. There is a ton of industrial data that looks like this; it isn’t just people slinging structured data.
        Graphs are everywhere but we furiously avoid them because the scalability of operations over anything but severely constrained graphs is so poor. Selection bias.
        NSA in particular heavily funded foundational theoretical and applied computer science research into scaling graph computing for decades. They had all kinds of boring graphs where trillions of edges was their Tuesday. The US military also uses large graph databases in fairly boring applications that probably didn’t require a graph database.
      - adsharma 22 days ago
        > many millions of dollars to anyone who can demonstrate a graph database that can handle a sparse trillion-edge graph.
        I wonder why no one has claimed it. It's possible to compress large graphs to 1 byte per edge via Graph reordering techniques. So a trillion scale graph becomes 1TB, which can fit into high end machines.
        Obviously it won't handle high write rates and mutations well. But with Apache Arrow based compression, it's certainly possible to handle read-only and read-mostly graphs.
        Also the single machine constraint feels artificial. For any columnar database written in the last 5 years, implementing object store support is tablestakes.
        [-]
        jandrewrogers 22 days ago
        Achieving adequate performance at 1T edges in one aspect requires severe tradeoffs in other aspects, making every implementation impractical at that scale. You touched on a couple of the key issues when I was working in this domain.
        There is no single machine constraint, just the observation that we routinely run non-graph databases at similar scale on single machines without issue. It doesn't scale on in-memory supercomputers either, so the hardware details are unrelated to the problem:
        - Graph database with good query performance typically has terrible write performance. It doesn't matter how fast queries are if it takes too long to get data into the system. At this scale there can be no secondary indexing structures into the graph; you need a graph cutting algorithm efficient for both scalable writes and join recursion. This was solved.
        - Graph workloads break cache replacement algorithms for well-understood theory reasons. Avoiding disk just removes one layer of broken caching among many but doesn't address the abstract purpose for which a cache exists. This is why in-memory systems still scale poorly. We've known how to solve this in theory since at least the 1980s. The caveat is it is surprisingly difficult to fully reduce to practice in software, especially at scale, so no one really has. This is a work in progress.
        - Most implementations use global synchronization barriers when parallelizing algorithms such as BFS, which greatly increases resource consumption while throttling hardware scalability and performance. My contribution to research was actually in this area: I discovered a way to efficiently use error correction algorithms to elide the barriers. I think there is room to make this even better but I don't think anyone has worked on it since.
        The pathological cache replacement behavior is the real killer here. It is what is left even if you don't care about write performance or parallelization.
        I haven't worked in this area for many years but I do keep tabs on new graph databases to see if someone is exploiting that prior R&D, even if developed independently.
      - mleonhard 22 days ago
        > Inexplicably, major advances in this area 15-20 years ago under the auspices of government programs never bled into the academic literature even though it materially improved the situation.
        Would you please share some more info about this? Were the advances implemented in software and never written up and published? What are the names of the government programs?
      - rossjudson 22 days ago
        I guess it all depends on the meaning of the word "handle", and what the use cases are.
    - darkteflon 22 days ago
      KuzuDB, now in [maintenance mode](https://github.com/kuzudb/kuzu). Quite annoyed about that one, was using it extensively.
      [-]
      - gdotv 20 days ago
        LadybugDB (https://github.com/LadybugDB/ladybug) at this point seems to be the only sustainable fork. When deciding what to do about the Kuzu archival on https://gdotv.com, we've gone with maintaining support for the last available version of Kuzu (it's still heavily used from what I'm seeing) whilst introducing support for LadybugDB. I've looked into a few other forks and at this point in time none seem to be actively maintained for more than a few weeks before getting dropped.
        [-]
        darkteflon 19 days ago
        Thanks, that’s interesting - I didn’t know that Ladybug was a KuzuDB fork.
    - adsharma 22 days ago
      Source: https://www.theregister.com/2023/03/08/great_graph_debate_we...
      > There are some additional optimizations that are specific to graphs that a relational DBMS needs to incorporate: [...]
      This is essentially what Kuzu implemented and DuckDB tried to implement (DuckPGQ), without touching relational storage.
      The jury is out on which one is a better approach.
  - justonceokay 22 days ago
    Yes a graph database will happily lead you down a n^3 (or worse!) path when trying to query for a single relation if you are not wise about your indexes, etc.
    [-]
    - cluckindan 22 days ago
      That sounds like a ”graph” DB which implements edges as separate tables, like building a graph in a standard SQL RDB.
      If you wish to avoid that particular caveat, look for a graph DB which materializes edges within vertices/nodes. The obvious caveat there is that the edges are not normalized, which may or may not be an issue for your particulat application.
    - adsharma 22 days ago
      Are you talking about the query plan for scanning the rel table? Kuzu used a hash index and a join.
      Trying to make it optional.
      Try
      explain match (a)-[b]->(c) return a.rowid, b.rowid, c.rowid;
  - stult 22 days ago
    It certainly does seem problematic to have a graph database hiding edges, sharp or not
- gdotv 22 days ago
  Agreed, there's been a literal explosion in the last 3 months of new graph databases coded from scratch, clearly largely LLM assisted. I'm having to keep track of the industry quite a bit to decide what to add support for on https://gdotv.com and frankly these days it's getting tedious.
  [-]
  - piyh 22 days ago
    I'm turning off my brain and using neo4j
    [-]
    - gdotv 22 days ago
      proof that Neo4j won the popularity contest!
    - UltraSane 22 days ago
      Neo4j is pretty nice.
  - aorth 22 days ago
    Figurative!
- ozgrakkurt 22 days ago
  Using a LLM coded database sounds like hell considering even major databases can have some rough edges and be painful to use.
- algolint 22 days ago
  [flagged]
- hrmtst93837 22 days ago
  [flagged]
- arthurjean 22 days ago
  Sounds about right for someone who ships fast and iterates. 54 days for a v0 that probably needs refactoring isn't that crazy if the dev has a real DB background. We've all seen open source projects drag on for 3 years without shipping anything, that's not necessarily better
  [-]
  - Aurornis 22 days ago
    200,000 lines of code on week 1 is not a sign of a quality codebase with careful thought put into it.
    > We've all seen open source projects drag on for 3 years without shipping anything, that's not necessarily better
    There are more options than “never ship anything” and “use AI to slip 200,000 lines of code into a codebase”
  - TheJord 22 days ago
    shipping fast matters a lot less than shipping something you actually understand. 200k lines in a week means nobody knows what's in there, including the author. that's not a codebase, it's a liability
cjlm 22 days ago
Overwhelmed by the sheer number of graph databases? I released a new site this week that lists and categorises them. https://gdb-engines.com
[-]
- Sytten 22 days ago
  Knowing if it is embeddable or server would be nice in that table
  [-]
  - cjlm 21 days ago
    Yes, I have the "embedded" kind in there but a dedicated column would be nice. Thanks!
- dbacar 22 days ago
  Did you generate the list using an LLM?
  [-]
  - cjlm 22 days ago
    I was inspired by https://arxiv.org/abs/2505.24758 and collated their assessment into a table and then just kept adding databases :)
    Claude helped a lot but it's all reviewed and curated by me.
natdempk 22 days ago
Serious question: are there any actually good and useful graph databases that people would trust in production at reasonable scale and are available as a vendor or as open source? eg. not Meta's TAO
[-]
- szarnyasg 22 days ago
  That's a difficult question and I would like to avoid giving a direct answer (because I co-lead a nonprofit benchmarking graph databases) but even knowing what you need for a graph database can be a tricky decision. See my FOSDEM 2025 talk, where I tried to make sense of the field:
  https://archive.fosdem.org/2025/schedule/event/fosdem-2025-5...
- cjlm 22 days ago
  Serious answer: limiting to just Open Source: JanusGraph, DGraph, Apache AGE, HugeGraph, MemGraph and ArcadeDB all meet that criteria.
  [-]
  - adsharma 22 days ago
    What is open source and what is a graph database are both hotly debated topics.
    Author of ArcadeDB critiques many nominally open source licenses here:
    https://www.linkedin.com/posts/garulli_why-arcadedb-will-nev...
    What is a graph database is also relevant:
```
  - Does it need index free adjacency?
  - Does it need to implement compressed sparse rows?
  - Does it need to implement ACID?
  - Does translating Cypher to SQL count as a graph database?
```
- adsharma 22 days ago
  What people perceive as "Facebook production graph" is not just TAO. There is an ecosystem around it and I wrote one piece of it.
  Full history here: https://www.linkedin.com/pulse/brief-history-graphs-facebook...
- flyingsilverfin 22 days ago
  I run the development of TypeDB, which doesnt use Cypher but works really well as a graph database. Certainly it, and other graph databases like neo4j, are used in production at scale. However, a lot of oss databases are open core on some level, it just depends on where they draw the line. We draw it at clustering/high availability for the time being, the rest is in the CE version.
- gdotv 22 days ago
  plenty of those - I've had to work with dozens of different graph databases integrating them on https://gdotv.com, save for maybe 1-2 exceptions in the list of supported databases on our website, they're all production ready and either backed by a vendor or open-source (or sometimes both, e.g. Apache AGE for Azure PostgreSQL). There are some technologies that have been around for a long time but really flying under the radar, despite being used a lot in enterprise (e.g. JanusGraph).
- pphysch 22 days ago
  Yeah: Postgres, etc.
  When you actually need to run graph algorithms against your relational data, you export the subset of that data into something like Grafeo (embedded mode is a big plus here) and run your analysis.
  [-]
  - adsharma 22 days ago
    That importing is expensive and prevents you from handling billion scale graphs.
    It's possible to run cypher against duckdb (soon postgres as well via duckdb's postgres extension) without having to import anything. That's a game changer when everything is in the same process.
- lvca 19 days ago
  [dead]
adsharma 22 days ago
There are 25 graph databases all going me too in the AI/LLM driven cycle.
Writing it in Rust gets visibility because of the popularity of the language on HN.
Here's why we are not doing it for LadybugDB.
Would love to explore a more gradual/incremental path.
Also focusing on just one query language: strongly typed cypher.
https://github.com/LadybugDB/ladybug/discussions/141
[-]
- tadfisher 22 days ago
  Is LadybugDB not one of these 25 projects?
  [-]
  - adsharma 22 days ago
    LadybugDB is backed by this tech (I didn't write it)
    https://vldb.org/cidrdb/2023/kuzu-graph-database-management-...
    You can judge for yourself what work has been done in the last 5 months. Many short videos here. New open source contributors who I didn't know before ramping up.
    https://youtube.com/@ladybugdb
  - wartywhoa23 22 days ago
    Those 25 are me too; this one is a me as well /s.
- pjmlp 22 days ago
  Good decision, as proven multiple times, it is the product not the programming language, that makes the customers.
- Sytten 22 days ago
  I really wish people would stop using the language as an argument and that commenter would also move on to a more interesting debate.
  In your discussion the first comment from an ex kuzu dev made an excellent point that rust for databases in an excellent language to ship faster with confidence while reducing real problems of concurrency and corruption.
  At some point it becomes intellectual dishonesty to dismiss a language because of vibes instead of merit.
  [-]
  - adsharma 21 days ago
    I didn't dismiss the language. I called it a north star. Rust is still the best option if you desire memory safety.
    But rewriting a complex working piece of software in Rust is not trivial. Having an incremental path (where only parts are rewritten in Rust and compatible with C++ code) would be a good path to get there.
    Also open to new code and extensions getting written in Rust.
satvikpendem 22 days ago
There seem to be a lot of these, how does it compare to Helix DB for example? Also, why would you ever want to query a database with GraphQL, for which it was explicitly not made for that purpose?
mark_l_watson 22 days ago
I just spent an hour with Grafeo, trying to also get the associated library grafeo_langchain working with a local Ollama model. Mixed results. I really like the Python Kuzu graph database, still use it even though the developers no longer support it.
[-]
- gdotv 22 days ago
  Ever try https://gdotv.com with it? Really interesting to see folks still using Kuzu despite the archival status. We decided to maintain support for that reason, it's been left in a fairly stable rate which is fantastic. Might be worth checking out LadybugDB (the main fork), migration is pretty easy.
Kalizazi 21 days ago
Weird project, it's definitely AI assisted, high LoC, but when you see the commits it doesnt look like the average AI slob, and the design is definitely not conventional.
JS tests seem fully AI generated thought.. And big difference in quality between some of the ecosystem repo's. Server, Web and memory all seem very well developed, llamaindex and langchain lower effort.
I think the main thing this project needs is more maintainers, but looking purely at the features of this database, and the fact that it's Apache2-0, make it interesting, at least for me.
snissn 22 days ago
It's not clear that graph-bench in "Tested with the LDBC Social Network Benchmark via graph-bench" is a benchmark that you made. It seems more robust and reliable than "we built a db and a benchmark tool, and our benchmark tool says we're the best". Just a thing to be careful about. You should just state that it's your tool and you welcome feedback to help make it so that other projects being compared are compared in their best light. Something like that might help, I don't know though it's a hard problem.
[-]
- cynicalkane 22 days ago
  Strong chance the same robot that wrote the benchmark also wrote the sentence to sound impressive.
  This is another one of the vibe-coded slop projects that are routinely frontpaging HN now. As someone else pointed out, the single author has "written" >100kLOC in diffs per week. It's not possible that any human knows what's in the codebase in any reasonable detail.
SkyPuncher 22 days ago
Every time I look at graph databases, I just cannot figure out what problem they're solving. Particularly in an LLM based world.
Don't get me wrong, graphs have interesting properties and there's something intriguing out these dynamic, open ended queries. But, what features/products/customer journeys are people building with a graph DB.
Every time I explore, I end up back at "yea, but a standard DB will do 90% of this as a 10% of the effort".
[-]
- jandrewrogers 22 days ago
  In virtually all cases, you want a normal relational database and a sensible schema. Far easier and fewer sharp edges. Reaching for a graph database should never be the default choice.
  A handful of data models have strongly graph-like characteristics where queries require recursive ad hoc joins and similar. If your data is small, this is nominally the use case for a graph database. Often you can make it work pretty well on a good relational database if you are an expert at (ab)using it. Relational databases usually have better features in other areas too.
  If you have a very large graph-like data model, then you have to consider more exotic solutions. You will know when you have one of these problems because you already tried everything and everything is terrible. But you still started with a relational database.
- zozbot234 22 days ago
  A standard DB ala Postgres will be a perfectly functional graph database unless you're doing very specialized network analysis queries, which is not what most of these "knowledge graph" databases are being used for. It's only querying and data modeling that's a bit fiddly (expressing the "graph" structure using SQL) and that's being improved by the new Property Graph Query (PGQ) in the latest SQL standards.
  [-]
  - ffsm8 22 days ago
    It'd be great if PG came with a serverless/embeddable mode, that'd be the main missing thing in comparison to this tool.
    I know pglite, and while it's great someone made that, it's definitely not the same
    [-]
    - adsharma 21 days ago
      I maintain a fork of pgserver (pglite with native code). It's called pgembed. Comes with many vector and BM25 extensions.
      Just in case folks here were wondering if I'm some type of a graphdb bigot.
  - adsharma 21 days ago
    This is the same topic I had an intense argument with my coworkers at the company formerly called FB a decade ago. There is a belief that most joins are 1-2 deep. And that many hop queries with reasoning are rare and non-existent.
    I wonder how you reconcile the demand for LLMs with multihop reasoning with the statement above.
    I think a lot what is stated here is how things work today and where established companies operate.
    The contradictions in their positions are plain and simple.
    [-]
    - zozbot234 21 days ago
      There are worst-case optimal algorithms for multi-way and multi-hop joins. This does not require giving up the relational model.
      [-]
      - adsharma 21 days ago
        I maintain LadybugDB which implements WCOJ (inherited from the KuzuDB days). So I don't disagree with the idea. Just that it's a graph database with relational internals and some internal warts that makes it hard to compose queries. Working on fixing them.
        https://github.com/LadybugDB/ladybug/discussions/204#discuss...
      - adsharma 21 days ago
        Also an important test is the check on whether it's WCOJ on top of relational storage or is the compressed sparse row (CSR) actually persisted to disk. The PGQ implementations don't.
        There are second order optimizations that LLMs logically implement that CSR implementing DBs don't. With sufficient funding, we'll be able to pursue those as well.
        [-]
        zozbot234 21 days ago
        CSR is an array-based trie hence very costly to update. It can serve as an index for parts of the graph that basically will almost never change, but not otherwise.
        [-]
        adsharma 21 days ago
        Makes it a good match for columnar databases which already operate on the read-only, read-mostly part of the spectrum.
        Perhaps people can invent LSM like structures on top of them.
        But at least establish that CSR on disk is a basic requirement before you claim that you're a legit graph database.
  - gdotv 22 days ago
    That's coming to Postgres 19 this year, had a brief exchange with a committer earlier this week and it's actually available in the Postgres repo to try (need to run your own build of course). Very exciting development!
- adsharma 22 days ago
  For starters, LLMs themselves are a graph database with probabilistic edge traversal.
  Some apps want it to be deterministic.
  I'm surprised this question comes up so often.
  It's mainly from the vector embedding camp, who rightfully observe that vector + keyword search gets you to 70-80% on evals. What is all this hype about graphs for the last 20-30%?
  [-]
  - Tsarp 22 days ago
    "LLMs themselves are a graph database with probabilistic edge traversal" whaat?
    Do you have any good demos to showcase where graph DBs clearly have an advantage? Its mostly just toy made demos.
    vector embeddings on the other hand no matter how limited clearly have proven themselves useful beyond youtube/linkedin thought leader demos.
    [-]
    - adsharma 21 days ago
      It comes from people who develop LLMs. Anthropic and Google. References below.
      My other favorite quote: transformers are GNNs which won the hardware lottery.
      Longer form at blog.ladybugmem.ai
      You want to believe that everything probabilistic has more value and determinism doesn't? Or that the world is made up of tabular data? You have a lot of company.
      The other side of the argument I believe has a lot of money.
      https://www.anthropic.com/research/mapping-mind-language-mod...
      https://research.google/blog/patchscopes-a-unifying-framewor...
      [-]
      - Tsarp 21 days ago
        Not sure how that was the take away from both the posts above.
        I read the blog post and your website but unfortunately didnt help change my perspective.
        Thanks for the share
lvca 21 days ago
Do you know if Grafeo ever implemented the LDBC benchmark? I'd love to compare it with other Graph Databases: https://arcadedb.com/blog/neo4j-alternatives-in-2026-a-fair-...
Especially with OLAP queries.
OtomotO 22 days ago
Interesting... Need to check how this differs from agdb, with which I had some success for a sideproject in the past.
https://github.com/agnesoft/agdb
Ah, yeah, a different query language.
dramm 22 days ago
ACID, so let’s see the Jepsen tests.
foota 22 days ago
I added a super cheap and bad embedding database in a project that allows the agent to call a tool for searching all the content it's built, it seems to work pretty well! This way the agent doesn't need to call a bunch of list tools (which I was worried would introduce lost of data to the context), and can find things based on fuzzy search.
brunoborges 22 days ago
Why is everything "... built in Rust" trending so easily on HN?
[-]
- mattvr 22 days ago
  It implies high performance, reliability, and a higher degree of mastery of the developer.
  (Which may not all be true, but perhaps moreso than your average project)
- IshKebab 22 days ago
  Because Rust is an excellent language that pushes you into the "pit of success", and consequently software written in Rust tends to be fast, robust and easy to deploy.
  There's no big mystery. No conspiracy or organised evangelism. Rust is just really good.
  [-]
  - macintux 22 days ago
    Worth noting that “robust” and “correct” are orthogonal. Graph databases (well, any database) seem like an area where correctness particularly matters, and I doubt Rust gives any meaningful advantage there.
    [-]
    - IshKebab 21 days ago
      They absolutely are not orthogonal. They are closely related. In any case, Rust improves both.
      > I doubt Rust gives any meaningful advantage there.
      Advantage over what? Haskell & OCaml? Maybe not. C++ or Python? Absolutely. Its type system is far stronger than those, and its APIs are much better designed and harder to misuse.
ngburke 22 days ago
Been looking for something like this for a side project. The embedded mode with no external deps is the killer feature for me, hate dragging in a server just to do graph traversal. Going to give it a shot.
xlii 22 days ago
I wonder if people are using (or intend to use) vibe-coded projects like the one linked.
I mean - I understand, some people have fun looking at new tech no matter the source, but my question is is there a person who would be designated to pick a GraphQL in language and would ignore all the LLM flags and put it in production.
cluckindan 22 days ago
The d:Document syntax looks so happy!
nexxuz 22 days ago
I was ready to learn more about this but I saw "written in Rust" and I literally rolled my eyes and said never mind.
[-]
- ComputerGuru 22 days ago
  I think "written by genAI" should be a bigger turnoff than "written in Rust".
  [-]
  - andriy_koval 22 days ago
    alternative opinion:
    * it is possible to write high quality software using GenAI
    * not using GenAI could mean project won't be competitive in current landscape
    [-]
    - Aurornis 22 days ago
      > * it is possible to write high quality software using GenAI
      From examine this codebase it doesn’t appear to be written carefully with AI.
      It looks like code that was promoted into existence as fast as possible.
      [-]
      - andriy_koval 22 days ago
        sure, there are bad genAI projects and there are good genAI projects. You can remove genAI term from previous sentence.
    - quantumHazer 22 days ago
      > not using GenAI could mean project won't be competitive in current landscape
      why? this is false in my opinion, iterating fast is not a good indicator of quality nor competitiveness
      [-]
      - andriy_koval 22 days ago
        iterating fast over quality (e.g. refactoring, tests coverage, benchmarks, documentation, trying new nontrivial ideas) is a good indicator of quality.
        [-]
        quantumHazer 22 days ago
        you can’t iterate fast over quality though. it takes patience and expertise, not a bloated repo like this.
        every example you mentioned is not something you should delegate to LLMs, unless quick prototyping
        [-]
        andriy_koval 21 days ago
        > every example you mentioned is not something you should delegate to LLMs, unless quick prototyping
        it works very well for me, llm with guidance produces good quality code.
- chuckadams 22 days ago
  Too bad you don't do the same for commenting on HN.
- OtomotO 22 days ago
  Because it was explicitly advertising Rust and you can't stand the zealotry or because you hate Rust?
  Because the latter is really dumb. I don't mind a software written in C, although I personally wouldn't want to write it anymore.
measurablefunc 22 days ago
This looks like another avant-garde "art" project.
aplomb1026 22 days ago
[dead]
takahitoyoneda 22 days ago
[dead]
bamwor 22 days ago
[dead]
caijia 22 days ago
[flagged]