Models have gotten good enough that they can (mostly) take on long-horizon, complex tasks. We believe the bottleneck now is that these smart-enough models often lack information about your company, which is scattered in people's heads, Slack threads, stale docs, and in back-and-forth convos with AI.
MCP is useful for getting some info in front of an agent, but there are problems: (1) Once the session dies, so does the insight, so instead of copy-pasting a whole doc each time you're telling the agent to dig through Drive each time - not much of a win; (2) Even when MCP works, what it gathers isn't comprehensive, because people decide things on a whiteboard, brainstorm out loud, post a little in Slack, and scribble the rest in a doc, which leaves the agent working from partial information; (3) And even if it had everything, it doesn't do the meta-reasoning required to do a great job. If you paste in a Notion doc and it won't learn your design taste or your writing style unless you tell it to, and it won't know why a decision was made or when.
As undergrads 5 years ago, we were into the tools-for-thought wave and became power users of Notion, Obsidian, Roam, Anki, real believers in building a second brain. After GPT-3.5 came out we started to realize how much more powerful that second brain could be if an AI could actually read it, because suddenly it would know our backstory, our taste, our preferences, and unlock genuinely new capabilities. That’s why we’re building Hyper.
We know it’s not for everybody! But for people who do want to be on the cutting edge, this is a force multiplier that makes agents faster and better. It increases the number of tasks they can do, and how effectively they do them.
Hyper works by ingesting everything you give it access to, Docs, Slack, Email, Calendar, Granola, and synthesizes it into a knowledge graph of facts and their relationships with embeddings for semantic search. The memory system we’ve built is hybrid, with two modalities. Episodes are the raw source items kept as the source of truth. Facts are the meaning pulled out of each episode, stored as subject-predicate-object records with a plain summary and timestamps for when the fact was introduced and when it was invalidated (subject=person, predicate=works_at, object=company). Facts form a graph with typed edges between them: X is in tension with Y, A is derived from B, J supersedes K. Every time a new fact comes in we update the facts in its neighborhood, so the graph stays current, and that's how we handle stale information. When "we'll ship Friday" is later contradicted by "we're shipping Monday," the new fact supersedes the old one instead of both looking equally true, and we never auto-discard the superseded version, so you can still ask how you landed on Monday.
Every fact carries provenance back to its source and access-control tags for who is allowed to see it. At retrieval we query-expand, then fuse semantic search over embeddings with Postgres full-text search using reciprocal rank fusion, and we only ever evaluate a query against the facts and episodes that person has access to, which means two people on the same team can ask the same question and get different answers. We keep information fresh with webhooks where they exist and polling where they don't, hashing contents to catch changes for sources that don’t handle native dedupe. Agents read and write through two paths: lifecycle hooks in tools like Claude Code, Cowork, Codex, and Cursor, where we inject relevant context on every prompt and pull interesting facts out of every response, and plain MCP tool calls for everything that doesn't expose hooks.
We love it! and so do our early users: one CEO uses Hyper to draft emails in his voice with full company context. What took hours/week now takes minutes and gets sharper each time Hyper learns more how he thinks and how his company is changing. Another YC founder one-shotted a launch video script because Hyper already knew their product, voice, positioning accumulated over months.
We have a 3-day free trial, explained more on our pricing page (https://heyhyper.ai/pricing) and there are more details in our FAQ (https://heyhyper.ai/faq), including things like privacy, compliance, and how we’re different from other “memory” companies..
Give it a spin! break it! and tell us where it falls short: https://heyhyper.ai/. We'd love to build you a 10-star experience :) Comments welcome!
I've always thought that knowledge graphs/expert systems, and even the broader concept of entity-attribute-value storage, got an unfairly bad reputation because of the 1970s/1980s "AI Winter."
And I think that perhaps this reputation is why so much of the oxygen in the RAG space has been consumed by the notion that "RAG = retrieval of fragments by vector similarity."
The difference now from decades ago, of course, is that now LLMs can do both the job of maintaining that graph at scale, and being able to agentically run successive queries to explore for best practices in any situation! And these have reached the scalability where any small business can build and use their own expert system.
I really want to see this approach win, because I think there's such an opportunity to explore even more data structures and approaches from the past and how their impact can be reimagined. If LLMs do indeed approach AGI, it will be in large part due to the ability to use tools (there's some evolutionary irony there, too) - and we should be trying every kind of underlying storage for those tools that we can, standing on the shoulders of giants.
(And curious what database you use for the knowledge graph - those are also a place where we stand on the shoulders of giants!)
And re: the graph -- Postgres stays king here. There are a lot of fancy database mechanisms for building systems like this, but the convenience of a SQL data structure that can tie the graph into structured metadata is pretty unbeatable. This may evolve with time as well.
(Along those lines, I recall lots of this getting messy in a pre-LLM project the moment someone said "merge these two CRM accounts and their histories, but oh whoops turns out they were different all along, and only some of the updates should have applied" - there's a whole set of interesting challenges around attributing EAV when the very notion of object identity evolves over time. Whether a fact is relevant is really a judgment that can only be made with full context - but we now have tools that eat context for breakfast!)
That said, this is the ultimate moat. Once everything about how to operate a business lives in your product, the business must rely heavily on it. I personally would only use something like this if I knew it was open source and that data could live on my own servers. If agents and my own team are consulting Hyper for things and you go out of business or move upmarket or something, it's pretty much back to the stone age for us.
Very useful idea though with a lot of potential, especially for companies like OpenAI and Anthropic looking for a moat!
You loose sooooooo much meaningful context and information when you transform something into a knowledge graph. Simple cases like "Gabe is CEO of Valve" map nicely to a graph, but things like "Matt Garman is CEO of AWS" don't represent that AWS is a sub-company of Amazon (with it's own CEO).
Additionally, one of my biggest gripes of Claude's memories and every memory system I've worked with is they completely fail to capture intent. The architecture notes I documented while doing a wild spike on a critical infrastructure component absolutely should not be referenced in every day work. Yet, somehow, that type of memory always works it's way into unrelated sessions.
There is also a capture problem. Imagine you hire an intern and you tell them "John Smith is the CEO of Foo". If they've never heard of Foo, it would be impossible to infer anything about the nature of Foo, unless they're allowed to look into the outside world. No system (even humans!) can capture 100% of information, but that doesn't mean the system is broken. The question is, can you organize and collect enough information to be able to (a) address most queries and (b) initiate deeper investigation if the information is incomplete? We believe the answer is yes.
Intent is very much the same way. Will hybrid search uncover your architecture notes at some point, for an unrelated reason? Almost certainly. Should there be enough surrounding context to indicate that this was written for a spike? Also yes (this is where Claude/markdown memories fail). It should be enough to still be (net) massively useful, and the error rate will go down over time.
How are you handling cases where multiple sources of truth contradict each other?
Does Hyper assume best guess or is there any human in the loop verification?
Unlike many other memory systems, Hyper never actually deletes memories. It constantly reranks them based on confidence, which factors into how they're retrieved. So every statement has a full history and system of record for how it got there, and you can trace (with attribution) why Hyper gives the answers it does. If there's something that Hyper misses, we provide tools in-app and in-terminal-plugin that let a human explicitly correct what Hyper knows.
For instance, history, newer information is mixed with older authoritative information.
The same thing for religious institutions, where the older items may be the more authoritative for the purposes.
This looks great and congratulations on the launch.
I am also building in this space and wanted to get your views on a few things.
1. Are you building your own connectors to 3p systems? 2. How are you finding the sales motion? I found people to get the problem fast, but actually converting them seems rather slow.
Good luck!
Would love to swap notes at some point if you are up for it?
Every new advancement from the model providers helps unlock new capabilities, but we are confident this "brain" idea is going to be core infrastructure for every company in the future. It extends beyond code and project management: we think about "what does the 'office of the future' look like? Ambient recording in every room? Smart whiteboards that turn drawings -> CAD -> kick off 3d printers?" and it's exciting to see how many unsolved challenges are on that road. Appreciate the support and excited to keep building :)
Right now our measurements are primarily subjective; we have several customers tell us "Hyper let my agent draft outbound/do market research/run experiments overnight with no intervention or follow-ups, when I would have to constantly babysit it in the past." We have also run Hyper's algorithms on common benchmarks versus more traditional methods. I don't want to claim numbers before we've verified them, but Hyper performs significantly better.
We do not use RAG in the traditional sense (semantic similarity across chunked source documents). We use hybrid retrieval methods to fetch relevant information across our carefully designed knowledge graph, and then have shallow agents consolidate retrieved information into a format that the invoking agent can understand.
2. How do you deal with conflicting facts? In tech, the new is constantly replacing the old.
3. Is knowledge extraction real time? How fast is it in general?
1. I'll address this in two parts.
(a) Memory vs. Enterprise Search. I consider search to address targeted, stateless retrieval whereas memory solves temporal, tacit, and derived problems. Glean can tell you why a ticket was filed or answer a specific question regarding a customer call. But in many companies, important questions are broader: "What went wrong the first time we went with this vendor?" "How has our brand shifted in tone over time?". These cannot be answered by a few documents, and it's not obvious whether this information would be in Slack or Notion or Drive. It requires an active, entropy-fighting system that is going to extract information and keep track of how it evolves over time.
(b) Benchmarks: absolutely. Don't want to claim anything before we've published results, but Hyper scores very well on LoCoMo and LongMemEval, and we are constantly trying to bolster our set of evals. We will publish results more openly in the coming weeks. I will caveat though: many SOTA memory providers are converging on the top end of these benchmarks, and yet we don't see mass adoption. We believe that UX affordances are underrated and critical to get "company brains" working in real, messy businesses. Many of our users have come to us from other providers purely because the competition was too difficult to use and maintain across the org.
2. Hyper maintains a graph of information where each node is an extracted "fact." This happens continuously, in the background, live from every connector or connected agent. At insertion-time, new information is compared against relevant information. Our system (a DAG of agentic nodes) determines the relationships between these facts and makes appropriate updates: X derives Y, A updates B. For now, we rely on recency as the primary indicator of conflict (i.e. we assume more recent information is generally more true than old information). We realize that this will need to become more sophisticated, and are iterating.
3. Knowledge extraction is real-time and asynchronous, and should add next to zero latency to any existing system. We continually update the graph in our backend, without relying on a nightly compaction/dreams cycle, so information from the world should be reflected in Hyper's responses in close to real time. Retrieval can be slightly more expensive, but the latency is negligible compared to the overhead of the calling agent. We recognize the importance of performance (we both worked on on-device robotics!) and are happy to publish numbers as we measure them :)
This raises a follow-up question: what is your differentiation?
The main thing we see in the world is that (a) teams already struggle to coordinate information over many different personalities and data sources. This was a more dull problem before when the actual IC/execution overhead was so large. But now with AI the execution overhead is way smaller, and "being on the same page" is a much bigger problem. (b) As agents do more and more of the mechanical work in the company, it's vital that they have a consistent big picture-view to perform tasks efficiently without errors.
Hyper aims to solve this problem end-to-end; the memory system is a vital part of this, but Hyper does more. We already support native agentic email-writing and LinkedIn-drafting automations, and will be expanding on that front. Today it's a "brain that knows everything," but so much of the value is in using that brain to perform work in a self-improving way. And on the other side, we need to make sure that getting information into the system is as frictionless as possible. We care a ton about UX -- one-click integrations, using hooks to get context in and out invisibly and reliably.
Made me think this was for companies working on self-driving.
- as well as the Show HN guidelines, which apply when people are sharing their work:
"Be respectful. Anyone sharing work is making a contribution, however modest."
"When something isn't good, you needn't pretend that it is, but don't be gratuitously negative."
You're welcome to make your substantive points thoughtfully, but please don't post like this.
https://news.ycombinator.com/showhn.html
https://news.ycombinator.com/newsguidelines.html