My team is dealing with a lot of information: Wikis, Code repos, Monitoring dashboards, internal chat messages, emails, Task tickets, related systems, etc.
There are many cases when we need to do ad-hoc searches for anything related to a concept. For instance, imagine if someone makes a change to a metric, there is a need to find all dashboards that might be using this metric to make sure they are still valid after the change.
I don't want to just fix this problem, but create the ability to find related information in ad-hoc cases.
The ramp-up time is not important, as long as some positive value can be created with a small initial effort.
Any existing products (Paid/Free/Open Source, etc) and any references to existing knowledge (designs, discussions) about this would be really appreciated.
My experience has been to encourage public blogging / speaking of technical information. If its public, there are several benefits. First you need to explain to people with little context from the company. You also feel scrutiny to make it accurate and not embarrass yourself. And readers will see the date of authorship, and have a sense of when this information was true. And of course, Google is a better search engine than anything you'll have internally!
For example, when I worked on search at Reddit, I didn't point people at anything internal (that stuff rots) but instead I would point people at places like:
https://www.reddit.com/r/RedditEng/comments/1985mnj/bringing...
https://www.youtube.com/watch?v=gUtF1gyHsSM
The downside to this approach is companies are too precious about IP so don't want you to be specific. (despite it almost certaintly not being special). Also company blogs can get over-edited to the point where they lose authenticity in favor of SEO spam.
This isn't the tool to use for things like runbooks, etc. It's a more useful thing for broader context.
I wish more companies just gave their developers their own personal blogs, and were less precious about preventing speaking.
So my preference is a coherent story at a point in time
People will not maintain knowledge bases unless you force them. So remove as much friction as possible and make as accessible as possible. Hence, the automation and textiles. It doesn't need to be plaintext, just something universal and human-readable. Could be formatted in markdown, yaml, json, be single email-files, everything you can find with simple tools and make connections with. The version-control and it's report then will allow you to follow the trail of work, to discover what was discussed and change around the same time, to find connections. And it's never wrong to have a reversible history of your stuff.
And maybe along the way you can motivate people to also write some proper documentation here and there, and add some more fancy tools on-top.
It's markdown, with several plug-ins. Easier than mediawiki which relies on human forces. We preferred automation, but no insanities like on other wikis like WYSIWYG editing and such management nonsense.
And I had the advantage of being the phpwiki maintainer those times to easily extend it to our needs. And I wrote some custom plugins for them, with ajax tricks. It helped that phpwiki is not such a mess as mediawiki, which was also entirely insecure.
All these plugins were up streamed then. Then Alcatel took over maintainance and they run a similar knowledge base.
I personally and professionally used these to do some cool things, like run audits across different systems simultaneously. Common stack would include Protege for creating the ontologies (i.e., a schema of how the things you're interested in link to each other), Ontotext Refine or py scripts to populate the graphs, and Ontotext GraphDB or Neo4j AuraDB for storing them.
It's relatively easy to then connect this knowledge base to an LLM, and get more flexibility out of it.
That said, there aren't that many user-friendly tools that get the most out of KGs. Most people I worked with weren't interested in KGs or knowledge bases themselves, they just wanted their particular problem solved. And often, it was easier to justify purchasing a subscription to managed tools that (claim to) solve the problem.
So, unless you're OK with building some middleware to combine user apps with KGs, it won't stick with others, in my experience.
Neo4j, open search, and vector embedding. I would use OpenAI api calls to generate the open search query based on user text input.
For example, user could search “what tasks are assigned to Jake that are at least 50% complete and due in the in the next 2 weeks” and it would be able to return relevant results.
Obviously only as good as the user search query. I spent close to a 100 hours writing tests to get it working close to 100% of the time. Eventually I dropped the embeddings because I could generate the opensearch query on the fly. So it was pretty lean and easy.
vim knowledge.md
The key is to store both data and metadata... OpenMetadata may be what you need: https://open-metadata.org/ but I couldn't spot wiki, chat, github nor JIRA connectors :shrug:
Good luck, keep us posted.
It probably doesn’t really matter.
Good luck.