Show HN: Model2vec – Lightning-fast Static Embeddings for RAG/Semantic Search

(github.com)

28 points | by Pringled 189 days ago

3 comments

bturtel 187 days ago
This seems awesome for enabling RAG queries for on-device LLMs.
jerpint 189 days ago
I wonder at what point it will be ~as much overhead to pass through a subset of the data with a small yet capable and fast LLM vs. using a crude dot product when doing retrieval
[-]
- Pringled 188 days ago
  I think a combination works quite well: first getting a small set of candidates from all the data using a lightweight model, and the using a heavy-duty model to rerank the results and get the final candidates.
protoshell248 188 days ago
10K embeddings generated in under 700 milliseconds!!!