Why Agents Don't Scale: It's an Engineering Problem, Not an AI Problem

(blog.r-lopes.com)

5 points | by dovelome 12 days ago

2 comments

sermakarevich 12 days ago
I think they do scale:
-- check this bug report from AMD where they say they run fleet of 50 agents 24x7 https://github.com/anthropics/claude-code/issues/42796
-- here I am running 3 coding agents 24x5 (not yet 7 so far) https://news.ycombinator.com/item?id=48520757
I was using multi-coding agents for poc projects first, now for production research and poc second and soon plan to get to production.
[-]
- dovelome 9 days ago
  I like what you are doing! I suggest you create a proxy service that will route the traffic to the AI provider.
  You can cache questions and answers heavily and use a B25 search with a vector embedded to retrieve the best results for you.
```
   RAG pipeline
   ├── BM25 + TF-IDF + RRF retrieval
   ├── cross-encoder reranking
   ├── knowledge-graph entity linking
   └── multi-angle intent detection
        │
        ▼
   LLM synthesis  (Claude / local models)
```
  [-]
  - dovelome 9 days ago
    Check it out: https://blog.r-lopes.com/how-it-works
dovelome 12 days ago
[flagged]