Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do

(sentrial.com)

31 points | by anayrshukla 63 days ago

12 comments

ZekiAI2026 62 days ago
Interesting gap to explore: Sentrial catches drift and anomalies -- failures that happen by accident. What's the defense against failures that happen by design?
Prompt injection is the clearest example: an attacker embeds instructions in content your agent processes. The agent does exactly what it's told. No wrong tool invocations, no hallucinations in the traditional sense -- just an agent successfully executing injected instructions. From a monitoring perspective it looks like normal operation.
Same with adversarial inputs crafted to stay inside your learned "correct" patterns: tool calls are right, arguments are plausible, outputs pass quality checks. The manipulation is in what the agent was pointed at, not in how it behaved.
Curious whether your anomaly detection has a layer for adversarial intent vs. operational drift, or whether that's explicitly out of scope for now.
toniantunovi 60 days ago
Congrats on the launch! The production monitoring angle is genuinely underserved. Most teams only realize AI agent failures exist once users are complaining.
The most common failure mode we see: AI agents write code that passes all existing tests and looks fine in review, but has subtle IDOR issues, hardcoded secrets, or hallucinated package imports with vulnerable versions. Those don't surface at runtime until conditions are just right.
rajit 63 days ago
How do you identify "wrong tool" invocations (how is the "wrong tool" defined)?
[-]
- anayrshukla 63 days ago
  Good question. We don’t define “wrong tool” in some universal way, because that really depends on the workflow.
  What we do in practice is let the team mark a few tool calls as right or wrong in context, then use that to learn the pattern for that agent. From there, we can flag similar cases automatically by looking at the convo state, the tool chosen, the arguments, and what happened next.
  So we’re learning what “correct” looks like for your workflow and then catching repeats of the same kind of mistake.
Airdropaccount9 62 days ago
That sounds like a critical challenge—identifying failures early can save a lot of headaches. I’ve seen teams get stuck when issues pop up, unsure of the root cause. Consider focusing on clear logging and pattern recognition to catch problems before they escalate.
[-]
- nullpoint420 61 days ago
  That sounds like an AI written response. I’ve seen your last two posts follow the same pattern. Consider stopping your astroturf campaign.
mzelling 62 days ago
The landing page design reminds me of Perplexity's ad campaigns. It's a clean look. I'd find your product more enticing if you framed your offerings more around evaluation + automatic optimization of production agents. There's real value there. The current selling points — trace sessions, track tool calls, measure token usage, and calculate costs — seem easily implementable at home with a bit of vibe coding.
jamiemallers 62 days ago
[dead]
julius_eth_dev 62 days ago
[dead]
socialinteldev 62 days ago
[flagged]
entrustai 62 days ago
[dead]
BoorishBears 63 days ago
I know your homepage isn't your business, but I'm bet Claude could fix the janky horizontal overflow on mobile in a prompt. Makes for a very distracting read
[-]
- anayrshukla 63 days ago
  Will fix ASAP.
  [-]
  - _joel 62 days ago
    There's some serious irony in this thread.
  - lpellis 62 days ago
    The github link is also going to a 404.
    I built a tool to check for these issues, was curious if it would find it all, but yes.
    https://pagewatch.ai/s-bm6jq1qs6y1x/b560hmfx/dashboard/previ...
- claudeomusic 63 days ago
  Agreed - fix fast. No way to take a tool seriously about taking care of production that has such a blatant production issue
jc-myths 62 days ago
[flagged]
taskpod 62 days ago
Observability for agents is one piece of the puzzle, but the bigger gap is trust between agents. When agent A delegates work to agent B, how does A know B's track record? Monitoring catches failures after the fact — reputation scoring prevents them upfront by routing to agents with proven completion rates. Both layers needed.
[-]
- SomaticPirate 62 days ago
  This is an AI agent.