hero_agent: LLM provider fallback cascade + error UX #93

Closed
opened 2026-03-26 00:31:34 +00:00 by mik-tf · 1 comment
Owner

Problem

When the primary LLM provider (OpenRouter) fails (e.g. HTTP 402 credits exhausted), the agent returns "I'm here to help!" — a generic fallback that looks like a working response. The user has no idea the AI is broken and thinks it is just dumb.

Discovered when OpenRouter credits hit $0 — agent appeared completely non-functional but gave no error indication.

Three fixes needed

1. Provider fallback cascade (high priority)

When the primary model/provider fails, automatically try the next one:

claude-sonnet-4.5 (OpenRouter) → 402
  → claude-haiku-4.5 (OpenRouter) → 402
    → llama-3.3-70b-versatile (Groq) → 200 ✓ (free tier!)

Groq offers free inference for llama-3.3-70b-versatile — the agent should always have a working fallback even when paid providers are exhausted. The model list already includes it, the agent just needs to cascade on provider errors (402, 429, 500, 503).

2. Error message passthrough (quick fix)

Stop masking errors as responses. Instead of "I'm here to help!", return:

"Sorry, I couldn't reach the AI model — the API returned: Payment Required (HTTP 402). Please check your OpenRouter credits."

This applies to both quick_response() and agent_loop() in agent.rs.

3. Dashboard LLM health indicator (nice to have)

Add LLM provider status to the agent dashboard stats bar:

  • On startup or first request: test each provider with a minimal call
  • Show in stats: LLM: OpenRouter ✓ | Groq ✓ or LLM: OpenRouter ✗ (402) | Groq ✓
  • This makes provider issues immediately visible without needing to send a chat message

Files to modify

File Change
agent.rs Add retry loop with model fallback in quick_response() and agent_loop()
llm_client.rs Detect retriable provider errors (402, 429, 500, 503) vs permanent errors
routes.rs Add /api/llm-status endpoint or include in /api/stats

Why this matters

  • Free Groq fallback means the agent never goes fully offline — even with $0 on OpenRouter
  • Error passthrough means users can self-diagnose instead of filing bugs
  • Dashboard status means admins can see provider health at a glance

Signed-off-by: mik-tf

## Problem When the primary LLM provider (OpenRouter) fails (e.g. HTTP 402 credits exhausted), the agent returns `"I'm here to help!"` — a generic fallback that looks like a working response. The user has no idea the AI is broken and thinks it is just dumb. Discovered when OpenRouter credits hit $0 — agent appeared completely non-functional but gave no error indication. ## Three fixes needed ### 1. Provider fallback cascade (high priority) When the primary model/provider fails, automatically try the next one: ``` claude-sonnet-4.5 (OpenRouter) → 402 → claude-haiku-4.5 (OpenRouter) → 402 → llama-3.3-70b-versatile (Groq) → 200 ✓ (free tier!) ``` Groq offers free inference for llama-3.3-70b-versatile — the agent should always have a working fallback even when paid providers are exhausted. The model list already includes it, the agent just needs to cascade on provider errors (402, 429, 500, 503). ### 2. Error message passthrough (quick fix) Stop masking errors as responses. Instead of `"I'm here to help!"`, return: > "Sorry, I couldn't reach the AI model — the API returned: Payment Required (HTTP 402). Please check your OpenRouter credits." This applies to both `quick_response()` and `agent_loop()` in `agent.rs`. ### 3. Dashboard LLM health indicator (nice to have) Add LLM provider status to the agent dashboard stats bar: - On startup or first request: test each provider with a minimal call - Show in stats: `LLM: OpenRouter ✓ | Groq ✓` or `LLM: OpenRouter ✗ (402) | Groq ✓` - This makes provider issues immediately visible without needing to send a chat message ## Files to modify | File | Change | |------|--------| | `agent.rs` | Add retry loop with model fallback in `quick_response()` and `agent_loop()` | | `llm_client.rs` | Detect retriable provider errors (402, 429, 500, 503) vs permanent errors | | `routes.rs` | Add `/api/llm-status` endpoint or include in `/api/stats` | ## Why this matters - Free Groq fallback means the agent **never goes fully offline** — even with $0 on OpenRouter - Error passthrough means users can **self-diagnose** instead of filing bugs - Dashboard status means admins can **see provider health at a glance** Signed-off-by: mik-tf
Author
Owner

Fixed in v0.7.3-dev (https://forge.ourworld.tf/lhumina_code/hero_services/releases/tag/v0.7.3-dev). Deployed to herodev, visually verified via Hero Browser MCP. All E2E tests passing.

Fixed in v0.7.3-dev (https://forge.ourworld.tf/lhumina_code/hero_services/releases/tag/v0.7.3-dev). Deployed to herodev, visually verified via Hero Browser MCP. All E2E tests passing.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/home#93
No description provided.