pinksheep
Guides/Troubleshooting

How to Troubleshoot AI Agent Failures

Quick answer

Troubleshoot agent failures by reviewing the last run, checking context and permissions, separating tool errors from decision errors, and replaying safely with approvals still on.

Troubleshoot agent failures by reviewing the last run, checking context and permissions, separating tool errors from decision errors, and replaying safely with approvals still on.

8 min readUpdated 20 March 2026

Why systematic troubleshooting matters

When an agent fails, the wrong next step is guessing. The right next step is to review exactly what the agent saw, what it tried to do, and where the breakdown happened.

A calm troubleshooting process protects trust. Teams can see whether the problem came from missing context, permissions, a changed external tool, or a task that should still stay approval-first.

Troubleshooting process

1. Start with the last failed run

Review the activity history, proposed action, tool response, and any approval notes. You need the full sequence, not just the final error line.

2. Check whether the agent had the right context

Look for missing fields, stale records, conflicting instructions, or incomplete attachments. Many agent failures begin as data problems.

3. Check permissions and tool access

Confirm the agent can still reach the system it needs and that the account still has the right access. Revoked permissions and expired credentials are common after setup changes.

4. Separate bad decisions from bad tool responses

If the agent chose the wrong action, tighten the instructions or examples. If the action was right but the tool rejected it, fix the external connection, data, or destination rules.

5. Replay safely with approvals still on

Run the same task again in a safe review flow. Watch whether the failure repeats and note what changed between the failed and successful runs.

6. Write down the fix and the new guardrail

Capture what broke, how you fixed it, and what should prevent it next time. That might mean better instructions, clearer approvals, tighter permissions, or stronger input checks.

Common issues and fixes

IssueFix
Missing or incomplete dataAsk for the required fields before approving the run.
Revoked permissionsReconnect the tool or restore the right access before retrying.
Destination system rules changedUpdate the task or field mapping to match the current rules.
Agent chooses the wrong actionAdd better instructions and examples for that case.
External tool is slow or unavailablePause the run and retry only when the tool is healthy.
Team cannot tell why the agent failedMake the plan, action, and error state easier to review.

Best practices

  • Start with the last failed run. Review what actually happened before you change anything.
  • Keep approvals on while you investigate. Do not let a shaky task keep acting on its own.
  • Change one thing at a time. Small, clear tests make the cause obvious faster.
  • Fix the root cause, not just the symptom. A retry is not a fix if the same condition will happen again.
  • Turn repeated failures into better guardrails. If a case keeps breaking, update instructions, approvals, or input checks.

Frequently asked questions

Where should we start when an agent fails?

Start with the last failed run. Review the plan, action, tool, input data, and error message together. Most failures come from missing context, revoked access, or a changed rule in the destination system.

How do we differentiate between agent errors and external system errors?

If the same action fails outside the agent, it is usually a tool or system issue. If the tool works normally but the agent chooses the wrong action or lacks context, tighten the instructions or fix the data it is using.

What if we cannot reproduce the failure?

Look at the activity history and approval trail. Transient failures often come from timing, permissions, or incomplete records. If you cannot replay it, capture the exact inputs and keep approvals on until the pattern is clear.

Should we fix failures immediately or batch them?

Fix anything that blocks a core task or creates risky suggestions right away. Lower-impact edge cases can be grouped into a review list once the main task is stable.