Until May 2026, debugging an Agentforce agent meant staring at Studio run history and guessing. Agent Platform Tracing and AgentLens changed that — but almost no one has written a practical walkthrough of how to actually use them.
When an agent produces a wrong answer, picks the wrong topic, or silently skips an action, Studio tells you it happened. It does not tell you why. The full observability stack — Simulate Mode, Live Test Mode, Agent Platform Tracing, and AgentLens — gives you the why. This guide walks through each layer and how they connect.
The observability stack, from surface to deep
Think of Agentforce debugging as four layers, each revealing more detail than the last:
- Simulate Mode — conversational testing inside Agentforce Builder. Useful for confirming happy-path flows but shows no internal state. Start here to reproduce the issue consistently.
- Interaction Summary panel — visible after each Simulate Mode turn. Shows the topic that was selected, the actions that were triggered, and each action's output. If the wrong topic was selected, you can see it here — but not why it won.
- Agent Platform Tracing — org-level feature that writes every LLM routing decision, Flow execution, Apex call, and external tool invocation as an OpenTelemetry trace event. This is where classification scores live.
- AgentLens — open-source visualisation layer that renders those traces as interactive finite-state machine diagrams and decision graphs. This is where you see the reasoning path, not just the outcome.
Step 1: Enable Agent Platform Tracing
In Setup, navigate to Einstein → Agentforce → Agent Platform Tracing and toggle it on at the org level. Once enabled, every subsequent agent interaction generates trace records. No restart or deployment is required — it takes effect immediately.
Trace data is stored as platform events and is queryable within seconds of a conversation ending. The default retention window is 24 hours in the platform event store. For longer retention and bulk querying, you need a Data Cloud (Data 360) allocation — traces flow into a dedicated AgentTelemetry data stream automatically once both features are active.
Enable tracing in your sandbox first, run a few Simulate Mode sessions to confirm trace records are populating, then enable in production during a low-traffic window.
Step 2: Install and connect AgentLens
AgentLens is an open-source project maintained by the Salesforce developer community. Install it from the AgentLens GitHub repository following the setup guide — it connects to your org via a Connected App and reads from the Agent Platform Tracing event stream.
Once connected, AgentLens surfaces two views: the decision graph (which topic was scored, what scores looked like across all topics, which won) and the FSM diagram (the sequence of states the agent passed through: topic selection → action execution → response generation). The FSM view is the one you want for diagnosing loops, dead ends, and unexpected state transitions.
Step 3: Read a trace tree
Each trace represents a single conversation turn. The tree has four node types:
- LLM call nodes — the Atlas routing decision. Contains the input message, the topic scores for all candidates, and the winning topic. This is where misrouting is visible: if your intended topic scored 0.71 and a competing topic scored 0.69, the agent made a reasonable choice given the classification descriptions you wrote — the fix is in the descriptions, not the platform.
- Flow execution nodes — shows each autolaunched flow that was triggered, inputs passed, outputs returned, and execution time. A flow that runs for 4.8 seconds in a turn chain is a CPU limit risk.
- Apex invocation nodes — the
@InvocableMethodcall, its inputs, HTTP response code, and whether the DML succeeded. A 200 with zero rows inserted is visible here; it looks like success in Studio. - Tool call nodes — external API calls or MCP tool invocations. Latency and response payload visible. If you are running Synapse or another MCP integration, this is where those calls appear in the trace.
Step 4: Diagnose common failure patterns
Three failures appear consistently in production traces that Studio run history never surfaces:
Topic score ties: If you see two topics with scores within 0.03 of each other across multiple sessions, your classification descriptions are too similar. The agent is not malfunctioning — the descriptions genuinely both match. Rewrite them using contrast-first language: open each description with what this topic handles that no other topic handles. See our companion post on fixing topic misrouting for the full technique.
Silent FLS failures in Apex: An Apex invocable that calls Database.insert() without allOrNone: false will silently succeed at the HTTP layer when FLS blocks individual field writes. The HTTP 200 is the transaction result, not the DML result. Fix: always check SaveResult objects and surface failures explicitly in the invocable's output schema so the agent can route to a human if DML failed.
CPU limit accumulation: Salesforce enforces a 10,000ms CPU time limit per transaction. When an agent chains three or more flows in a single turn, their execution times accumulate. A flow that takes 900ms in isolation may cause the third chained flow to tip over the limit. The trace shows cumulative CPU consumption per node — anything above 7,000ms in a multi-action turn is a risk. Fix: split multi-step operations across turns using the agent's conversation state, or move expensive logic into async Apex called via a Platform Event.
Step 5: Query bulk history in Data Cloud
For patterns across hundreds of sessions — not just one-off debugging — query the AgentTelemetry object in Data Cloud using SOQL or the Data Cloud Query API. Filter by topic name to find your highest-misrouting topics, or by action name to find your slowest flows:
SELECT TopicName, COUNT(Id) misroutes
FROM AgentTelemetry
WHERE SessionDate = LAST_N_DAYS:7
AND ClassificationResult = 'TopicMismatch'
GROUP BY TopicName
ORDER BY misroutes DESC
LIMIT 10
Running this weekly surfaces drift before it compounds. Classification descriptions that worked at launch degrade over time as users phrase requests in new ways — bulk querying catches this months before it becomes visible in CSAT.
Frequently Asked Questions
What is AgentLens and how is it different from Agentforce Studio run history?
AgentLens is an observability tool that converts Agentforce OpenTelemetry traces into interactive agent graphs and FSM diagrams, showing the internal reasoning path the Atlas engine took. Studio run history shows pass/fail outcomes per turn. AgentLens shows the decision scores, action sequencing, and latency breakdown — what you need when an agent silently picks the wrong topic or skips an action with no error message.
How do I enable Agent Platform Tracing in Salesforce?
Go to Setup → Einstein → Agentforce → Agent Platform Tracing and toggle it on. Traces start populating immediately for all subsequent agent interactions. The default 24-hour retention window is sufficient for sandbox debugging; connect Data Cloud (Data 360) if you need longer retention or bulk querying via SOQL against the AgentTelemetry object.
What agent failures do trace logs reveal that Studio run history misses?
Three patterns: (1) topic score ties — two topics scoring within 0.03 of each other; the agent made a reasonable choice but the descriptions are too similar; (2) silent FLS failures — Apex actions returning HTTP 200 but writing no data due to field-level security blocking individual DML rows; (3) CPU limit accumulation — chained flows whose individual execution times sum past the 10,000ms transaction limit, causing the final flow to fail. All three look like successes in Studio but appear as anomalies in trace latency and decision-score columns.
Building or troubleshooting an Agentforce deployment?
We design Agentforce architectures, configure observability from day one, and fix deployments that are silently misbehaving in production. One call, concrete answers.
Book a strategy call →