Building + Managing an Agentic Legal Workforce

Legal teams don’t need another bot. They need agents that improve outcomes, elevate data quality, and embed into real workflows. At the HIKE2 Lodge during Dreamforce, Salesforce’s Léo Murgel and HIKE2’s Morgan Llewellyn shared a practical blueprint for moving from experimentation to impact—with an operating model that keeps risk in check and value compounding.

“It’s not did we build an agent? It’s how is that agent embedded into our workflow—and who is accountable for its outcomes?”

What leaders should take away

Start narrow, scale with intent. Ramp with a trusted cohort, set clear expectations, and expand only after you’ve validated usefulness, accuracy, and adoption.
Make governance a product feature. Guardrails, access control, “no answer” behavior, and auditability must be designed in, not bolted on.
Shift ownership to the business. IT secures and enables; domain leaders monitor accuracy, exceptions, and continuous improvement.
Treat agents like products. Define lifecycle, measure performance, manage backlog and tech debt, and evolve capabilities over time.
Orchestrate many agents. Users shouldn’t need to know which agent to ask; orchestration routes requests and resolves conflicts across knowledge domains.

A pragmatic maturity path for legal AI agents

Most programs stall because they try to jump to automation too soon. Use this progression:

Assist — Grounded answers from a curated knowledge base (policies, FAQs).
Advise — Add record-specific context (matters, accounts, contracts) to deliver precise, situational answers.
Act — Execute approved actions (create an MSA, update a field, route intake) with controls, logging, and a warm human handoff.

“Agents earn trust when they can say I don’t know and hand off cleanly—confident guesses are the riskiest failures.”

Governance by design (not as a gate)

Access & content control

Use platform permissions to segment sensitive knowledge (e.g., legal-team-only policies).
Define what the agent will not do (e.g., interpreting certain contracts) to reduce harmful ambiguity.

Guardrails that prevent “confident wrong”

Configure refusal patterns for out-of-scope questions.
Keep an explicit human escalation path with context transfer.

Auditability with restraint

Retain just-enough interaction data to defend SLAs, measure accuracy, and demonstrate policy alignment—within defined retention limits.

Data quality is a first-class use case

Agents can enforce better data at the source:

Flag conflicts across systems; prompt users to resolve before errors propagate.
Lean on a unified customer or matter view (e.g., a data fabric) so the agent isn’t whipsawed by duplicate or conflicting records.
“If you expose fractured data to an agent, it will amplify the fracture. Fix the fabric first.”

From pilot to production: how to ramp responsibly

Narrow scope, clear expectations. State explicitly what the agent does today and what’s coming next.
Trusted cohort first. Roll out to a small, informed group to shape the experience and avoid credibility hits.
Continuous testing. Use LLM-based evaluators to spot drift and regressions; business experts must review nuanced, policy-grounded answers.
Measure adoption and exceptions. Track usage, refusal rates, handoffs, and time-to-resolution.

Operating model: who owns what

Business ownership

Define use cases, acceptance criteria, and “no-answer” boundaries.
Monitor accuracy, handle exceptions, and prioritize backlog.
Own lifecycle updates as policies and processes evolve.

IT/Platform enablement

Secure access, integration, observability, and orchestration across agents.
Provide tooling for evaluation, logging, and performance dashboards.

This shared responsibility keeps the agent aligned with legal risk tolerance while ensuring it actually moves work forward.

Orchestrating a many-agent enterprise

As capabilities grow, avoid “agent sprawl.” Establish:

Single entry experience where an orchestrator routes to the right specialist agent.
Conflict resolution rules when adjacent topics overlap (e.g., contracting vs. revenue recognition guidance).
Portfolio governance to prevent duplicated logic and to manage versioning and reuse.

For vendors and buyers of “packaged agents”

Design for workflow change, not just automation. If the human process doesn’t change, value will stall.
Deliver opinionated actions and narrow domains. Packaged strengths lie in well-defined data and tasks; leave room for customer-specific guardrails.
Expose observability. Give business users dashboards for exceptions, drift, and outcome metrics.

Where to start next week

Pick one Assist-level use case (policy Q&A, intake deflection) with a clear refusal pattern and human handoff.
Harden the data fabric for that use case; remove duplicates and set access controls.
Define success metrics: answerability, accuracy bands, adoption, exception rate, time saved.
Stand up a cross-functional “agent owner” (business), with platform support from IT.
Pilot with a small cohort, iterate quickly, then layer Advise and—when ready—Act.

“The days of ‘we built a bot’ are over. The question now is how the agent changes your workflow, who’s accountable for it, and how you prove it’s working.”