background

Synthetic Auth Analysis - AI Agents As Employees


AI Agents As Employees

In October 2025, a research team led by Sandy Carter published "AI Agents As Employees," a 57-page paper examining how organizations are integrating AI agents into their workforce structures. The paper surveys current deployment practices, governance frameworks, and the shift from treating AI as a "tool" to treating it as a "teammate."

The document is comprehensive in scope—covering everything from technical architecture to emotional intelligence impacts, from ROI frameworks to regulatory compliance. It includes case studies from companies like JPMorgan, Shopify, PayPal, and Unstoppable Domains, and references major AI platforms from Microsoft, Salesforce, IBM, and others.

The paper's central thesis: AI agents represent a fundamental transformation in how work gets done, requiring organizations to rethink not just their technology stack but their organizational charts, governance structures, and even the "social contract" between humans and machines.

It's a well-researched survey that will likely influence enterprise AI strategy discussions. Which is precisely why it deserves rigorous scrutiny.

The following analysis examines the paper's claims against its own evidence, identifying several critical contradictions that practitioners should understand before adopting its recommendations. The issues aren't minor technical quibbles—they're fundamental gaps between the paper's aspirational framing and the operational realities it acknowledges but never reconciles.

This matters because words shape deployment decisions, and deployment decisions have consequences. When the paper calls AI agents "teammates" capable of entering "social contracts," it's not just using colorful metaphors—it's establishing frameworks that will guide how organizations think about accountability, liability, and human oversight.


SUMMARY OF KEY CLAIMS:

The paper makes several core assertions:

  • AI agents should be treated as "teammates" not "tools" (Section 4)
  • Organizations can achieve "decision explainability" through existing XAI frameworks and vendor platforms (Section 6.1)
  • We need a "new social and operational contract between humans and machines" (Section 12.1)
  • With proper governance (ethics committees, bias audits, human-in-the-loop systems), AI agents can be deployed responsibly as digital employees

These claims rest on case studies showing measurable success in narrow domains: customer support automation, document processing, compliance monitoring, and marketing campaign optimization.


1. THE EXPLAINABILITY CONTRADICTION

• Section 6.1 claims “decision explainability” through tools like LIME, XAI frameworks, and vendor platforms (Microsoft Entra, Credo AI, Salesforce)

• Section 6.3.7 admits models are “opaque black boxes” whose “decision logic is hidden”

The gap: These platforms provide workflow traceability (what API was called) not decision explainability (why the LLM produced that output)

• LIME was designed for traditional ML models; applying it to frontier LLMs produces feature attribution, not causal reasoning

• Most agents use closed-source LLMs (GPT-4, Claude) via API where you have no access to weights, training data, or internal reasoning

Bottom line: You can audit the process but not the reasoning—a critical distinction for accountability


2. THE IDENTITY MANAGEMENT GAP

• Paper devotes 57 pages to “AI agents as employees” but only ~100 words to identity management

• Brief mentions: Microsoft Entra ID assigns “unique identities” and “role-based access”

What’s missing:

  • Authentication architecture for agents
  • Authorization boundaries and privilege escalation risks
  • Identity lifecycle (provisioning, rotation, decommissioning)
  • Cross-system federation when agents span multiple platforms
  • Agent impersonation scenarios (who is liable when an agent acts “on behalf of” a human?)
  • Multi-tenant security in agent marketplaces

• Example: Synergetics agents are “crypto-wallet-enabled” and can “purchase datasets autonomously”—but no discussion of spending limits, fraud detection, or financial controls

Bottom line: Identity management is foundational to security and accountability, yet remains critically underspecified


3. THE “TEAMMATE” METAPHOR BREAKS DOWN

• Paper’s central framing: AI agents should be treated as “teammates” not “tools” (Section 4)

The problem: Teams are defined by shared understanding, mutual adjustment, and collaborative problem-solving

• Research cited in the paper (Section 8.1-8.2) admits agents:

  • “Lack natural emotional reciprocity” (line 1313-1314)
  • Cannot “genuinely understand, interpret, or reciprocate human emotions” (line 1289)
  • Lack “shared mental models” essential for team effectiveness (line 1312)
  • Have “programmed rather than emotionally grounded” responses (line 1304)

What teams actually require:

  • Negotiation of roles and responsibilities → Agents follow fixed parameters
  • Ability to question objectives when they don’t make sense → Agents execute goals as given
  • Learning from teammates’ tacit knowledge → Agents struggle with tacit knowledge (acknowledged in Section 4.3)
  • Covering for each other during unexpected challenges → Agents fail predictably at edge cases
  • Collective sense-making in ambiguous situations → Agents optimize for predefined metrics

The reality: Humans don’t have “teammates”—they have automated assistants that require constant monitoring

• If humans must verify agent outputs (Section 6.3.3’s “robust human review”), the agent isn’t a teammate, it’s a supervised intern who never graduates

Bottom line: ”Teammate” implies collaborative intelligence. What we have is conditional automation with conversational interfaces


4. THE “SOCIAL CONTRACT” FALLACY

• Conclusion claims we need “a new social and operational contract between humans and machines” (Section 12.1)

• Paper itself admits agents “lack natural emotional reciprocity” (8.2) and have “programmed rather than emotionally grounded” responses (8.2)

The problem: Contracts require reciprocity, agency, and mutual stakes. Agents have none of these.

• What the paper actually describes:

  • Operational SLAs (technical specifications)
  • Design protocols (unilateral human commitments)
  • Organizational policies (management directives)

Bottom line: This is a contract about machines among humans, not with machines. The framing obscures accountability.


5. THE BLACK BOX PROBLEM (ACKNOWLEDGED BUT UNRESOLVED)

• Section 6.3.7: “Organizations continue deploying opaque models despite these risks”

• Paper cites Air Canada liability case—company held responsible for chatbot statements

• Courts are signaling: the deploying organization (not the model provider) bears liability

The unresolved tension:

  • You’re liable for decisions made by opaque systems
  • You can’t explain why the system made that decision
  • Third-party API providers consider model internals “trade secrets”

• Proposed solution (human-in-the-loop) negates the efficiency argument for agents

Bottom line: Legal accountability requires explainability we don’t have


6. GOVERNANCE THEATER vs. GENUINE OVERSIGHT

• Paper recommends: AI ethics committees, Chief AI Officers, bias audits, transparency frameworks

Reality check: 96% of business leaders expect an AI-related data breach within a year (cited in paper, Section 6.3.6)

• These governance structures often exist to insulate C-suite from liability rather than fundamentally reshape deployment decisions

• Red teaming and Constitutional AI are valuable but don’t solve the fundamental attribution problem: when complex systems fail, determining “what went wrong” is often intractable

Bottom line: We’re retrofitting governance after deployment rather than requiring explainability as a prerequisite


7. THE DEPLOYMENT-REALITY GAP

• Paper cites success stories (Unstoppable Domains handling 32% of tickets, JPMorgan automating contracts)

What these share: Narrow scope, clear success criteria, heavy human oversight at boundaries

• Paper also cites failures: IBM laid off 8,000 HR staff for AI, then rehired; McDonald’s abandoned AI drive-through after persistent errors; Klarna rehired staff after AI “shortfalls in empathy”

Pattern: AI agents work for constrained, well-defined tasks. They struggle with context, nuance, and edge cases.

Bottom line: Current agents are sophisticated automation, not “teammates”


8. THE POWER ASYMMETRY (UNADDRESSED)

• Paper treats “AI agents as employees” as neutral technological evolution

• Never mentions: agents don’t need healthcare, don’t unionize, scale infinitely—making them infinitely preferable from a capital perspective

• 23% of workers fear replacement (cited in paper), but labor displacement framed as “organizational challenge” not systemic shift

Bottom line: This is a mechanism for labor displacement dressed in innovation rhetoric


RECOMMENDATIONS FOR PRACTITIONERS:

  1. Treat explainability claims skeptically: Ask whether you’re getting workflow logs or actual decision reasoning.
  2. Demand identity architecture: Before deployment, require detailed IAM specifications, not just “we use Entra ID”.
  3. Accept the black box reality: If using closed-source LLMs, acknowledge you’re assuming tail risk no governance can fully mitigate.
  4. Scope appropriately: Deploy agents in narrow domains where failure modes are acceptable and success criteria are measurable.
  5. Invest in adversarial testing, not aspirational governance: Red team your agents, don’t just document policies.
  6. Maintain expensive human verification where it matters: In high-stakes domains, human oversight isn’t optional—it’s liability protection.
  7. Be honest about displacement: If agents replace human roles, acknowledge it openly rather than euphemizing as “augmentation”.
  8. Call them what they are: ”Automated assistants” or “AI tools,” not “teammates”—the metaphor sets unrealistic expectations and obscures the supervision burden.

CONCLUSION:

This paper is competent surveying of the landscape but makes several conceptual leaps that practitioners should scrutinize. AI agents are powerful tools for specific, well-scoped automation tasks. Treating them as “employees” with whom we have “contracts” or as “teammates” capable of collaborative problem-solving obscures accountability and oversells current capabilities.

This analysis doesn’t diminish the paper’s value as a survey of current practices. It questions whether those practices rest on sound conceptual and technical foundations.


background

Subscribe to Synthetic Auth