Enterprises have accelerated their efforts to integrate AI into operations. More recently, many have launched pilot projects to develop agent-based systems across various domains. Most of these agents rely heavily on the capabilities of conversation-centric foundation models such as GPT or Gemini. But as we deploy more capable agent-based intelligent applications, we must utilize agent-centric models for reasoning, planning, coordinating, and learning. Such agents will need to incorporate neurosymbolic components that access Large Reasoning Models, Large Action Models, and other such agent-specific models. Instead of communicating using natural language, as is the case today, multi-agent systems will need to communicate using specialized languages in conjunction with appropriate communication protocols.
The Models Agents Use
In a recent post, “Evolving the AI Agent Spectrum from Software to Embodied AI,” I arranged AI agents along a six-level spectrum based on their capabilities. In that framework, I explained how agents advance from simple, software-based assistants (Level 1) to embodied, fully autonomous agents (Level 6).
Today’s conversation-centric LLMs support the Level 2 agents, which mainly interpret and respond to human instructions using natural language. They can automate routine tasks, assist with information retrieval, and aid decision-making through conversational interfaces. For example, a Level 2 “HR agent” that accesses a conversation-centric foundation model can be used to screen candidate resumes for a particular corporate position. Similarly, a Level 2 “paralegal agent” can be used to review corporate contracts.
As enterprises advance to Level 4–6 agents, the primary focus of LLMs, which is linguistic fluency, will fall short for tasks involving multi-step reasoning, autonomous actions, and coordination among agents. For instance, consider a comprehensive multi-agent predictive maintenance system used in manufacturing. Its agents need to predict equipment failures, dynamically reschedule production, order parts, and work together with robotic systems. Achieving this requires two distinct types of AI models:
Large Action Models (LAMs) for operational intelligence
Large Reasoning Models (LRMs) for reasoning and planning
Neurosymbolic components that are part of these higher-level agents utilize such models to accomplish the tasks assigned to them.
Large Action Models
Over the past year, we have witnessed the rapid rise of LAMs. This is a new class of AI models designed to translate perception and context into action. LAMs generate actions enabling agents to plan and execute complex sequences of operations in both digital and physical environments.
Large Action Models model relationships between states, goals, and actions. They learn from interaction data, such as API calls, robotic control signals, and software operations, rather than from static text. These models learn “how” to act in context, integrating perception (what’s happening), reasoning (what should be done), and control (how to do it).
Google DeepMind’s RT-2 and NVIDIA’s Cosmos platform are state-of-the-art LAMs.
Large Reasoning Models
To enable the transition to Level 5 (Autonomous Learning), agents need specialized cognitive structures provided by Large Reasoning Models. LRMs provide the cognitive component powering LAMs. They have been architecturally structured and specially trained for deliberation rather than general conversation.
They utilize formal deliberation techniques such as Tree-of-Thought (ToT) search, allowing the model to explore multiple potential solutions, evaluate their logic, and select the safest, most robust path. LRMs are engineered to move beyond pattern-matching (situational adaptation) to abstract deduction, enabling an agent to act effectively in novel situations, which is the defining characteristic of Level 5 autonomy.
The Language of Coordination
Using natural language for inter-agent communication in multi-agent systems is ineffective. Natural language is too slow and ambiguous. Inter-agent communication must prioritize efficiency and precision. Instead, agents must communicate using standardized data structures that explicitly articulate the message’s intent and content, replacing verbose text with high-bandwidth, structured exchange.
Agent Communication Languages (ACLs), such as KQML and FIPA, require that every message must contain an intent and a structured content payload. Agents exchange formal commands like (REQUEST :sender SchedulingAgent :content check_part_availability), which is unambiguous and instantly machine-readable.
The structured messages are governed by protocols that enforce reliability and security, enabling agents to coordinate. The Agent-to-Agent (A2A) Protocol standardizes the secure and reliable transfer of structured messages between agents. This is a necessity for Level 4 and 6 multi-agent systems. In such systems, agents from different enterprise applications, e.g., the Maintenance Agent and the ERP Ordering Agent, must collaborate seamlessly and maintain context across long-running workflows. The Model Context Protocol (MCP) standardizes the “model-to-tool” interaction. It ensures that an agent always supplies its underlying model with the necessary tools, memory, and grounded context. This results in reliable responses. MCP helps guarantee the stability and non-hallucination of the reasoning process.
Bridging LLMs and LAMs in Enterprise AI
Conversation-centric LLMs will continue to facilitate human-agent interaction, while LRMs/LAMs will handle planning and execution. Enterprises are experimenting with this paradigm in areas such as customer service automation, industrial maintenance, and digital operations. Over time, this integration could evolve toward agent-centric foundation models that incorporate conversational, perceptual, and operational intelligence. Such models would be capable of reasoning about both language and action within the same semantic framework. They would enable autonomous collaboration, long-term memory, and adaptive decision-making in dynamic enterprise environments.
As Professor Gary Marcus recently argued, the future of AI lies in specialized systems that integrate multiple forms of reasoning rather than relying on monolithic general-purpose models. Agent-centric foundation models embody precisely this vision. They move beyond general language understanding toward contextual, action-oriented intelligence to make consequential decisions.
Looking Ahead
Enterprises will transition from today’s AI-enhanced monolithic applications to AI-first multi-agent systems that will utilize agent-centric models. The emergence of agent-centric models signals the emergence of cognitive infrastructures in the enterprise. These infrastructures will integrate various model types (LLM, LRM, LAM) with neurosymbolic reasoning components and formal communication protocols (A2A, MCP). Enterprises that succeed in this transition will understand and respond to their customers’ and users’ intents accurately, efficiently, and safely.


