Skip to content

AgentProcessingUnit (APU)

The ConversationalAPU (Agent Processing Unit) allows for multiple llm support including multimedia support and function calling (even if the llm does not support it). It does the by defining units that can patch in missing functionality for the llm. For example, defining an image_unit will allow that image unit to describe an image for you llm, allowing even a small model to understand the image and respond to it.

Basic Flow

The llm_unit will be called all the messages retrieved from memory (as controlled by the memory unit), all new messages, and the tools dynamically created from the logic_units. If the llm_unit executes any tools, it will be fed the response(s) and be executed again. When done the response is returned (or streamed).

Spec

KeyDescription
io_unittype: Reference[IOUnit]
Default: Reference[IOUnit]
Description: Manages the ingress and egress of data, converting messages to and from formats understandable by both the agent and external entities.
memory_unittype: Reference[MemoryUnit]
Default: Reference[MemoryUnit]
Description: Archives and retrieves messages exchanged during agent interactions, crucial for maintaining conversation history and context-aware responses.
logic_unitstype: List[Reference[LogicUnit]]
Default: []
Description: Hosts the essential logic for task execution, enabling the agent to perform computations and other actions.
llm_unittype: Reference[LLMUnit]
Default: Reference[LLMUnit]
Description: Interacts with Language Learning Models (LLMs), managing the complexities of forming requests and interpreting responses.
audio_unittype: Optional[Reference[AudioUnit]]
Default: None
Description: Processes audio inputs and outputs, integrated depending on the specific requirements of the deployment.
image_unittype: Optional[Reference[ImageUnit]]
Default: None
Description: Handles image processing tasks, integrated based on the deployment needs.
document_processortype: Reference[DocumentProcessor]
Default: Reference[DocumentProcessor]
Description: Manages document-related tasks within the agent’s operational scope.