CopilotKit & AG-UI: Building Fullstack Agents & Generative UI

This documentation explores CopilotKit and AG-UI, providing developer infrastructure and a protocol for building user-interactive AI agents. It covers the Generative UI spectrum and agent state management for robust AI applications.

5 min readAI Guide

Introduction

CopilotKit provides developer infrastructure, including open-source SDKs and a self-hostable cloud, to simplify the creation of full-stack, user-interactive AI agents. AG-UI (Agent-User Interaction Protocol) acts as a horizontal standard, enabling seamless communication between agent backends and diverse user-facing applications across web, mobile, and voice.

Configuration Checklist

Element	Version / Link
Language / Runtime	TypeScript, Python
Main library	CopilotKit, AG-UI
Required APIs	Google ADK, Microsoft Agent Framework, AWS Bedrock AgentCore, LangChain, Pydantic AI, Jgno, LlamaIndex, Mistral, AG-SCORE
Keys / credentials needed	API keys for respective LLM providers and services (e.g., Google, Microsoft, AWS)

Step-by-Step Guide

Step 1 — Understanding the Agent-User Interaction Stack

Building full-stack agentic applications is challenging because agents break the traditional request/response paradigm of the internet. Agentic software is long-running, requires streaming and reconnections, supports structured and unstructured data (text, voice, tool calls, state updates), and demands composition (hand-off control to sub-agents). CopilotKit and AG-UI provide the necessary layers to connect the agentic world with user-facing applications.

Step 2 — Implementing Controlled Generative UI

Controlled Generative UI allows developers to define a collection of pre-built, pre-defined components. The agent then selects which of these components to show based on the user's query. This approach offers pixel-perfect designs and is maximally deterministic, making it ideal for the most frequently used surfaces in an application (e.g., flight tickets in an airline app).

Example Code (React Frontend with CopilotKit):

import { useComponent } from "@copilotkit/react-core";
import { z } from "zod"; // Used for schema definition, as mentioned by the speaker

// Define a custom PieChart React component (implementation not shown)
interface PieChartProps {
  title: string;
  description?: string;
  data: { label: string; value: number }[];
}
const PieChart: React.FC<PieChartProps> = ({ title, description, data }) => {
  // ... PieChart rendering logic
  return (
    <div>
      <h3>{title}</h3>
      {description && <p>{description}</p>}
      {/* Render chart based on data */}
    </div>
  );
};

// Register the PieChart component with CopilotKit
useComponent({
  name: "PieChart",
  description: "Displays a pie chart.",
  parameters: z.object({ // Define expected parameters using Zod
    title: z.string(),
    description: z.string().optional(),
    data: z.array(z.object({
      label: z.string(),
      value: z.number(),
    })),
  }),
  render: (props) => <PieChart {...props} />, // Specify the React component to render
});

Install Command (implied):

npm install @copilotkit/react-core zod

Step 3 — Implementing Declarative Generative UI

Declarative Generative UI involves developers declaring a catalog of "lego-like" building blocks (your designs). The agent then dynamically assembles these components on demand to answer any user query. This approach is suitable for the "long tail" of user interfaces and internal enterprise applications, where flexibility and simplicity of implementation are more important than pixel-perfect designs. It allows for custom styling across experiences and works across web and mobile platforms.

Example Code (A2UI in Action, inspired by Google's A2UI and CopilotKit):

import { createCatalogFromDefinitions } from "@copilotkit/react-core"; // [Editor's note: function name to verify in the official documentation]
import { z } from "zod";
import React from "react";

// 1. Declare component definitions (schema for agent to understand)
export const catalogDefinitions = {
  Card: {
    description: "A styled card container.",
    props: z.object({
      title: z.string(),
      subtitle: z.string().optional(),
      body: z.string().optional(),
    }),
  },
  PrimaryButton: {
    description: "A styled primary button.",
    props: z.object({
      label: z.string(),
      onClick: z.any().optional(), // [Editor's note: type to verify in the official documentation]
    }),
  },
  // ... other component definitions like charts, tables, etc.
};

// 2. Define renderers for these components (how they look in React)
// Assuming custom React components are in a './components' directory
import { Card as CardComponent } from "./components/Card";
import { PrimaryButton as PrimaryButtonComponent } from "./components/PrimaryButton";

export const catalogRenderers = {
  Card: (props) => <CardComponent {...props} />,
  PrimaryButton: (props) => <PrimaryButtonComponent {...props} />,
  // ... other component renderers
};

// 3. Create the catalog that the agent will use
const catalog = createCatalogFromDefinitions(catalogDefinitions, catalogRenderers);

// To teach an AG-UI compatible agent to use this catalog (example with a hypothetical agent setup):
// [Editor's note: specific API for agent integration to verify in the official documentation]
// For a Python agent (e.g., LangChain):
// agent.add_tool(catalog.to_langchain_tool());
// For a React frontend, the catalog is passed to the CopilotKit provider.

Install Command (implied):

npm install @copilotkit/react-core zod

Step 4 — Implementing Open-Ended Generative UI

Open-Ended Generative UI allows agents to return fully open-ended content (e.g., raw HTML/JavaScript) or embed third-party applications (MCP Apps) within a secure iframe. This provides maximum flexibility, as the agent effectively owns the entire visual layer. It's particularly useful for integrating third-party tools or creating highly custom, unforeseen user experiences.

Example Code (Python Agent for MCP Apps):

from google_ads_agents import LlmAgent # Example agent from Google Ads SDK

agent = LlmAgent(
    name="open_gen_agent",
    model="gemini-1.0-flash",
    instruction="You can generate interactive HTML content for the user, or embed third-party apps.",
    # [Editor's note: specific API for MCP app integration to verify in the official documentation]
    # For example, to enable embedding Excalidraw:
    # mcp_apps=["https://excalidraw.app"]
)

Example Code (React Frontend for Open-Ended UI):

import { useCopilotKit } from "@copilotkit/react-core";
import React from "react";

function App() {
  useCopilotKit({
    // Enable the agent to generate open-ended UI content
    openGenerativeUI: true,
    // [Editor's note: specific API for MCP app integration to verify in the official documentation]
    // If supporting MCP apps, list their URLs:
    // mcpApps: ["https://excalidraw.app"]
  });

  return (
    <div>
      {/* Your main application UI */}
      {/* Agent-generated content will typically be rendered in a secure iframe */}
      <div id="agent-generated-ui-container"></div>
    </div>
  );
}

Install Commands (implied):

pip install google-ads-agents # Example for Python agent
npm install @copilotkit/react-core

Step 5 — Managing Agent State (Shared State)

Agent state refers to the bidirectional synchronization of structured data between agents and user interfaces. This allows for collaborative interactions where both the user and the agent can read and write to a shared piece of state (e.g., a to-do list). This abstraction simplifies building complex, interactive agentic applications like custom cursors.

Example Code (Python Agent with Shared State):

from google_ads_agents import LlmAgent
from typing import List, Dict

# Define a function that represents the shared state (e.g., a list of todos)
def todos_context_state(todos: List[Dict]) -> Dict:
    """
    Manages a list of to-do items for the user.
    Each todo item should be a dictionary with 'id', 'text', and 'done' keys.
    """
    # In a real application, this would interact with a persistent storage
    return {"status": "success", "todos": todos}

# Create an agent that can interact with the todos state
todos_agent = LlmAgent(
    name="todo_agent",
    model="gemini-1.0-flash",
    tools=[todos_context_state], # Agent can call this tool to read/write todos
    # [Editor's note: specific API for initial state setup to verify in the official documentation]
    # Example: initial_state_schema=z.object({todos: z.array(z.object({id: z.number(), text: z.string(), done: z.boolean()}))})
)

Example Code (React Frontend with Shared State):

import { useAgent } from "@copilotkit/react-core";
import React, { useState, useEffect } from "react";
import { z } from "zod";

// Define the schema for the shared todos state
const todoSchema = z.object({
  id: z.number(),
  text: z.string(),
  done: z.boolean(),
});
const todosStateSchema = z.object({
  todos: z.array(todoSchema),
});

function TodoApp() {
  // Use the useAgent hook to access and manage agent state
  const { agentState, setAgentState } = useAgent<z.infer<typeof todosStateSchema>>({
    initialState: { todos: [] }, // Initialize agent state
    stateSchema: todosStateSchema, // Provide schema for validation
  });

  // Local UI state, synchronized with agentState
  const [localTodos, setLocalTodos] = useState(agentState.todos);

  // Effect to keep local UI state in sync with agentState changes
  useEffect(() => {
    setLocalTodos(agentState.todos);
  }, [agentState.todos]);

  const addTodo = (text: string) => {
    const newTodos = [...localTodos, { id: Date.now(), text, done: false }];
    setLocalTodos(newTodos);
    setAgentState({ todos: newTodos }); // Update agent state, which syncs with the agent
  };

  const toggleTodo = (id: number) => {
    const newTodos = localTodos.map(todo =>
      todo.id === id ? { ...todo, done: !todo.done } : todo
    );
    setLocalTodos(newTodos);
    setAgentState({ todos: newTodos });
  };

  return (
    <div>
      <h2>My Todos</h2>
      <input
        type="text"
        placeholder="Add a new todo"
        onKeyDown={(e) => {
          if (e.key === 'Enter' && e.currentTarget.value) {
            addTodo(e.currentTarget.value);
            e.currentTarget.value = '';
          }
        }}
      />
      <ul>
        {localTodos.map((todo) => (
          <li key={todo.id} style={{ textDecoration: todo.done ? 'line-through' : 'none' }}>
            {todo.text}
            <button onClick={() => toggleTodo(todo.id)}>Toggle</button>
          </li>
        ))}
      </ul>
      {/* Agent can also modify 'todos' via its tools, and the UI will update */}
    </div>
  );
}

Install Commands (implied):

pip install google-ads-agents # Example for Python agent
npm install @copilotkit/react-core zod

Comparison Tables

The Generative UI Spectrum

Feature	Controlled Gen UI	Declarative Gen UI	Open-Ended Gen UI
What is it?	Pre-defined components. Agent selects which to show.	Developers declare a component catalog of lego-like building blocks (your designs). The agent assembles the components together.	Agents return fully open-ended content (HTML/JS). An embedded iframe.
Great for	Few most-used surfaces in your application — the workhorse of the Gen UI Universe.	The long tail of UIs in consumer apps & internal enterprise applications.	3rd party apps + fully custom experiences.
Why?	Pixel-perfect designs, maximally determinable.	Flexibility & simplicity of implementation matter more that deterministic pixel-perfect designs.	Maximum flexibility — the agent owns the entire visual layer.
Pros	Pixel-perfect designs, maximally determinable.	Custom styling across experiences, works with web or mobile.	Fully open UI control.
Cons	One surface per interaction.	Less visual control, less determinable.	Harder to design an app within an app, non-uniform presentation, security concerns.

⚠️ Common Mistakes & Pitfalls

Ignoring the "Agentic Break": Traditional request/response paradigms are insufficient for agentic systems. Developers often struggle with glue code to manage long-running, streaming, and stateful interactions. Fix: Adopt frameworks like CopilotKit that are purpose-built for the agentic paradigm, abstracting away these complexities.
Loss of User Analytics: Deploying agentic applications can invalidate existing user analytics because agent-mediated interactions are different from direct user-UI interactions. Fix: Implement an "Insight Layer" to specifically track and understand how users interact with agentic applications in production.
Struggling with Agent Improvement: Making agents smarter and more autonomous is challenging due to the need for high-quality feedback data on agent performance. Fix: Integrate Continuous Learning from Human Feedback (CLHF) mechanisms, where user actions (like accepting/rejecting agent suggestions or editing agent output) are used as signals to continuously train and improve agent models.
Security Risks with Open-Ended UI: Allowing agents to generate arbitrary HTML/JavaScript directly into the main application can introduce significant security vulnerabilities. Fix: Always render agent-generated open-ended content within a secure, isolated environment, such as an embedded iframe, to prevent malicious code execution.
Lack of UI Control for Complex Scenarios: Relying solely on fully open-ended generative UI can lead to unpredictable or non-uniform user experiences, especially for critical or branded surfaces. Fix: Utilize the Generative UI Spectrum by choosing the appropriate paradigm (Controlled, Declarative, or Open-Ended) based on the specific context, balancing control, flexibility, and determinism.

Glossary

Agentic System: A software system powered by AI agents that can perform tasks autonomously or semi-autonomously, often interacting with users and tools.
Generative UI: User interfaces that are dynamically generated or assembled by AI models (like LLMs) based on user input or agent actions.
AG-UI Protocol: A horizontal standard for Agent-User Interaction, enabling seamless communication and data exchange between agent backends and user-facing frontends.

Key Takeaways

The future of user interfaces is AI-driven, with agentic systems mediating interactions between humans and technology.
CopilotKit offers open-source SDKs and a self-hostable cloud solution specifically designed for building full-stack agentic applications.
AG-UI is a crucial protocol that standardizes agent-user interaction, allowing agents to connect with various frontends (web, mobile, voice, messaging).
Generative UI exists on a spectrum: Controlled (for deterministic, pixel-perfect designs), Declarative (for flexible assembly of building blocks), and Open-Ended (for maximum customization and third-party app integration).
Agent state management is essential for enabling collaborative experiences where agents and users can bidirectionally sync and operate on shared data.
The Agent-User Interaction Stack emphasizes Enablement (building agents), Insight (understanding user interaction with agents), and Continuous Learning from Human Feedback (improving agents over time).
Leveraging user feedback (e.g., accepting/rejecting agent output) is a powerful signal for continuously training and enhancing agent autonomy and quality.
The transition from command-line interfaces to graphical user interfaces in computing history parallels the current shift from text-based AI interactions to more sophisticated, generative UIs.

Resources

DeepLearning.AI: https://www.deeplearning.ai/
CopilotKit: https://www.copilotkit.ai/ (Implied official website)
AG-UI Protocol: https://www.ag-ui.com/ (Implied official website)
LangChain: https://www.langchain.com/
CrewAI: https://www.crewai.com/
Zod: https://zod.dev/
Pydantic: https://docs.pydantic.dev/
Excalidraw: https://excalidraw.com/
Cursor: https://cursor.sh/

All guides Lire en français →