open-webui/Fu-Jie_openwebui-extensions

Fork 0

Files

fujie 219ba83df3 feat(infographic): release v1.5.0 with smart language detection & organize debug tools

2026-01-28 02:14:30 +08:00

27 KiB

Raw Blame History

GitHub Copilot SDK Integration Workflow

Author: Fu-Jie
Version: 0.2.3
Last Updated: 2026-01-27

Architecture Overview
Request Processing Flow
Session Management
Streaming Response Handling
Event Processing Mechanism
Tool Execution Flow
System Prompt Extraction
Configuration Parameters
Key Functions Reference

Architecture Overview

Component Diagram

┌─────────────────────────────────────────────────────────────┐
│                       OpenWebUI                              │
│  ┌───────────────────────────────────────────────────────┐  │
│  │              Pipe Interface (Entry Point)             │  │
│  └─────────────────────┬─────────────────────────────────┘  │
│                        │                                     │
│                        ▼                                     │
│  ┌───────────────────────────────────────────────────────┐  │
│  │           _pipe_impl (Main Logic)                     │  │
│  │  ┌──────────────────────────────────────────────────┐ │  │
│  │  │ 1. Environment Setup (_setup_env)               │ │  │
│  │  │ 2. Model Selection (request_model parsing)      │ │  │
│  │  │ 3. Chat Context Extraction                       │ │  │
│  │  │ 4. System Prompt Extraction                      │ │  │
│  │  │ 5. Session Management (create/resume)            │ │  │
│  │  │ 6. Streaming/Non-streaming Response              │ │  │
│  │  └──────────────────────────────────────────────────┘ │  │
│  └─────────────────────┬─────────────────────────────────┘  │
│                        │                                     │
│                        ▼                                     │
│  ┌───────────────────────────────────────────────────────┐  │
│  │           GitHub Copilot Client                       │  │
│  │  ┌──────────────────────────────────────────────────┐ │  │
│  │  │ • CopilotClient (SDK instance)                   │ │  │
│  │  │ • Session (conversation context)                 │ │  │
│  │  │ • Event Stream (async events)                    │ │  │
│  │  └──────────────────────────────────────────────────┘ │  │
│  └─────────────────────┬─────────────────────────────────┘  │
│                        │                                     │
└────────────────────────┼─────────────────────────────────────┘
                         ▼
              ┌──────────────────────┐
              │  Copilot CLI Process │
              │  (Backend Agent)     │
              └──────────────────────┘

Key Components

Pipe Interface: OpenWebUI's standard entry point
Environment Manager: CLI setup, token validation, environment variables
Session Manager: Persistent conversation state with automatic compaction
Event Processor: Asynchronous streaming event handler
Tool System: Custom tool registration and execution
Debug Logger: Frontend console logging for troubleshooting

Request Processing Flow

Complete Request Lifecycle

graph TD
    A[OpenWebUI Request] --> B[pipe Entry Point]
    B --> C[_pipe_impl]
    C --> D{Setup Environment}
    D --> E[Parse Model ID]
    E --> F[Extract Chat Context]
    F --> G[Extract System Prompt]
    G --> H{Session Exists?}
    H -->|Yes| I[Resume Session]
    H -->|No| J[Create New Session]
    I --> K[Initialize Tools]
    J --> K
    K --> L[Process Images]
    L --> M{Streaming Mode?}
    M -->|Yes| N[stream_response]
    M -->|No| O[send_and_wait]
    N --> P[Async Event Stream]
    O --> Q[Direct Response]
    P --> R[Return to OpenWebUI]
    Q --> R

Step-by-Step Breakdown

1. Environment Setup (`_setup_env`)

def _setup_env(self, __event_call__=None):
    """
    Priority:
    1. Check VALVES.CLI_PATH
    2. Search system PATH
    3. Auto-install via curl (if not found)
    4. Set GH_TOKEN environment variables
    """

Actions:

Locate Copilot CLI binary
Set COPILOT_CLI_PATH environment variable
Configure GH_TOKEN for authentication
Apply custom environment variables

2. Model Selection

# Input: body["model"] = "copilotsdk-claude-sonnet-4.5"
request_model = body.get("model", "")
if request_model.startswith(f"{self.id}-"):
    real_model_id = request_model[len(f"{self.id}-"):]  # "claude-sonnet-4.5"

3. Chat Context Extraction (`_get_chat_context`)

# Priority order for chat_id:
# 1. __metadata__ (most reliable)
# 2. body["chat_id"]
# 3. body["metadata"]["chat_id"]
chat_ctx = self._get_chat_context(body, __metadata__, __event_call__)
chat_id = chat_ctx.get("chat_id")

4. System Prompt Extraction (`_extract_system_prompt`)

Multi-source fallback strategy:

metadata.model.params.system
Model database lookup (by model_id)
body.params.system
Messages with role="system"

5. Session Creation/Resumption

New Session:

session_config = SessionConfig(
    session_id=chat_id,
    model=real_model_id,
    streaming=is_streaming,
    tools=custom_tools,
    system_message={"mode": "append", "content": system_prompt_content},
    infinite_sessions=InfiniteSessionConfig(
        enabled=True,
        background_compaction_threshold=0.8,
        buffer_exhaustion_threshold=0.95
    )
)
session = await client.create_session(config=session_config)

Resume Session:

try:
    session = await client.resume_session(chat_id)
    # Session state preserved: history, tools, workspace
except Exception:
    # Fallback to creating new session

Session Management

Infinite Sessions Architecture

┌─────────────────────────────────────────────────────────┐
│              Session Lifecycle                          │
│                                                         │
│  ┌──────────┐  create   ┌──────────┐  resume  ┌───────┴───┐
│  │ Chat ID  │─────────▶ │ Session  │ ◀────────│  OpenWebUI │
│  └──────────┘           │  State   │          └───────────┘
│                         └─────┬────┘                       │
│                               │                            │
│                               ▼                            │
│  ┌─────────────────────────────────────────────────────┐  │
│  │          Context Window Management                  │  │
│  │  ┌──────────────────────────────────────────────┐  │  │
│  │  │ Messages [user, assistant, tool_results...]  │  │  │
│  │  │ Token Usage: ████████████░░░░ (80%)          │  │  │
│  │  └──────────────────────────────────────────────┘  │  │
│  │                      │                              │  │
│  │                      ▼                              │  │
│  │  ┌──────────────────────────────────────────────┐  │  │
│  │  │  Threshold Reached (0.8)                     │  │  │
│  │  │  → Background Compaction Triggered           │  │  │
│  │  └──────────────────────────────────────────────┘  │  │
│  │                      │                              │  │
│  │                      ▼                              │  │
│  │  ┌──────────────────────────────────────────────┐  │  │
│  │  │  Compacted Summary + Recent Messages         │  │  │
│  │  │  Token Usage: ██████░░░░░░░░░░░ (40%)        │  │  │
│  │  └──────────────────────────────────────────────┘  │  │
│  └─────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

Configuration Parameters

InfiniteSessionConfig(
    enabled=True,                              # Enable infinite sessions
    background_compaction_threshold=0.8,       # Start compaction at 80% token usage
    buffer_exhaustion_threshold=0.95           # Emergency threshold at 95%
)

Behavior:

< 80%: Normal operation, no compaction
80-95%: Background compaction (summarize older messages)
> 95%: Force compaction before next request

Streaming Response Handling

Event-Driven Architecture

async def stream_response(
    self, client, session, send_payload, init_message: str = "", __event_call__=None
) -> AsyncGenerator:
    """
    Asynchronous event processing with queue-based buffering.
    
    Flow:
    1. Start async send task
    2. Register event handler
    3. Process events via queue
    4. Yield chunks to OpenWebUI
    5. Clean up resources
    """

Event Processing Pipeline

┌────────────────────────────────────────────────────────────┐
│              Copilot SDK Event Stream                      │
└────────────────────┬───────────────────────────────────────┘
                     │
                     ▼
        ┌────────────────────────┐
        │  Event Handler         │
        │  (Sync Callback)       │
        └────────┬───────────────┘
                 │
                 ▼
        ┌────────────────────────┐
        │  Async Queue           │
        │  (Thread-safe)         │
        └────────┬───────────────┘
                 │
                 ▼
        ┌────────────────────────┐
        │  Consumer Loop         │
        │  (async for)           │
        └────────┬───────────────┘
                 │
                 ▼
        ┌────────────────────────┐
        │  yield to OpenWebUI    │
        └────────────────────────┘

State Management During Streaming

state = {
    "thinking_started": False,   # <think> tags opened
    "content_sent": False        # Main content has started
}
active_tools = {}  # Track concurrent tool executions

State Transitions:

reasoning_delta arrives → thinking_started = True → Output: <think>\n{reasoning}
message_delta arrives → Close </think> if open → content_sent = True → Output: {content}
tool.execution_start → Output tool indicator (inside/outside <think>)
session.complete → Finalize stream

Event Processing Mechanism

Event Type Reference

Following official SDK patterns (from copilot.SessionEventType):

Event Type	Description	Key Data Fields	Handler Action
`assistant.message_delta`	Main content streaming	`delta_content`	Yield text chunk
`assistant.reasoning_delta`	Chain-of-thought	`delta_content`	Wrap in `<think>` tags
`tool.execution_start`	Tool call initiated	`name`, `tool_call_id`	Display tool indicator
`tool.execution_complete`	Tool finished	`result.content`	Show completion status
`session.compaction_start`	Context compaction begins	-	Log debug info
`session.compaction_complete`	Compaction done	-	Log debug info
`session.error`	Error occurred	`error`, `message`	Emit error notification

Event Handler Implementation

def handler(event):
    """Process streaming events following official SDK patterns."""
    event_type = get_event_type(event)  # Handle enum/string types
    
    # Extract data using safe_get_data_attr (handles dict/object)
    if event_type == "assistant.message_delta":
        delta = safe_get_data_attr(event, "delta_content")
        if delta:
            queue.put_nowait(delta)  # Thread-safe enqueue

Official SDK Pattern Compliance

def safe_get_data_attr(event, attr: str, default=None):
    """
    Official pattern: event.data.delta_content
    Handles both dict and object access patterns.
    """
    if not hasattr(event, "data") or event.data is None:
        return default
    
    data = event.data
    
    # Dict access (JSON-like)
    if isinstance(data, dict):
        return data.get(attr, default)
    
    # Object attribute (Python SDK)
    return getattr(data, attr, default)

Tool Execution Flow

Tool Registration

# 1. Define tool at module level
@define_tool(description="Generate a random integer within a specified range.")
async def generate_random_number(params: RandomNumberParams) -> str:
    number = random.randint(params.min, params.max)
    return f"Generated random number: {number}"

# 2. Register in _initialize_custom_tools
def _initialize_custom_tools(self):
    if not self.valves.ENABLE_TOOLS:
        return []
    
    all_tools = {
        "generate_random_number": generate_random_number,
    }
    
    # Filter based on AVAILABLE_TOOLS valve
    if self.valves.AVAILABLE_TOOLS == "all":
        return list(all_tools.values())
    
    enabled = [t.strip() for t in self.valves.AVAILABLE_TOOLS.split(",")]
    return [all_tools[name] for name in enabled if name in all_tools]

Tool Execution Timeline

User Message: "Generate a random number between 1 and 100"
     │
     ▼
Model Decision: Use tool `generate_random_number`
     │
     ▼
Event: tool.execution_start
     │  → Display: "🔧 Running Tool: generate_random_number"
     ▼
Tool Function Execution (async)
     │
     ▼
Event: tool.execution_complete
     │  → Result: "Generated random number: 42"
     │  → Display: "✅ Tool Completed: 42"
     ▼
Model generates response using tool result
     │
     ▼
Event: assistant.message_delta
     │  → "I generated the number 42 for you."
     ▼
Stream Complete

Visual Indicators

Before Content:

<think>
Running Tool: generate_random_number...
Tool `generate_random_number` Completed. Result: 42
</think>

I generated the number 42 for you.

After Content Started:

The number is

> 🔧 **Running Tool**: `generate_random_number`

> ✅ **Tool Completed**: 42

actually 42.

System Prompt Extraction

Multi-Source Priority System

async def _extract_system_prompt(self, body, messages, request_model, real_model_id):
    """
    Priority order:
    1. metadata.model.params.system (highest)
    2. Model database lookup
    3. body.params.system
    4. messages[role="system"] (fallback)
    """

Source 1: Metadata Model Params

# OpenWebUI injects model configuration
metadata = body.get("metadata", {})
meta_model = metadata.get("model", {})
meta_params = meta_model.get("params", {})
system_prompt = meta_params.get("system")  # Priority 1

Source 2: Model Database

from open_webui.models.models import Models

# Try multiple model ID variations
model_ids_to_try = [
    request_model,                    # "copilotsdk-claude-sonnet-4.5"
    request_model.removeprefix(...),  # "claude-sonnet-4.5"
    real_model_id,                    # From valves
]

for mid in model_ids_to_try:
    model_record = Models.get_model_by_id(mid)
    if model_record and hasattr(model_record, "params"):
        system_prompt = model_record.params.get("system")
        if system_prompt:
            break

Source 3: Body Params

body_params = body.get("params", {})
system_prompt = body_params.get("system")

Source 4: System Message

for msg in messages:
    if msg.get("role") == "system":
        system_prompt = self._extract_text_from_content(msg.get("content"))
        break

Configuration in SessionConfig

system_message_config = {
    "mode": "append",           # Append to conversation context
    "content": system_prompt_content
}

session_config = SessionConfig(
    system_message=system_message_config,
    # ... other params
)

Configuration Parameters

Valve Definitions

Parameter	Type	Default	Description
`GH_TOKEN`	str	`""`	GitHub Fine-grained Token (requires 'Copilot Requests' permission)
`MODEL_ID`	str	`"claude-sonnet-4.5"`	Default model when dynamic fetching fails
`CLI_PATH`	str	`"/usr/local/bin/copilot"`	Path to Copilot CLI binary
`DEBUG`	bool	`False`	Enable frontend console debug logging
`LOG_LEVEL`	str	`"error"`	CLI log level: none, error, warning, info, debug, all
`SHOW_THINKING`	bool	`True`	Display model reasoning in `<think>` tags
`SHOW_WORKSPACE_INFO`	bool	`True`	Show session workspace path in debug mode
`EXCLUDE_KEYWORDS`	str	`""`	Comma-separated keywords to exclude models
`WORKSPACE_DIR`	str	`""`	Restricted workspace directory (empty = process cwd)
`INFINITE_SESSION`	bool	`True`	Enable automatic context compaction
`COMPACTION_THRESHOLD`	float	`0.8`	Background compaction at 80% token usage
`BUFFER_THRESHOLD`	float	`0.95`	Emergency threshold at 95%
`TIMEOUT`	int	`300`	Stream chunk timeout (seconds)
`CUSTOM_ENV_VARS`	str	`""`	JSON string of custom environment variables
`ENABLE_TOOLS`	bool	`False`	Enable custom tool system
`AVAILABLE_TOOLS`	str	`"all"`	Available tools: "all" or comma-separated list

Environment Variables

# Set by _setup_env
export COPILOT_CLI_PATH="/usr/local/bin/copilot"
export GH_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"
export GITHUB_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"

# Custom variables (from CUSTOM_ENV_VARS valve)
export CUSTOM_VAR_1="value1"
export CUSTOM_VAR_2="value2"

Key Functions Reference

Entry Points

`pipe(body, metadata, __event_emitter, event_call__)`

Purpose: OpenWebUI stable entry point
Returns: Delegates to _pipe_impl

`_pipe_impl(body, metadata, __event_emitter, event_call__)`

Purpose: Main request processing logic
Flow: Setup → Extract → Session → Response
Returns: str (non-streaming) or AsyncGenerator (streaming)

`pipes()`

Purpose: Dynamic model list fetching
Returns: List of available models with multiplier info
Caching: Uses _model_cache to avoid repeated API calls

Session Management

`_build_session_config(chat_id, real_model_id, custom_tools, system_prompt_content, is_streaming)`

Purpose: Construct SessionConfig object
Returns: SessionConfig with infinite sessions and tools

`_get_chat_context(body, metadata, __event_call__)`

Purpose: Extract chat_id with priority fallback
Returns: {"chat_id": str}

Streaming

`stream_response(client, session, send_payload, init_message, __event_call__)`

Purpose: Async streaming event processor
Yields: Text chunks to OpenWebUI
Resources: Auto-cleanup client and session

`handler(event)`

Purpose: Sync event callback (inside stream_response)
Action: Parse event → Enqueue chunks → Update state

Helpers

`_emit_debug_log(message, __event_call__)`

Purpose: Send debug logs to frontend console
Condition: Only when DEBUG=True

`_setup_env(__event_call__)`

Purpose: Locate CLI, set environment variables
Side Effects: Modifies os.environ

`_extract_system_prompt(body, messages, request_model, real_model_id, __event_call__)`

Purpose: Multi-source system prompt extraction
Returns: (system_prompt_content, source_name)

`_process_images(messages, __event_call__)`

Purpose: Extract text and images from multimodal messages
Returns: (text_content, attachments_list)

`_initialize_custom_tools()`

Purpose: Register and filter custom tools
Returns: List of tool functions

Utility Functions

`get_event_type(event) -> str`

Purpose: Extract event type string from enum/string
Handles: SessionEventType enum → .value extraction

`safe_get_data_attr(event, attr: str, default=None)`

Purpose: Safe attribute extraction from event.data
Handles: Both dict access and object attribute access

Troubleshooting Guide

Enable Debug Mode

# In OpenWebUI Valves UI:
DEBUG = True
SHOW_WORKSPACE_INFO = True
LOG_LEVEL = "debug"

Debug Output Location

Frontend Console:

// Open browser DevTools (F12)
// Look for logs with prefix: [Copilot Pipe]
console.debug("[Copilot Pipe] Extracted ChatID: abc123 (Source: __metadata__)")

Backend Logs:

# Python logging output
logger.debug(f"[Copilot Pipe] Session resumed: {chat_id}")

Common Issues

1. Session Not Resuming

Symptom: New session created every request
Causes:

chat_id not extracted correctly
Session expired on Copilot side
INFINITE_SESSION=False (sessions not persistent)

Solution:

# Check debug logs for:
"Extracted ChatID: <id> (Source: ...)"
"Session <id> not found (...), creating new."

2. System Prompt Not Applied

Symptom: Model ignores configured system prompt
Causes:

Not found in any of 4 sources
Session resumed (system prompt only set on creation)

Solution:

# Check debug logs for:
"Extracted system prompt from <source> (length: X)"
"Configured system message (mode: append)"

3. Tools Not Available

Symptom: Model can't use custom tools
Causes:

ENABLE_TOOLS=False
Tool not registered in _initialize_custom_tools
Wrong AVAILABLE_TOOLS filter

Solution:

# Check debug logs for:
"Enabled X custom tools: ['tool1', 'tool2']"

Performance Optimization

Model List Caching

# First request: Fetch from API
models = await client.list_models()
self._model_cache = [...]  # Cache result

# Subsequent requests: Use cache
if self._model_cache:
    return self._model_cache

Session Persistence

Impact: Eliminates redundant model initialization on every request

# Without session:
# Each request: Initialize model → Load context → Generate → Discard

# With session (chat_id):
# First request: Initialize model → Load context → Generate → Save
# Later: Resume → Generate (instant)

Streaming vs Non-streaming

Streaming:

Lower perceived latency (first token faster)
Better UX for long responses
Resource cleanup via generator exit

Non-streaming:

Simpler error handling
Atomic response (no partial output)
Use for short responses

Security Considerations

Token Protection

# ❌ Never log tokens
logger.debug(f"Token: {self.valves.GH_TOKEN}")  # DON'T DO THIS

# ✅ Mask sensitive data
logger.debug(f"Token configured: {'*' * 10}")

Workspace Isolation

# Set WORKSPACE_DIR to restrict file access
WORKSPACE_DIR = "/safe/sandbox/path"

# Copilot CLI respects this directory
client_config["cwd"] = WORKSPACE_DIR

Input Validation

# Validate chat_id format
if chat_id and not re.match(r'^[a-zA-Z0-9_-]+$', chat_id):
    logger.warning(f"Invalid chat_id format: {chat_id}")
    chat_id = None

Future Enhancements

Planned Features

Multi-Session Management: Support multiple parallel sessions per user
Session Analytics: Track token usage, compaction frequency
Tool Result Caching: Avoid redundant tool calls
Custom Event Filters: User-configurable event handling
Workspace Templates: Pre-configured workspace environments
Streaming Abort: Graceful cancellation of long-running requests

API Evolution

Monitoring Copilot SDK updates for:

New event types (e.g., assistant.function_call)
Enhanced tool capabilities
Improved session serialization

References

License: MIT
Maintainer: Fu-Jie (@Fu-Jie)

27 KiB Raw Blame History

GitHub Copilot SDK Integration Workflow

Table of Contents

Architecture Overview

Component Diagram

Key Components

Request Processing Flow

Complete Request Lifecycle

Step-by-Step Breakdown

1. Environment Setup (_setup_env)

2. Model Selection

3. Chat Context Extraction (_get_chat_context)

4. System Prompt Extraction (_extract_system_prompt)

5. Session Creation/Resumption

Session Management

Infinite Sessions Architecture

Configuration Parameters

Streaming Response Handling

Event-Driven Architecture

Event Processing Pipeline

State Management During Streaming

Event Processing Mechanism

Event Type Reference

Event Handler Implementation

Official SDK Pattern Compliance

Tool Execution Flow

Tool Registration

Tool Execution Timeline

Visual Indicators

System Prompt Extraction

Multi-Source Priority System

Source 1: Metadata Model Params

Source 2: Model Database

Source 3: Body Params

Source 4: System Message

Configuration in SessionConfig

Configuration Parameters

Valve Definitions

Environment Variables

Key Functions Reference

Entry Points

pipe(body, __metadata__, __event_emitter__, __event_call__)

_pipe_impl(body, __metadata__, __event_emitter__, __event_call__)

pipes()

Session Management

_build_session_config(chat_id, real_model_id, custom_tools, system_prompt_content, is_streaming)

_get_chat_context(body, __metadata__, __event_call__)

Streaming

stream_response(client, session, send_payload, init_message, __event_call__)

handler(event)

Helpers

_emit_debug_log(message, __event_call__)

_setup_env(__event_call__)

_extract_system_prompt(body, messages, request_model, real_model_id, __event_call__)

_process_images(messages, __event_call__)

_initialize_custom_tools()

Utility Functions

get_event_type(event) -> str

safe_get_data_attr(event, attr: str, default=None)

Troubleshooting Guide

Enable Debug Mode

Debug Output Location

Common Issues

1. Session Not Resuming

2. System Prompt Not Applied

3. Tools Not Available

Performance Optimization

Model List Caching

Session Persistence

Streaming vs Non-streaming

Security Considerations

Token Protection

Workspace Isolation

Input Validation

Future Enhancements

Planned Features

API Evolution

References

27 KiB

Raw Blame History

1. Environment Setup (`_setup_env`)

3. Chat Context Extraction (`_get_chat_context`)

4. System Prompt Extraction (`_extract_system_prompt`)

`pipe(body, metadata, __event_emitter, event_call__)`

`_pipe_impl(body, metadata, __event_emitter, event_call__)`

`pipes()`

`_build_session_config(chat_id, real_model_id, custom_tools, system_prompt_content, is_streaming)`

`_get_chat_context(body, metadata, __event_call__)`

`stream_response(client, session, send_payload, init_message, __event_call__)`

`handler(event)`

`_emit_debug_log(message, __event_call__)`

`_setup_env(__event_call__)`

`_extract_system_prompt(body, messages, request_model, real_model_id, __event_call__)`

`_process_images(messages, __event_call__)`

`_initialize_custom_tools()`

`get_event_type(event) -> str`

`safe_get_data_attr(event, attr: str, default=None)`