Files
Fu-Jie_openwebui-extensions/plugins/debug/legacy_docs/guides/WORKFLOW.md

836 lines
27 KiB
Markdown

# GitHub Copilot SDK Integration Workflow
**Author:** Fu-Jie
**Version:** 0.2.3
**Last Updated:** 2026-01-27
---
## Table of Contents
1. [Architecture Overview](#architecture-overview)
2. [Request Processing Flow](#request-processing-flow)
3. [Session Management](#session-management)
4. [Streaming Response Handling](#streaming-response-handling)
5. [Event Processing Mechanism](#event-processing-mechanism)
6. [Tool Execution Flow](#tool-execution-flow)
7. [System Prompt Extraction](#system-prompt-extraction)
8. [Configuration Parameters](#configuration-parameters)
9. [Key Functions Reference](#key-functions-reference)
---
## Architecture Overview
### Component Diagram
```
┌─────────────────────────────────────────────────────────────┐
│ OpenWebUI │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Pipe Interface (Entry Point) │ │
│ └─────────────────────┬─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ _pipe_impl (Main Logic) │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ 1. Environment Setup (_setup_env) │ │ │
│ │ │ 2. Model Selection (request_model parsing) │ │ │
│ │ │ 3. Chat Context Extraction │ │ │
│ │ │ 4. System Prompt Extraction │ │ │
│ │ │ 5. Session Management (create/resume) │ │ │
│ │ │ 6. Streaming/Non-streaming Response │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └─────────────────────┬─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ GitHub Copilot Client │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ • CopilotClient (SDK instance) │ │ │
│ │ │ • Session (conversation context) │ │ │
│ │ │ • Event Stream (async events) │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └─────────────────────┬─────────────────────────────────┘ │
│ │ │
└────────────────────────┼─────────────────────────────────────┘
┌──────────────────────┐
│ Copilot CLI Process │
│ (Backend Agent) │
└──────────────────────┘
```
### Key Components
1. **Pipe Interface**: OpenWebUI's standard entry point
2. **Environment Manager**: CLI setup, token validation, environment variables
3. **Session Manager**: Persistent conversation state with automatic compaction
4. **Event Processor**: Asynchronous streaming event handler
5. **Tool System**: Custom tool registration and execution
6. **Debug Logger**: Frontend console logging for troubleshooting
---
## Request Processing Flow
### Complete Request Lifecycle
```mermaid
graph TD
A[OpenWebUI Request] --> B[pipe Entry Point]
B --> C[_pipe_impl]
C --> D{Setup Environment}
D --> E[Parse Model ID]
E --> F[Extract Chat Context]
F --> G[Extract System Prompt]
G --> H{Session Exists?}
H -->|Yes| I[Resume Session]
H -->|No| J[Create New Session]
I --> K[Initialize Tools]
J --> K
K --> L[Process Images]
L --> M{Streaming Mode?}
M -->|Yes| N[stream_response]
M -->|No| O[send_and_wait]
N --> P[Async Event Stream]
O --> Q[Direct Response]
P --> R[Return to OpenWebUI]
Q --> R
```
### Step-by-Step Breakdown
#### 1. Environment Setup (`_setup_env`)
```python
def _setup_env(self, __event_call__=None):
"""
Priority:
1. Check VALVES.CLI_PATH
2. Search system PATH
3. Auto-install via curl (if not found)
4. Set GH_TOKEN environment variables
"""
```
**Actions:**
- Locate Copilot CLI binary
- Set `COPILOT_CLI_PATH` environment variable
- Configure `GH_TOKEN` for authentication
- Apply custom environment variables
#### 2. Model Selection
```python
# Input: body["model"] = "copilotsdk-claude-sonnet-4.5"
request_model = body.get("model", "")
if request_model.startswith(f"{self.id}-"):
real_model_id = request_model[len(f"{self.id}-"):] # "claude-sonnet-4.5"
```
#### 3. Chat Context Extraction (`_get_chat_context`)
```python
# Priority order for chat_id:
# 1. __metadata__ (most reliable)
# 2. body["chat_id"]
# 3. body["metadata"]["chat_id"]
chat_ctx = self._get_chat_context(body, __metadata__, __event_call__)
chat_id = chat_ctx.get("chat_id")
```
#### 4. System Prompt Extraction (`_extract_system_prompt`)
Multi-source fallback strategy:
1. `metadata.model.params.system`
2. Model database lookup (by model_id)
3. `body.params.system`
4. Messages with `role="system"`
#### 5. Session Creation/Resumption
**New Session:**
```python
session_config = SessionConfig(
session_id=chat_id,
model=real_model_id,
streaming=is_streaming,
tools=custom_tools,
system_message={"mode": "append", "content": system_prompt_content},
infinite_sessions=InfiniteSessionConfig(
enabled=True,
background_compaction_threshold=0.8,
buffer_exhaustion_threshold=0.95
)
)
session = await client.create_session(config=session_config)
```
**Resume Session:**
```python
try:
session = await client.resume_session(chat_id)
# Session state preserved: history, tools, workspace
except Exception:
# Fallback to creating new session
```
---
## Session Management
### Infinite Sessions Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Session Lifecycle │
│ │
│ ┌──────────┐ create ┌──────────┐ resume ┌───────┴───┐
│ │ Chat ID │─────────▶ │ Session │ ◀────────│ OpenWebUI │
│ └──────────┘ │ State │ └───────────┘
│ └─────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Context Window Management │ │
│ │ ┌──────────────────────────────────────────────┐ │ │
│ │ │ Messages [user, assistant, tool_results...] │ │ │
│ │ │ Token Usage: ████████████░░░░ (80%) │ │ │
│ │ └──────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────┐ │ │
│ │ │ Threshold Reached (0.8) │ │ │
│ │ │ → Background Compaction Triggered │ │ │
│ │ └──────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────┐ │ │
│ │ │ Compacted Summary + Recent Messages │ │ │
│ │ │ Token Usage: ██████░░░░░░░░░░░ (40%) │ │ │
│ │ └──────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### Configuration Parameters
```python
InfiniteSessionConfig(
enabled=True, # Enable infinite sessions
background_compaction_threshold=0.8, # Start compaction at 80% token usage
buffer_exhaustion_threshold=0.95 # Emergency threshold at 95%
)
```
**Behavior:**
- **< 80%**: Normal operation, no compaction
- **80-95%**: Background compaction (summarize older messages)
- **> 95%**: Force compaction before next request
---
## Streaming Response Handling
### Event-Driven Architecture
```python
async def stream_response(
self, client, session, send_payload, init_message: str = "", __event_call__=None
) -> AsyncGenerator:
"""
Asynchronous event processing with queue-based buffering.
Flow:
1. Start async send task
2. Register event handler
3. Process events via queue
4. Yield chunks to OpenWebUI
5. Clean up resources
"""
```
### Event Processing Pipeline
```
┌────────────────────────────────────────────────────────────┐
│ Copilot SDK Event Stream │
└────────────────────┬───────────────────────────────────────┘
┌────────────────────────┐
│ Event Handler │
│ (Sync Callback) │
└────────┬───────────────┘
┌────────────────────────┐
│ Async Queue │
│ (Thread-safe) │
└────────┬───────────────┘
┌────────────────────────┐
│ Consumer Loop │
│ (async for) │
└────────┬───────────────┘
┌────────────────────────┐
│ yield to OpenWebUI │
└────────────────────────┘
```
### State Management During Streaming
```python
state = {
"thinking_started": False, # <think> tags opened
"content_sent": False # Main content has started
}
active_tools = {} # Track concurrent tool executions
```
**State Transitions:**
1. `reasoning_delta` arrives → `thinking_started = True` → Output: `<think>\n{reasoning}`
2. `message_delta` arrives → Close `</think>` if open → `content_sent = True` → Output: `{content}`
3. `tool.execution_start` → Output tool indicator (inside/outside `<think>`)
4. `session.complete` → Finalize stream
---
## Event Processing Mechanism
### Event Type Reference
Following official SDK patterns (from `copilot.SessionEventType`):
| Event Type | Description | Key Data Fields | Handler Action |
|-----------|-------------|-----------------|----------------|
| `assistant.message_delta` | Main content streaming | `delta_content` | Yield text chunk |
| `assistant.reasoning_delta` | Chain-of-thought | `delta_content` | Wrap in `<think>` tags |
| `tool.execution_start` | Tool call initiated | `name`, `tool_call_id` | Display tool indicator |
| `tool.execution_complete` | Tool finished | `result.content` | Show completion status |
| `session.compaction_start` | Context compaction begins | - | Log debug info |
| `session.compaction_complete` | Compaction done | - | Log debug info |
| `session.error` | Error occurred | `error`, `message` | Emit error notification |
### Event Handler Implementation
```python
def handler(event):
"""Process streaming events following official SDK patterns."""
event_type = get_event_type(event) # Handle enum/string types
# Extract data using safe_get_data_attr (handles dict/object)
if event_type == "assistant.message_delta":
delta = safe_get_data_attr(event, "delta_content")
if delta:
queue.put_nowait(delta) # Thread-safe enqueue
```
### Official SDK Pattern Compliance
```python
def safe_get_data_attr(event, attr: str, default=None):
"""
Official pattern: event.data.delta_content
Handles both dict and object access patterns.
"""
if not hasattr(event, "data") or event.data is None:
return default
data = event.data
# Dict access (JSON-like)
if isinstance(data, dict):
return data.get(attr, default)
# Object attribute (Python SDK)
return getattr(data, attr, default)
```
---
## Tool Execution Flow
### Tool Registration
```python
# 1. Define tool at module level
@define_tool(description="Generate a random integer within a specified range.")
async def generate_random_number(params: RandomNumberParams) -> str:
number = random.randint(params.min, params.max)
return f"Generated random number: {number}"
# 2. Register in _initialize_custom_tools
def _initialize_custom_tools(self):
if not self.valves.ENABLE_TOOLS:
return []
all_tools = {
"generate_random_number": generate_random_number,
}
# Filter based on AVAILABLE_TOOLS valve
if self.valves.AVAILABLE_TOOLS == "all":
return list(all_tools.values())
enabled = [t.strip() for t in self.valves.AVAILABLE_TOOLS.split(",")]
return [all_tools[name] for name in enabled if name in all_tools]
```
### Tool Execution Timeline
```
User Message: "Generate a random number between 1 and 100"
Model Decision: Use tool `generate_random_number`
Event: tool.execution_start
│ → Display: "🔧 Running Tool: generate_random_number"
Tool Function Execution (async)
Event: tool.execution_complete
│ → Result: "Generated random number: 42"
│ → Display: "✅ Tool Completed: 42"
Model generates response using tool result
Event: assistant.message_delta
│ → "I generated the number 42 for you."
Stream Complete
```
### Visual Indicators
**Before Content:**
```markdown
<think>
Running Tool: generate_random_number...
Tool `generate_random_number` Completed. Result: 42
</think>
I generated the number 42 for you.
```
**After Content Started:**
```markdown
The number is
> 🔧 **Running Tool**: `generate_random_number`
> ✅ **Tool Completed**: 42
actually 42.
```
---
## System Prompt Extraction
### Multi-Source Priority System
```python
async def _extract_system_prompt(self, body, messages, request_model, real_model_id):
"""
Priority order:
1. metadata.model.params.system (highest)
2. Model database lookup
3. body.params.system
4. messages[role="system"] (fallback)
"""
```
### Source 1: Metadata Model Params
```python
# OpenWebUI injects model configuration
metadata = body.get("metadata", {})
meta_model = metadata.get("model", {})
meta_params = meta_model.get("params", {})
system_prompt = meta_params.get("system") # Priority 1
```
### Source 2: Model Database
```python
from open_webui.models.models import Models
# Try multiple model ID variations
model_ids_to_try = [
request_model, # "copilotsdk-claude-sonnet-4.5"
request_model.removeprefix(...), # "claude-sonnet-4.5"
real_model_id, # From valves
]
for mid in model_ids_to_try:
model_record = Models.get_model_by_id(mid)
if model_record and hasattr(model_record, "params"):
system_prompt = model_record.params.get("system")
if system_prompt:
break
```
### Source 3: Body Params
```python
body_params = body.get("params", {})
system_prompt = body_params.get("system")
```
### Source 4: System Message
```python
for msg in messages:
if msg.get("role") == "system":
system_prompt = self._extract_text_from_content(msg.get("content"))
break
```
### Configuration in SessionConfig
```python
system_message_config = {
"mode": "append", # Append to conversation context
"content": system_prompt_content
}
session_config = SessionConfig(
system_message=system_message_config,
# ... other params
)
```
---
## Configuration Parameters
### Valve Definitions
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `GH_TOKEN` | str | `""` | GitHub Fine-grained Token (requires 'Copilot Requests' permission) |
| `MODEL_ID` | str | `"claude-sonnet-4.5"` | Default model when dynamic fetching fails |
| `CLI_PATH` | str | `"/usr/local/bin/copilot"` | Path to Copilot CLI binary |
| `DEBUG` | bool | `False` | Enable frontend console debug logging |
| `LOG_LEVEL` | str | `"error"` | CLI log level: none, error, warning, info, debug, all |
| `SHOW_THINKING` | bool | `True` | Display model reasoning in `<think>` tags |
| `SHOW_WORKSPACE_INFO` | bool | `True` | Show session workspace path in debug mode |
| `EXCLUDE_KEYWORDS` | str | `""` | Comma-separated keywords to exclude models |
| `WORKSPACE_DIR` | str | `""` | Restricted workspace directory (empty = process cwd) |
| `INFINITE_SESSION` | bool | `True` | Enable automatic context compaction |
| `COMPACTION_THRESHOLD` | float | `0.8` | Background compaction at 80% token usage |
| `BUFFER_THRESHOLD` | float | `0.95` | Emergency threshold at 95% |
| `TIMEOUT` | int | `300` | Stream chunk timeout (seconds) |
| `CUSTOM_ENV_VARS` | str | `""` | JSON string of custom environment variables |
| `ENABLE_TOOLS` | bool | `False` | Enable custom tool system |
| `AVAILABLE_TOOLS` | str | `"all"` | Available tools: "all" or comma-separated list |
### Environment Variables
```bash
# Set by _setup_env
export COPILOT_CLI_PATH="/usr/local/bin/copilot"
export GH_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"
export GITHUB_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"
# Custom variables (from CUSTOM_ENV_VARS valve)
export CUSTOM_VAR_1="value1"
export CUSTOM_VAR_2="value2"
```
---
## Key Functions Reference
### Entry Points
#### `pipe(body, __metadata__, __event_emitter__, __event_call__)`
- **Purpose**: OpenWebUI stable entry point
- **Returns**: Delegates to `_pipe_impl`
#### `_pipe_impl(body, __metadata__, __event_emitter__, __event_call__)`
- **Purpose**: Main request processing logic
- **Flow**: Setup → Extract → Session → Response
- **Returns**: `str` (non-streaming) or `AsyncGenerator` (streaming)
#### `pipes()`
- **Purpose**: Dynamic model list fetching
- **Returns**: List of available models with multiplier info
- **Caching**: Uses `_model_cache` to avoid repeated API calls
### Session Management
#### `_build_session_config(chat_id, real_model_id, custom_tools, system_prompt_content, is_streaming)`
- **Purpose**: Construct SessionConfig object
- **Returns**: `SessionConfig` with infinite sessions and tools
#### `_get_chat_context(body, __metadata__, __event_call__)`
- **Purpose**: Extract chat_id with priority fallback
- **Returns**: `{"chat_id": str}`
### Streaming
#### `stream_response(client, session, send_payload, init_message, __event_call__)`
- **Purpose**: Async streaming event processor
- **Yields**: Text chunks to OpenWebUI
- **Resources**: Auto-cleanup client and session
#### `handler(event)`
- **Purpose**: Sync event callback (inside `stream_response`)
- **Action**: Parse event → Enqueue chunks → Update state
### Helpers
#### `_emit_debug_log(message, __event_call__)`
- **Purpose**: Send debug logs to frontend console
- **Condition**: Only when `DEBUG=True`
#### `_setup_env(__event_call__)`
- **Purpose**: Locate CLI, set environment variables
- **Side Effects**: Modifies `os.environ`
#### `_extract_system_prompt(body, messages, request_model, real_model_id, __event_call__)`
- **Purpose**: Multi-source system prompt extraction
- **Returns**: `(system_prompt_content, source_name)`
#### `_process_images(messages, __event_call__)`
- **Purpose**: Extract text and images from multimodal messages
- **Returns**: `(text_content, attachments_list)`
#### `_initialize_custom_tools()`
- **Purpose**: Register and filter custom tools
- **Returns**: List of tool functions
### Utility Functions
#### `get_event_type(event) -> str`
- **Purpose**: Extract event type string from enum/string
- **Handles**: `SessionEventType` enum → `.value` extraction
#### `safe_get_data_attr(event, attr: str, default=None)`
- **Purpose**: Safe attribute extraction from event.data
- **Handles**: Both dict access and object attribute access
---
## Troubleshooting Guide
### Enable Debug Mode
```python
# In OpenWebUI Valves UI:
DEBUG = True
SHOW_WORKSPACE_INFO = True
LOG_LEVEL = "debug"
```
### Debug Output Location
**Frontend Console:**
```javascript
// Open browser DevTools (F12)
// Look for logs with prefix: [Copilot Pipe]
console.debug("[Copilot Pipe] Extracted ChatID: abc123 (Source: __metadata__)")
```
**Backend Logs:**
```python
# Python logging output
logger.debug(f"[Copilot Pipe] Session resumed: {chat_id}")
```
### Common Issues
#### 1. Session Not Resuming
**Symptom:** New session created every request
**Causes:**
- `chat_id` not extracted correctly
- Session expired on Copilot side
- `INFINITE_SESSION=False` (sessions not persistent)
**Solution:**
```python
# Check debug logs for:
"Extracted ChatID: <id> (Source: ...)"
"Session <id> not found (...), creating new."
```
#### 2. System Prompt Not Applied
**Symptom:** Model ignores configured system prompt
**Causes:**
- Not found in any of 4 sources
- Session resumed (system prompt only set on creation)
**Solution:**
```python
# Check debug logs for:
"Extracted system prompt from <source> (length: X)"
"Configured system message (mode: append)"
```
#### 3. Tools Not Available
**Symptom:** Model can't use custom tools
**Causes:**
- `ENABLE_TOOLS=False`
- Tool not registered in `_initialize_custom_tools`
- Wrong `AVAILABLE_TOOLS` filter
**Solution:**
```python
# Check debug logs for:
"Enabled X custom tools: ['tool1', 'tool2']"
```
---
## Performance Optimization
### Model List Caching
```python
# First request: Fetch from API
models = await client.list_models()
self._model_cache = [...] # Cache result
# Subsequent requests: Use cache
if self._model_cache:
return self._model_cache
```
### Session Persistence
**Impact:** Eliminates redundant model initialization on every request
```python
# Without session:
# Each request: Initialize model → Load context → Generate → Discard
# With session (chat_id):
# First request: Initialize model → Load context → Generate → Save
# Later: Resume → Generate (instant)
```
### Streaming vs Non-streaming
**Streaming:**
- Lower perceived latency (first token faster)
- Better UX for long responses
- Resource cleanup via generator exit
**Non-streaming:**
- Simpler error handling
- Atomic response (no partial output)
- Use for short responses
---
## Security Considerations
### Token Protection
```python
# ❌ Never log tokens
logger.debug(f"Token: {self.valves.GH_TOKEN}") # DON'T DO THIS
# ✅ Mask sensitive data
logger.debug(f"Token configured: {'*' * 10}")
```
### Workspace Isolation
```python
# Set WORKSPACE_DIR to restrict file access
WORKSPACE_DIR = "/safe/sandbox/path"
# Copilot CLI respects this directory
client_config["cwd"] = WORKSPACE_DIR
```
### Input Validation
```python
# Validate chat_id format
if chat_id and not re.match(r'^[a-zA-Z0-9_-]+$', chat_id):
logger.warning(f"Invalid chat_id format: {chat_id}")
chat_id = None
```
---
## Future Enhancements
### Planned Features
1. **Multi-Session Management**: Support multiple parallel sessions per user
2. **Session Analytics**: Track token usage, compaction frequency
3. **Tool Result Caching**: Avoid redundant tool calls
4. **Custom Event Filters**: User-configurable event handling
5. **Workspace Templates**: Pre-configured workspace environments
6. **Streaming Abort**: Graceful cancellation of long-running requests
### API Evolution
Monitoring Copilot SDK updates for:
- New event types (e.g., `assistant.function_call`)
- Enhanced tool capabilities
- Improved session serialization
---
## References
- [GitHub Copilot SDK Documentation](https://github.com/github/copilot-sdk)
- [OpenWebUI Pipe Development](https://docs.openwebui.com/)
- [Awesome OpenWebUI Project](https://github.com/Fu-Jie/awesome-openwebui)
---
**License:** MIT
**Maintainer:** Fu-Jie ([@Fu-Jie](https://github.com/Fu-Jie))