# GitHub Copilot SDK Integration Workflow **Author:** Fu-Jie **Version:** 0.2.3 **Last Updated:** 2026-01-27 --- ## Table of Contents 1. [Architecture Overview](#architecture-overview) 2. [Request Processing Flow](#request-processing-flow) 3. [Session Management](#session-management) 4. [Streaming Response Handling](#streaming-response-handling) 5. [Event Processing Mechanism](#event-processing-mechanism) 6. [Tool Execution Flow](#tool-execution-flow) 7. [System Prompt Extraction](#system-prompt-extraction) 8. [Configuration Parameters](#configuration-parameters) 9. [Key Functions Reference](#key-functions-reference) --- ## Architecture Overview ### Component Diagram ``` ┌─────────────────────────────────────────────────────────────┐ │ OpenWebUI │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ Pipe Interface (Entry Point) │ │ │ └─────────────────────┬─────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ _pipe_impl (Main Logic) │ │ │ │ ┌──────────────────────────────────────────────────┐ │ │ │ │ │ 1. Environment Setup (_setup_env) │ │ │ │ │ │ 2. Model Selection (request_model parsing) │ │ │ │ │ │ 3. Chat Context Extraction │ │ │ │ │ │ 4. System Prompt Extraction │ │ │ │ │ │ 5. Session Management (create/resume) │ │ │ │ │ │ 6. Streaming/Non-streaming Response │ │ │ │ │ └──────────────────────────────────────────────────┘ │ │ │ └─────────────────────┬─────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ GitHub Copilot Client │ │ │ │ ┌──────────────────────────────────────────────────┐ │ │ │ │ │ • CopilotClient (SDK instance) │ │ │ │ │ │ • Session (conversation context) │ │ │ │ │ │ • Event Stream (async events) │ │ │ │ │ └──────────────────────────────────────────────────┘ │ │ │ └─────────────────────┬─────────────────────────────────┘ │ │ │ │ └────────────────────────┼─────────────────────────────────────┘ ▼ ┌──────────────────────┐ │ Copilot CLI Process │ │ (Backend Agent) │ └──────────────────────┘ ``` ### Key Components 1. **Pipe Interface**: OpenWebUI's standard entry point 2. **Environment Manager**: CLI setup, token validation, environment variables 3. **Session Manager**: Persistent conversation state with automatic compaction 4. **Event Processor**: Asynchronous streaming event handler 5. **Tool System**: Custom tool registration and execution 6. **Debug Logger**: Frontend console logging for troubleshooting --- ## Request Processing Flow ### Complete Request Lifecycle ```mermaid graph TD A[OpenWebUI Request] --> B[pipe Entry Point] B --> C[_pipe_impl] C --> D{Setup Environment} D --> E[Parse Model ID] E --> F[Extract Chat Context] F --> G[Extract System Prompt] G --> H{Session Exists?} H -->|Yes| I[Resume Session] H -->|No| J[Create New Session] I --> K[Initialize Tools] J --> K K --> L[Process Images] L --> M{Streaming Mode?} M -->|Yes| N[stream_response] M -->|No| O[send_and_wait] N --> P[Async Event Stream] O --> Q[Direct Response] P --> R[Return to OpenWebUI] Q --> R ``` ### Step-by-Step Breakdown #### 1. Environment Setup (`_setup_env`) ```python def _setup_env(self, __event_call__=None): """ Priority: 1. Check VALVES.CLI_PATH 2. Search system PATH 3. Auto-install via curl (if not found) 4. Set GH_TOKEN environment variables """ ``` **Actions:** - Locate Copilot CLI binary - Set `COPILOT_CLI_PATH` environment variable - Configure `GH_TOKEN` for authentication - Apply custom environment variables #### 2. Model Selection ```python # Input: body["model"] = "copilotsdk-claude-sonnet-4.5" request_model = body.get("model", "") if request_model.startswith(f"{self.id}-"): real_model_id = request_model[len(f"{self.id}-"):] # "claude-sonnet-4.5" ``` #### 3. Chat Context Extraction (`_get_chat_context`) ```python # Priority order for chat_id: # 1. __metadata__ (most reliable) # 2. body["chat_id"] # 3. body["metadata"]["chat_id"] chat_ctx = self._get_chat_context(body, __metadata__, __event_call__) chat_id = chat_ctx.get("chat_id") ``` #### 4. System Prompt Extraction (`_extract_system_prompt`) Multi-source fallback strategy: 1. `metadata.model.params.system` 2. Model database lookup (by model_id) 3. `body.params.system` 4. Messages with `role="system"` #### 5. Session Creation/Resumption **New Session:** ```python session_config = SessionConfig( session_id=chat_id, model=real_model_id, streaming=is_streaming, tools=custom_tools, system_message={"mode": "append", "content": system_prompt_content}, infinite_sessions=InfiniteSessionConfig( enabled=True, background_compaction_threshold=0.8, buffer_exhaustion_threshold=0.95 ) ) session = await client.create_session(config=session_config) ``` **Resume Session:** ```python try: session = await client.resume_session(chat_id) # Session state preserved: history, tools, workspace except Exception: # Fallback to creating new session ``` --- ## Session Management ### Infinite Sessions Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Session Lifecycle │ │ │ │ ┌──────────┐ create ┌──────────┐ resume ┌───────┴───┐ │ │ Chat ID │─────────▶ │ Session │ ◀────────│ OpenWebUI │ │ └──────────┘ │ State │ └───────────┘ │ └─────┬────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Context Window Management │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ │ │ Messages [user, assistant, tool_results...] │ │ │ │ │ │ Token Usage: ████████████░░░░ (80%) │ │ │ │ │ └──────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ │ │ Threshold Reached (0.8) │ │ │ │ │ │ → Background Compaction Triggered │ │ │ │ │ └──────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ │ │ Compacted Summary + Recent Messages │ │ │ │ │ │ Token Usage: ██████░░░░░░░░░░░ (40%) │ │ │ │ │ └──────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` ### Configuration Parameters ```python InfiniteSessionConfig( enabled=True, # Enable infinite sessions background_compaction_threshold=0.8, # Start compaction at 80% token usage buffer_exhaustion_threshold=0.95 # Emergency threshold at 95% ) ``` **Behavior:** - **< 80%**: Normal operation, no compaction - **80-95%**: Background compaction (summarize older messages) - **> 95%**: Force compaction before next request --- ## Streaming Response Handling ### Event-Driven Architecture ```python async def stream_response( self, client, session, send_payload, init_message: str = "", __event_call__=None ) -> AsyncGenerator: """ Asynchronous event processing with queue-based buffering. Flow: 1. Start async send task 2. Register event handler 3. Process events via queue 4. Yield chunks to OpenWebUI 5. Clean up resources """ ``` ### Event Processing Pipeline ``` ┌────────────────────────────────────────────────────────────┐ │ Copilot SDK Event Stream │ └────────────────────┬───────────────────────────────────────┘ │ ▼ ┌────────────────────────┐ │ Event Handler │ │ (Sync Callback) │ └────────┬───────────────┘ │ ▼ ┌────────────────────────┐ │ Async Queue │ │ (Thread-safe) │ └────────┬───────────────┘ │ ▼ ┌────────────────────────┐ │ Consumer Loop │ │ (async for) │ └────────┬───────────────┘ │ ▼ ┌────────────────────────┐ │ yield to OpenWebUI │ └────────────────────────┘ ``` ### State Management During Streaming ```python state = { "thinking_started": False, # tags opened "content_sent": False # Main content has started } active_tools = {} # Track concurrent tool executions ``` **State Transitions:** 1. `reasoning_delta` arrives → `thinking_started = True` → Output: `\n{reasoning}` 2. `message_delta` arrives → Close `` if open → `content_sent = True` → Output: `{content}` 3. `tool.execution_start` → Output tool indicator (inside/outside ``) 4. `session.complete` → Finalize stream --- ## Event Processing Mechanism ### Event Type Reference Following official SDK patterns (from `copilot.SessionEventType`): | Event Type | Description | Key Data Fields | Handler Action | |-----------|-------------|-----------------|----------------| | `assistant.message_delta` | Main content streaming | `delta_content` | Yield text chunk | | `assistant.reasoning_delta` | Chain-of-thought | `delta_content` | Wrap in `` tags | | `tool.execution_start` | Tool call initiated | `name`, `tool_call_id` | Display tool indicator | | `tool.execution_complete` | Tool finished | `result.content` | Show completion status | | `session.compaction_start` | Context compaction begins | - | Log debug info | | `session.compaction_complete` | Compaction done | - | Log debug info | | `session.error` | Error occurred | `error`, `message` | Emit error notification | ### Event Handler Implementation ```python def handler(event): """Process streaming events following official SDK patterns.""" event_type = get_event_type(event) # Handle enum/string types # Extract data using safe_get_data_attr (handles dict/object) if event_type == "assistant.message_delta": delta = safe_get_data_attr(event, "delta_content") if delta: queue.put_nowait(delta) # Thread-safe enqueue ``` ### Official SDK Pattern Compliance ```python def safe_get_data_attr(event, attr: str, default=None): """ Official pattern: event.data.delta_content Handles both dict and object access patterns. """ if not hasattr(event, "data") or event.data is None: return default data = event.data # Dict access (JSON-like) if isinstance(data, dict): return data.get(attr, default) # Object attribute (Python SDK) return getattr(data, attr, default) ``` --- ## Tool Execution Flow ### Tool Registration ```python # 1. Define tool at module level @define_tool(description="Generate a random integer within a specified range.") async def generate_random_number(params: RandomNumberParams) -> str: number = random.randint(params.min, params.max) return f"Generated random number: {number}" # 2. Register in _initialize_custom_tools def _initialize_custom_tools(self): if not self.valves.ENABLE_TOOLS: return [] all_tools = { "generate_random_number": generate_random_number, } # Filter based on AVAILABLE_TOOLS valve if self.valves.AVAILABLE_TOOLS == "all": return list(all_tools.values()) enabled = [t.strip() for t in self.valves.AVAILABLE_TOOLS.split(",")] return [all_tools[name] for name in enabled if name in all_tools] ``` ### Tool Execution Timeline ``` User Message: "Generate a random number between 1 and 100" │ ▼ Model Decision: Use tool `generate_random_number` │ ▼ Event: tool.execution_start │ → Display: "🔧 Running Tool: generate_random_number" ▼ Tool Function Execution (async) │ ▼ Event: tool.execution_complete │ → Result: "Generated random number: 42" │ → Display: "✅ Tool Completed: 42" ▼ Model generates response using tool result │ ▼ Event: assistant.message_delta │ → "I generated the number 42 for you." ▼ Stream Complete ``` ### Visual Indicators **Before Content:** ```markdown Running Tool: generate_random_number... Tool `generate_random_number` Completed. Result: 42 I generated the number 42 for you. ``` **After Content Started:** ```markdown The number is > 🔧 **Running Tool**: `generate_random_number` > ✅ **Tool Completed**: 42 actually 42. ``` --- ## System Prompt Extraction ### Multi-Source Priority System ```python async def _extract_system_prompt(self, body, messages, request_model, real_model_id): """ Priority order: 1. metadata.model.params.system (highest) 2. Model database lookup 3. body.params.system 4. messages[role="system"] (fallback) """ ``` ### Source 1: Metadata Model Params ```python # OpenWebUI injects model configuration metadata = body.get("metadata", {}) meta_model = metadata.get("model", {}) meta_params = meta_model.get("params", {}) system_prompt = meta_params.get("system") # Priority 1 ``` ### Source 2: Model Database ```python from open_webui.models.models import Models # Try multiple model ID variations model_ids_to_try = [ request_model, # "copilotsdk-claude-sonnet-4.5" request_model.removeprefix(...), # "claude-sonnet-4.5" real_model_id, # From valves ] for mid in model_ids_to_try: model_record = Models.get_model_by_id(mid) if model_record and hasattr(model_record, "params"): system_prompt = model_record.params.get("system") if system_prompt: break ``` ### Source 3: Body Params ```python body_params = body.get("params", {}) system_prompt = body_params.get("system") ``` ### Source 4: System Message ```python for msg in messages: if msg.get("role") == "system": system_prompt = self._extract_text_from_content(msg.get("content")) break ``` ### Configuration in SessionConfig ```python system_message_config = { "mode": "append", # Append to conversation context "content": system_prompt_content } session_config = SessionConfig( system_message=system_message_config, # ... other params ) ``` --- ## Configuration Parameters ### Valve Definitions | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `GH_TOKEN` | str | `""` | GitHub Fine-grained Token (requires 'Copilot Requests' permission) | | `MODEL_ID` | str | `"claude-sonnet-4.5"` | Default model when dynamic fetching fails | | `CLI_PATH` | str | `"/usr/local/bin/copilot"` | Path to Copilot CLI binary | | `DEBUG` | bool | `False` | Enable frontend console debug logging | | `LOG_LEVEL` | str | `"error"` | CLI log level: none, error, warning, info, debug, all | | `SHOW_THINKING` | bool | `True` | Display model reasoning in `` tags | | `SHOW_WORKSPACE_INFO` | bool | `True` | Show session workspace path in debug mode | | `EXCLUDE_KEYWORDS` | str | `""` | Comma-separated keywords to exclude models | | `WORKSPACE_DIR` | str | `""` | Restricted workspace directory (empty = process cwd) | | `INFINITE_SESSION` | bool | `True` | Enable automatic context compaction | | `COMPACTION_THRESHOLD` | float | `0.8` | Background compaction at 80% token usage | | `BUFFER_THRESHOLD` | float | `0.95` | Emergency threshold at 95% | | `TIMEOUT` | int | `300` | Stream chunk timeout (seconds) | | `CUSTOM_ENV_VARS` | str | `""` | JSON string of custom environment variables | | `ENABLE_TOOLS` | bool | `False` | Enable custom tool system | | `AVAILABLE_TOOLS` | str | `"all"` | Available tools: "all" or comma-separated list | ### Environment Variables ```bash # Set by _setup_env export COPILOT_CLI_PATH="/usr/local/bin/copilot" export GH_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx" export GITHUB_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx" # Custom variables (from CUSTOM_ENV_VARS valve) export CUSTOM_VAR_1="value1" export CUSTOM_VAR_2="value2" ``` --- ## Key Functions Reference ### Entry Points #### `pipe(body, __metadata__, __event_emitter__, __event_call__)` - **Purpose**: OpenWebUI stable entry point - **Returns**: Delegates to `_pipe_impl` #### `_pipe_impl(body, __metadata__, __event_emitter__, __event_call__)` - **Purpose**: Main request processing logic - **Flow**: Setup → Extract → Session → Response - **Returns**: `str` (non-streaming) or `AsyncGenerator` (streaming) #### `pipes()` - **Purpose**: Dynamic model list fetching - **Returns**: List of available models with multiplier info - **Caching**: Uses `_model_cache` to avoid repeated API calls ### Session Management #### `_build_session_config(chat_id, real_model_id, custom_tools, system_prompt_content, is_streaming)` - **Purpose**: Construct SessionConfig object - **Returns**: `SessionConfig` with infinite sessions and tools #### `_get_chat_context(body, __metadata__, __event_call__)` - **Purpose**: Extract chat_id with priority fallback - **Returns**: `{"chat_id": str}` ### Streaming #### `stream_response(client, session, send_payload, init_message, __event_call__)` - **Purpose**: Async streaming event processor - **Yields**: Text chunks to OpenWebUI - **Resources**: Auto-cleanup client and session #### `handler(event)` - **Purpose**: Sync event callback (inside `stream_response`) - **Action**: Parse event → Enqueue chunks → Update state ### Helpers #### `_emit_debug_log(message, __event_call__)` - **Purpose**: Send debug logs to frontend console - **Condition**: Only when `DEBUG=True` #### `_setup_env(__event_call__)` - **Purpose**: Locate CLI, set environment variables - **Side Effects**: Modifies `os.environ` #### `_extract_system_prompt(body, messages, request_model, real_model_id, __event_call__)` - **Purpose**: Multi-source system prompt extraction - **Returns**: `(system_prompt_content, source_name)` #### `_process_images(messages, __event_call__)` - **Purpose**: Extract text and images from multimodal messages - **Returns**: `(text_content, attachments_list)` #### `_initialize_custom_tools()` - **Purpose**: Register and filter custom tools - **Returns**: List of tool functions ### Utility Functions #### `get_event_type(event) -> str` - **Purpose**: Extract event type string from enum/string - **Handles**: `SessionEventType` enum → `.value` extraction #### `safe_get_data_attr(event, attr: str, default=None)` - **Purpose**: Safe attribute extraction from event.data - **Handles**: Both dict access and object attribute access --- ## Troubleshooting Guide ### Enable Debug Mode ```python # In OpenWebUI Valves UI: DEBUG = True SHOW_WORKSPACE_INFO = True LOG_LEVEL = "debug" ``` ### Debug Output Location **Frontend Console:** ```javascript // Open browser DevTools (F12) // Look for logs with prefix: [Copilot Pipe] console.debug("[Copilot Pipe] Extracted ChatID: abc123 (Source: __metadata__)") ``` **Backend Logs:** ```python # Python logging output logger.debug(f"[Copilot Pipe] Session resumed: {chat_id}") ``` ### Common Issues #### 1. Session Not Resuming **Symptom:** New session created every request **Causes:** - `chat_id` not extracted correctly - Session expired on Copilot side - `INFINITE_SESSION=False` (sessions not persistent) **Solution:** ```python # Check debug logs for: "Extracted ChatID: (Source: ...)" "Session not found (...), creating new." ``` #### 2. System Prompt Not Applied **Symptom:** Model ignores configured system prompt **Causes:** - Not found in any of 4 sources - Session resumed (system prompt only set on creation) **Solution:** ```python # Check debug logs for: "Extracted system prompt from (length: X)" "Configured system message (mode: append)" ``` #### 3. Tools Not Available **Symptom:** Model can't use custom tools **Causes:** - `ENABLE_TOOLS=False` - Tool not registered in `_initialize_custom_tools` - Wrong `AVAILABLE_TOOLS` filter **Solution:** ```python # Check debug logs for: "Enabled X custom tools: ['tool1', 'tool2']" ``` --- ## Performance Optimization ### Model List Caching ```python # First request: Fetch from API models = await client.list_models() self._model_cache = [...] # Cache result # Subsequent requests: Use cache if self._model_cache: return self._model_cache ``` ### Session Persistence **Impact:** Eliminates redundant model initialization on every request ```python # Without session: # Each request: Initialize model → Load context → Generate → Discard # With session (chat_id): # First request: Initialize model → Load context → Generate → Save # Later: Resume → Generate (instant) ``` ### Streaming vs Non-streaming **Streaming:** - Lower perceived latency (first token faster) - Better UX for long responses - Resource cleanup via generator exit **Non-streaming:** - Simpler error handling - Atomic response (no partial output) - Use for short responses --- ## Security Considerations ### Token Protection ```python # ❌ Never log tokens logger.debug(f"Token: {self.valves.GH_TOKEN}") # DON'T DO THIS # ✅ Mask sensitive data logger.debug(f"Token configured: {'*' * 10}") ``` ### Workspace Isolation ```python # Set WORKSPACE_DIR to restrict file access WORKSPACE_DIR = "/safe/sandbox/path" # Copilot CLI respects this directory client_config["cwd"] = WORKSPACE_DIR ``` ### Input Validation ```python # Validate chat_id format if chat_id and not re.match(r'^[a-zA-Z0-9_-]+$', chat_id): logger.warning(f"Invalid chat_id format: {chat_id}") chat_id = None ``` --- ## Future Enhancements ### Planned Features 1. **Multi-Session Management**: Support multiple parallel sessions per user 2. **Session Analytics**: Track token usage, compaction frequency 3. **Tool Result Caching**: Avoid redundant tool calls 4. **Custom Event Filters**: User-configurable event handling 5. **Workspace Templates**: Pre-configured workspace environments 6. **Streaming Abort**: Graceful cancellation of long-running requests ### API Evolution Monitoring Copilot SDK updates for: - New event types (e.g., `assistant.function_call`) - Enhanced tool capabilities - Improved session serialization --- ## References - [GitHub Copilot SDK Documentation](https://github.com/github/copilot-sdk) - [OpenWebUI Pipe Development](https://docs.openwebui.com/) - [OpenWebUI Extensions Project](https://github.com/Fu-Jie/openwebui-extensions) --- **License:** MIT **Maintainer:** Fu-Jie ([@Fu-Jie](https://github.com/Fu-Jie))