fix(pipes): fix mcp tool filtering and force-enable autonomous web search

- Fix issue where mcp tool filtering logic (function_name_filter_list) in admin backend caused all tools to be hidden due to ID prefix mismatch
- Force enable web_search tool for Copilot Agent regardless of UI toggles, providing full autonomy for search-related intents
- Updated README and version to v0.9.1
This commit is contained in:
fujie
2026-03-04 00:11:28 +08:00
parent a8a324500a
commit c6279240b9
26 changed files with 3109 additions and 59 deletions

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK Pipe for OpenWebUI
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.9.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.9.1 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
@@ -14,19 +14,17 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
---
## ✨ v0.9.0: The Skills Revolution & Stability Update
## ✨ v0.9.1: MCP Tool Filtering & Web Search Reliability Fix
- **🧩 Copilot SDK Skills Support**: Native support for Copilot SDK skill directories (`SKILL.md` + resources).
- **🔄 OpenWebUI Skills Bridge**: Full bidirectional sync between OpenWebUI Workspace > Skills and SDK skill directories.
- **🛠️ Deterministic `manage_skills` Tool**: Expert tool for stable install/create/list/edit/delete skill operations.
- **🌊 Reinforced Status Bar**: Multi-layered locking mechanism (`session_finalized` guard) and atomic async delivery to prevent "stuck" indicators.
- **🗂️ Persistent Config Directory**: Added `COPILOTSDK_CONFIG_DIR` for stable session-state persistence across container restarts.
- **🐛 Fixed MCP tool filtering logic**: Resolved a critical issue where configuring `function_name_filter_list` (or selecting specific tools in UI) would cause all tools from that MCP server to be incorrectly hidden due to ID prefix mismatches (`server:mcp:`).
- **🌐 Autonomous Web Search**: `web_search` is now always enabled for the agent (bypassing the UI toggle), leveraging the Copilot SDK's native ability to decide when to search.
- **🔍 Improved filter stability**: Ensured tool-level whitelists apply reliably without breaking the entire server connection.
---
## ✨ Key Capabilities
- **🔑 Unified Intelligence (Official + BYOK)**: Seamlessly switch between official GitHub Copilot models (o1, GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash) and your own models (OpenAI, Anthropic) via **Bring Your Own Key** mode.
- **🔑 Unified Intelligence (Official + BYOK)**: Seamlessly switch between official GitHub Copilot models and your own models (OpenAI, Anthropic, DeepSeek, xAI) via **Bring Your Own Key** mode.
- **🛡️ Physical Workspace Isolation**: Every session runs in its own isolated directory sandbox. This ensures absolute data privacy and prevents cross-chat file contamination while allowing the Agent full filesystem access.
- **🔌 Universal Tool Protocol**:
- **Native MCP**: Direct, high-performance connection to Model Context Protocol servers.

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK 官方管道
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.9.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.9.1 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商OpenAI, Anthropic并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
@@ -8,17 +8,11 @@
> **核心伴侣组件**
> 如需启用文件处理与数据分析能力,请务必安装 [GitHub Copilot SDK Files Filter](https://openwebui.com/posts/403a62ee-a596-45e7-be65-fab9cc24dd6)。
>
>## ✨ 0.9.0 核心更新:技能革命与稳定性加固
>## ✨ 0.9.1 最新更新MCP 工具过滤与网页搜索可靠性修复
- **🧩 Copilot SDK Skills 原生支持**: 技能可作为一等上下文能力被加载和使用
- **🔄 OpenWebUI Skills 桥接**: 实现 OpenWebUI **工作区 > Skills** 与 SDK 技能目录的深度双向同步
- **🛠️ 确定性 `manage_skills` 工具**: 通过稳定工具契约完成技能的生命周期管理
- **🌊 状态栏逻辑加固**: 引入 `session_finalized` 多层锁定机制,彻底解决任务完成后状态栏回弹或卡死的问题。
- **🗂️ 环境目录持久化**: 增强 `COPILOTSDK_CONFIG_DIR` 逻辑,确保会话状态跨容器重启稳定存在。
- **🌐 持续化共享缓存(扩展)**: 技能统一存储在 `OPENWEBUI_SKILLS_SHARED_DIR/shared/`,跨会话与容器重启复用。
- **🎯 智能意图路由(扩展)**: 自动识别技能管理请求并优先路由到 `manage_skills`,确保执行确定性。
- **🗂️ 环境目录升级**: 新增 `COPILOTSDK_CONFIG_DIR`,并自动回退到 `/app/backend/data/.copilot`,确保 SDK 配置与会话状态在容器重启后稳定持久化。
- **🧭 CLI 提示词护栏**: 系统提示词明确区分可执行的 **tools** 与不可调用的 **skills**,并要求技能生命周期操作优先走 `manage_skills`,同时强化 CLI/Python 执行规范。
- **🐛 修复 MCP 工具过滤逻辑**:解决了在管理员后端配置 `function_name_filter_list`(或在聊天界面勾选特定工具)时,因 ID 前缀(`server:mcp:`)识别逻辑错误导致所选服务器下的全部工具意外失效的问题
- **🌐 自主网页搜索**`web_search` 工具现已强制对 Agent 开启(绕过 UI 网页搜索开关),充分利用 Copilot 自身具备的搜索判断能力
- **🔍 提升过滤稳定性**:由于修复了 ID 归一化逻辑,现在手动点选或后端配置的工具白名单均能稳定生效,不再会导致整个服务被排除
> [!TIP]
> **BYOK 模式无需订阅**
@@ -28,7 +22,7 @@
## ✨ 核心能力 (Key Capabilities)
- **🔑 统一智能体验 (官方 + BYOK)**: 自由切换官方模型o1, GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash与自定义服务商OpenAI, Anthropic支持 **BYOK (自带 Key)** 模式。
- **🔑 统一智能体验 (官方 + BYOK)**: 自由切换官方模型与自定义服务商OpenAI, Anthropic, DeepSeek, xAI),支持 **BYOK (自带 Key)** 模式。
- **🛡️ 物理级工作区隔离**: 每个会话在独立的沙箱目录中运行。确保绝对的数据隐私,防止不同聊天间的文件污染,同时给予 Agent 完整的文件系统操作权限。
- **🔌 通用工具协议**:
- **原生 MCP**: 高性能直连 Model Context Protocol 服务器。

View File

@@ -15,7 +15,7 @@ Pipes allow you to:
## Available Pipe Plugins
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.9.0) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. **NEW in v0.9.0: OpenWebUI Skills Bridge**, reinforced status bar stability, and persistent SDK config management. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md) | [**View Detailed Usage Guide**](github-copilot-sdk-usage-guide.md).
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.9.1) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. **NEW in v0.9.1: MCP filter reliability fix** for `server:mcp:{id}` chat selection and function filter consistency. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md) | [**View Detailed Usage Guide**](github-copilot-sdk-usage-guide.md).
- **[Case Study: GitHub 100 Star Growth Analysis](star-prediction-example.md)** - Learn how to use the GitHub Copilot SDK Pipe with Minimax 2.1 to automatically analyze CSV data and generate project growth reports.
- **[Case Study: High-Quality Video to GIF Conversion](video-processing-example.md)** - See how the model uses system-level FFmpeg to accelerate, scale, and optimize colors for screen recordings.

View File

@@ -15,7 +15,7 @@ Pipes 可以用于:
## 可用的 Pipe 插件
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.9.0) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。**v0.9.0 重量级更新OpenWebUI Skills 桥接**、状态栏稳定性加固,以及持久化 SDK 配置目录管理(`COPILOTSDK_CONFIG_DIR`。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md) | [**查看详细使用手册**](github-copilot-sdk-usage-guide.zh.md)。
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.9.1) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。**v0.9.1 更新MCP 过滤可靠性修复**,修正 `server:mcp:{id}` 聊天选择匹配并提升函数过滤一致性。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md) | [**查看详细使用手册**](github-copilot-sdk-usage-guide.zh.md)。
- **[实战案例GitHub 100 Star 增长预测](star-prediction-example.zh.md)** - 展示如何使用 GitHub Copilot SDK Pipe 结合 Minimax 2.1 模型,自动编写脚本分析 CSV 数据并生成详细的项目增长报告。
- **[实战案例:视频高质量 GIF 转换与加速](video-processing-example.zh.md)** - 演示模型如何通过底层 FFmpeg 工具对录屏进行加速、缩放及双阶段色彩优化处理。

View File

@@ -0,0 +1,138 @@
# Copilot SDK 自动任务脚本使用说明
本目录提供了一个通用任务执行脚本,以及两个示例任务脚本:
- `auto_programming_task.py`(通用)
- `run_mindmap_action_to_tool.sh`示例mind map action → tool
- `run_infographic_action_to_tool.sh`示例infographic action → tool
## 1. 先决条件
- 在仓库根目录执行(非常重要)
- Python 3 可用
- 当前环境中可正常使用 Copilot SDK / CLI
建议先验证:
python3 plugins/debug/copilot-sdk/auto_programming_task.py --help | head -40
---
## 2. 核心行为(当前默认)
`auto_programming_task.py` 默认是 **两阶段自动执行**
1) 先规划PlanningAI 根据你的需求自动补全上下文、扩展为可执行计划。
2) 再执行ExecutionAI 按计划直接改代码并给出结果。
如果你要关闭“先规划”,可使用 `--no-plan-first`
---
## 3. 可复制命令(通用)
### 3.1 最常用:直接写任务文本
python3 plugins/debug/copilot-sdk/auto_programming_task.py \
--task "把 plugins/actions/xxx/xxx.py 转成 plugins/tools/xxx-tool/ 下的单文件 Tool 插件。保留 i18n 和语言回退逻辑。不要升级 SDK 版本。" \
--cwd "$PWD" \
--model "gpt-5.3-codex" \
--reasoning-effort "xhigh" \
--timeout 3600 \
--stream \
--trace-events \
--heartbeat-seconds 8
### 3.2 使用任务文件(长任务推荐)
先写任务文件(例如 task.txt再执行
python3 plugins/debug/copilot-sdk/auto_programming_task.py \
--task-file "./task.txt" \
--cwd "$PWD" \
--model "gpt-5.3-codex" \
--reasoning-effort "xhigh" \
--timeout 3600 \
--stream \
--trace-events \
--heartbeat-seconds 8
### 3.3 关闭规划阶段(仅直接执行)
python3 plugins/debug/copilot-sdk/auto_programming_task.py \
--task "你的任务" \
--cwd "$PWD" \
--model "gpt-5-mini" \
--reasoning-effort "medium" \
--timeout 1800 \
--no-plan-first
---
## 4. 可复制命令(示例脚本)
### 4.1 Mind Map 示例任务
./plugins/debug/copilot-sdk/run_mindmap_action_to_tool.sh
### 4.2 Infographic 示例任务
./plugins/debug/copilot-sdk/run_infographic_action_to_tool.sh
说明:这两个脚本是“固定任务模板”,适合当前仓库;复制到其他仓库时通常需要改任务内容。
---
## 5. 结果如何判定“完成”
建议同时满足以下条件:
1) 进程退出码为 0
2) 输出中出现阶段结束信息(含最终摘要)
3) 看到 `session.idle`(若是 `session.error` 则未完成)
4) `git diff --name-only` 显示改动范围符合你的约束
可复制检查命令:
echo $?
git diff --name-only
git status --short
---
## 6. 参数速查
- `--task`:直接传任务文本
- `--task-file`:从文件读取任务文本(与 `--task` 二选一)
- `--cwd`:工作区目录(建议用 `$PWD`
- `--model`:模型(例如 `gpt-5.3-codex``gpt-5-mini`
- `--reasoning-effort``low|medium|high|xhigh`
- `--timeout`:超时秒数
- `--stream`:实时输出增量内容
- `--trace-events`:输出事件流,便于排错
- `--heartbeat-seconds`:心跳输出间隔
- `--no-plan-first`:关闭默认“先规划后执行”
---
## 7. 常见问题
### Q1为什么提示找不到脚本
你大概率不在仓库根目录。先执行:
pwd
确认后再运行命令。
### Q2执行很久没有输出
加上 `--trace-events --stream`,并适当增大 `--timeout`
### Q3改动超出预期范围
把范围约束明确写进任务文本,例如:
“不要修改其他文件代码,可以读取整个项目作为代码库。”
并在完成后用:
git diff --name-only
进行核对。

View File

@@ -0,0 +1,137 @@
#!/usr/bin/env bash
# run_owui_api_docs_phases.sh
# One-click runner: generate OpenWebUI API documentation across 8 phases.
#
# Usage:
# ./plugins/debug/copilot-sdk/run_owui_api_docs_phases.sh
# ./plugins/debug/copilot-sdk/run_owui_api_docs_phases.sh --start-phase 3
# ./plugins/debug/copilot-sdk/run_owui_api_docs_phases.sh --only-phase 1
#
# Working directory: /Users/fujie/app/python/oui/open-webui (open-webui source)
# Task files: plugins/debug/copilot-sdk/tasks/owui-api-docs/phases/
set -euo pipefail
# ── Resolve paths ────────────────────────────────────────────────────────────
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/../../.." && pwd)" # openwebui-extensions root
TASKS_DIR="${SCRIPT_DIR}/tasks/owui-api-docs/phases"
TARGET_CWD="/Users/fujie/app/python/oui/open-webui" # source repo to scan
RUNNER="${SCRIPT_DIR}/auto_programming_task.py"
PYTHON="${PYTHON:-python3}"
# ── Arguments ────────────────────────────────────────────────────────────────
START_PHASE=1
ONLY_PHASE=""
while [[ $# -gt 0 ]]; do
case "$1" in
--start-phase)
START_PHASE="$2"; shift 2 ;;
--only-phase)
ONLY_PHASE="$2"; shift 2 ;;
*)
echo "Unknown argument: $1" >&2; exit 1 ;;
esac
done
# ── Phase definitions ─────────────────────────────────────────────────────────
declare -a PHASE_FILES=(
"01_route_index.txt"
"02_auth_users_groups_models.txt"
"03_chats_channels_memories_notes.txt"
"04_files_folders_knowledge_retrieval.txt"
"05_ollama_openai_audio_images.txt"
"06_tools_functions_pipelines_skills_tasks.txt"
"07_configs_prompts_evaluations_analytics_scim_utils.txt"
"08_consolidation_index.txt"
)
declare -a PHASE_LABELS=(
"Route Index (master table)"
"Auth / Users / Groups / Models"
"Chats / Channels / Memories / Notes"
"Files / Folders / Knowledge / Retrieval"
"Ollama / OpenAI / Audio / Images"
"Tools / Functions / Pipelines / Skills / Tasks"
"Configs / Prompts / Evaluations / Analytics / SCIM / Utils"
"Consolidation — README + JSON"
)
# ── Pre-flight checks ─────────────────────────────────────────────────────────
echo "============================================================"
echo " OpenWebUI API Docs — Phase Runner"
echo "============================================================"
echo " Source (--cwd): ${TARGET_CWD}"
echo " Task files: ${TASKS_DIR}"
echo " Runner: ${RUNNER}"
echo ""
if [[ ! -d "${TARGET_CWD}" ]]; then
echo "ERROR: Target source directory not found: ${TARGET_CWD}" >&2
exit 1
fi
if [[ ! -f "${RUNNER}" ]]; then
echo "ERROR: Runner script not found: ${RUNNER}" >&2
exit 1
fi
# ── Run phases ────────────────────────────────────────────────────────────────
TOTAL=${#PHASE_FILES[@]}
PASSED=0
FAILED=0
for i in "${!PHASE_FILES[@]}"; do
PHASE_NUM=$((i + 1))
TASK_FILE="${TASKS_DIR}/${PHASE_FILES[$i]}"
LABEL="${PHASE_LABELS[$i]}"
# --only-phase filter
if [[ -n "${ONLY_PHASE}" && "${PHASE_NUM}" != "${ONLY_PHASE}" ]]; then
echo " [SKIP] Phase ${PHASE_NUM}: ${LABEL}"
continue
fi
# --start-phase filter
if [[ "${PHASE_NUM}" -lt "${START_PHASE}" ]]; then
echo " [SKIP] Phase ${PHASE_NUM}: ${LABEL} (before start phase)"
continue
fi
if [[ ! -f "${TASK_FILE}" ]]; then
echo " [ERROR] Task file not found: ${TASK_FILE}" >&2
FAILED=$((FAILED + 1))
break
fi
echo ""
echo "──────────────────────────────────────────────────────────"
echo " Phase ${PHASE_NUM}/${TOTAL}: ${LABEL}"
echo " Task file: ${PHASE_FILES[$i]}"
echo "──────────────────────────────────────────────────────────"
if "${PYTHON}" "${RUNNER}" \
--task-file "${TASK_FILE}" \
--cwd "${TARGET_CWD}" \
--model "claude-sonnet-4.6" \
--reasoning-effort high \
--no-plan-first; then
echo " ✓ Phase ${PHASE_NUM} completed successfully."
PASSED=$((PASSED + 1))
else
EXIT_CODE=$?
echo ""
echo " ✗ Phase ${PHASE_NUM} FAILED (exit code: ${EXIT_CODE})." >&2
echo " Fix the issue and re-run with: --start-phase ${PHASE_NUM}" >&2
FAILED=$((FAILED + 1))
exit "${EXIT_CODE}"
fi
done
# ── Summary ──────────────────────────────────────────────────────────────────
echo ""
echo "============================================================"
echo " Run complete: ${PASSED} passed, ${FAILED} failed"
echo " Output: ${TARGET_CWD}/api_docs/"
echo "============================================================"

View File

@@ -0,0 +1,74 @@
# OpenWebUI API Documentation — Phase Run Order
## Overview
This task set reads the OpenWebUI backend source code and generates a complete
API reference in `api_docs/` inside the open-webui repository.
**Source repo:** `/Users/fujie/app/python/oui/open-webui`
**Output directory:** `/Users/fujie/app/python/oui/open-webui/api_docs/`
**Task files dir:** `plugins/debug/copilot-sdk/tasks/owui-api-docs/phases/`
---
## Phase Execution Order
Run phases sequentially. Each phase depends on the previous.
| Order | Task File | Coverage | ~Lines Read |
|-------|-----------|----------|-------------|
| 1 | `01_route_index.txt` | main.py + all 26 router files → master route table | ~15,000 |
| 2 | `02_auth_users_groups_models.txt` | auths, users, groups, models | ~4,600 |
| 3 | `03_chats_channels_memories_notes.txt` | chats, channels, memories, notes | ~5,500 |
| 4 | `04_files_folders_knowledge_retrieval.txt` | files, folders, knowledge, retrieval | ~5,200 |
| 5 | `05_ollama_openai_audio_images.txt` | ollama, openai, audio, images | ~6,900 |
| 6 | `06_tools_functions_pipelines_skills_tasks.txt` | tools, functions, pipelines, skills, tasks | ~3,200 |
| 7 | `07_configs_prompts_evaluations_analytics_scim_utils.txt` | configs, prompts, evaluations, analytics, scim, utils | ~3,400 |
| 8 | `08_consolidation_index.txt` | Consolidates all outputs → README.md + JSON | (reads generated files) |
---
## Output Files (after all phases complete)
```
open-webui/api_docs/
├── README.md ← Master index + quick reference
├── 00_route_index.md ← Complete route table (200+ endpoints)
├── 02_auths.md
├── 02_users.md
├── 02_groups.md
├── 02_models.md
├── 03_chats.md
├── 03_channels.md
├── 03_memories.md
├── 03_notes.md
├── 04_files.md
├── 04_folders.md
├── 04_knowledge.md
├── 04_retrieval.md
├── 05_ollama.md
├── 05_openai.md
├── 05_audio.md
├── 05_images.md
├── 06_tools.md
├── 06_functions.md
├── 06_pipelines.md
├── 06_skills.md
├── 06_tasks.md
├── 07_configs.md
├── 07_prompts.md
├── 07_evaluations.md
├── 07_analytics.md
├── 07_scim.md
├── 07_utils.md
└── openwebui_api.json ← Machine-readable summary (all routes)
```
---
## Notes
- Each phase uses `--no-plan-first` (detailed instructions already provided).
- Working directory for all phases: `/Users/fujie/app/python/oui/open-webui`
- The one-click runner: `run_owui_api_docs_phases.sh`
- If a phase fails, fix the issue and re-run that single phase before continuing.

View File

@@ -0,0 +1,41 @@
Phase 1 Mission:
Scan the entire OpenWebUI backend source and produce a master route index table.
Source root: backend/open_webui/
Target output directory: api_docs/
Constraints:
- Read-only on ALL files EXCEPT under api_docs/ (create it if missing).
- Do NOT generate per-endpoint detail yet — only the master table.
- Cover every router file in backend/open_webui/routers/.
- Also read backend/open_webui/main.py to capture route prefixes (app.include_router calls).
Deliverables:
1) Create directory: api_docs/
2) Create file: api_docs/00_route_index.md
Content of 00_route_index.md must contain:
- A table with columns: Module | HTTP Method | Path | Handler Function | Auth Required | Brief Description
- One row per route decorator found in every router file.
- "Auth Required" = YES if the route depends on get_verified_user / get_admin_user / similar dependency, NO otherwise.
- "Brief Description" = first sentence of the handler's docstring, or empty string if none.
- Group rows by Module (router file name without .py).
- At the top: a summary section listing total_route_count and module_count.
Process:
1. Read main.py — extract all app.include_router() calls, note prefix and tags per router.
2. For each router file in backend/open_webui/routers/, read it fully.
3. Find every @router.get/@router.post/@router.put/@router.delete/@router.patch decorator.
4. For each decorator: record path, method, function name, auth dependency, docstring.
5. Write the combined table to api_docs/00_route_index.md.
Exit Criteria:
- api_docs/00_route_index.md exists.
- Table contains at least 100 rows (the codebase has 200+ routes).
- No placeholder or TBD.
- Total route count printed at the top.
Final output format:
- List of files created/updated.
- Total routes found.
- Any router files that could not be parsed and why.

View File

@@ -0,0 +1,82 @@
Phase 2 Mission:
Generate detailed API reference documentation for authentication, users, groups, and models endpoints.
Prerequisites:
- api_docs/00_route_index.md must already exist (from Phase 1).
Source files to read (fully):
- backend/open_webui/routers/auths.py
- backend/open_webui/routers/users.py
- backend/open_webui/routers/groups.py
- backend/open_webui/routers/models.py
- backend/open_webui/models/auths.py (Pydantic models)
- backend/open_webui/models/users.py
- backend/open_webui/models/groups.py (if exists)
- backend/open_webui/models/models.py
Output files to create under api_docs/:
- 02_auths.md
- 02_users.md
- 02_groups.md
- 02_models.md
Per-endpoint format (use this EXACTLY for every endpoint in each file):
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Header | Authorization | Bearer token | Yes | JWT token |
| Body | field_name | type | Yes/No | description |
| Query | param_name | type | No | description |
| Path | param_name | type | Yes | description |
*If no request body/params, write: "No additional parameters."*
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Error Responses**
| Status | Meaning |
|--------|---------|
| 400 | Bad request / validation error |
| 401 | Not authenticated |
| 403 | Insufficient permissions |
| 404 | Resource not found |
---
Instructions:
1. Read each router file fully to understand every route.
2. Trace Pydantic model definitions from the corresponding models/ file.
3. Fill in every field from actual code — no guessing.
4. If a field is Optional with a default, mark Required = No.
5. For auth: check FastAPI dependency injection (Depends(get_verified_user) → "Verified user", Depends(get_admin_user) → "Admin only").
6. List ALL endpoints in the router — do not skip any.
Exit Criteria:
- 4 output files created.
- Every route from 00_route_index.md for these modules is covered.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.

View File

@@ -0,0 +1,87 @@
Phase 3 Mission:
Generate detailed API reference documentation for chat, channels, memories, and notes endpoints.
Prerequisites:
- api_docs/00_route_index.md must already exist (from Phase 1).
Source files to read (fully):
- backend/open_webui/routers/chats.py
- backend/open_webui/routers/channels.py
- backend/open_webui/routers/memories.py
- backend/open_webui/routers/notes.py
- backend/open_webui/models/chats.py (Pydantic models)
- backend/open_webui/models/channels.py
- backend/open_webui/models/memories.py
- backend/open_webui/models/notes.py (if exists)
- backend/open_webui/models/messages.py (shared message models)
Output files to create under api_docs/:
- 03_chats.md
- 03_channels.md
- 03_memories.md
- 03_notes.md
Per-endpoint format:
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Body | field_name | type | Yes/No | description |
*If no parameters, write: "No additional parameters."*
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Error Responses**
| Status | Meaning |
|--------|---------|
| 401 | Not authenticated |
| 403 | Insufficient permissions |
| 404 | Resource not found |
---
Special notes for this phase:
- chats.py is 1527 lines with ~40 routes — document ALL of them.
- channels.py is 2133 lines — document ALL routes; note WebSocket upgrade endpoints separately.
- For WebSocket endpoints: note the protocol (ws://) and describe events/message payload format.
- Pay special attention to chat history structure: messages array, history.messages dict.
- Note pagination parameters (skip, limit, page) where applicable.
Instructions:
1. Read each router file fully.
2. Trace Pydantic model definitions from the corresponding models/ file.
3. For complex response types (list of chats, paginated results), show the wrapper structure.
4. If a route modifies chat history, document the exact history object shape.
5. List ALL endpoints — do not skip paginated variants.
Exit Criteria:
- 4 output files created.
- Every route from 00_route_index.md for these modules is covered.
- WebSocket endpoints documented with payload shape.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.
- Note any complex schemas that required deep tracing.

View File

@@ -0,0 +1,94 @@
Phase 4 Mission:
Generate detailed API reference documentation for files, folders, knowledge base, and retrieval endpoints.
Prerequisites:
- api_docs/00_route_index.md must already exist (from Phase 1).
Source files to read (fully):
- backend/open_webui/routers/files.py (~911 lines)
- backend/open_webui/routers/folders.py (~351 lines)
- backend/open_webui/routers/knowledge.py (~1139 lines)
- backend/open_webui/routers/retrieval.py (~2820 lines — LARGEST FILE)
- backend/open_webui/models/files.py
- backend/open_webui/models/folders.py
- backend/open_webui/models/knowledge.py
Output files to create under api_docs/:
- 04_files.md
- 04_folders.md
- 04_knowledge.md
- 04_retrieval.md
Per-endpoint format:
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Body | field_name | type | Yes/No | description |
*If no parameters, write: "No additional parameters."*
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Error Responses**
| Status | Meaning |
|--------|---------|
| 401 | Not authenticated |
| 404 | Resource not found |
---
Special notes for this phase:
FILES:
- File upload uses multipart/form-data — document the form fields.
- File metadata response: id, filename, meta.content_type, size, user_id.
- File content endpoint: returns raw bytes — note Content-Type header behavior.
KNOWLEDGE:
- Knowledge base endpoints interact with vector store — note which ones trigger embedding/indexing.
- Document the "files" array in knowledge base objects (which file IDs are linked).
- Add/remove files from knowledge base: document the exact request shape.
RETRIEVAL:
- retrieval.py is 2820 lines; it configures the RAG pipeline (embedding models, chunk settings, etc.).
- Prioritize documenting: query endpoint, config GET/POST endpoints, embedding model endpoints.
- For config endpoints: document ALL configuration fields (chunk_size, chunk_overlap, top_k, etc.).
- Document the "process" endpoints (process_doc, process_web, process_youtube) with their request shapes.
Instructions:
1. Read ALL source files listed above.
2. For retrieval.py: focus on public API surface (router endpoints), not internal helper functions.
3. Document file upload endpoints with multipart form fields clearly marked.
4. Trace vector DB config models in retrieval.py to document all configurable fields.
Exit Criteria:
- 4 output files created.
- retrieval.py endpoints fully documented including all config fields.
- File upload endpoints show form-data field names.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.
- Note any tricky schemas (nested config objects, etc.).

View File

@@ -0,0 +1,98 @@
Phase 5 Mission:
Generate detailed API reference documentation for AI provider endpoints: Ollama, OpenAI-compatible, Audio, and Images.
Prerequisites:
- api_docs/00_route_index.md must already exist (from Phase 1).
Source files to read (fully):
- backend/open_webui/routers/ollama.py (~1884 lines)
- backend/open_webui/routers/openai.py (~1466 lines)
- backend/open_webui/routers/audio.py (~1397 lines)
- backend/open_webui/routers/images.py (~1164 lines)
Output files to create under api_docs/:
- 05_ollama.md
- 05_openai.md
- 05_audio.md
- 05_images.md
Per-endpoint format:
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Body | field_name | type | Yes/No | description |
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Streaming:** Yes / No *(add this line for endpoints that support SSE/streaming)*
**Error Responses**
| Status | Meaning |
|--------|---------|
| 401 | Not authenticated |
| 503 | Upstream provider unavailable |
---
Special notes for this phase:
OLLAMA:
- Endpoints are mostly pass-through proxies to Ollama's own API.
- Document which endpoints are admin-only (model management) vs user-accessible (generate/chat).
- For streaming endpoints (generate, chat), note: "Supports SSE streaming via stream=true."
- Document the model pull/push/delete management endpoints carefully.
OPENAI:
- Endpoints proxy to configured OpenAI-compatible backend.
- Document the /api/openai/models endpoint (returns merged model list).
- Note which endpoints pass through request body to upstream unchanged.
- Document admin endpoints for adding/removing OpenAI API connections.
AUDIO:
- Document: transcription (STT), TTS synthesis, and audio config endpoints.
- For file upload endpoints: specify multipart/form-data field names.
- Document supported audio formats and any size limits visible in code.
- Note: Engine types (openai, whisper, etc.) and configuration endpoints.
IMAGES:
- Document: image generation endpoints and image engine config.
- Note DALL-E vs ComfyUI vs Automatic1111 backend differences if documented in code.
- Document image config GET/POST: size, steps, model, and other parameters.
Instructions:
1. Read each file fully — they are complex proxying routers.
2. For pass-through proxy routes: still document the expected request/response shape.
3. Distinguish between admin configuration routes and user-facing generation routes.
4. Streaming endpoints must be clearly marked with "Streaming: Yes" and note the SSE event format.
Exit Criteria:
- 4 output files created.
- Every route from 00_route_index.md for these modules is covered.
- Streaming endpoints clearly annotated.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.
- Note streaming endpoints count per module.

View File

@@ -0,0 +1,103 @@
Phase 6 Mission:
Generate detailed API reference documentation for tools, functions, pipelines, skills, and tasks endpoints.
Prerequisites:
- docs/open_webui_api/00_route_index.md must already exist (from Phase 1).
- NOTE: Output directory is api_docs/ (not docs/open_webui_api/).
Source files to read (fully):
- backend/open_webui/routers/tools.py (~868 lines)
- backend/open_webui/routers/functions.py (~605 lines)
- backend/open_webui/routers/pipelines.py (~540 lines)
- backend/open_webui/routers/skills.py (~447 lines)
- backend/open_webui/routers/tasks.py (~764 lines)
- backend/open_webui/models/tools.py
- backend/open_webui/models/functions.py
- backend/open_webui/models/skills.py
Output files to create under api_docs/:
- 06_tools.md
- 06_functions.md
- 06_pipelines.md
- 06_skills.md
- 06_tasks.md
Per-endpoint format:
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Body | field_name | type | Yes/No | description |
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Error Responses**
| Status | Meaning |
|--------|---------|
| 401 | Not authenticated |
| 404 | Resource not found |
---
Special notes for this phase:
TOOLS:
- Tools are user-created Python functions exposed to LLM. Document CRUD operations.
- The tool "specs" field: document its structure (list of OpenAI function call specs).
- Document the "export" endpoint if present.
FUNCTIONS:
- Functions include filters, actions, pipes registered by admin.
- Document the `type` field values: "filter", "action", "pipe".
- Document the `meta` and `valves` fields structure.
PIPELINES:
- Pipelines connect to external pipeline servers.
- Document: add pipeline (URL + API key), list pipelines, get valves, set valves.
- Note: pipelines proxy through to an external server; document that behavior.
SKILLS:
- Skills are agent-style plugins with multi-step execution.
- Document the skills schema: name, content (Python source), meta.
- Note if there's a "call" endpoint for executing a skill.
TASKS:
- Tasks module handles background processing (title generation, tag generation, etc.).
- Document config endpoints (GET/POST for task-specific LLM settings).
- Document any direct invocation endpoints.
Instructions:
1. Read all source files fully.
2. For valves/specs/meta fields with complex structure, show the full nested schema.
3. Distinguish admin-only CRUD from user-accessible execution endpoints.
4. For endpoints that execute code (tools, functions, skills), clearly note security implications.
Exit Criteria:
- 5 output files created.
- Every route from 00_route_index.md for these modules is covered.
- Complex nested schemas (valves, specs, meta) fully documented.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.

View File

@@ -0,0 +1,109 @@
Phase 7 Mission:
Generate detailed API reference documentation for configuration, prompts, evaluations, analytics, SCIM, and utility endpoints.
Prerequisites:
- api_docs/00_route_index.md must already exist (from Phase 1).
Source files to read (fully):
- backend/open_webui/routers/configs.py (~548 lines)
- backend/open_webui/routers/prompts.py (~759 lines)
- backend/open_webui/routers/evaluations.py (~466 lines)
- backend/open_webui/routers/analytics.py (~454 lines)
- backend/open_webui/routers/scim.py (~1030 lines)
- backend/open_webui/routers/utils.py (~123 lines)
- backend/open_webui/models/prompts.py
- backend/open_webui/config.py (for config field definitions)
Output files to create under api_docs/:
- 07_configs.md
- 07_prompts.md
- 07_evaluations.md
- 07_analytics.md
- 07_scim.md
- 07_utils.md
Per-endpoint format:
---
### {HTTP_METHOD} {full_path}
**Summary:** One sentence description.
**Auth:** Admin only | Verified user | Public
**Request**
| Location | Field | Type | Required | Description |
|----------|-------|------|----------|-------------|
| Body | field_name | type | Yes/No | description |
**Response `200`**
```json
{
"example_field": "example_value"
}
```
| Field | Type | Description |
|-------|------|-------------|
| field_name | type | description |
**Error Responses**
| Status | Meaning |
|--------|---------|
| 401 | Not authenticated |
| 404 | Resource not found |
---
Special notes for this phase:
CONFIGS:
- This is the most important module in this phase.
- The global config GET/POST endpoints control system-wide settings.
- Read backend/open_webui/config.py to enumerate ALL configurable fields.
- Document every config field with its type, default, and effect.
- Group config fields by category (auth, RAG, models, UI, etc.) in the output.
PROMPTS:
- System prompts stored by users.
- Document CRUD operations and the command field (trigger word like "/summarize").
- Note the "access_control" field structure.
EVALUATIONS:
- Feedback/rating data for model responses.
- Document the feedback object structure (rating, comment, model_id, etc.).
- Note any aggregation/analytics endpoints.
ANALYTICS:
- Usage statistics endpoints.
- Document what metrics are tracked and aggregation options.
SCIM:
- SCIM 2.0 protocol for enterprise user/group provisioning.
- Document: /Users, /Groups, /ServiceProviderConfig, /ResourceTypes endpoints.
- Note: SCIM uses different Content-Type and auth mechanism — document these.
- Follow SCIM 2.0 RFC schema format for user/group objects.
UTILS:
- Miscellaneous utility endpoints.
- Document all available utilities (markdown renderer, code executor, etc.).
Instructions:
1. Read config.py in addition to router files to get complete field lists.
2. For SCIM: follow SCIM 2.0 RFC conventions in documentation format.
3. For configs: produce a separate "All Config Fields" appendix table.
Exit Criteria:
- 6 output files created.
- configs.md includes appendix table of ALL config fields with defaults.
- scim.md follows SCIM 2.0 documentation conventions.
- No placeholder or TBD.
Final output format:
- List of files created.
- Count of endpoints documented per file.
- Count of config fields documented in configs.md appendix.

View File

@@ -0,0 +1,89 @@
Phase 8 Mission:
Consolidate all previously generated phase outputs into a polished master index and a machine-readable summary.
Prerequisites:
- ALL phase 1~7 output files must exist under api_docs/.
- Specifically, these files must exist:
- api_docs/00_route_index.md
- api_docs/02_auths.md, 02_users.md, 02_groups.md, 02_models.md
- api_docs/03_chats.md, 03_channels.md, 03_memories.md, 03_notes.md
- api_docs/04_files.md, 04_folders.md, 04_knowledge.md, 04_retrieval.md
- api_docs/05_ollama.md, 05_openai.md, 05_audio.md, 05_images.md
- api_docs/06_tools.md, 06_functions.md, 06_pipelines.md, 06_skills.md, 06_tasks.md
- api_docs/07_configs.md, 07_prompts.md, 07_evaluations.md, 07_analytics.md, 07_scim.md, 07_utils.md
Output files to create/update under api_docs/:
1. api_docs/README.md — human-readable master index
2. api_docs/openwebui_api.json — machine-readable OpenAPI-style JSON summary
Content of README.md:
- Title: "OpenWebUI Backend API Reference"
- Subtitle: "Auto-generated from source code. Do not edit manually."
- Generation date (today's date)
- Table of Contents (links to every .md file above)
- Statistics:
- Total module count
- Total route count (from 00_route_index.md)
- Admin-only route count
- Public route count
- Streaming endpoint count
- Quick Reference: a condensed table of the 20 most commonly used endpoints (chat creation, message send, file upload, model list, auth login/logout, etc.)
- Authentication Guide section:
- How to get a JWT token (reference auths.md)
- How to include it in requests (Authorization: Bearer <token>)
- Token expiry behavior
- Common Patterns section:
- Pagination (skip/limit parameters)
- Error response shape: {detail: string}
- Rate limiting (if documented in code)
Content of openwebui_api.json:
A JSON object with this structure:
{
"meta": {
"generated_date": "YYYY-MM-DD",
"source": "backend/open_webui/routers/",
"total_routes": <number>,
"modules": [<list of module names>]
},
"routes": [
{
"module": "auths",
"method": "POST",
"path": "/api/v1/auths/signin",
"handler": "signin",
"auth_required": false,
"auth_type": "public",
"summary": "Authenticate user and return JWT token.",
"request_body": {
"email": {"type": "str", "required": true},
"password": {"type": "str", "required": true}
},
"response_200": {
"token": {"type": "str"},
"token_type": {"type": "str"},
"id": {"type": "str"}
},
"streaming": false
}
]
}
- Include ALL routes from all modules.
- For streaming endpoints: "streaming": true.
Instructions:
1. Read ALL generated phase output files (00 through 07).
2. Parse or summarize endpoint data from each file to populate the JSON.
3. Write README.md with complete statistics and quick reference.
4. Validate: total_routes in README.md must match count in openwebui_api.json.
Exit Criteria:
- api_docs/README.md exists with statistics and ToC.
- api_docs/openwebui_api.json exists with all routes (valid JSON).
- Route counts in README.md and JSON are consistent.
- No placeholder or TBD.
Final output format:
- Confirmation of files created.
- Total routes count in JSON.
- Any modules with missing or incomplete data (for manual review).

View File

@@ -0,0 +1,192 @@
# 🧭 Agents Stability & Friendliness Guide
This guide focuses on how to improve **reliability** and **user experience** of agents in `github_copilot_sdk.py`.
---
## 1) Goals
- Reduce avoidable failures (timeouts, tool-call dead ends, invalid outputs).
- Keep responses predictable under stress (large context, unstable upstream, partial tool failures).
- Make interaction friendly (clear progress, clarification before risky actions, graceful fallback).
- Preserve backwards compatibility while introducing stronger defaults.
---
## 2) Stability model (4 layers)
## Layer A — Input safety
- Validate essential runtime context early (user/chat/model/tool availability).
- Use strict parsing for JSON-like user/task config (never trust raw free text).
- Add guardrails for unsupported mode combinations (e.g., no tools + tool-required task).
**Implementation hints**
- Add preflight validator before `create_session`.
- Return fast-fail structured errors with recovery actions.
## Layer B — Session safety
- Use profile-driven defaults (`model`, `reasoning_effort`, `infinite_sessions` thresholds).
- Auto-fallback to safe profile when unknown profile is requested.
- Isolate each chat in a deterministic workspace path.
**Implementation hints**
- Add `AGENT_PROFILE` + fallback to `default`.
- Keep `infinite_sessions` enabled by default for long tasks.
## Layer C — Tool-call safety
- Add `on_pre_tool_use` to validate and sanitize args.
- Add denylist/allowlist checks for dangerous operations.
- Add timeout budget per tool class (file/network/shell).
**Implementation hints**
- Keep current `on_post_tool_use` behavior.
- Extend hooks gradually: `on_pre_tool_use` first, then `on_error_occurred`.
## Layer D — Recovery safety
- Retry only idempotent operations with capped attempts.
- Distinguish recoverable vs non-recoverable failures.
- Add deterministic fallback path (summary answer + explicit limitation).
**Implementation hints**
- Retry policy table by event type.
- Emit "what succeeded / what failed / what to do next" blocks.
---
## 3) Friendliness model (UX contract)
## A. Clarification first for ambiguity
Use `on_user_input_request` for:
- Missing constraints (scope, target path, output format)
- High-risk actions (delete/migrate/overwrite)
- Contradictory instructions
**Rule**: ask once with concise choices; avoid repeated back-and-forth.
## B. Progress visibility
Emit status in major phases:
1. Context check
2. Planning/analysis
3. Tool execution
4. Verification
5. Final result
**Rule**: no silent waits > 8 seconds.
## C. Friendly failure style
Every failure should include:
- what failed
- why (short)
- what was already done
- next recommended action
## D. Output readability
Standardize final response blocks:
- `Outcome`
- `Changes`
- `Validation`
- `Limitations`
- `Next Step`
---
## 4) High-value features to add (priority)
## P0 (immediate)
1. `on_user_input_request` handler with default answer strategy
2. `on_pre_tool_use` for argument checks + risk gates
3. Structured progress events (phase-based)
## P1 (short-term)
4. Error taxonomy + retry policy (`network`, `provider`, `tool`, `validation`)
5. Profile-based session factory with safe fallback
6. Auto quality gate for final output sections
## P2 (mid-term)
7. Transport flexibility (`cli_url`, `use_stdio`, `port`) for deployment resilience
8. Azure provider path completion
9. Foreground session lifecycle support for advanced multi-session control
---
## 5) Suggested valves for stability/friendliness
- `AGENT_PROFILE`: `default | builder | analyst | reviewer`
- `ENABLE_USER_INPUT_REQUEST`: `bool`
- `DEFAULT_USER_INPUT_ANSWER`: `str`
- `TOOL_CALL_TIMEOUT_SECONDS`: `int`
- `MAX_RETRY_ATTEMPTS`: `int`
- `ENABLE_SAFE_TOOL_GUARD`: `bool`
- `ENABLE_PHASE_STATUS_EVENTS`: `bool`
- `ENABLE_FRIENDLY_FAILURE_TEMPLATE`: `bool`
---
## 6) Failure playbooks (practical)
## Playbook A — Provider timeout
- Retry once if request is idempotent.
- Downgrade reasoning effort if timeout persists.
- Return concise fallback and preserve partial result.
## Playbook B — Tool argument mismatch
- Block execution in `on_pre_tool_use`.
- Ask user one clarification question if recoverable.
- Otherwise skip tool and explain impact.
## Playbook C — Large output overflow
- Save large output to workspace file.
- Return file path + short summary.
- Avoid flooding chat with huge payload.
## Playbook D — Conflicting user instructions
- Surface conflict explicitly.
- Offer 2-3 fixed choices.
- Continue only after user selection.
---
## 7) Metrics to track
- Session success rate
- Tool-call success rate
- Average recovery rate after first failure
- Clarification rate vs hallucination rate
- Mean time to first useful output
- User follow-up dissatisfaction signals (e.g., “not what I asked”)
---
## 8) Minimal rollout plan
1. Add `on_user_input_request` + `on_pre_tool_use` (feature-gated).
2. Add phase status events and friendly failure template.
3. Add retry policy + error taxonomy.
4. Add profile fallback and deployment transport options.
5. Observe metrics for 1-2 weeks, then tighten defaults.
---
## 9) Quick acceptance checklist
- Agent asks clarification only when necessary.
- No long silent period without status updates.
- Failures always include next actionable step.
- Unknown profile/provider config does not crash session.
- Large outputs are safely redirected to file.
- Final response follows a stable structure.

View File

@@ -0,0 +1,192 @@
# 🧭 Agents 稳定性与友好性指南
本文聚焦如何提升 `github_copilot_sdk.py` 中 Agent 的**稳定性**与**交互友好性**。
---
## 1目标
- 降低可避免失败(超时、工具死路、输出不可解析)。
- 在高压场景保持可预期(大上下文、上游不稳定、部分工具失败)。
- 提升交互体验(进度可见、风险操作先澄清、优雅降级)。
- 在不破坏兼容性的前提下逐步增强默认行为。
---
## 2稳定性模型4 层)
## A 层:输入安全
- 会话创建前验证关键上下文user/chat/model/tool 可用性)。
- 对 JSON/配置采用严格解析,不信任自由文本。
- 对不支持的模式组合做前置拦截(例如:任务需要工具但工具被禁用)。
**落地建议**
- `create_session` 前增加 preflight validator。
- 快速失败并返回结构化恢复建议。
## B 层:会话安全
- 使用 profile 驱动默认值(`model``reasoning_effort``infinite_sessions`)。
- 请求未知 profile 时自动回退到安全默认 profile。
- 每个 chat 使用确定性 workspace 路径隔离。
**落地建议**
- 增加 `AGENT_PROFILE`,未知值回退 `default`
- 长任务默认开启 `infinite_sessions`
## C 层:工具调用安全
- 增加 `on_pre_tool_use` 做参数校验与净化。
- 增加高风险操作 allow/deny 规则。
- 按工具类别配置超时预算(文件/网络/命令)。
**落地建议**
- 保留现有 `on_post_tool_use`
- 先补 `on_pre_tool_use`,再补 `on_error_occurred`
## D 层:恢复安全
- 仅对幂等操作重试,且有次数上限。
- 区分可恢复/不可恢复错误。
- 提供确定性降级输出(摘要 + 限制说明)。
**落地建议**
- 按错误类型配置重试策略。
- 统一输出“成功了什么 / 失败了什么 / 下一步”。
---
## 3友好性模型UX 合约)
## A. 歧义先澄清
通过 `on_user_input_request` 处理:
- 约束缺失(范围、目标路径、输出格式)
- 高风险动作(删除/迁移/覆盖)
- 用户指令互相冲突
**规则**:一次提问给出有限选项,避免反复追问。
## B. 进度可见
按阶段发状态:
1. 上下文检查
2. 规划/分析
3. 工具执行
4. 验证
5. 结果输出
**规则**:超过 8 秒不能无状态输出。
## C. 失败友好
每次失败都要包含:
- 失败点
- 简短原因
- 已完成部分
- 下一步可执行建议
## D. 输出可读
统一最终输出结构:
- `Outcome`
- `Changes`
- `Validation`
- `Limitations`
- `Next Step`
---
## 4高价值增强项优先级
## P0立即
1. `on_user_input_request` + 默认答复策略
2. `on_pre_tool_use` 参数检查 + 风险闸门
3. 阶段化状态事件
## P1短期
4. 错误分类 + 重试策略(`network/provider/tool/validation`
5. profile 化 session 工厂 + 安全回退
6. 最终输出质量门(结构校验)
## P2中期
7. 传输配置能力(`cli_url/use_stdio/port`
8. Azure provider 支持完善
9. foreground session 生命周期能力(高级多会话)
---
## 5建议新增 valves
- `AGENT_PROFILE`: `default | builder | analyst | reviewer`
- `ENABLE_USER_INPUT_REQUEST`: `bool`
- `DEFAULT_USER_INPUT_ANSWER`: `str`
- `TOOL_CALL_TIMEOUT_SECONDS`: `int`
- `MAX_RETRY_ATTEMPTS`: `int`
- `ENABLE_SAFE_TOOL_GUARD`: `bool`
- `ENABLE_PHASE_STATUS_EVENTS`: `bool`
- `ENABLE_FRIENDLY_FAILURE_TEMPLATE`: `bool`
---
## 6故障应对手册实用
## 场景 AProvider 超时
- 若请求幂等,重试一次。
- 仍超时则降低 reasoning 强度。
- 返回简洁降级结果并保留已有中间成果。
## 场景 B工具参数不匹配
-`on_pre_tool_use` 阻断。
- 可恢复则提一个澄清问题。
- 不可恢复则跳过工具并说明影响。
## 场景 C输出过大
- 大输出落盘到 workspace 文件。
- 返回文件路径 + 简要摘要。
- 避免把超大内容直接刷屏。
## 场景 D用户指令冲突
- 明确指出冲突点。
- 给 2-3 个固定选项。
- 用户选定后再继续。
---
## 7建议监控指标
- 会话成功率
- 工具调用成功率
- 首次失败后的恢复率
- 澄清率 vs 幻觉率
- 首次可用输出耗时
- 用户不满意信号(如“不是我要的”)
---
## 8最小落地路径
1. 先加 `on_user_input_request` + `on_pre_tool_use`(功能开关控制)。
2. 增加阶段状态事件和失败友好模板。
3. 增加错误分类与重试策略。
4. 增加 profile 安全回退与传输配置能力。
5. 观察 1-2 周指标,再逐步收紧默认策略。
---
## 9验收速查
- 仅在必要时澄清,不重复追问。
- 无长时间无状态“沉默”。
- 失败输出包含下一步动作。
- profile/provider 配置异常不导致会话崩溃。
- 超大输出可安全转文件。
- 最终响应结构稳定一致。

View File

@@ -0,0 +1,294 @@
# 🤖 Custom Agents Reference (Copilot SDK Python)
This document explains how to create **custom agent profiles** using the SDK at:
- `/Users/fujie/app/python/oui/copilot-sdk/python`
and apply them in this pipe:
- `plugins/pipes/github-copilot-sdk/github_copilot_sdk.py`
---
## 1) What is a “Custom Agent” here?
In Copilot SDK Python, a custom agent is not a separate runtime class from the SDK itself.
It is typically a **session configuration bundle**:
- model + reasoning level
- system message/persona
- tools exposure
- hooks lifecycle behavior
- user input strategy
- infinite session compaction strategy
- provider (optional BYOK)
So the practical implementation is:
1. Define an `AgentProfile` data structure.
2. Convert profile -> `session_config`.
3. Call `client.create_session(session_config)`.
---
## 2) SDK capabilities you can use
From `copilot-sdk/python/README.md`, the key knobs are:
- `model`
- `reasoning_effort`
- `tools`
- `system_message`
- `streaming`
- `provider`
- `infinite_sessions`
- `on_user_input_request`
- `hooks`
These are enough to create different agent personas without forking core logic.
---
## 3) Recommended architecture in pipe
Use a **profile registry** + a single factory method.
```python
from dataclasses import dataclass
from typing import Any, Callable, Optional
@dataclass
class AgentProfile:
name: str
model: str
reasoning_effort: str = "medium"
system_message: Optional[str] = None
enable_tools: bool = True
enable_openwebui_tools: bool = True
enable_hooks: bool = False
enable_user_input: bool = False
infinite_sessions_enabled: bool = True
compaction_threshold: float = 0.8
buffer_exhaustion_threshold: float = 0.95
```
Then map profile -> session config:
```python
def build_session_config(profile: AgentProfile, tools: list, hooks: dict, user_input_handler: Optional[Callable[..., Any]]):
config = {
"model": profile.model,
"reasoning_effort": profile.reasoning_effort,
"streaming": True,
"infinite_sessions": {
"enabled": profile.infinite_sessions_enabled,
"background_compaction_threshold": profile.compaction_threshold,
"buffer_exhaustion_threshold": profile.buffer_exhaustion_threshold,
},
}
if profile.system_message:
config["system_message"] = {"content": profile.system_message}
if profile.enable_tools:
config["tools"] = tools
if profile.enable_hooks and hooks:
config["hooks"] = hooks
if profile.enable_user_input and user_input_handler:
config["on_user_input_request"] = user_input_handler
return config
```
---
## 4) Example profile presets
```python
AGENT_PROFILES = {
"builder": AgentProfile(
name="builder",
model="claude-sonnet-4.6",
reasoning_effort="high",
system_message="You are a precise coding agent. Prefer minimal, verifiable changes.",
enable_tools=True,
enable_hooks=True,
),
"analyst": AgentProfile(
name="analyst",
model="gpt-5-mini",
reasoning_effort="medium",
system_message="You analyze and summarize with clear evidence mapping.",
enable_tools=False,
enable_hooks=False,
),
"reviewer": AgentProfile(
name="reviewer",
model="claude-sonnet-4.6",
reasoning_effort="high",
system_message="Review diffs, identify risks, and propose minimal fixes.",
enable_tools=True,
enable_hooks=True,
),
}
```
---
## 5) Integrating with this pipe
In `github_copilot_sdk.py`:
1. Add a Valve like `AGENT_PROFILE` (default: `builder`).
2. Resolve profile from registry at runtime.
3. Build `session_config` from profile.
4. Merge existing valve toggles (`ENABLE_TOOLS`, `ENABLE_OPENWEBUI_TOOLS`) as final override.
Priority recommendation:
- explicit runtime override > valve toggle > profile default
This keeps backward compatibility while enabling profile-based behavior.
---
## 6) Hook strategy (safe defaults)
Use hooks only when needed:
- `on_pre_tool_use`: allow/deny tools, sanitize args
- `on_post_tool_use`: add short execution context
- `on_user_prompt_submitted`: normalize unsafe prompt patterns
- `on_error_occurred`: retry/skip/abort policy
Start with no-op hooks, then incrementally enforce policy.
---
## 7) Validation checklist
- Profile can be selected by valve and takes effect.
- Session created with expected model/reasoning.
- Tool availability matches profile + valve overrides.
- Hook handlers run only when enabled.
- Infinite-session compaction settings are applied.
- Fallback to default profile if unknown profile name is provided.
---
## 8) Anti-patterns to avoid
- Hardcoding profile behavior in multiple places.
- Mixing tool registration logic with prompt-format logic.
- Enabling expensive hooks for all profiles by default.
- Coupling profile name to exact model id with no fallback.
---
## 9) Minimal rollout plan
1. Add profile dataclass + registry.
2. Add one valve: `AGENT_PROFILE`.
3. Build session config factory.
4. Keep existing behavior as default profile.
5. Add 2 more profiles (`analyst`, `reviewer`) and test.
---
## 10) SDK gap analysis for current pipe (high-value missing features)
Current pipe already implements many advanced capabilities:
- `SessionConfig` with `tools`, `system_message`, `infinite_sessions`, `provider`, `mcp_servers`
- Session resume/create path
- `list_models()` cache path
- Attachments in `session.send(...)`
- Hook integration (currently `on_post_tool_use`)
Still missing (or partially implemented) high-value SDK features:
### A. `on_user_input_request` handler (ask-user loop)
**Why valuable**
- Enables safe clarification for ambiguous tasks instead of hallucinated assumptions.
**Current state**
- Not wired into `create_session(...)`.
**Implementation idea**
- Add valves:
- `ENABLE_USER_INPUT_REQUEST: bool`
- `DEFAULT_USER_INPUT_ANSWER: str`
- Add a handler function and pass:
- `session_params["on_user_input_request"] = handler`
### B. Full lifecycle hooks (beyond `on_post_tool_use`)
**Why valuable**
- Better policy control and observability.
**Current state**
- Only `on_post_tool_use` implemented.
**Implementation idea**
- Add optional handlers for:
- `on_pre_tool_use`
- `on_user_prompt_submitted`
- `on_session_start`
- `on_session_end`
- `on_error_occurred`
### C. Provider type coverage gap (`azure`)
**Why valuable**
- Azure OpenAI users cannot configure provider type natively.
**Current state**
- Valve type only allows `openai | anthropic`.
**Implementation idea**
- Extend valve enum to include `azure`.
- Add `BYOK_AZURE_API_VERSION` valve.
- Build `provider` payload with `azure` block when selected.
### D. Client transport options exposure (`cli_url`, `use_stdio`, `port`)
**Why valuable**
- Enables remote/shared Copilot server and tuning transport mode.
**Current state**
- `_build_client_config` sets `cli_path/cwd/config_dir/log_level/env`, but not transport options.
**Implementation idea**
- Add valves:
- `COPILOT_CLI_URL`
- `COPILOT_USE_STDIO`
- `COPILOT_PORT`
- Conditionally inject into `client_config`.
### E. Foreground session lifecycle APIs
**Why valuable**
- Better multi-session UX and control in TUI/server mode.
**Current state**
- No explicit usage of:
- `get_foreground_session_id()`
- `set_foreground_session_id()`
- `client.on("session.foreground", ...)`
**Implementation idea**
- Optional debug/admin feature only.
- Add event bridge for lifecycle notifications.
---
## 11) Recommended implementation priority
1. `on_user_input_request` (highest value / low risk)
2. Full lifecycle hooks (high value / medium risk)
3. Azure provider support (high value for enterprise users)
4. Client transport valves (`cli_url/use_stdio/port`)
5. Foreground session APIs (optional advanced ops)

View File

@@ -0,0 +1,292 @@
# 🤖 自定义 Agents 参考文档Copilot SDK Python
本文说明如何基于以下 SDK 创建**可复用的自定义 Agent 配置**
- `/Users/fujie/app/python/oui/copilot-sdk/python`
并接入当前 Pipe
- `plugins/pipes/github-copilot-sdk/github_copilot_sdk.py`
---
## 1这里的“自定义 Agent”是什么
在 Copilot SDK Python 中,自定义 Agent 通常不是 SDK 里的独立类,而是一个**会话配置组合**
- 模型与推理强度
- system message / 人设
- tools 暴露范围
- hooks 生命周期行为
- 用户输入策略
- infinite session 压缩策略
- provider可选
实际落地方式:
1. 定义 `AgentProfile` 数据结构。
2. 将 profile 转成 `session_config`
3. 调用 `client.create_session(session_config)`
---
## 2SDK 可用于定制 Agent 的能力
根据 `copilot-sdk/python/README.md`,关键可配置项包括:
- `model`
- `reasoning_effort`
- `tools`
- `system_message`
- `streaming`
- `provider`
- `infinite_sessions`
- `on_user_input_request`
- `hooks`
这些能力足够做出多个 agent 人设,而无需复制整套管线代码。
---
## 3在 Pipe 中推荐的架构
建议采用:**Profile 注册表 + 单一工厂函数**。
```python
from dataclasses import dataclass
from typing import Any, Callable, Optional
@dataclass
class AgentProfile:
name: str
model: str
reasoning_effort: str = "medium"
system_message: Optional[str] = None
enable_tools: bool = True
enable_openwebui_tools: bool = True
enable_hooks: bool = False
enable_user_input: bool = False
infinite_sessions_enabled: bool = True
compaction_threshold: float = 0.8
buffer_exhaustion_threshold: float = 0.95
```
profile -> session_config 的工厂函数:
```python
def build_session_config(profile: AgentProfile, tools: list, hooks: dict, user_input_handler: Optional[Callable[..., Any]]):
config = {
"model": profile.model,
"reasoning_effort": profile.reasoning_effort,
"streaming": True,
"infinite_sessions": {
"enabled": profile.infinite_sessions_enabled,
"background_compaction_threshold": profile.compaction_threshold,
"buffer_exhaustion_threshold": profile.buffer_exhaustion_threshold,
},
}
if profile.system_message:
config["system_message"] = {"content": profile.system_message}
if profile.enable_tools:
config["tools"] = tools
if profile.enable_hooks and hooks:
config["hooks"] = hooks
if profile.enable_user_input and user_input_handler:
config["on_user_input_request"] = user_input_handler
return config
```
---
## 4示例 Profile 预设
```python
AGENT_PROFILES = {
"builder": AgentProfile(
name="builder",
model="claude-sonnet-4.6",
reasoning_effort="high",
system_message="You are a precise coding agent. Prefer minimal, verifiable changes.",
enable_tools=True,
enable_hooks=True,
),
"analyst": AgentProfile(
name="analyst",
model="gpt-5-mini",
reasoning_effort="medium",
system_message="You analyze and summarize with clear evidence mapping.",
enable_tools=False,
enable_hooks=False,
),
"reviewer": AgentProfile(
name="reviewer",
model="claude-sonnet-4.6",
reasoning_effort="high",
system_message="Review diffs, identify risks, and propose minimal fixes.",
enable_tools=True,
enable_hooks=True,
),
}
```
---
## 5如何接入当前 Pipe
`github_copilot_sdk.py` 中:
1. 新增 Valve`AGENT_PROFILE`(默认 `builder`)。
2. 运行时从注册表解析 profile。
3. 通过工厂函数生成 `session_config`
4. 把已有开关(如 `ENABLE_TOOLS``ENABLE_OPENWEBUI_TOOLS`)作为最终覆盖层。
推荐优先级:
- 显式运行时参数 > valve 开关 > profile 默认值
这样能保持向后兼容,同时支持按 profile 切换 agent 行为。
---
## 6Hooks 策略(安全默认)
仅在必要时开启 hooks
- `on_pre_tool_use`:工具调用前 allow/deny、参数净化
- `on_post_tool_use`:补充简要上下文
- `on_user_prompt_submitted`:提示词规范化
- `on_error_occurred`:错误重试/跳过/中止策略
建议先用 no-op再逐步加策略。
---
## 7验证清单
- 可通过 valve 选择 profile且生效。
- session 使用了预期 model / reasoning。
- 工具可用性符合 profile + valve 覆盖后的结果。
- hooks 仅在启用时触发。
- infinite session 的阈值配置已生效。
- 传入未知 profile 时能安全回退到默认 profile。
---
## 8常见反模式
- 把 profile 逻辑硬编码在多个位置。
- 将工具注册逻辑与提示词格式化耦合。
- 默认给所有 profile 开启高开销 hooks。
- profile 名与模型 ID 强绑定且没有回退方案。
---
## 9最小落地步骤
1. 增加 profile dataclass + registry。
2. 增加一个 valve`AGENT_PROFILE`
3. 增加 session_config 工厂函数。
4. 将现有行为作为 default profile。
5. 再加 `analyst``reviewer` 两个 profile 并验证。
---
## 10当前 Pipe 的 SDK 能力差距(高价值项)
当前 pipe 已实现不少高级能力:
- `SessionConfig` 里的 `tools``system_message``infinite_sessions``provider``mcp_servers`
- session 的 resume/create 路径
- `list_models()` 模型缓存路径
- `session.send(...)` 附件传递
- hooks 接入(目前仅 `on_post_tool_use`
但仍有高价值能力未实现或仅部分实现:
### A. `on_user_input_request`ask-user 交互回路)
**价值**
- 任务不明确时可主动追问,降低错误假设和幻觉。
**现状**
- 尚未接入 `create_session(...)`
**实现建议**
- 增加 valves
- `ENABLE_USER_INPUT_REQUEST: bool`
- `DEFAULT_USER_INPUT_ANSWER: str`
-`session_params` 中注入:
- `session_params["on_user_input_request"] = handler`
### B. 完整生命周期 hooks不仅 `on_post_tool_use`
**价值**
- 增强策略控制与可观测性。
**现状**
- 目前只实现了 `on_post_tool_use`
**实现建议**
- 增加可选 handler
- `on_pre_tool_use`
- `on_user_prompt_submitted`
- `on_session_start`
- `on_session_end`
- `on_error_occurred`
### C. Provider 类型覆盖缺口(`azure`
**价值**
- 企业 Azure OpenAI 场景可直接接入。
**现状**
- valve 仅支持 `openai | anthropic`
**实现建议**
- 扩展枚举支持 `azure`
- 增加 `BYOK_AZURE_API_VERSION`
- 选择 azure 时构造 provider 的 `azure` 配置块。
### D. Client 传输配置未暴露(`cli_url` / `use_stdio` / `port`
**价值**
- 支持远程/共享 Copilot 服务,便于部署与调优。
**现状**
- `_build_client_config` 仅设置 `cli_path/cwd/config_dir/log_level/env`
**实现建议**
- 增加 valves
- `COPILOT_CLI_URL`
- `COPILOT_USE_STDIO`
- `COPILOT_PORT`
-`client_config` 中按需注入。
### E. 前台会话生命周期 API 未使用
**价值**
- 多会话/运维场景下可增强可控性与可视化。
**现状**
- 尚未显式使用:
- `get_foreground_session_id()`
- `set_foreground_session_id()`
- `client.on("session.foreground", ...)`
**实现建议**
- 作为 debug/admin 高级功能逐步接入。
---
## 11建议实现优先级
1. `on_user_input_request`(收益高、风险低)
2. 完整 lifecycle hooks收益高、风险中
3. Azure provider 支持(企业价值高)
4. client 传输配置 valves`cli_url/use_stdio/port`
5. 前台会话生命周期 API高级可选

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK Pipe for OpenWebUI
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.9.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.9.1 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
@@ -14,21 +14,17 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
---
## ✨ v0.9.0: The Skills Revolution & Stability Update
## ✨ v0.9.1: MCP Tool Filtering & Web Search Reliability Fix
- **🧩 Copilot SDK Skills Support**: Native support for Copilot SDK skill directories (`SKILL.md` + resources). Skills can now be loaded as first-class runtime context.
- **🔄 OpenWebUI Skills Bridge**: Full bidirectional sync between OpenWebUI **Workspace > Skills** and SDK skill directories.
- **🛠️ Deterministic `manage_skills` Tool**: Expert tool for stable install/create/list/edit/delete skill operations.
- **🌊 Reinforced Status Bar**: Multi-layered locking mechanism (`session_finalized` guard) and atomic async delivery to prevent "stuck" indicators.
- **⚡ Asynchronous Integrity**: Refactored status emission to route all updates through a centralized helper, ensuring atomic delivery and preventing race conditions in parallel execution streams.
- **💓 Pulse-Lock Refresh**: Implemented a hardware-inspired "pulse" logic that forces a final UI state refresh at the end of each session, ensuring the status bar settling on "Task completed."
- **🗂️ Persistent Config Directory**: Added `COPILOTSDK_CONFIG_DIR` for stable session-state persistence across container restarts.
- **🐛 Fixed MCP tool filtering logic**: Resolved a critical issue where configuring `function_name_filter_list` (or selecting specific tools in UI) would cause all tools from that MCP server to be incorrectly hidden due to ID prefix mismatches (`server:mcp:`).
- **🌐 Autonomous Web Search**: `web_search` is now always enabled for the agent (bypassing the UI toggle), leveraging the Copilot SDK's native ability to decide when to search.
- **🔍 Improved filter stability**: Ensured tool-level whitelists apply reliably without breaking the entire server connection.
---
## ✨ Key Capabilities
- **🔑 Unified Intelligence (Official + BYOK)**: Seamlessly switch between official GitHub Copilot models (o1, GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash) and your own models (OpenAI, Anthropic) via **Bring Your Own Key** mode.
- **🔑 Unified Intelligence (Official + BYOK)**: Seamlessly switch between official GitHub Copilot models and your own models (OpenAI, Anthropic, DeepSeek, xAI) via **Bring Your Own Key** mode.
- **🛡️ Physical Workspace Isolation**: Every session runs in its own isolated directory sandbox. This ensures absolute data privacy and prevents cross-chat file contamination while allowing the Agent full filesystem access.
- **🔌 Universal Tool Protocol**:
- **Native MCP**: Direct, high-performance connection to Model Context Protocol servers.

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK 官方管道
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.9.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.9.1 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商OpenAI, Anthropic并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
@@ -13,19 +13,17 @@
---
## ✨ 0.9.0 核心更新:技能革命与稳定性加固
## ✨ 0.9.1 最新更新MCP 工具过滤与网页搜索可靠性修复
- **🧩 Copilot SDK Skills 原生支持**: 技能可作为一等上下文能力被加载和使用
- **🔄 OpenWebUI Skills 桥接**: 实现 OpenWebUI **工作区 > Skills** 与 SDK 技能目录的深度双向同步
- **🛠️ 确定性 `manage_skills` 工具**: 通过稳定工具契约完成技能的生命周期管理
- **🌊 状态栏逻辑加固**: 引入 `session_finalized` 多层锁定机制,彻底解决任务完成后状态栏回弹或卡死的问题。
- **🗂️ 环境目录持久化**: 增强 `COPILOTSDK_CONFIG_DIR` 逻辑,确保会话状态跨容器重启稳定存在。
- **🐛 修复 MCP 工具过滤逻辑**:解决了在管理员后端配置 `function_name_filter_list`(或在聊天界面勾选特定工具)时,因 ID 前缀(`server:mcp:`)识别逻辑错误导致所选服务器下的全部工具意外失效的问题
- **🌐 自主网页搜索**`web_search` 工具现已强制对 Agent 开启(绕过 UI 网页搜索开关),充分利用 Copilot 自身具备的搜索判断能力
- **🔍 提升过滤稳定性**:由于修复了 ID 归一化逻辑,现在手动点选或后端配置的工具白名单均能稳定生效,不再会导致整个服务被排除
---
## ✨ 核心能力 (Key Capabilities)
- **🔑 统一智能体验 (官方 + BYOK)**: 自由切换官方模型o1, GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash与自定义服务商OpenAI, Anthropic支持 **BYOK (自带 Key)** 模式。
- **🔑 统一智能体验 (官方 + BYOK)**: 自由切换官方模型与自定义服务商OpenAI, Anthropic, DeepSeek, xAI),支持 **BYOK (自带 Key)** 模式。
- **🛡️ 物理级工作区隔离**: 每个会话在独立的沙箱目录中运行。确保绝对的数据隐私,防止不同聊天间的文件污染,同时给予 Agent 完整的文件系统操作权限。
- **🔌 通用工具协议**:
- **原生 MCP**: 高性能直连 Model Context Protocol 服务器。

View File

@@ -5,7 +5,7 @@ author_url: https://github.com/Fu-Jie/openwebui-extensions
funding_url: https://github.com/open-webui
openwebui_id: ce96f7b4-12fc-4ac3-9a01-875713e69359
description: Integrate GitHub Copilot SDK. Supports dynamic models, multi-turn conversation, streaming, multimodal input, infinite sessions, bidirectional OpenWebUI Skills bridge, and manage_skills tool.
version: 0.9.0
version: 0.9.1
requirements: github-copilot-sdk==0.1.25
"""
@@ -923,9 +923,9 @@ class Pipe:
return final_tools
# 4. Extract chat-level tool selection (P4: user selection from Chat UI)
chat_tool_ids = None
if __metadata__ and isinstance(__metadata__, dict):
chat_tool_ids = __metadata__.get("tool_ids") or None
chat_tool_ids = self._normalize_chat_tool_ids(
__metadata__.get("tool_ids") if isinstance(__metadata__, dict) else None
)
# 5. Load OpenWebUI tools dynamically (always fresh, no cache)
openwebui_tools = await self._load_openwebui_tools(
@@ -2190,11 +2190,12 @@ class Pipe:
return []
# P4: Chat tool_ids whitelist — only active when user explicitly selected tools
if chat_tool_ids:
chat_tool_ids_set = set(chat_tool_ids)
selected_custom_tool_ids = self._extract_selected_custom_tool_ids(chat_tool_ids)
if selected_custom_tool_ids:
chat_tool_ids_set = set(selected_custom_tool_ids)
filtered = [tid for tid in tool_ids if tid in chat_tool_ids_set]
await self._emit_debug_log(
f"[Tools] tool_ids whitelist active: {len(tool_ids)}{len(filtered)} (selected: {chat_tool_ids})",
f"[Tools] custom tool_ids whitelist active: {len(tool_ids)}{len(filtered)} (selected: {selected_custom_tool_ids})",
__event_call__,
)
tool_ids = filtered
@@ -2284,6 +2285,30 @@ class Pipe:
except Exception:
pass
# Force web_search enabled when OpenWebUI tools are enabled,
# regardless of request feature flags, model meta defaults, or UI toggles.
model_info = (
model_dict.get("info") if isinstance(model_dict, dict) else None
)
if isinstance(model_info, dict):
model_meta = model_info.get("meta")
if not isinstance(model_meta, dict):
model_meta = {}
model_info["meta"] = model_meta
builtin_meta = model_meta.get("builtinTools")
if not isinstance(builtin_meta, dict):
builtin_meta = {}
builtin_meta["web_search"] = True
model_meta["builtinTools"] = builtin_meta
# Force feature selection to True for web_search to bypass UI session toggles
if isinstance(body, dict):
features = body.get("features")
if not isinstance(features, dict):
features = {}
body["features"] = features
features["web_search"] = True
# Get builtin tools
# Code interpreter is STRICT opt-in: only enabled when request
# explicitly sets feature code_interpreter=true. Missing means disabled.
@@ -2380,6 +2405,13 @@ class Pipe:
converted_tools = []
for tool_name, t_dict in tools_dict.items():
if isinstance(tool_name, str) and tool_name.startswith("_"):
if self.valves.DEBUG:
await self._emit_debug_log(
f"[Tools] Skip private tool: {tool_name}",
__event_call__,
)
continue
try:
copilot_tool = self._convert_openwebui_tool_to_sdk(
tool_name,
@@ -2410,6 +2442,7 @@ class Pipe:
return None
mcp_servers = {}
selected_custom_tool_ids = self._extract_selected_custom_tool_ids(chat_tool_ids)
# Read MCP servers directly from DB to avoid stale in-memory cache
connections = self._read_tool_server_connections()
@@ -2440,8 +2473,15 @@ class Pipe:
)
continue
# P4: chat_tool_ids whitelist — if user selected tools, only include matching servers
if chat_tool_ids and f"server:{raw_id}" not in chat_tool_ids:
# P4: chat tool whitelist for MCP servers
# OpenWebUI MCP tool IDs use "server:mcp:{id}" (not just "server:{id}").
# Only enforce MCP server filtering when MCP server IDs are explicitly selected.
selected_mcp_server_ids = {
tid[len("server:mcp:") :]
for tid in selected_custom_tool_ids
if isinstance(tid, str) and tid.startswith("server:mcp:")
}
if selected_mcp_server_ids and raw_id not in selected_mcp_server_ids:
continue
# Sanitize server_id (using same logic as tools)
@@ -2478,13 +2518,18 @@ class Pipe:
function_filter = mcp_config.get("function_name_filter_list", "")
allowed_tools = ["*"]
if function_filter:
if isinstance(function_filter, str):
allowed_tools = [
f.strip() for f in function_filter.split(",") if f.strip()
]
elif isinstance(function_filter, list):
allowed_tools = function_filter
parsed_filter = self._parse_mcp_function_filter(function_filter)
expanded_filter = self._expand_mcp_filter_aliases(
parsed_filter,
raw_server_id=raw_id,
sanitized_server_id=server_id,
)
self._emit_debug_log_sync(
f"[MCP] function_name_filter_list raw={function_filter!r} parsed={parsed_filter} expanded={expanded_filter}",
__event_call__,
)
if expanded_filter:
allowed_tools = expanded_filter
mcp_servers[server_id] = {
"type": "http",
@@ -2630,6 +2675,142 @@ class Pipe:
items = [item.strip() for item in value.split(",")]
return self._dedupe_preserve_order([item for item in items if item])
def _normalize_chat_tool_ids(self, raw_tool_ids: Any) -> List[str]:
"""Normalize chat tool_ids payload to a clean list[str]."""
if not raw_tool_ids:
return []
normalized: List[str] = []
if isinstance(raw_tool_ids, str):
text = raw_tool_ids.strip()
if not text:
return []
if text.startswith("["):
try:
parsed = json.loads(text)
return self._normalize_chat_tool_ids(parsed)
except Exception:
pass
normalized = [p.strip() for p in re.split(r"[,\n;]+", text) if p.strip()]
return self._dedupe_preserve_order(normalized)
if isinstance(raw_tool_ids, (list, tuple, set)):
for item in raw_tool_ids:
if isinstance(item, str):
value = item.strip()
if value:
normalized.append(value)
continue
if isinstance(item, dict):
for key in ("id", "tool_id", "value", "name"):
value = item.get(key)
if isinstance(value, str) and value.strip():
normalized.append(value.strip())
break
return self._dedupe_preserve_order(normalized)
def _extract_selected_custom_tool_ids(self, chat_tool_ids: Any) -> List[str]:
"""Return selected non-builtin tool IDs only."""
normalized = self._normalize_chat_tool_ids(chat_tool_ids)
return self._dedupe_preserve_order(
[
tid
for tid in normalized
if isinstance(tid, str) and not tid.startswith("builtin:")
]
)
def _parse_mcp_function_filter(self, raw_filter: Any) -> List[str]:
"""Parse MCP function filter list from string/list/json into normalized names."""
if not raw_filter:
return []
if isinstance(raw_filter, (list, tuple, set)):
return self._dedupe_preserve_order(
[
str(item).strip().strip('"').strip("'")
for item in raw_filter
if str(item).strip().strip('"').strip("'")
]
)
if isinstance(raw_filter, str):
text = raw_filter.strip()
if not text:
return []
if text.startswith("["):
try:
parsed = json.loads(text)
return self._parse_mcp_function_filter(parsed)
except Exception:
pass
parts = re.split(r"[,\n;,、]+", text)
cleaned: List[str] = []
for part in parts:
value = part.strip().strip('"').strip("'")
if value.startswith("- "):
value = value[2:].strip()
if value:
cleaned.append(value)
return self._dedupe_preserve_order(cleaned)
return []
def _expand_mcp_filter_aliases(
self,
tool_names: List[str],
raw_server_id: str,
sanitized_server_id: str,
) -> List[str]:
"""Expand MCP filter names with common server-prefixed aliases.
Some MCP providers expose namespaced tool names such as:
- github__get_me
- github/get_me
- github.get_me
while admins often configure bare names like `get_me`.
"""
if not tool_names:
return []
prefixes = self._dedupe_preserve_order(
[
str(raw_server_id or "").strip(),
str(sanitized_server_id or "").strip(),
]
)
variants: List[str] = []
for name in tool_names:
clean_name = str(name).strip()
if not clean_name:
continue
# Keep original configured name first.
variants.append(clean_name)
# If admin already provided a namespaced value, keep it as-is only.
if any(sep in clean_name for sep in ("__", "/", ".")):
continue
for prefix in prefixes:
if not prefix:
continue
variants.extend(
[
f"{prefix}__{clean_name}",
f"{prefix}/{clean_name}",
f"{prefix}.{clean_name}",
]
)
return self._dedupe_preserve_order(variants)
def _is_manage_skills_intent(self, text: str) -> bool:
"""Detect whether the user is asking to manage/install skills.
@@ -4343,9 +4524,9 @@ class Pipe:
)
# P4: Chat tool_ids whitelist — extract once, reuse for both OpenAPI and MCP
chat_tool_ids = None
if __metadata__ and isinstance(__metadata__, dict):
chat_tool_ids = __metadata__.get("tool_ids") or None
chat_tool_ids = self._normalize_chat_tool_ids(
__metadata__.get("tool_ids") if isinstance(__metadata__, dict) else None
)
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
user_lang = user_ctx["user_language"]

View File

@@ -0,0 +1,39 @@
# iFlow Official SDK Pipe
This plugin integrates the [iFlow SDK](https://platform.iflow.cn/cli/sdk/sdk-python) into OpenWebUI as a `Pipe`.
## Features
- **Standard iFlow Integration**: Connects to the iFlow CLI process via WebSocket (ACP).
- **Auto-Process Management**: Automatically starts the iFlow process if it's not running.
- **Streaming Support**: Direct streaming from iFlow to the chat interface.
- **Status Updates**: Real-time status updates in the UI (thinking, tool usage, etc.).
- **Tool Execution Visibility**: See when iFlow is calling and completing tools.
## Configuration
Set the following `Valves`:
- `IFLOW_PORT`: The port for the iFlow CLI process (default: `8090`).
- `IFLOW_URL`: The WebSocket URL (default: `ws://localhost:8090/acp`).
- `AUTO_START`: Automatically start the process (default: `True`).
- `TIMEOUT`: Request timeout in seconds.
- `LOG_LEVEL`: SDK logging level (DEBUG, INFO, etc.).
## Installation
This plugin requires both the **iFlow CLI** binary and the **iflow-cli-sdk** Python package.
### 1. Install iFlow CLI (System level)
Run the following command in your terminal (Linux/macOS):
```bash
bash -c "$(curl -fsSL https://platform.iflow.cn/cli/install.sh)"
```
### 2. Install Python SDK (OpenWebUI environment)
```bash
pip install iflow-cli-sdk
```

View File

@@ -0,0 +1,37 @@
# iFlow 官方 SDK Pipe 插件
此插件将 [iFlow SDK](https://platform.iflow.cn/cli/sdk/sdk-python) 集成到 OpenWebUI 中。
## 功能特性
- **标准 iFlow 集成**:通过 WebSocket (ACP) 连接到 iFlow CLI 进程。
- **自动进程管理**:如果 iFlow 进程未运行,将自动启动。
- **流式输出支持**:支持从 iFlow 到聊天界面的实时流式输出。
- **实时状态更新**:在 UI 中实时显示助手状态(思考中、工具调用等)。
- **工具调用可视化**:实时反馈 iFlow 调用及完成工具的过程。
## 配置项 (Valves)
- `IFLOW_PORT`iFlow CLI 进程端口(默认:`8090`)。
- `IFLOW_URL`WebSocket 地址(默认:`ws://localhost:8090/acp`)。
- `AUTO_START`:是否自动启动进程(默认:`True`)。
- `TIMEOUT`:请求超时时间(秒)。
- `LOG_LEVEL`SDK 日志级别DEBUG, INFO 等)。
## 安装说明
此插件同时依赖 **iFlow CLI** 二进制文件和 **iflow-cli-sdk** Python 包。
### 1. 安装 iFlow CLI (系统层级)
在系统中执行以下命令(适用于 Linux/macOS
```bash
bash -c "$(curl -fsSL https://gitee.com/iflow-ai/iflow-cli/raw/main/install.sh)"
```
### 2. 安装 Python SDK (OpenWebUI 环境)
```bash
pip install iflow-cli-sdk
```

View File

@@ -0,0 +1,544 @@
"""
title: iFlow Official SDK Pipe
author: Fu-Jie
author_url: https://github.com/Fu-Jie/openwebui-extensions
funding_url: https://github.com/open-webui
description: Integrate iFlow SDK. Supports dynamic models, multi-turn conversation, streaming, tool execution, and task planning.
version: 0.1.2
requirements: iflow-cli-sdk==0.1.11
"""
import shutil
import subprocess
import os
import json
import asyncio
import logging
from typing import Optional, Union, AsyncGenerator, List, Any, Dict, Literal
from pydantic import BaseModel, Field
# Setup logger
logger = logging.getLogger(__name__)
# Import iflow SDK modules with safety
IFlowClient = None
IFlowOptions = None
AssistantMessage = None
TaskFinishMessage = None
ToolCallMessage = None
PlanMessage = None
TaskStatusMessage = None
ApprovalMode = None
StopReason = None
try:
from iflow_sdk import (
IFlowClient,
IFlowOptions,
AssistantMessage,
TaskFinishMessage,
ToolCallMessage,
PlanMessage,
TaskStatusMessage,
ApprovalMode,
StopReason,
)
except ImportError:
logger.error(
"iflow-cli-sdk not found. Please install it with 'pip install iflow-cli-sdk'."
)
# Base guidelines for all users, adapted for iFlow
BASE_GUIDELINES = (
"\n\n[Environment & Capabilities Context]\n"
"You are an AI assistant operating within a high-capability Linux container environment (OpenWebUI) powered by **iFlow CLI**.\n"
"\n"
"**System Environment & User Privileges:**\n"
"- **Output Environment**: You are rendering in the **OpenWebUI Chat Page**. Optimize your output format to leverage Markdown for the best UI experience.\n"
"- **Root Access**: You are running as **root**. You have **READ access to the entire container file system**. You **MUST ONLY WRITE** to your designated persistent workspace directory.\n"
"- **STRICT FILE CREATION RULE**: You are **PROHIBITED** from creating or editing files outside of your specific workspace path. Never place files in `/root`, `/tmp`, or `/app`. All operations must use the absolute path provided in your session context.\n"
"- **iFlow Task Planning**: You possess **Task Planning** capabilities. When faced with complex requests, you SHOULD generate a structured plan. The iFlow SDK will visualize this plan as a task list for the user.\n"
"- **Tool Execution (ACP)**: You interact with tools via the **Agent Control Protocol (ACP)**. Depending on the `ApprovalMode`, your tool calls may be executed automatically or require user confirmation.\n"
"- **Rich Python Environment**: You can natively import and use any installed OpenWebUI dependencies.\n"
"\n"
"**Formatting & Presentation Directives:**\n"
"1. **Markdown Excellence**: Leverage headers, tables, and lists to structure your response professionally.\n"
"2. **Advanced Visualization**: Use **Mermaid** for diagrams and **LaTeX** for math. Always wrap Mermaid in standard ```mermaid blocks.\n"
"3. **Interactive Artifacts (HTML)**: **Premium Delivery Protocol**: For web applications, you MUST:\n"
" - 1. **Persist**: Create the file in the workspace (e.g., `index.html`).\n"
" - 2. **Publish**: Call `publish_file_from_workspace(filename='your_file.html')` (via provided tools if available). This triggers the premium embedded experience.\n"
" - **CRITICAL**: Never output raw HTML source code directly in the chat. Persist and publish.\n"
"4. **Media & Files**: ALWAYS embed generated media using `![caption](url)`. Never provide plain text links for images/videos.\n"
"5. **Dual-Channel Delivery**: Always aim to provide both an instant visual Insight in the chat AND a persistent downloadable file.\n"
"6. **Active & Autonomous**: Analyze the user's request -> Formulate a plan -> **EXECUTE** the plan immediately. Minimize user friction.\n"
)
# Sensitive extensions only for Administrators
ADMIN_EXTENSIONS = (
"\n**[ADMINISTRATOR PRIVILEGES - CONFIDENTIAL]**\n"
"Current user is an **ADMINISTRATOR**. Restricted access is lifted:\n"
"- **Full OS Interaction**: You can use shell tools to analyze any container process or system configuration.\n"
"- **Database Access**: You can connect to the **OpenWebUI Database** using credentials in environment variables.\n"
"- **iFlow CLI Debugging**: You can inspect iFlow configuration and logs for diagnostic purposes.\n"
"**SECURITY NOTE**: Protect sensitive internal details.\n"
)
# Strict restrictions for regular Users
USER_RESTRICTIONS = (
"\n**[USER ACCESS RESTRICTIONS - STRICT]**\n"
"Current user is a **REGULAR USER**. Adhere to boundaries:\n"
"- **NO Environment Access**: FORBIDDEN from accessing environment variables (e.g., via `env` or `os.environ`).\n"
"- **NO Database Access**: MUST NOT attempt to connect to OpenWebUI database.\n"
"- **NO Writing Outside Workspace**: All artifacts MUST be saved strictly inside the isolated workspace path provided.\n"
"- **Restricted Shell**: Use shell tools ONLY for operations within your isolated workspace. Do NOT explore system secrets.\n"
)
class Pipe:
class Valves(BaseModel):
IFLOW_PORT: int = Field(
default=8090,
description="Port for iFlow CLI process.",
)
IFLOW_URL: str = Field(
default="ws://localhost:8090/acp",
description="WebSocket URL for iFlow ACP.",
)
AUTO_START: bool = Field(
default=True,
description="Whether to automatically start the iFlow process.",
)
TIMEOUT: float = Field(
default=300.0,
description="Timeout for the message request (seconds).",
)
LOG_LEVEL: str = Field(
default="INFO",
description="Log level for iFlow SDK (DEBUG, INFO, WARNING, ERROR).",
)
CWD: str = Field(
default="",
description="CLI operation working directory. Empty for default.",
)
APPROVAL_MODE: Literal["DEFAULT", "AUTO_EDIT", "YOLO", "PLAN"] = Field(
default="YOLO",
description="Tool execution permission mode.",
)
FILE_ACCESS: bool = Field(
default=False,
description="Enable file system access (disabled by default for security).",
)
AUTO_INSTALL_CLI: bool = Field(
default=True,
description="Automatically install iFlow CLI if not found in PATH.",
)
IFLOW_BIN_DIR: str = Field(
default="/app/backend/data/bin",
description="Fixed path for iFlow CLI binary (recommended for persistence in Docker).",
)
# Auth Config
SELECTED_AUTH_TYPE: Literal["iflow", "openai-compatible"] = Field(
default="iflow",
description="Authentication type. 'iflow' for native, 'openai-compatible' for others.",
)
AUTH_API_KEY: str = Field(
default="",
description="API Key for the model provider.",
)
AUTH_BASE_URL: str = Field(
default="",
description="Base URL for the model provider.",
)
AUTH_MODEL: str = Field(
default="",
description="Model name to use.",
)
SYSTEM_PROMPT: str = Field(
default="",
description="System prompt to guide the AI's behavior.",
)
def __init__(self):
self.type = "pipe"
self.id = "iflow_sdk"
self.name = "iflow"
self.valves = self.Valves()
def _get_user_role(self, __user__: dict) -> str:
"""Determine if the user is an admin."""
return __user__.get("role", "user")
def _get_system_prompt(self, role: str) -> str:
"""Construct the dynamic system prompt based on user role."""
prompt = self.valves.SYSTEM_PROMPT if self.valves.SYSTEM_PROMPT else ""
prompt += BASE_GUIDELINES
if role == "admin":
prompt += ADMIN_EXTENSIONS
else:
prompt += USER_RESTRICTIONS
return prompt
async def _ensure_cli(self, _emit_status) -> bool:
"""Check for iFlow CLI and attempt installation if missing."""
async def _check_binary(name: str) -> Optional[str]:
# 1. Check in system PATH
path = shutil.which(name)
if path:
return path
# 2. Compile potential search paths
search_paths = []
# Try to resolve NPM global prefix
try:
proc = await asyncio.create_subprocess_exec(
"npm",
"config",
"get",
"prefix",
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
stdout, _ = await proc.communicate()
if proc.returncode == 0:
prefix = stdout.decode().strip()
search_paths.extend(
[
os.path.join(prefix, "bin"),
os.path.join(prefix, "node_modules", ".bin"),
prefix,
]
)
except:
pass
if self.valves.IFLOW_BIN_DIR:
search_paths.extend(
[
self.valves.IFLOW_BIN_DIR,
os.path.join(self.valves.IFLOW_BIN_DIR, "bin"),
]
)
# Common/default locations
search_paths.extend(
[
os.path.expanduser("~/.iflow/bin"),
os.path.expanduser("~/.npm-global/bin"),
os.path.expanduser("~/.local/bin"),
"/usr/local/bin",
"/usr/bin",
"/bin",
os.path.expanduser("~/bin"),
]
)
for p in search_paths:
full_path = os.path.join(p, name)
if os.path.exists(full_path) and os.access(full_path, os.X_OK):
return full_path
return None
# Initial check
binary_path = await _check_binary("iflow")
if binary_path:
logger.info(f"iFlow CLI found at: {binary_path}")
bin_dir = os.path.dirname(binary_path)
if bin_dir not in os.environ["PATH"]:
os.environ["PATH"] = f"{bin_dir}:{os.environ['PATH']}"
return True
if not self.valves.AUTO_INSTALL_CLI:
return False
try:
install_loc_msg = (
self.valves.IFLOW_BIN_DIR
if self.valves.IFLOW_BIN_DIR
else "default location"
)
await _emit_status(
f"iFlow CLI not found. Attempting auto-installation to {install_loc_msg}..."
)
# Detection for package managers and official script
env = os.environ.copy()
has_npm = shutil.which("npm") is not None
has_curl = shutil.which("curl") is not None
if has_npm:
if self.valves.IFLOW_BIN_DIR:
os.makedirs(self.valves.IFLOW_BIN_DIR, exist_ok=True)
install_cmd = f"npm i -g --prefix {self.valves.IFLOW_BIN_DIR} @iflow-ai/iflow-cli@latest"
else:
install_cmd = "npm i -g @iflow-ai/iflow-cli@latest"
elif has_curl:
await _emit_status(
"npm not found. Attempting to use official shell installer via curl..."
)
# Official installer script from gitee/github as fallback
# We try gitee first as it's more reliable in some environments
install_cmd = 'bash -c "$(curl -fsSL https://gitee.com/iflow-ai/iflow-cli/raw/main/install.sh)"'
# If we have a custom bin dir, try to tell the installer (though it might not support it)
if self.valves.IFLOW_BIN_DIR:
env["IFLOW_BIN_DIR"] = self.valves.IFLOW_BIN_DIR
else:
await _emit_status(
"Error: Neither 'npm' nor 'curl' found. Cannot proceed with auto-installation."
)
return False
process = await asyncio.create_subprocess_shell(
install_cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
env=env,
)
stdout_data, stderr_data = await process.communicate()
# Even if the script returns non-zero (which it might if it tries to
# start an interactive shell at the end), we check if the binary exists.
await _emit_status(
"Installation script finished. Finalizing verification..."
)
binary_path = await _check_binary("iflow")
if binary_path:
try:
os.chmod(binary_path, 0o755)
except:
pass
await _emit_status(f"iFlow CLI confirmed at {binary_path}.")
bin_dir = os.path.dirname(binary_path)
if bin_dir not in os.environ["PATH"]:
os.environ["PATH"] = f"{bin_dir}:{os.environ['PATH']}"
return True
else:
# Script failed and no binary
error_msg = (
stderr_data.decode().strip() or "Binary not found in search paths"
)
logger.error(
f"Installation failed with code {process.returncode}: {error_msg}"
)
await _emit_status(f"Installation failed: {error_msg}")
return False
except Exception as e:
logger.error(f"Error during installation: {str(e)}")
await _emit_status(f"Installation error: {str(e)}")
return False
async def _ensure_sdk(self, _emit_status) -> bool:
"""Check for iflow-cli-sdk Python package and attempt installation if missing."""
global IFlowClient, IFlowOptions, AssistantMessage, TaskFinishMessage, ToolCallMessage, PlanMessage, TaskStatusMessage, ApprovalMode, StopReason
if IFlowClient is not None:
return True
await _emit_status("iflow-cli-sdk not found. Attempting auto-installation...")
try:
# Use sys.executable to ensure we use the same Python environment
import sys
process = await asyncio.create_subprocess_exec(
sys.executable,
"-m",
"pip",
"install",
"iflow-cli-sdk",
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
stdout, stderr = await process.communicate()
if process.returncode == 0:
await _emit_status("iflow-cli-sdk installed successfully. Loading...")
# Try to import again
from iflow_sdk import (
IFlowClient as C,
IFlowOptions as O,
AssistantMessage as AM,
TaskFinishMessage as TM,
ToolCallMessage as TC,
PlanMessage as P,
TaskStatusMessage as TS,
ApprovalMode as AP,
StopReason as SR,
)
# Update global pointers
IFlowClient, IFlowOptions = C, O
AssistantMessage, TaskFinishMessage = AM, TM
ToolCallMessage, PlanMessage = TC, P
TaskStatusMessage, ApprovalMode, StopReason = TS, AP, SR
return True
else:
error_msg = stderr.decode().strip()
logger.error(f"SDK installation failed: {error_msg}")
await _emit_status(f"SDK installation failed: {error_msg}")
return False
except Exception as e:
logger.error(f"Error during SDK installation: {str(e)}")
await _emit_status(f"SDK installation error: {str(e)}")
return False
async def pipe(
self, body: dict, __user__: dict, __event_emitter__=None
) -> Union[str, AsyncGenerator[str, None]]:
"""Main entry point for the pipe."""
async def _emit_status(description: str, done: bool = False):
if __event_emitter__:
await __event_emitter__(
{
"type": "status",
"data": {
"description": description,
"done": done,
},
}
)
# 0. Ensure SDK and CLI are available
if not await self._ensure_sdk(_emit_status):
return "Error: iflow-cli-sdk (Python package) missing and auto-installation failed. Please install it with `pip install iflow-cli-sdk` manually."
# 1. Update PATH to include custom bin dir
if self.valves.IFLOW_BIN_DIR not in os.environ["PATH"]:
os.environ["PATH"] = f"{self.valves.IFLOW_BIN_DIR}:{os.environ['PATH']}"
# 2. Ensure CLI is installed and path is updated
if not await self._ensure_cli(_emit_status):
return f"Error: iFlow CLI not found and auto-installation failed. Please install it to {self.valves.IFLOW_BIN_DIR} manually."
messages = body.get("messages", [])
if not messages:
return "No messages provided."
# Get the last user message
last_message = messages[-1]
content = last_message.get("content", "")
# Determine user role and construct prompt
role = self._get_user_role(__user__)
dynamic_prompt = self._get_system_prompt(role)
# Prepare Auth Info
auth_info = None
if self.valves.AUTH_API_KEY:
auth_info = {
"api_key": self.valves.AUTH_API_KEY,
"base_url": self.valves.AUTH_BASE_URL,
"model_name": self.valves.AUTH_MODEL,
}
# Prepare Session Settings
session_settings = None
try:
from iflow_sdk import SessionSettings
session_settings = SessionSettings(system_prompt=dynamic_prompt)
except ImportError:
session_settings = {"system_prompt": dynamic_prompt}
# 2. Configure iFlow Options
# Use local references to ensure we're using the freshly imported SDK components
from iflow_sdk import (
IFlowOptions as SDKOptions,
ApprovalMode as SDKApprovalMode,
)
# Get approval mode with a safe fallback
try:
target_mode = getattr(SDKApprovalMode, self.valves.APPROVAL_MODE)
except (AttributeError, TypeError):
target_mode = (
SDKApprovalMode.YOLO if hasattr(SDKApprovalMode, "YOLO") else None
)
options = SDKOptions(
url=self.valves.IFLOW_URL,
auto_start_process=self.valves.AUTO_START,
process_start_port=self.valves.IFLOW_PORT,
timeout=self.valves.TIMEOUT,
log_level=self.valves.LOG_LEVEL,
cwd=self.valves.CWD or None,
approval_mode=target_mode,
file_access=self.valves.FILE_ACCESS,
auth_method_id=self.valves.SELECTED_AUTH_TYPE if auth_info else None,
auth_method_info=auth_info,
session_settings=session_settings,
)
async def _emit_status(description: str, done: bool = False):
if __event_emitter__:
await __event_emitter__(
{
"type": "status",
"data": {
"description": description,
"done": done,
},
}
)
# 3. Stream from iFlow
async def stream_generator():
try:
await _emit_status("Initializing iFlow connection...")
async with IFlowClient(options) as client:
await client.send_message(content)
await _emit_status("iFlow is processing...")
async for message in client.receive_messages():
if isinstance(message, AssistantMessage):
yield message.chunk.text
if message.agent_info and message.agent_info.agent_id:
logger.debug(
f"Message from agent: {message.agent_info.agent_id}"
)
elif isinstance(message, PlanMessage):
plan_str = "\n".join(
[
f"{'' if e.status == 'completed' else ''} [{e.priority}] {e.content}"
for e in message.entries
]
)
await _emit_status(f"Execution Plan updated:\n{plan_str}")
elif isinstance(message, TaskStatusMessage):
await _emit_status(f"iFlow: {message.status}")
elif isinstance(message, ToolCallMessage):
tool_desc = (
f"Calling tool: {message.tool_name}"
if message.tool_name
else "Invoking tool"
)
await _emit_status(
f"{tool_desc}... (Status: {message.status})"
)
elif isinstance(message, TaskFinishMessage):
reason_msg = "Task completed."
if message.stop_reason == StopReason.MAX_TOKENS:
reason_msg = "Task stopped: Max tokens reached."
elif message.stop_reason == StopReason.END_TURN:
reason_msg = "Task completed successfully."
await _emit_status(reason_msg, done=True)
break
except Exception as e:
logger.error(f"Error in iFlow pipe: {str(e)}", exc_info=True)
error_msg = f"iFlow Error: {str(e)}"
yield error_msg
await _emit_status(error_msg, done=True)
return stream_generator()

View File

@@ -0,0 +1,241 @@
"""
title: Smart Mind Map Tool
author: Fu-Jie
author_url: https://github.com/Fu-Jie/openwebui-extensions
funding_url: https://github.com/open-webui
version: 1.1.0
description: Intelligently analyzes text content and generates interactive mind maps inline to help users structure and visualize knowledge.
"""
import asyncio
import logging
import re
import time
import json
from datetime import datetime, timezone
from typing import Any, Callable, Awaitable, Dict, Optional
from fastapi import Request
from pydantic import BaseModel, Field
from open_webui.utils.chat import generate_chat_completion
from open_webui.models.users import Users
logger = logging.getLogger(__name__)
class Tools:
class Valves(BaseModel):
MODEL_ID: str = Field(default="", description="The model ID to use for mind map generation. If empty, uses the current conversation model.")
MIN_TEXT_LENGTH: int = Field(default=50, description="Minimum text length required for analysis.")
SHOW_STATUS: bool = Field(default=True, description="Whether to show status messages.")
def __init__(self):
self.valves = self.Valves()
self.__translations = {
"en-US": {
"status_analyzing": "Smart Mind Map: Analyzing text structure...",
"status_drawing": "Smart Mind Map: Drawing completed!",
"notification_success": "Mind map has been generated, {user_name}!",
"error_text_too_short": "Text content is too short ({len} characters). Min: {min_len}.",
"error_user_facing": "Sorry, Smart Mind Map encountered an error: {error}",
"status_failed": "Smart Mind Map: Failed.",
"ui_title": "🧠 Smart Mind Map",
"ui_download_png": "PNG",
"ui_download_svg": "SVG",
"ui_download_md": "Markdown",
"ui_zoom_out": "Zoom Out",
"ui_zoom_reset": "Reset",
"ui_zoom_in": "Zoom In",
"ui_depth_select": "Expand Level",
"ui_depth_all": "All",
"ui_depth_2": "L2",
"ui_depth_3": "L3",
"ui_fullscreen": "Fullscreen",
"ui_theme": "Theme",
"ui_footer": "<b>Powered by</b> <a href='https://markmap.js.org/' target='_blank' rel='noopener noreferrer'>Markmap</a>",
"html_error_missing_content": "⚠️ Missing content.",
"html_error_load_failed": "⚠️ Resource load failed.",
"js_done": "Done",
},
"zh-CN": {
"status_analyzing": "思维导图:深入分析文本结构...",
"status_drawing": "思维导图:绘制完成!",
"notification_success": "思维导图已生成,{user_name}",
"error_text_too_short": "文本内容过短({len}字符),请提供至少{min_len}字符。",
"error_user_facing": "抱歉,思维导图处理出错:{error}",
"status_failed": "思维导图:处理失败。",
"ui_title": "🧠 智能思维导图",
"ui_download_png": "PNG",
"ui_download_svg": "SVG",
"ui_download_md": "Markdown",
"ui_zoom_out": "缩小",
"ui_zoom_reset": "重置",
"ui_zoom_in": "放大",
"ui_depth_select": "展开层级",
"ui_depth_all": "全部",
"ui_depth_2": "2级",
"ui_depth_3": "3级",
"ui_fullscreen": "全屏",
"ui_theme": "主题",
"ui_footer": "<b>Powered by</b> <a href='https://markmap.js.org/' target='_blank' rel='noopener noreferrer'>Markmap</a>",
"html_error_missing_content": "⚠️ 缺少有效内容。",
"html_error_load_failed": "⚠️ 资源加载失败。",
"js_done": "完成",
}
}
self.__system_prompt = """You are a professional mind map assistant. Analyze text and output Markdown list syntax for Markmap.js.
Guidelines:
- Root node (#) must be ultra-compact (max 10 chars for CJK, 5 words for Latin).
- Use '-' with 2-space indentation.
- Output ONLY Markdown wrapped in ```markdown.
- Match the language of the input text."""
self.__css_template = """
:root {
--primary-color: #1e88e5; --secondary-color: #43a047; --background-color: #f4f6f8;
--card-bg-color: #ffffff; --text-color: #000000; --link-color: #546e7a;
--node-stroke-color: #90a4ae; --muted-text-color: #546e7a; --border-color: #e0e0e0;
--shadow: 0 4px 12px rgba(0, 0, 0, 0.05); --font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
}
.theme-dark {
--primary-color: #3b82f6; --secondary-color: #22c55e; --background-color: #0d1117;
--card-bg-color: #161b22; --text-color: #ffffff; --link-color: #58a6ff;
--node-stroke-color: #8b949e; --muted-text-color: #7d8590; --border-color: #30363d;
}
html, body { margin: 0; padding: 0; width: 100%; height: 600px; background: transparent; overflow: hidden; font-family: var(--font-family); }
.mindmap-wrapper { display: flex; flex-direction: column; width: 100%; height: 100%; background: var(--card-bg-color); border: 1px solid var(--border-color); border-radius: 12px; overflow: hidden; box-shadow: var(--shadow); }
.header { display: flex; align-items: center; padding: 8px 16px; border-bottom: 1px solid var(--border-color); background: var(--card-bg-color); flex-shrink: 0; gap: 12px; }
.header h1 { margin: 0; font-size: 1rem; flex-grow: 1; color: var(--text-color); }
.btn-group { display: flex; gap: 2px; background: var(--background-color); padding: 2px; border-radius: 6px; }
.control-btn { border: none; background: transparent; color: var(--text-color); padding: 4px 8px; cursor: pointer; border-radius: 4px; font-size: 0.8rem; opacity: 0.7; }
.control-btn:hover { background: var(--card-bg-color); opacity: 1; }
.content { flex-grow: 1; position: relative; }
.markmap-container { position: absolute; top:0; left:0; right:0; bottom:0; }
svg text { fill: var(--text-color) !important; }
svg .markmap-link { stroke: var(--link-color) !important; }
"""
self.__content_template = """
<div class="mindmap-wrapper">
<div class="header">
<h1>{t_ui_title}</h1>
<div class="btn-group">
<button id="z-in-{uid}" class="control-btn">+</button>
<button id="z-out-{uid}" class="control-btn">-</button>
<button id="z-res-{uid}" class="control-btn">↺</button>
</div>
<div class="btn-group">
<select id="d-sel-{uid}" class="control-btn">
<option value="0">{t_ui_depth_all}</option>
<option value="2">{t_ui_depth_2}</option>
<option value="3" selected>{t_ui_depth_3}</option>
</select>
</div>
<button id="t-tog-{uid}" class="control-btn">◐</button>
</div>
<div class="content"><div class="markmap-container" id="mm-{uid}"></div></div>
</div>
<script type="text/template" id="src-{uid}">{md}</script>
"""
async def generate_mind_map(
self,
text: str,
__user__: Optional[Dict[str, Any]] = None,
__metadata__: Optional[Dict[str, Any]] = None,
__event_emitter__: Optional[Callable[[Any], Awaitable[None]]] = None,
__request__: Optional[Request] = None,
) -> Any:
user_ctx = await self.__get_user_context(__user__, __request__)
lang = user_ctx["lang"]
name = user_ctx["name"]
if len(text) < self.valves.MIN_TEXT_LENGTH:
return f"⚠️ {self.__get_t(lang, 'error_text_too_short', len=len(text), min_len=self.valves.MIN_TEXT_LENGTH)}"
await self.__emit_status(__event_emitter__, self.__get_t(lang, "status_analyzing"), False)
try:
target_model = self.valves.MODEL_ID or (__metadata__.get("model_id") if __metadata__ else "")
llm_payload = {
"model": target_model,
"messages": [
{"role": "system", "content": self.__system_prompt},
{"role": "user", "content": f"Language: {lang}\nText: {text}"},
],
"temperature": 0.5,
}
user_obj = Users.get_user_by_id(user_ctx["id"])
response = await generate_chat_completion(__request__, llm_payload, user_obj)
md_content = self.__extract_md(response["choices"][0]["message"]["content"])
uid = str(int(time.time() * 1000))
ui_t = {f"t_{k}": self.__get_t(lang, k) for k in self.__translations["en-US"] if k.startswith("ui_")}
html_body = self.__content_template.format(uid=uid, md=md_content, **ui_t)
script = f"""
<script>
(function() {{
const uid = "{uid}";
const load = (s, c) => c() ? Promise.resolve() : new Promise((r,e) => {{
const t = document.createElement('script'); t.src = s; t.onload = r; t.onerror = e; document.head.appendChild(t);
}});
const init = () => load('https://cdn.jsdelivr.net/npm/d3@7', () => window.d3)
.then(() => load('https://cdn.jsdelivr.net/npm/markmap-lib@0.17', () => window.markmap?.Transformer))
.then(() => load('https://cdn.jsdelivr.net/npm/markmap-view@0.17', () => window.markmap?.Markmap))
.then(() => {{
const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
svg.style.width = svg.style.height = '100%';
const cnt = document.getElementById('mm-'+uid); cnt.appendChild(svg);
const {{ Transformer, Markmap }} = window.markmap;
const {{ root }} = new Transformer().transform(document.getElementById('src-'+uid).textContent);
const mm = Markmap.create(svg, {{ autoFit: true, initialExpandLevel: 3 }}, root);
document.getElementById('z-in-'+uid).onclick = () => mm.rescale(1.25);
document.getElementById('z-out-'+uid).onclick = () => mm.rescale(0.8);
document.getElementById('z-res-'+uid).onclick = () => mm.fit();
document.getElementById('t-tog-'+uid).onclick = () => document.body.classList.toggle('theme-dark');
document.getElementById('d-sel-'+uid).onchange = (e) => {{
mm.setOptions({{ initialExpandLevel: parseInt(e.target.value) || 99 }}); mm.setData(root); mm.fit();
}};
window.addEventListener('resize', () => mm.fit());
}});
if (document.readyState === 'loading') document.addEventListener('DOMContentLoaded', init); else init();
}})();
</script>
"""
final_html = f"<!DOCTYPE html><html lang='{lang}'><head><style>{self.__css_template}</style></head><body>{html_body}{script}</body></html>"
await self.__emit_status(__event_emitter__, self.__get_t(lang, "status_drawing"), True)
await self.__emit_notification(__event_emitter__, self.__get_t(lang, "notification_success", user_name=name), "success")
return (final_html.strip(), {"Content-Disposition": "inline", "Content-Type": "text/html"})
except Exception as e:
logger.error(f"Mind Map Error: {e}", exc_info=True)
await self.__emit_status(__event_emitter__, self.__get_t(lang, "status_failed"), True)
return f"{self.__get_t(lang, 'error_user_facing', error=str(e))}"
async def __get_user_context(self, __user__, __request__) -> Dict[str, str]:
u = __user__ or {}
lang = u.get("language") or (__request__.headers.get("accept-language") or "en-US").split(",")[0].split(";")[0]
return {"id": u.get("id", "unknown"), "name": u.get("name", "User"), "lang": lang}
def __get_t(self, lang: str, key: str, **kwargs) -> str:
base = lang.split("-")[0]
t = self.__translations.get(lang, self.__translations.get(base, self.__translations["en-US"])).get(key, key)
return t.format(**kwargs) if kwargs else t
def __extract_md(self, content: str) -> str:
match = re.search(r"```markdown\s*(.*?)\s*```", content, re.DOTALL)
return (match.group(1).strip() if match else content.strip()).replace("</script>", "<\\/script>")
async def __emit_status(self, emitter, description: str, done: bool):
if self.valves.SHOW_STATUS and emitter:
await emitter({"type": "status", "data": {"description": description, "done": done}})
async def __emit_notification(self, emitter, content: str, ntype: str):
if emitter:
await emitter({"type": "notification", "data": {"type": ntype, "content": content}})