Compare commits

...

10 Commits

Author SHA1 Message Date
fujie
fc9f1ccb43 feat(pipes): v0.7.0 final release with native tool UI and CLI integration
- Core: Adapt to OpenWebUI native tool call UI and thinking process visualization
- Infra: Bundle Copilot CLI via pip package (no more background curl installation)
- Fix: Resolve "Error getting file content" on OpenWebUI v0.8.0+ via absolute paths
- i18n: Add native localization for status messages in 11 languages
- UX: Optimize reasoning status display logic and cleanup legacy code
2026-02-23 02:33:59 +08:00
fujie
272b959a44 fix(actions): fix white background bleed in dark mode when viewport is narrow 2026-02-22 00:24:37 +08:00
fujie
0bde066088 fix: correct Top 6 plugin name-data mapping to match Gist badge order 2026-02-22 00:10:04 +08:00
fujie
6334660e8d docs: add root README update date sync rule to consistency maintenance 2026-02-22 00:06:43 +08:00
fujie
c29d84f97a docs: update Smart Mind Map release date in root readmes to 2026-02-22 2026-02-22 00:02:51 +08:00
fujie
aac2e89022 chore: update smart mind map plugin image. 2026-02-22 00:00:19 +08:00
fujie
fea812d4f4 ci: remove redundant release title from github release body 2026-02-21 23:49:36 +08:00
fujie
b570cbfcde docs(filters): fix broken link to workflow guide in mkdocs strict build 2026-02-21 23:47:48 +08:00
fujie
adc5e0a1f4 feat(filters): release v1.3.0 for async context compression
- Add native i18n support across 9 languages
- Implement non-blocking frontend log emission for zero TTFB delay
- Add token_usage_status_threshold to intelligently control status notifications
- Automatically detect and skip compression for copilot_sdk models
- Set debug_mode default to false for a quieter production environment
- Update documentation and remove legacy bilingual code
2026-02-21 23:44:12 +08:00
fujie
04b8108890 chore: ignore and stop tracking .git-worktrees 2026-02-21 21:50:35 +08:00
25 changed files with 1458 additions and 4453 deletions

Submodule .git-worktrees/feature-copilot-cli deleted from 1bbddb2222

View File

@@ -478,7 +478,117 @@ async def get_user_language(self):
**注意**: 即使插件有 `Valves` 配置,也应优先尝试自动探测,提升用户体验。
### 8. 智能代理文件交付规范 (Agent File Delivery Standards)
### 8. 国际化 (i18n) 适配规范 (Internationalization Standards)
开发供全球用户使用的插件时,必须预置多语言支持(如中文、英文等)。
#### i18n 字典定义
在文件顶部定义 `TRANSLATIONS` 字典存储多语言字符串:
```python
TRANSLATIONS = {
"en-US": {
"status_starting": "Smart Mind Map is starting...",
},
"zh-CN": {
"status_starting": "智能思维导图正在启动...",
},
# ... 其他语言
}
# 语言回退映射 (Fallback Map)
FALLBACK_MAP = {
"zh": "zh-CN",
"zh-TW": "zh-CN",
"zh-HK": "zh-CN",
"en": "en-US",
"en-GB": "en-US"
}
```
#### 获取当前用户真实语言 (Robust Language Detection)
Open WebUI 的前端localStorage并未自动同步语言设置到后端数据库或通过标准 API 参数传递。为了获取精准的用户偏好语言,**必须**使用多层级回退机制Multi-level Fallback
`JS 动态探测 (localStorage)` > `HTTP 浏览器头 (Accept-Language)` > `用户 Profile 默认设置` > `en-US`
> **注意!防卡死指南 (Anti-Deadlock Guide)**
> 在通过 `__event_call__` 执行前端 JS 脚本时,如果前端脚本不慎抛出异常 (`Exception`) 会导致回调函数 `cb()` 永不执行,这会让后端的 `asyncio` 永远阻塞并卡死整个请求队列!
> **必须**做两重防护:
> 1. JS 内部包裹 `try...catch` 保证必须有 `return`。
> 2. 后端使用 `asyncio.wait_for` 设置强制超时(建议 2 秒)。
```python
import asyncio
from fastapi import Request
async def _get_user_context(
self,
__user__: Optional[dict],
__event_call__: Optional[callable] = None,
__request__: Optional[Request] = None,
) -> dict:
user_language = __user__.get("language", "en-US") if __user__ else "en-US"
# 1st Fallback: HTTP Accept-Language header
if __request__ and hasattr(__request__, "headers") and "accept-language" in __request__.headers:
raw_lang = __request__.headers.get("accept-language", "")
if raw_lang:
user_language = raw_lang.split(",")[0].split(";")[0]
# 2nd Fallback (Best): Execute JS in frontend to read localStorage
if __event_call__:
try:
js_code = """
try {
return (
document.documentElement.lang ||
localStorage.getItem('locale') ||
navigator.language ||
'en-US'
);
} catch (e) {
return 'en-US';
}
"""
# 【致命!】必须设置 wait_for 防止前端无响应卡死后端
frontend_lang = await asyncio.wait_for(
__event_call__({"type": "execute", "data": {"code": js_code}}),
timeout=2.0
)
if frontend_lang and isinstance(frontend_lang, str):
user_language = frontend_lang
except Exception as e:
pass # fallback to accept-language or en-US
return {
"user_language": user_language,
# ... user_name, user_id etc.
}
```
#### 实际使用 (Usage in Action/Filter)
在 Action 或者 Filter 执行时引用这套上下文获取机制,然后传入映射器获取最终翻译:
```python
async def action(
self,
body: dict,
__user__: Optional[dict] = None,
__event_call__: Optional[callable] = None,
__request__: Optional[Request] = None,
**kwargs
) -> Optional[dict]:
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
user_lang = user_ctx["user_language"]
# 获取多语言文本 (通过你的 translation.get() 扩展)
# start_msg = self._get_translation(user_lang, "status_starting")
```
### 9. 智能代理文件交付规范 (Agent File Delivery Standards)
在开发具备文件生成能力的智能代理插件(如 GitHub Copilot SDK 集成)时,必须遵循以下标准流程,以确保文件在不同存储后端(本地/S3下的可用性并绕过不必要的 RAG 处理。
@@ -498,7 +608,7 @@ async def get_user_language(self):
- 代理应始终将“当前目录”视为其受保护所在的私有工作空间。
- `publish_file_from_workspace` 的参数 `filename` 仅需传入相对于当前目录的文件名。
### 9. Copilot SDK 插件工具定义规范 (Copilot SDK Tool Definition Standards)
### 10. Copilot SDK 插件工具定义规范 (Copilot SDK Tool Definition Standards)
在为 GitHub Copilot SDK 开发自定义工具时,为了确保大模型能正确识别参数(避免生成空的 `properties` Schema必须遵循以下定义模式
@@ -532,6 +642,63 @@ my_tool = define_tool(
2. **Field 描述**: 在 `BaseModel` 中使用 `Field(..., description="...")` 为每个参数提供详细的描述信息。
3. **Required vs Optional**: 明确标注必填项(无默认值)和可选项(带 `default`)。
### 11. Copilot SDK 流式渲染与工具卡片规范 (Streaming & Tool Card Standards)
在处理大模型的思维链Reasoning输出和工具调用Tool Calls为了确保能完美兼容 OpenWebUI 0.8.x 前端的 Markdown 解析器及原生折叠 UI 组件,必须遵循以下极度严格的输出格式规范。
#### 思维链流式渲染 (Reasoning Streaming)
为了让前端能够正确显示“Thinking...”的折叠框和 Spinner 动画,**必须**使用原生的 `<think>` 标签。
- **正确的标签包裹**:
```html
<think>
这里是思考过程...
</think>
```
- **关键细节**:
- **标签闭合检测**: 必须在代码内部维护状态(如 `state["thinking_started"]`。当1正文内容即将开始输出2工具调用触发 (`tool.execution_start`) 时,**必须优先输出 `\n</think>\n` 强制闭合标签**。如果不闭合,后续的正文或工具面板会被全部吞进思考框内,导致页面完全崩坏!
- **不要手动拼装**: 严禁通过手动输出 `<details type="reasoning">` 等大段 HTML 来模拟思考过程,这种方式极易在流式片段发送中破坏前端 DOM 树并导致错位。
#### 工具调用原生卡片 (Native Tool Calls Block)
为了在对话界面中生成标准、原生的下拉折叠“工具调用”卡片,当 `event_type == "tool.execution_complete"` 时,必须向队列输出如下严格格式的 HTML
```python
# 必须转义属性中的双引号为 &quot;
args_for_attr = args_json_str.replace('"', "&quot;")
result_for_attr = result_content.replace('"', "&quot;")
tool_block = (
f'\\n<details type="tool_calls"'
f' id="{tool_call_id}"'
f' name="{tool_name}"'
f' arguments="{args_for_attr}"'
f' result="{result_for_attr}"'
f' done="true">\\n'
f"<summary>Tool Executed</summary>\\n"
f"</details>\\n\\n"
)
queue.put_nowait(tool_block)
```
- **致命避坑点 (Critical Pitfalls)**:
1. **属性转义 (Extremely Important)**: `<details>` 内的 `arguments` 和 `result` 属性**必须**将内部的所有双引号 `"` 替换为 `&quot;`。因为 OpenWebUI 前端提取这些数据的 Regex 是严格的 `="([^"]*)"`,一旦内容中出现原生双引号,就会被瞬间截断,导致参数被渲染为空并引发解析错误!
2. **换行符要求**: `<details ...>` 尖括号闭合后紧接着的内容**必须换行**(即 `>\\n`),否则 Markdown 扩展引擎无法将其识别为独立的 UI Block。
3. **去除冗余通知**: 不要在 `tool.execution_start` 事件中提前向对话流输出普通的 `🔧 Executing...` 纯文本块,这会导致最终页面上同时出现两块工具提示(一个文本,一个折叠卡片)。
#### Debug 信息的解耦 (Decoupling Debug Logs)
对于连接建立、运行环境、缓存加载等属于 *脚本自身运行状态* 的 Debug 信息:
- **禁止**: 不要将这些内容 yield 到最终的回答数据流(或塞进 `<think>` 标签内),这会污染回答的纯粹性。
- **推荐**: 统一使用 OpenWebUI 顶部的原生状态反馈气泡Status Events
```python
await __event_emitter__({
"type": "status",
"data": {"description": "连接建立,正在等待响应...", "done": True}
})
```
---
## ⚡ Action 插件规范 (Action Plugin Standards)
@@ -947,8 +1114,10 @@ Filter 实例是**单例 (Singleton)**。
任何插件的**新增、修改或移除**,必须同时更新:
1. **插件代码** (version)
2. **项目文档** (`docs/`)
3. **自述文件** (`README.md`)
2. **插件自述文件** (`plugins/{type}/{name}/README.md` & `README_CN.md`)
3. **项目文档** (`docs/plugins/{type}/{name}.md` & `.zh.md`)
4. **项目文档索引** (`docs/plugins/{type}/index.md` & `index.zh.md` — 版本号)
5. **项目根 README** (`README.md` & `README_CN.md` — 更新日期徽章 `![updated](https://img.shields.io/badge/YYYY--MM--DD-gray?style=flat)` 必须同步为发布当天日期)
### 3. 发布工作流 (Release Workflow)

View File

@@ -329,8 +329,7 @@ jobs:
DETECTED_CHANGES: ${{ needs.check-changes.outputs.release_notes }}
COMMITS: ${{ steps.commits.outputs.commits }}
run: |
echo "# ${VERSION} Release" > release_notes.md
echo "" >> release_notes.md
> release_notes.md
if [ -n "$TITLE" ]; then
echo "## $TITLE" >> release_notes.md

1
.gitignore vendored
View File

@@ -139,3 +139,4 @@ logs/
# OpenWebUI specific
# Add any specific ignores for OpenWebUI plugins if needed
.git-worktrees/

View File

@@ -24,12 +24,12 @@ A collection of enhancements, plugins, and prompts for [OpenWebUI](https://githu
| Rank | Plugin | Version | Downloads | Views | 📅 Updated |
| :---: | :--- | :---: | :---: | :---: | :---: |
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![p1_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_version.json&style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![p1_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_version.json&style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--22-gray?style=flat) |
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![p2_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_version.json&style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--31-gray?style=flat) |
| 🥉 | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | ![p3_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_version.json&style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--07-gray?style=flat) |
| 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![p4_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_version.json&style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 5⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | ![p5_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_version.json&style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--10-gray?style=flat) |
| 6⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![p6_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_version.json&style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 4⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![p4_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_version.json&style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 5⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![p5_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_version.json&style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 6⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | ![p6_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_version.json&style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--10-gray?style=flat) |
### 📈 Total Downloads Trend

View File

@@ -21,12 +21,12 @@ OpenWebUI 增强功能集合。包含个人开发与收集的插件、提示词
| 排名 | 插件 | 版本 | 下载 | 浏览 | 📅 更新 |
| :---: | :--- | :---: | :---: | :---: | :---: |
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![p1_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_version.json&style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![p1_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_version.json&style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--22-gray?style=flat) |
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![p2_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_version.json&style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--31-gray?style=flat) |
| 🥉 | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | ![p3_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_version.json&style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--07-gray?style=flat) |
| 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![p4_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_version.json&style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 5⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | ![p5_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_version.json&style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--10-gray?style=flat) |
| 6⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![p6_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_version.json&style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 4⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![p4_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_version.json&style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 5⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![p5_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_version.json&style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--01--29-gray?style=flat) |
| 6⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | ![p6_version](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_version.json&style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--02--10-gray?style=flat) |
### 📈 总下载量累计趋势

View File

@@ -1,137 +1,81 @@
# Async Context Compression
# Async Context Compression Filter
<span class="category-badge filter">Filter</span>
<span class="version-badge">v1.2.2</span>
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
Reduces token consumption in long conversations through intelligent summarization while maintaining conversational coherence.
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
## What's new in 1.3.0
- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
---
## Overview
## Core Features
The Async Context Compression filter helps manage token usage in long conversations by:
- Intelligently summarizing older messages
- Preserving important context
- Reducing API costs
- Maintaining conversation coherence
This is especially useful for:
- Long-running conversations
- Complex multi-turn discussions
- Cost optimization
- Token limit management
## Features
- :material-arrow-collapse-vertical: **Smart Compression**: AI-powered context summarization
- :material-clock-fast: **Async Processing**: Non-blocking background compression
- :material-memory: **Context Preservation**: Keeps important information
- :material-currency-usd-off: **Cost Reduction**: Minimize token usage
- :material-console: **Frontend Debugging**: Debug logs in browser console
- :material-alert-circle-check: **Enhanced Error Reporting**: Clear error status notifications
- :material-check-all: **Open WebUI v0.7.x Compatibility**: Dynamic DB session handling
- :material-account-convert: **Improved Compatibility**: Summary role changed to `assistant`
- :material-shield-check: **Enhanced Stability**: Resolved race conditions in state management
- :material-ruler: **Preflight Context Check**: Validates context fit before sending
- :material-format-align-justify: **Structure-Aware Trimming**: Preserves document structure
- :material-content-cut: **Native Tool Output Trimming**: Trims verbose tool outputs (Note: Non-native tool outputs are not fully injected into context)
- :material-chart-bar: **Detailed Token Logging**: Granular token breakdown
- :material-account-search: **Smart Model Matching**: Inherit config from base models
- :material-image-off: **Multimodal Support**: Images are preserved but tokens are **NOT** calculated
-**Full i18n Support**: Native localization across 9 languages.
- ✅ Automatic compression triggered by token thresholds.
- ✅ Asynchronous summarization that does not block chat responses.
- ✅ Persistent storage via Open WebUI's shared database connection (PostgreSQL, SQLite, etc.).
- ✅ Flexible retention policy to keep the first and last N messages.
- ✅ Smart injection of historical summaries back into the context.
- ✅ Structure-aware trimming that preserves document structure (headers, intro, conclusion).
- ✅ Native tool output trimming for cleaner context when using function calling.
- ✅ Real-time context usage monitoring with warning notifications (>90%).
- ✅ Detailed token logging for precise debugging and optimization.
- **Smart Model Matching**: Automatically inherits configuration from base models for custom presets.
- **Multimodal Support**: Images are preserved but their tokens are **NOT** calculated. Please adjust thresholds accordingly.
---
## Installation
## Installation & Configuration
1. Download the plugin file: [`async_context_compression.py`](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression)
2. Upload to OpenWebUI: **Admin Panel****Settings****Functions**
3. Configure compression settings
4. Enable the filter
### 1) Database (automatic)
- Uses Open WebUI's shared database connection; no extra configuration needed.
- The `chat_summary` table is created on first run.
### 2) Filter order
- Recommended order: pre-filters (<10) → this filter (10) → post-filters (>10).
---
## How It Works
## Configuration Parameters
```mermaid
graph TD
A[Incoming Messages] --> B{Token Count > Threshold?}
B -->|No| C[Pass Through]
B -->|Yes| D[Summarize Older Messages]
D --> E[Preserve Recent Messages]
E --> F[Combine Summary + Recent]
F --> G[Send to LLM]
```
| Parameter | Default | Description |
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `priority` | `10` | Execution order; lower runs earlier. |
| `compression_threshold_tokens` | `64000` | Trigger asynchronous summary when total tokens exceed this value. Set to 50%-70% of your model's context window. |
| `max_context_tokens` | `128000` | Hard cap for context; older messages (except protected ones) are dropped if exceeded. |
| `keep_first` | `1` | Always keep the first N messages (protects system prompts). |
| `keep_last` | `6` | Always keep the last N messages to preserve recent context. |
| `summary_model` | `None` | Model for summaries. Strongly recommended to set a fast, economical model (e.g., `gemini-2.5-flash`, `deepseek-v3`). Falls back to the current chat model when empty. |
| `summary_model_max_context` | `0` | Max context tokens for the summary model. If 0, falls back to `model_thresholds` or global `max_context_tokens`. |
| `max_summary_tokens` | `16384` | Maximum tokens for the generated summary. |
| `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. |
| `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). |
| `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. |
| `debug_mode` | `false` | Log verbose debug info. Set to `false` in production. |
| `show_debug_log` | `false` | Print debug logs to browser console (F12). Useful for frontend debugging. |
| `show_token_usage_status` | `true` | Show token usage status notification in the chat interface. |
| `token_usage_status_threshold` | `80` | The minimum usage percentage (0-100) required to show a context usage status notification. |
---
## Configuration
## ⭐ Support
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `compression_threshold_tokens` | integer | `64000` | Trigger compression above this token count |
| `max_context_tokens` | integer | `128000` | Hard limit for context |
| `keep_first` | integer | `1` | Always keep the first N messages |
| `keep_last` | integer | `6` | Always keep the last N messages |
| `summary_model` | string | `None` | Model to use for summarization |
| `summary_model_max_context` | integer | `0` | Max context tokens for summary model |
| `max_summary_tokens` | integer | `16384` | Maximum tokens for the summary |
| `enable_tool_output_trimming` | boolean | `false` | Enable trimming of large tool outputs |
If this plugin has been useful, a star on [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) is a big motivation for me. Thank you for the support.
---
## Troubleshooting ❓
## Example
- **Initial system prompt is lost**: Keep `keep_first` greater than 0 to protect the initial message.
- **Compression effect is weak**: Raise `compression_threshold_tokens` or lower `keep_first` / `keep_last` to allow more aggressive compression.
- **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
### Before Compression
## Changelog
```
[Message 1] User: Tell me about Python...
[Message 2] AI: Python is a programming language...
[Message 3] User: What about its history?
[Message 4] AI: Python was created by Guido...
[Message 5] User: And its features?
[Message 6] AI: Python has many features...
... (many more messages)
[Message 20] User: Current question
```
### After Compression
```
[Summary] Previous conversation covered Python basics,
history, features, and common use cases...
[Message 18] User: Recent question about decorators
[Message 19] AI: Decorators in Python are...
[Message 20] User: Current question
```
---
## Requirements
!!! note "Prerequisites"
- OpenWebUI v0.3.0 or later
- Access to an LLM for summarization
!!! tip "Best Practices"
- Set appropriate token thresholds based on your model's context window
- Preserve more recent messages for technical discussions
- Test compression settings in non-critical conversations first
---
## Troubleshooting
??? question "Compression not triggering?"
Check if the token count exceeds your configured threshold. Enable debug logging for more details.
??? question "Important context being lost?"
Increase the `preserve_recent` setting or lower the compression ratio.
---
## Source Code
[:fontawesome-brands-github: View on GitHub](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression){ .md-button }
See the full history on GitHub: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)

View File

@@ -1,137 +1,119 @@
# Async Context Compression异步上下文压缩
# 异步上下文压缩过滤器
<span class="category-badge filter">Filter</span>
<span class="version-badge">v1.2.2</span>
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
通过智能摘要减少长对话的 token 消耗,同时保持对话连贯
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
## 1.3.0 版本更新
- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化,现已原生支持 9 种语言(含中、英、日、韩及欧洲主要语言)。
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门(默认 80%),可以智能控制何时显示 Token 用量状态,减少不必要的打扰。
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构完全不影响首字节响应时间TTFB保持毫秒级极速推流。
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩,避免冲突。
- **配置项调整**: 为了提供更安静的生产环境体验,`debug_mode` 现已默认设置为 `false`
---
## 概览
## 核心特性
Async Context Compression 过滤器通过以下方式帮助管理长对话的 token 使用:
-**全方位国际化**: 原生支持 9 种界面语言。
-**自动压缩**: 基于 Token 阈值自动触发上下文压缩。
-**异步摘要**: 后台生成摘要,不阻塞当前对话响应。
-**持久化存储**: 复用 Open WebUI 共享数据库连接,自动支持 PostgreSQL/SQLite 等。
-**灵活保留策略**: 可配置保留对话头部和尾部消息,确保关键信息连贯。
-**智能注入**: 将历史摘要智能注入到新上下文中。
-**结构感知裁剪**: 智能折叠过长消息,保留文档骨架(标题、首尾)。
-**原生工具输出裁剪**: 支持裁剪冗长的工具调用输出。
-**实时监控**: 实时监控上下文使用情况,超过 90% 发出警告。
-**详细日志**: 提供精确的 Token 统计日志,便于调试。
-**智能模型匹配**: 自定义模型自动继承基础模型的阈值配置。
-**多模态支持**: 图片内容会被保留,但其 Token **不参与计算**。请相应调整阈值。
- 智能总结较早的消息
- 保留关键信息
- 降低 API 成本
- 保持对话一致性
特别适用于:
- 长时间会话
- 多轮复杂讨论
- 成本优化
- 上下文长度控制
## 功能特性
- :material-arrow-collapse-vertical: **智能压缩**AI 驱动的上下文摘要
- :material-clock-fast: **异步处理**:后台非阻塞压缩
- :material-memory: **保留上下文**:尽量保留重要信息
- :material-currency-usd-off: **降低成本**:减少 token 使用
- :material-console: **前端调试**:支持浏览器控制台日志
- :material-alert-circle-check: **增强错误报告**:清晰的错误状态通知
- :material-check-all: **Open WebUI v0.7.x 兼容性**:动态数据库会话处理
- :material-account-convert: **兼容性提升**:摘要角色改为 `assistant`
- :material-shield-check: **稳定性增强**:解决状态管理竞态条件
- :material-ruler: **预检上下文检查**:发送前验证上下文是否超限
- :material-format-align-justify: **结构感知裁剪**:保留文档结构的智能裁剪
- :material-content-cut: **原生工具输出裁剪**:自动裁剪冗长的工具输出(注意:非原生工具调用输出不会完整注入上下文)
- :material-chart-bar: **详细 Token 日志**:提供细粒度的 Token 统计
- :material-account-search: **智能模型匹配**:自定义模型自动继承基础模型配置
- :material-image-off: **多模态支持**:图片内容保留但 Token **不参与计算**
详细的工作原理和流程请参考 [工作流程指南](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/WORKFLOW_GUIDE_CN.md)。
---
## 安装
## 安装与配置
1. 下载插件文件:[`async_context_compression.py`](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression)
2. 上传到 OpenWebUI**Admin Panel** → **Settings****Functions**
3. 配置压缩参数
4. 启用过滤器
### 1. 数据库(自动)
- 自动使用 Open WebUI 的共享数据库连接,**无需额外配置**。
- 首次运行自动创建 `chat_summary` 表。
### 2. 过滤器顺序
- 建议顺序:前置过滤器(<10→ 本过滤器10→ 后置过滤器(>10
---
## 工作原理
## 配置参数
```mermaid
graph TD
A[Incoming Messages] --> B{Token Count > Threshold?}
B -->|No| C[Pass Through]
B -->|Yes| D[Summarize Older Messages]
D --> E[Preserve Recent Messages]
E --> F[Combine Summary + Recent]
F --> G[Send to LLM]
您可以在过滤器的设置中调整以下参数:
### 核心参数
| 参数 | 默认值 | 描述 |
| :----------------------------- | :------- | :------------------------------------------------------------------------------------ |
| `priority` | `10` | 过滤器执行顺序,数值越小越先执行。 |
| `compression_threshold_tokens` | `64000` | **重要**: 当上下文总 Token 超过此值时后台生成摘要,建议设为模型上下文窗口的 50%-70%。 |
| `max_context_tokens` | `128000` | **重要**: 上下文硬上限,超过即移除最早消息(保留受保护消息)。 |
| `keep_first` | `1` | 始终保留对话开始的 N 条消息,保护系统提示或环境变量。 |
| `keep_last` | `6` | 始终保留对话末尾的 N 条消息,确保最近上下文连贯。 |
### 摘要生成配置
| 参数 | 默认值 | 描述 |
| :-------------------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------ |
| `summary_model` | `None` | 用于生成摘要的模型 ID。**强烈建议**配置快速、经济、上下文窗口大的模型(如 `gemini-2.5-flash``deepseek-v3`)。留空则尝试复用当前对话模型。 |
| `summary_model_max_context` | `0` | 摘要模型的最大上下文 Token 数。如果为 0则回退到 `model_thresholds` 或全局 `max_context_tokens`。 |
| `max_summary_tokens` | `16384` | 生成摘要时允许的最大 Token 数。 |
| `summary_temperature` | `0.1` | 控制摘要生成的随机性,较低的值结果更稳定。 |
### 高级配置
#### `model_thresholds` (模型特定阈值)
这是一个字典配置,可为特定模型 ID 覆盖全局 `compression_threshold_tokens``max_context_tokens`,适用于混合不同上下文窗口的模型。
**默认包含 GPT-4、Claude 3.5、Gemini 1.5/2.0、Qwen 2.5/3、DeepSeek V3 等推荐阈值。**
**配置示例:**
```json
{
"gpt-4": {
"compression_threshold_tokens": 8000,
"max_context_tokens": 32000
},
"gemini-2.5-flash": {
"compression_threshold_tokens": 734000,
"max_context_tokens": 1048576
}
}
```
---
## 配置项
| 选项 | 类型 | 默认值 | 说明 |
|--------|------|---------|-------------|
| `compression_threshold_tokens` | integer | `64000` | 超过该 token 数触发压缩 |
| `max_context_tokens` | integer | `128000` | 上下文硬性上限 |
| `keep_first` | integer | `1` | 始终保留的前 N 条消息 |
| `keep_last` | integer | `6` | 始终保留的后 N 条消息 |
| `summary_model` | string | `None` | 用于摘要的模型 |
| `summary_model_max_context` | integer | `0` | 摘要模型的最大上下文 Token 数 |
| `max_summary_tokens` | integer | `16384` | 摘要的最大 token 数 |
| `enable_tool_output_trimming` | boolean | `false` | 启用长工具输出裁剪 |
| 参数 | 默认值 | 描述 |
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------- |
| `enable_tool_output_trimming` | `false` | 启用时,若 `function_calling: "native"` 激活,将裁剪冗长的工具输出以仅提取最终答案。 |
| `debug_mode` | `false` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息。生产环境默认且建议设为 `false`。 |
| `show_debug_log` | `false` | 是否在浏览器控制台 (F12) 打印调试日志。便于前端调试。 |
| `show_token_usage_status` | `true` | 是否在对话结束时显示 Token 使用情况的状态通知。 |
| `token_usage_status_threshold` | `80` | 触发显示上下文用量状态通知的最低百分比阈值 (0-100)。 |
---
## 示例
## ⭐ 支持
### 压缩前
如果这个插件对你有帮助,欢迎到 [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) 点个 Star这将是我持续改进的动力感谢支持。
```
[Message 1] User: Tell me about Python...
[Message 2] AI: Python is a programming language...
[Message 3] User: What about its history?
[Message 4] AI: Python was created by Guido...
[Message 5] User: And its features?
[Message 6] AI: Python has many features...
... (many more messages)
[Message 20] User: Current question
```
## 故障排除 (Troubleshooting) ❓
### 压缩后
- **初始系统提示丢失**:将 `keep_first` 设置为大于 0。
- **压缩效果不明显**:提高 `compression_threshold_tokens`,或降低 `keep_first` / `keep_last` 以增强压缩力度。
- **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue[OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
```
[Summary] Previous conversation covered Python basics,
history, features, and common use cases...
## 更新日志
[Message 18] User: Recent question about decorators
[Message 19] AI: Decorators in Python are...
[Message 20] User: Current question
```
---
## 运行要求
!!! note "前置条件"
- OpenWebUI v0.3.0 及以上
- 需要可用的 LLM 用于摘要
!!! tip "最佳实践"
- 根据模型上下文窗口设置合适的 token 阈值
- 技术讨论可适当提高 `preserve_recent`
- 先在非关键对话中测试压缩效果
---
## 常见问题
??? question "没有触发压缩?"
检查 token 数是否超过配置的阈值,并开启调试日志了解细节。
??? question "重要上下文丢失?"
提高 `preserve_recent` 或降低压缩比例。
---
## 源码
[:fontawesome-brands-github: 在 GitHub 查看](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression){ .md-button }
完整历史请查看 GitHub 项目: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)

View File

@@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline:
Reduces token consumption in long conversations through intelligent summarization while maintaining coherence.
**Version:** 1.2.2
**Version:** 1.3.0
[:octicons-arrow-right-24: Documentation](async-context-compression.md)

View File

@@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件:
通过智能总结减少长对话的 token 消耗,同时保持连贯性。
**版本:** 1.2.2
**版本:** 1.3.0
[:octicons-arrow-right-24: 查看文档](async-context-compression.md)

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK Pipe for OpenWebUI
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.6.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.7.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
@@ -14,12 +14,13 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
---
## ✨ v0.6.2 Updates (What's New)
## ✨ v0.7.0 Updates (What's New)
- **🛠️ New Workspace Artifacts Tool**: Introduced `publish_file_from_workspace`. Agents can now generate files (e.g., Python-generated Excel/CSV) and provide direct download links for the user to click and save.
- **⚙️ Workflow Optimization**: Improved reliability of the internal agentic workspace management.
- **🛡️ Enhanced Security**: Refined access control for system resources within the isolated environment.
- **🔧 Performance Tuning**: Optimized stream processing for larger context windows.
- **🚀 Integrated CLI Management**: The Copilot CLI is now automatically managed and bundled via the `github-copilot-sdk` pip package. (v0.7.0)
- **🧠 Native Tool Call UI**: Full adaptation to **OpenWebUI's native tool call UI** and thinking process visualization. (v0.7.0)
- **🏠 OpenWebUI v0.8.0+ Fix**: Resolved "Error getting file content" download failure by switching to absolute path registration for published files. (v0.7.0)
- **🌐 Comprehensive Multi-language Support**: Native localization for status messages in 11 languages (EN, ZH, JA, KO, FR, DE, ES, IT, RU, VI, ID). (v0.7.0)
- **🧹 Architecture Cleanup**: Refactored core setup and optimized reasoning status display for a leaner experience. (v0.7.0)
---
@@ -31,8 +32,8 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
- **♾️ Infinite Session Management**: Smart context window management with automatic compaction for indefinite conversation capability.
- **🧠 Deep Database Integration**: Real-time persistence of TOD·O lists for long-running workflows.
- **🌊 Advanced Streaming**: Full support for thinking process/Chain of Thought visualization.
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support.
- **⚡ Full-Lifecycle File Agent**: Supports receiving uploaded files for raw bypass analysis and publishing results (Excel/reports) as downloadable links.
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support (bypasses RAG for direct binary access).
- **📤 Workspace Artifacts (`publish_file_from_workspace`)**: Agents can generate files (Excel, CSV, HTML reports, etc.) and provide **persistent download links** directly in the chat.
- **🖼️ Interactive Artifacts**: Automatically renders HTML/JS apps generated by the agent directly in the chat interface.
---
@@ -110,7 +111,7 @@ If this plugin has been useful, a **Star** on [OpenWebUI Extensions](https://git
- **Agent ignores files?**: Ensure the Files Filter is enabled, otherwise RAG will interfere with raw binaries.
- **No progress bar?**: The bar only appears when the Agent uses the `update_todo` tool.
- **Dependencies**: This Pipe automatically installs `github-copilot-sdk` (Python) and `github-copilot-cli` (Binary).
- **Dependencies**: This Pipe automatically manages `github-copilot-sdk` (Python) and utilizes the bundled binary CLI. No manual install required.
---

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK 官方管道
**作者:** [Fu-Jie](https://github.com/Fu-Jie) | **版本:** 0.6.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie) | **版本:** 0.7.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商OpenAI, Anthropic并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
@@ -14,12 +14,13 @@
---
## ✨ 0.6.2 更新内容 (What's New)
## ✨ 0.7.0 更新内容 (What's New)
- **🛠️ 新增工作区产物工具**: 引入 `publish_file_from_workspace`。Agent 现在可以生成物理文件(如使用 Python 生成的 Excel/CSV 报表),并直接在聊天界面提供点击下载链接。
- **⚙️ 工作流优化**: 提升了内部 Agent 物理工作区管理的可靠性与原子性。
- **🛡️ 安全增强**: 精细化了隔离环境下系统资源的访问控制策略。
- **🔧 性能微调**: 针对大上下文窗口优化了流式数据处理性能。
- **🚀 CLI 免维护集成**: Copilot CLI 现在通过 `github-copilot-sdk` pip 包自动同步管理,彻底告别手动 `curl | bash` 安装问题。(v0.7.0)
- **🧠 原生工具调用 UI**: 全面适配 **OpenWebUI 原生工具调用 UI** 与模型思考过程(思维链)展示。(v0.7.0)
- **🏠 OpenWebUI v0.8.0+ 兼容性修复**: 通过切换为绝对路径注册发布文件彻底解决了“Error getting file content”无法下载到本地的问题。(v0.7.0)
- **🌐 全面的多语言支持**: 针对状态消息进行了 11 国语言的原生本地化 (中/英/日/韩/法/德/西/意/俄/越/印尼)。(v0.7.0)
- **🧹 架构精简**: 重构了初始化逻辑并优化了推理状态显示,提供更轻量稳健的体验。(v0.7.0)
---
@@ -31,8 +32,8 @@
- **♾️ 无限会话管理**: 智能上下文窗口管理与自动压缩算法,支持无限时长的对话交互。
- **🧠 深度数据库集成**: 实时持久化 TOD·O 列表到 UI 进度条。
- **🌊 深度推理展示**: 完整支持模型思考过程 (Thinking Process) 的流式渲染。
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析。
- **⚡ 全生命周期文件 Agent**: 支持接收上传文件进行绕过 RAG 的深度分析,并将处理结果(如 Excel/报告)发布为下载链接实现闭环
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析(绕过 RAG 直接访问原始二进制内容)
- **📤 工作区产物工具 (`publish_file_from_workspace`)**: Agent 可生成文件Excel、CSV、HTML 报告等)并直接提供**持久化下载链接**。管理员还可额外获得通过 `/content/html` 接口的**聊天内 HTML 预览**链接
- **🖼️ 交互式伪影 (Artifacts)**: 自动渲染 Agent 生成的 HTML/JS 应用程序,直接在聊天界面交互。
---
@@ -95,7 +96,7 @@
### 1) 导入函数
1. 打开 OpenWebUI前往 **工作区** -> **函数**
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk_cn.py` 的内容。
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk.py` 的内容。
3. 点击保存并确保已启用。
### 2) 获取 Token (Get Token)
@@ -110,7 +111,7 @@
- **Agent 无法识别文件?**: 请确保已安装并启用了 Files Filter 插件,否则原始文件会被 RAG 干扰。
- **看不到 TODO 进度条?**: 进度条仅在 Agent 使用 `update_todo` 工具(通常是处理复杂任务)时出现。
- **依赖安装**: 本管道会自动尝试安装 `github-copilot-sdk` (Python 包) `github-copilot-cli` (官方二进制)
- **依赖安装**: 本管道会自动管理 `github-copilot-sdk` (Python 包) 并优先直接使用内置的二进制 CLI无需手动干预
---

View File

@@ -15,7 +15,7 @@ Pipes allow you to:
## Available Pipe Plugins
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.6.2) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Database Persistence**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. Supports streaming, multimodal, and infinite sessions. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md).
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.7.0) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Database Persistence**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. Supports streaming, multimodal, and infinite sessions. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md).
- **[Case Study: GitHub 100 Star Growth Analysis](star-prediction-example.md)** - Learn how to use the GitHub Copilot SDK Pipe with Minimax 2.1 to automatically analyze CSV data and generate project growth reports.
- **[Case Study: High-Quality Video to GIF Conversion](video-processing-example.md)** - See how the model uses system-level FFmpeg to accelerate, scale, and optimize colors for screen recordings.

View File

@@ -15,7 +15,7 @@ Pipes 可以用于:
## 可用的 Pipe 插件
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.6.2) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**数据库持久化**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。支持流式输出、打字机思考过程及无限会话。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md)。
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.7.0) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**数据库持久化**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。支持流式输出、打字机思考过程及无限会话。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md)。
- **[实战案例GitHub 100 Star 增长预测](star-prediction-example.zh.md)** - 展示如何使用 GitHub Copilot SDK Pipe 结合 Minimax 2.1 模型,自动编写脚本分析 CSV 数据并生成详细的项目增长报告。
- **[实战案例:视频高质量 GIF 转换与加速](video-processing-example.zh.md)** - 演示模型如何通过底层 FFmpeg 工具对录屏进行加速、缩放及双阶段色彩优化处理。

Binary file not shown.

Before

Width:  |  Height:  |  Size: 752 KiB

After

Width:  |  Height:  |  Size: 200 KiB

View File

@@ -9,6 +9,7 @@ icon_url: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAw
description: Intelligently analyzes text content and generates interactive mind maps to help users structure and visualize knowledge.
"""
import asyncio
import logging
import os
import re
@@ -693,7 +694,7 @@ CSS_TEMPLATE_MINDMAP = """
.content-area {
padding: 0;
flex: 1 1 0;
background: transparent;
background: var(--card-bg-color);
position: relative;
overflow: hidden;
width: 100%;
@@ -1514,6 +1515,7 @@ class Action:
self,
__user__: Optional[Dict[str, Any]],
__event_call__: Optional[Callable[[Any], Awaitable[None]]] = None,
__request__: Optional[Request] = None,
) -> Dict[str, str]:
"""Extract basic user context with safe fallbacks."""
if isinstance(__user__, (list, tuple)):
@@ -1528,20 +1530,36 @@ class Action:
# Default from profile
user_language = user_data.get("language", "en-US")
# Priority: Document Lang > LocalStorage (Frontend) > Browser > Profile (Default)
# Level 1 Fallback: Accept-Language from __request__ headers
if (
__request__
and hasattr(__request__, "headers")
and "accept-language" in __request__.headers
):
raw_lang = __request__.headers.get("accept-language", "")
if raw_lang:
user_language = raw_lang.split(",")[0].split(";")[0]
# Priority: Document Lang > LocalStorage (Frontend) > Browser > Request Header > Profile
if __event_call__:
try:
js_code = """
return (
document.documentElement.lang ||
localStorage.getItem('locale') ||
localStorage.getItem('language') ||
navigator.language ||
'en-US'
);
try {
return (
document.documentElement.lang ||
localStorage.getItem('locale') ||
localStorage.getItem('language') ||
navigator.language ||
'en-US'
);
} catch (e) {
return 'en-US';
}
"""
frontend_lang = await __event_call__(
{"type": "execute", "data": {"code": js_code}}
# Use asyncio.wait_for to prevent hanging if frontend fails to callback
frontend_lang = await asyncio.wait_for(
__event_call__({"type": "execute", "data": {"code": js_code}}),
timeout=2.0,
)
if frontend_lang and isinstance(frontend_lang, str):
user_language = frontend_lang
@@ -2204,7 +2222,7 @@ class Action:
flex-grow: 1;
position: relative;
overflow: hidden;
background: transparent;
background: var(--card-bg-color);
min-height: 0;
width: 100%;
height: 100%;
@@ -2387,7 +2405,7 @@ class Action:
__request__: Optional[Request] = None,
) -> Optional[dict]:
logger.info("Action: Smart Mind Map (v1.0.0) started")
user_ctx = await self._get_user_context(__user__, __event_call__)
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
user_language = user_ctx["user_language"]
user_name = user_ctx["user_name"]
user_id = user_ctx["user_id"]

View File

@@ -1,18 +1,22 @@
# Async Context Compression Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.2.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
## What's new in 1.2.2
## What's new in 1.3.0
- **Critical Fix**: Resolved `TypeError: 'str' object is not callable` caused by variable name conflict in logging function.
- **Compatibility**: Enhanced `params` handling to support Pydantic objects, improving compatibility with different OpenWebUI versions.
- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
---
## Core Features
-**Full i18n Support**: Native localization across 9 languages.
- ✅ Automatic compression triggered by token thresholds.
- ✅ Asynchronous summarization that does not block chat responses.
- ✅ Persistent storage via Open WebUI's shared database connection (PostgreSQL, SQLite, etc.).
@@ -55,8 +59,10 @@ This filter reduces token consumption in long conversations through intelligent
| `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. |
| `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). |
| `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. |
| `debug_mode` | `true` | Log verbose debug info. Set to `false` in production. |
| `debug_mode` | `false` | Log verbose debug info. Set to `false` in production. |
| `show_debug_log` | `false` | Print debug logs to browser console (F12). Useful for frontend debugging. |
| `show_token_usage_status` | `true` | Show token usage status notification in the chat interface. |
| `token_usage_status_threshold` | `80` | The minimum usage percentage (0-100) required to show a context usage status notification. |
---

View File

@@ -1,20 +1,24 @@
# 异步上下文压缩过滤器
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.2.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
## 1.2.2 版本更新
## 1.3.0 版本更新
- **严重错误修复**: 解决了因日志函数变量名冲突导致的 `TypeError: 'str' object is not callable` 错误
- **兼容性增强**: 改进了 `params` 处理逻辑以支持 Pydantic 对象,提高了对不同 OpenWebUI 版本的兼容性
- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化,现已原生支持 9 种语言(含中、英、日、韩及欧洲主要语言)
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门(默认 80%),可以智能控制何时显示 Token 用量状态,减少不必要的打扰
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构完全不影响首字节响应时间TTFB保持毫秒级极速推流。
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩,避免冲突。
- **配置项调整**: 为了提供更安静的生产环境体验,`debug_mode` 现已默认设置为 `false`
---
## 核心特性
-**全方位国际化**: 原生支持 9 种界面语言。
-**自动压缩**: 基于 Token 阈值自动触发上下文压缩。
-**异步摘要**: 后台生成摘要,不阻塞当前对话响应。
-**持久化存储**: 复用 Open WebUI 共享数据库连接,自动支持 PostgreSQL/SQLite 等。
@@ -27,7 +31,7 @@
-**智能模型匹配**: 自定义模型自动继承基础模型的阈值配置。
-**多模态支持**: 图片内容会被保留,但其 Token **不参与计算**。请相应调整阈值。
详细的工作原理和流程请参考 [工作流程指南](WORKFLOW_GUIDE_CN.md)。
详细的工作原理和流程请参考 [工作流程指南](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/WORKFLOW_GUIDE_CN.md)。
---
@@ -93,9 +97,10 @@
| 参数 | 默认值 | 描述 |
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------- |
| `enable_tool_output_trimming` | `false` | 启用时,若 `function_calling: "native"` 激活,将裁剪冗长的工具输出以仅提取最终答案。 |
| `debug_mode` | `true` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息(如 Token 计数、压缩进度、数据库操作等)。生产环境建议设为 `false`。 |
| `debug_mode` | `false` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息。生产环境默认且建议设为 `false`。 |
| `show_debug_log` | `false` | 是否在浏览器控制台 (F12) 打印调试日志。便于前端调试。 |
| `show_token_usage_status` | `true` | 是否在对话结束时显示 Token 使用情况的状态通知。 |
| `token_usage_status_threshold` | `80` | 触发显示上下文用量状态通知的最低百分比阈值 (0-100)。 |
---

View File

@@ -5,17 +5,17 @@ author: Fu-Jie
author_url: https://github.com/Fu-Jie/openwebui-extensions
funding_url: https://github.com/open-webui
description: Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression.
version: 1.2.2
version: 1.3.0
openwebui_id: b1655bc8-6de9-4cad-8cb5-a6f7829a02ce
license: MIT
═══════════════════════════════════════════════════════════════════════════════
📌 What's new in 1.2.1
📌 What's new in 1.3.0
═══════════════════════════════════════════════════════════════════════════════
✅ Smart Configuration: Automatically detects base model settings for custom models and adds `summary_model_max_context` for independent summary limits.
Performance & Refactoring: Optimized threshold parsing with caching and removed redundant code for better efficiency.
Bug Fixes & Modernization: Fixed `datetime` deprecation warnings and corrected type annotations.
✅ Smart Status Display: Added `token_usage_status_threshold` valve (default 80%) to control when token usage status is shown, reducing unnecessary notifications.
Copilot SDK Integration: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
Improved User Experience: Status messages now only appear when token usage exceeds the configured threshold, keeping the interface cleaner.
═══════════════════════════════════════════════════════════════════════════════
📌 Overview
@@ -150,7 +150,7 @@ summary_temperature
Description: Controls the randomness of the summary generation. Lower values produce more deterministic output.
debug_mode
Default: true
Default: false
Description: Prints detailed debug information to the log. Recommended to set to `false` in production.
show_debug_log
@@ -268,6 +268,7 @@ import hashlib
import time
import contextlib
import logging
from functools import lru_cache
# Setup logger
logger = logging.getLogger(__name__)
@@ -391,6 +392,130 @@ class ChatSummary(owui_Base):
)
TRANSLATIONS = {
"en-US": {
"status_context_usage": "Context Usage (Estimated): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ High Usage",
"status_loaded_summary": "Loaded historical summary (Hidden {count} historical messages)",
"status_context_summary_updated": "Context Summary Updated: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "Generating context summary in background...",
"status_summary_error": "Summary Error: {error}",
"summary_prompt_prefix": "【Previous Summary: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n",
"summary_prompt_suffix": "\n\n---\nBelow is the recent conversation:",
"tool_trimmed": "... [Tool outputs trimmed]\n{content}",
"content_collapsed": "\n... [Content collapsed] ...\n",
},
"zh-CN": {
"status_context_usage": "上下文用量 (预估): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ 用量较高",
"status_loaded_summary": "已加载历史总结 (隐藏了 {count} 条历史消息)",
"status_context_summary_updated": "上下文总结已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "正在后台生成上下文总结...",
"status_summary_error": "总结生成错误: {error}",
"summary_prompt_prefix": "【前情提要:以下是历史对话的总结,仅供上下文参考。请不要回复总结内容本身,直接回答之后最新的问题。】\n\n",
"summary_prompt_suffix": "\n\n---\n以下是最近的对话:",
"tool_trimmed": "... [工具输出已裁剪]\n{content}",
"content_collapsed": "\n... [内容已折叠] ...\n",
},
"zh-HK": {
"status_context_usage": "上下文用量 (預估): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ 用量較高",
"status_loaded_summary": "已載入歷史總結 (隱藏了 {count} 條歷史訊息)",
"status_context_summary_updated": "上下文總結已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "正在後台生成上下文總結...",
"status_summary_error": "總結生成錯誤: {error}",
"summary_prompt_prefix": "【前情提要:以下是歷史對話的總結,僅供上下文參考。請不要回覆總結內容本身,直接回答之後最新的問題。】\n\n",
"summary_prompt_suffix": "\n\n---\n以下是最近的對話:",
"tool_trimmed": "... [工具輸出已裁剪]\n{content}",
"content_collapsed": "\n... [內容已折疊] ...\n",
},
"zh-TW": {
"status_context_usage": "上下文用量 (預估): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ 用量較高",
"status_loaded_summary": "已載入歷史總結 (隱藏了 {count} 條歷史訊息)",
"status_context_summary_updated": "上下文總結已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "正在後台生成上下文總結...",
"status_summary_error": "總結生成錯誤: {error}",
"summary_prompt_prefix": "【前情提要:以下是歷史對話的總結,僅供上下文参考。請不要回覆總結內容本身,直接回答之後最新的問題。】\n\n",
"summary_prompt_suffix": "\n\n---\n以下是最近的對話:",
"tool_trimmed": "... [工具輸出已裁剪]\n{content}",
"content_collapsed": "\n... [內容已折疊] ...\n",
},
"ja-JP": {
"status_context_usage": "コンテキスト使用量 (推定): {tokens} / {max_tokens} トークン ({ratio}%)",
"status_high_usage": " | ⚠️ 使用量高",
"status_loaded_summary": "履歴の要約を読み込みました ({count} 件の履歴メッセージを非表示)",
"status_context_summary_updated": "コンテキストの要約が更新されました: {tokens} / {max_tokens} トークン ({ratio}%)",
"status_generating_summary": "バックグラウンドでコンテキスト要約を生成しています...",
"status_summary_error": "要約エラー: {error}",
"summary_prompt_prefix": "【これまでのあらすじ:以下は過去の会話の要約であり、コンテキストの参考としてのみ提供されます。要約の内容自体には返答せず、その後の最新の質問に直接答えてください。】\n\n",
"summary_prompt_suffix": "\n\n---\n以下は最近の会話です:",
"tool_trimmed": "... [ツールの出力をトリミングしました]\n{content}",
"content_collapsed": "\n... [コンテンツが折りたたまれました] ...\n",
},
"ko-KR": {
"status_context_usage": "컨텍스트 사용량 (예상): {tokens} / {max_tokens} 토큰 ({ratio}%)",
"status_high_usage": " | ⚠️ 사용량 높음",
"status_loaded_summary": "이전 요약 불러옴 ({count}개의 이전 메시지 숨김)",
"status_context_summary_updated": "컨텍스트 요약 업데이트됨: {tokens} / {max_tokens} 토큰 ({ratio}%)",
"status_generating_summary": "백그라운드에서 컨텍스트 요약 생성 중...",
"status_summary_error": "요약 오류: {error}",
"summary_prompt_prefix": "【이전 요약: 다음은 이전 대화의 요약이며 문맥 참고용으로만 제공됩니다. 요약 내용 자체에 답하지 말고 последу의 최신 질문에 직접 답하세요.】\n\n",
"summary_prompt_suffix": "\n\n---\n다음은 최근 대화입니다:",
"tool_trimmed": "... [도구 출력 잘림]\n{content}",
"content_collapsed": "\n... [내용 접힘] ...\n",
},
"fr-FR": {
"status_context_usage": "Utilisation du contexte (estimée) : {tokens} / {max_tokens} jetons ({ratio}%)",
"status_high_usage": " | ⚠️ Utilisation élevée",
"status_loaded_summary": "Résumé historique chargé ({count} messages d'historique masqués)",
"status_context_summary_updated": "Résumé du contexte mis à jour : {tokens} / {max_tokens} jetons ({ratio}%)",
"status_generating_summary": "Génération du résumé du contexte en arrière-plan...",
"status_summary_error": "Erreur de résumé : {error}",
"summary_prompt_prefix": "【Résumé précédent : Ce qui suit est un résumé de la conversation historique, fourni uniquement pour le contexte. Ne répondez pas au contenu du résumé lui-même ; répondez directement aux dernières questions.】\n\n",
"summary_prompt_suffix": "\n\n---\nVoici la conversation récente :",
"tool_trimmed": "... [Sorties d'outils coupées]\n{content}",
"content_collapsed": "\n... [Contenu réduit] ...\n",
},
"de-DE": {
"status_context_usage": "Kontextnutzung (geschätzt): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ Hohe Nutzung",
"status_loaded_summary": "Historische Zusammenfassung geladen ({count} historische Nachrichten ausgeblendet)",
"status_context_summary_updated": "Kontextzusammenfassung aktualisiert: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "Kontextzusammenfassung wird im Hintergrund generiert...",
"status_summary_error": "Zusammenfassungsfehler: {error}",
"summary_prompt_prefix": "【Vorherige Zusammenfassung: Das Folgende ist eine Zusammenfassung der historischen Konversation, die nur als Kontext dient. Antworten Sie nicht auf den Inhalt der Zusammenfassung selbst, sondern direkt auf die nachfolgenden neuesten Fragen.】\n\n",
"summary_prompt_suffix": "\n\n---\nHier ist die jüngste Konversation:",
"tool_trimmed": "... [Werkzeugausgaben gekürzt]\n{content}",
"content_collapsed": "\n... [Inhalt ausgeblendet] ...\n",
},
"es-ES": {
"status_context_usage": "Uso del contexto (estimado): {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_high_usage": " | ⚠️ Uso elevado",
"status_loaded_summary": "Resumen histórico cargado ({count} mensajes históricos ocultos)",
"status_context_summary_updated": "Resumen del contexto actualizado: {tokens} / {max_tokens} Tokens ({ratio}%)",
"status_generating_summary": "Generando resumen del contexto en segundo plano...",
"status_summary_error": "Error de resumen: {error}",
"summary_prompt_prefix": "【Resumen anterior: El siguiente es un resumen de la conversación histórica, proporcionado solo como contexto. No responda al contenido del resumen en sí; responda directamente a las preguntas más recientes.】\n\n",
"summary_prompt_suffix": "\n\n---\nA continuación se muestra la conversación reciente:",
"tool_trimmed": "... [Salidas de herramientas recortadas]\n{content}",
"content_collapsed": "\n... [Contenido contraído] ...\n",
},
"it-IT": {
"status_context_usage": "Utilizzo contesto (stimato): {tokens} / {max_tokens} Token ({ratio}%)",
"status_high_usage": " | ⚠️ Utilizzo elevato",
"status_loaded_summary": "Riepilogo storico caricato ({count} messaggi storici nascosti)",
"status_context_summary_updated": "Riepilogo contesto aggiornato: {tokens} / {max_tokens} Token ({ratio}%)",
"status_generating_summary": "Generazione riepilogo contesto in background...",
"status_summary_error": "Errore riepilogo: {error}",
"summary_prompt_prefix": "【Riepilogo precedente: Il seguente è un riepilogo della conversazione storica, fornito solo per contesto. Non rispondere al contenuto del riepilogo stesso; rispondi direttamente alle domande più recenti.】\n\n",
"summary_prompt_suffix": "\n\n---\nDi seguito è riportata la conversazione recente:",
"tool_trimmed": "... [Output degli strumenti tagliati]\n{content}",
"content_collapsed": "\n... [Contenuto compresso] ...\n",
},
}
# Global cache for tiktoken encoding
TIKTOKEN_ENCODING = None
if tiktoken:
@@ -400,6 +525,26 @@ if tiktoken:
logger.error(f"[Init] Failed to load tiktoken encoding: {e}")
@lru_cache(maxsize=1024)
def _get_cached_tokens(text: str) -> int:
"""Calculates tokens with LRU caching for exact string matches."""
if not text:
return 0
if TIKTOKEN_ENCODING:
try:
# tiktoken logic is relatively fast, but caching it based on exact string match
# turns O(N) encoding time to O(1) dictionary lookup for historical messages.
return len(TIKTOKEN_ENCODING.encode(text))
except Exception as e:
logger.warning(
f"[Token Count] tiktoken error: {e}, falling back to character estimation"
)
pass
# Fallback strategy: Rough estimation (1 token ≈ 4 chars)
return len(text) // 4
class Filter:
def __init__(self):
self.valves = self.Valves()
@@ -409,8 +554,105 @@ class Filter:
sessionmaker(bind=self._db_engine) if self._db_engine else None
)
self._model_thresholds_cache: Optional[Dict[str, Any]] = None
# Fallback mapping for variants not in TRANSLATIONS keys
self.fallback_map = {
"es-AR": "es-ES",
"es-MX": "es-ES",
"fr-CA": "fr-FR",
"en-CA": "en-US",
"en-GB": "en-US",
"en-AU": "en-US",
"de-AT": "de-DE",
}
self._init_database()
def _resolve_language(self, lang: str) -> str:
"""Resolve the best matching language code from the TRANSLATIONS dict."""
target_lang = lang
# 1. Direct match
if target_lang in TRANSLATIONS:
return target_lang
# 2. Variant fallback (explicit mapping)
if target_lang in self.fallback_map:
target_lang = self.fallback_map[target_lang]
if target_lang in TRANSLATIONS:
return target_lang
# 3. Base language fallback (e.g. fr-BE -> fr-FR)
if "-" in lang:
base_lang = lang.split("-")[0]
for supported_lang in TRANSLATIONS:
if supported_lang.startswith(base_lang + "-"):
return supported_lang
# 4. Final Fallback to en-US
return "en-US"
def _get_translation(self, lang: str, key: str, **kwargs) -> str:
"""Get translated string for the given language and key."""
target_lang = self._resolve_language(lang)
lang_dict = TRANSLATIONS.get(target_lang, TRANSLATIONS["en-US"])
text = lang_dict.get(key, TRANSLATIONS["en-US"].get(key, key))
if kwargs:
try:
text = text.format(**kwargs)
except Exception as e:
logger.warning(f"Translation formatting failed for {key}: {e}")
return text
async def _get_user_context(
self,
__user__: Optional[Dict[str, Any]],
__event_call__: Optional[Callable[[Any], Awaitable[None]]] = None,
) -> Dict[str, str]:
"""Extract basic user context with safe fallbacks."""
if isinstance(__user__, (list, tuple)):
user_data = __user__[0] if __user__ else {}
elif isinstance(__user__, dict):
user_data = __user__
else:
user_data = {}
user_id = user_data.get("id", "unknown_user")
user_name = user_data.get("name", "User")
user_language = user_data.get("language", "en-US")
if __event_call__:
try:
js_code = """
return (
document.documentElement.lang ||
localStorage.getItem('locale') ||
localStorage.getItem('language') ||
navigator.language ||
'en-US'
);
"""
frontend_lang = await asyncio.wait_for(
__event_call__({"type": "execute", "data": {"code": js_code}}),
timeout=1.0,
)
if frontend_lang and isinstance(frontend_lang, str):
user_language = frontend_lang
except asyncio.TimeoutError:
logger.warning(
"Failed to retrieve frontend language: Timeout (using fallback)"
)
except Exception as e:
logger.warning(
f"Failed to retrieve frontend language: {type(e).__name__}: {e}"
)
return {
"user_id": user_id,
"user_name": user_name,
"user_language": user_language,
}
def _parse_model_thresholds(self) -> Dict[str, Any]:
"""Parse model_thresholds string into a dictionary.
@@ -574,7 +816,7 @@ class Filter:
description="The temperature for summary generation.",
)
debug_mode: bool = Field(
default=True, description="Enable detailed logging for debugging."
default=False, description="Enable detailed logging for debugging."
)
show_debug_log: bool = Field(
default=False, description="Show debug logs in the frontend console"
@@ -582,6 +824,12 @@ class Filter:
show_token_usage_status: bool = Field(
default=True, description="Show token usage status notification"
)
token_usage_status_threshold: int = Field(
default=80,
ge=0,
le=100,
description="Only show token usage status when usage exceeds this percentage (0-100). Set to 0 to always show.",
)
enable_tool_output_trimming: bool = Field(
default=False,
description="Enable trimming of large tool outputs (only works with native function calling).",
@@ -654,20 +902,7 @@ class Filter:
def _count_tokens(self, text: str) -> int:
"""Counts the number of tokens in the text."""
if not text:
return 0
if TIKTOKEN_ENCODING:
try:
return len(TIKTOKEN_ENCODING.encode(text))
except Exception as e:
if self.valves.debug_mode:
logger.warning(
f"[Token Count] tiktoken error: {e}, falling back to character estimation"
)
# Fallback strategy: Rough estimation (1 token ≈ 4 chars)
return len(text) // 4
return _get_cached_tokens(text)
def _calculate_messages_tokens(self, messages: List[Dict]) -> int:
"""Calculates the total tokens for a list of messages."""
@@ -693,6 +928,20 @@ class Filter:
return total_tokens
def _estimate_messages_tokens(self, messages: List[Dict]) -> int:
"""Fast estimation of tokens based on character count (1/4 ratio)."""
total_chars = 0
for msg in messages:
content = msg.get("content", "")
if isinstance(content, list):
for part in content:
if isinstance(part, dict) and part.get("type") == "text":
total_chars += len(part.get("text", ""))
else:
total_chars += len(str(content))
return total_chars // 4
def _get_model_thresholds(self, model_id: str) -> Dict[str, int]:
"""Gets threshold configuration for a specific model.
@@ -830,11 +1079,13 @@ class Filter:
}})();
"""
await __event_call__(
{
"type": "execute",
"data": {"code": js_code},
}
asyncio.create_task(
__event_call__(
{
"type": "execute",
"data": {"code": js_code},
}
)
)
except Exception as e:
logger.error(f"Error emitting debug log: {e}")
@@ -876,17 +1127,55 @@ class Filter:
js_code = f"""
console.log("%c[Compression] {safe_message}", "{css}");
"""
# Add timeout to prevent blocking if frontend connection is broken
await asyncio.wait_for(
event_call({"type": "execute", "data": {"code": js_code}}),
timeout=2.0,
)
except asyncio.TimeoutError:
logger.warning(
f"Failed to emit log to frontend: Timeout (connection may be broken)"
asyncio.create_task(
event_call({"type": "execute", "data": {"code": js_code}})
)
except Exception as e:
logger.error(f"Failed to emit log to frontend: {type(e).__name__}: {e}")
logger.error(
f"Failed to process log to frontend: {type(e).__name__}: {e}"
)
def _should_show_status(self, usage_ratio: float) -> bool:
"""
Check if token usage status should be shown based on threshold.
Args:
usage_ratio: Current usage ratio (0.0 to 1.0)
Returns:
True if status should be shown, False otherwise
"""
if not self.valves.show_token_usage_status:
return False
# If threshold is 0, always show
if self.valves.token_usage_status_threshold == 0:
return True
# Check if usage exceeds threshold
threshold_ratio = self.valves.token_usage_status_threshold / 100.0
return usage_ratio >= threshold_ratio
def _should_skip_compression(
self, body: dict, __model__: Optional[dict] = None
) -> bool:
"""
Check if compression should be skipped.
Returns True if:
1. The base model includes 'copilot_sdk'
"""
# Check if base model includes copilot_sdk
if __model__:
base_model_id = __model__.get("base_model_id", "")
if "copilot_sdk" in base_model_id.lower():
return True
# Also check model in body
model_id = body.get("model", "")
if "copilot_sdk" in model_id.lower():
return True
return False
async def inlet(
self,
@@ -903,6 +1192,19 @@ class Filter:
Compression Strategy: Only responsible for injecting existing summaries, no Token calculation.
"""
# Check if compression should be skipped (e.g., for copilot_sdk)
if self._should_skip_compression(body, __model__):
if self.valves.debug_mode:
logger.info(
"[Inlet] Skipping compression: copilot_sdk detected in base model"
)
if self.valves.show_debug_log and __event_call__:
await self._log(
"[Inlet] ⏭️ Skipping compression: copilot_sdk detected",
event_call=__event_call__,
)
return body
messages = body.get("messages", [])
# --- Native Tool Output Trimming (Opt-in, only for native function calling) ---
@@ -966,8 +1268,14 @@ class Filter:
final_answer = content[last_match_end:].strip()
if final_answer:
msg["content"] = (
f"... [Tool outputs trimmed]\n{final_answer}"
msg["content"] = self._get_translation(
(
__user__.get("language", "en-US")
if __user__
else "en-US"
),
"tool_trimmed",
content=final_answer,
)
trimmed_count += 1
else:
@@ -980,8 +1288,14 @@ class Filter:
if len(parts) > 1:
final_answer = parts[-1].strip()
if final_answer:
msg["content"] = (
f"... [Tool outputs trimmed]\n{final_answer}"
msg["content"] = self._get_translation(
(
__user__.get("language", "en-US")
if __user__
else "en-US"
),
"tool_trimmed",
content=final_answer,
)
trimmed_count += 1
@@ -1173,6 +1487,10 @@ class Filter:
# Target is to compress up to the (total - keep_last) message
target_compressed_count = max(0, len(messages) - self.valves.keep_last)
# Get user context for i18n
user_ctx = await self._get_user_context(__user__, __event_call__)
lang = user_ctx["user_language"]
await self._log(
f"[Inlet] Recorded target compression progress: {target_compressed_count}",
event_call=__event_call__,
@@ -1207,10 +1525,9 @@ class Filter:
# 2. Summary message (Inserted as Assistant message)
summary_content = (
f"【Previous Summary: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n"
f"{summary_record.summary}\n\n"
f"---\n"
f"Below is the recent conversation:"
self._get_translation(lang, "summary_prompt_prefix")
+ f"{summary_record.summary}"
+ self._get_translation(lang, "summary_prompt_suffix")
)
summary_msg = {"role": "assistant", "content": summary_content}
@@ -1249,16 +1566,27 @@ class Filter:
"max_context_tokens", self.valves.max_context_tokens
)
# Calculate total tokens
total_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, calc_messages
)
# --- Fast Estimation Check ---
estimated_tokens = self._estimate_messages_tokens(calc_messages)
# Preflight Check Log
await self._log(
f"[Inlet] 🔎 Preflight Check: {total_tokens}t / {max_context_tokens}t ({(total_tokens/max_context_tokens*100):.1f}%)",
event_call=__event_call__,
)
# Since this is a hard limit check, only skip precise calculation if we are far below it (margin of 15%)
if estimated_tokens < max_context_tokens * 0.85:
total_tokens = estimated_tokens
await self._log(
f"[Inlet] 🔎 Fast Preflight Check (Est): {total_tokens}t / {max_context_tokens}t (Well within limit)",
event_call=__event_call__,
)
else:
# Calculate exact total tokens via tiktoken
total_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, calc_messages
)
# Preflight Check Log
await self._log(
f"[Inlet] 🔎 Precise Preflight Check: {total_tokens}t / {max_context_tokens}t ({(total_tokens/max_context_tokens*100):.1f}%)",
event_call=__event_call__,
)
# If over budget, reduce history (Keep Last)
if total_tokens > max_context_tokens:
@@ -1325,7 +1653,9 @@ class Filter:
first_line_found = True
# Add placeholder if there's more content coming
if idx < last_line_idx:
kept_lines.append("\n... [Content collapsed] ...\n")
kept_lines.append(
self._get_translation(lang, "content_collapsed")
)
continue
# Keep last non-empty line
@@ -1347,8 +1677,13 @@ class Filter:
target_msg["metadata"]["is_trimmed"] = True
# Calculate token reduction
old_tokens = self._count_tokens(content)
new_tokens = self._count_tokens(target_msg["content"])
# Use current token strategy
if total_tokens == estimated_tokens:
old_tokens = len(content) // 4
new_tokens = len(target_msg["content"]) // 4
else:
old_tokens = self._count_tokens(content)
new_tokens = self._count_tokens(target_msg["content"])
diff = old_tokens - new_tokens
total_tokens -= diff
@@ -1362,7 +1697,12 @@ class Filter:
# Strategy 2: Fallback - Drop Oldest Message Entirely (FIFO)
# (User requested to remove progressive trimming for other cases)
dropped = tail_messages.pop(0)
dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
if total_tokens == estimated_tokens:
dropped_tokens = len(str(dropped.get("content", ""))) // 4
else:
dropped_tokens = self._count_tokens(
str(dropped.get("content", ""))
)
total_tokens -= dropped_tokens
if self.valves.show_debug_log and __event_call__:
@@ -1382,14 +1722,24 @@ class Filter:
final_messages = candidate_messages
# Calculate detailed token stats for logging
system_tokens = (
self._count_tokens(system_prompt_msg.get("content", ""))
if system_prompt_msg
else 0
)
head_tokens = self._calculate_messages_tokens(head_messages)
summary_tokens = self._count_tokens(summary_content)
tail_tokens = self._calculate_messages_tokens(tail_messages)
if total_tokens == estimated_tokens:
system_tokens = (
len(system_prompt_msg.get("content", "")) // 4
if system_prompt_msg
else 0
)
head_tokens = self._estimate_messages_tokens(head_messages)
summary_tokens = len(summary_content) // 4
tail_tokens = self._estimate_messages_tokens(tail_messages)
else:
system_tokens = (
self._count_tokens(system_prompt_msg.get("content", ""))
if system_prompt_msg
else 0
)
head_tokens = self._calculate_messages_tokens(head_messages)
summary_tokens = self._count_tokens(summary_content)
tail_tokens = self._calculate_messages_tokens(tail_messages)
system_info = (
f"System({system_tokens}t)" if system_prompt_msg else "System(0t)"
@@ -1408,22 +1758,43 @@ class Filter:
# Prepare status message (Context Usage format)
if max_context_tokens > 0:
usage_ratio = total_section_tokens / max_context_tokens
status_msg = f"Context Usage (Estimated): {total_section_tokens} / {max_context_tokens} Tokens ({usage_ratio*100:.1f}%)"
if usage_ratio > 0.9:
status_msg += " | ⚠️ High Usage"
else:
status_msg = f"Loaded historical summary (Hidden {compressed_count} historical messages)"
# Only show status if threshold is met
if self._should_show_status(usage_ratio):
status_msg = self._get_translation(
lang,
"status_context_usage",
tokens=total_section_tokens,
max_tokens=max_context_tokens,
ratio=f"{usage_ratio*100:.1f}",
)
if usage_ratio > 0.9:
status_msg += self._get_translation(lang, "status_high_usage")
if __event_emitter__:
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
if __event_emitter__:
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
else:
# For the case where max_context_tokens is 0, show summary info without threshold check
if self.valves.show_token_usage_status and __event_emitter__:
status_msg = self._get_translation(
lang, "status_loaded_summary", count=compressed_count
)
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
# Emit debug log to frontend (Keep the structured log as well)
await self._emit_debug_log(
@@ -1454,9 +1825,20 @@ class Filter:
"max_context_tokens", self.valves.max_context_tokens
)
total_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, calc_messages
)
# --- Fast Estimation Check ---
estimated_tokens = self._estimate_messages_tokens(calc_messages)
# Only skip precise calculation if we are clearly below the limit
if estimated_tokens < max_context_tokens * 0.85:
total_tokens = estimated_tokens
await self._log(
f"[Inlet] 🔎 Fast limit check (Est): {total_tokens}t / {max_context_tokens}t",
event_call=__event_call__,
)
else:
total_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, calc_messages
)
if total_tokens > max_context_tokens:
await self._log(
@@ -1476,7 +1858,12 @@ class Filter:
> start_trim_index + 1 # Keep at least 1 message after keep_first
):
dropped = final_messages.pop(start_trim_index)
dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
if total_tokens == estimated_tokens:
dropped_tokens = len(str(dropped.get("content", ""))) // 4
else:
dropped_tokens = self._count_tokens(
str(dropped.get("content", ""))
)
total_tokens -= dropped_tokens
await self._log(
@@ -1485,23 +1872,30 @@ class Filter:
)
# Send status notification (Context Usage format)
if __event_emitter__:
status_msg = f"Context Usage (Estimated): {total_tokens} / {max_context_tokens} Tokens"
if max_context_tokens > 0:
usage_ratio = total_tokens / max_context_tokens
status_msg += f" ({usage_ratio*100:.1f}%)"
if max_context_tokens > 0:
usage_ratio = total_tokens / max_context_tokens
# Only show status if threshold is met
if self._should_show_status(usage_ratio):
status_msg = self._get_translation(
lang,
"status_context_usage",
tokens=total_tokens,
max_tokens=max_context_tokens,
ratio=f"{usage_ratio*100:.1f}",
)
if usage_ratio > 0.9:
status_msg += " | ⚠️ High Usage"
status_msg += self._get_translation(lang, "status_high_usage")
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
if __event_emitter__:
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
body["messages"] = final_messages
@@ -1517,6 +1911,7 @@ class Filter:
body: dict,
__user__: Optional[dict] = None,
__metadata__: dict = None,
__model__: dict = None,
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
__event_call__: Callable[[Any], Awaitable[None]] = None,
) -> dict:
@@ -1524,6 +1919,23 @@ class Filter:
Executed after the LLM response is complete.
Calculates Token count in the background and triggers summary generation (does not block current response, does not affect content output).
"""
# Check if compression should be skipped (e.g., for copilot_sdk)
if self._should_skip_compression(body, __model__):
if self.valves.debug_mode:
logger.info(
"[Outlet] Skipping compression: copilot_sdk detected in base model"
)
if self.valves.show_debug_log and __event_call__:
await self._log(
"[Outlet] ⏭️ Skipping compression: copilot_sdk detected",
event_call=__event_call__,
)
return body
# Get user context for i18n
user_ctx = await self._get_user_context(__user__, __event_call__)
lang = user_ctx["user_language"]
chat_ctx = self._get_chat_context(body, __metadata__)
chat_id = chat_ctx["chat_id"]
if not chat_id:
@@ -1547,6 +1959,7 @@ class Filter:
body,
__user__,
target_compressed_count,
lang,
__event_emitter__,
__event_call__,
)
@@ -1561,6 +1974,7 @@ class Filter:
body: dict,
user_data: Optional[dict],
target_compressed_count: Optional[int],
lang: str = "en-US",
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
__event_call__: Callable[[Any], Awaitable[None]] = None,
):
@@ -1595,37 +2009,58 @@ class Filter:
event_call=__event_call__,
)
# Calculate Token count in a background thread
current_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, messages
)
# --- Fast Estimation Check ---
estimated_tokens = self._estimate_messages_tokens(messages)
await self._log(
f"[🔍 Background Calculation] Token count: {current_tokens}",
event_call=__event_call__,
)
# For triggering summary generation, we need to be more precise if we are in the grey zone
# Margin is 15% (skip tiktoken if estimated is < 85% of threshold)
# Note: We still use tiktoken if we exceed threshold, because we want an accurate usage status report
if estimated_tokens < compression_threshold_tokens * 0.85:
current_tokens = estimated_tokens
await self._log(
f"[🔍 Background Calculation] Fast estimate ({current_tokens}) is well below threshold ({compression_threshold_tokens}). Skipping tiktoken.",
event_call=__event_call__,
)
else:
# Calculate Token count precisely in a background thread
current_tokens = await asyncio.to_thread(
self._calculate_messages_tokens, messages
)
await self._log(
f"[🔍 Background Calculation] Precise token count: {current_tokens}",
event_call=__event_call__,
)
# Send status notification (Context Usage format)
if __event_emitter__ and self.valves.show_token_usage_status:
if __event_emitter__:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
status_msg = f"Context Usage (Estimated): {current_tokens} / {max_context_tokens} Tokens"
if max_context_tokens > 0:
usage_ratio = current_tokens / max_context_tokens
status_msg += f" ({usage_ratio*100:.1f}%)"
if usage_ratio > 0.9:
status_msg += " | ⚠️ High Usage"
# Only show status if threshold is met
if self._should_show_status(usage_ratio):
status_msg = self._get_translation(
lang,
"status_context_usage",
tokens=current_tokens,
max_tokens=max_context_tokens,
ratio=f"{usage_ratio*100:.1f}",
)
if usage_ratio > 0.9:
status_msg += self._get_translation(
lang, "status_high_usage"
)
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
# Check if compression is needed
if current_tokens >= compression_threshold_tokens:
@@ -1642,6 +2077,7 @@ class Filter:
body,
user_data,
target_compressed_count,
lang,
__event_emitter__,
__event_call__,
)
@@ -1672,6 +2108,7 @@ class Filter:
body: dict,
user_data: Optional[dict],
target_compressed_count: Optional[int],
lang: str = "en-US",
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
__event_call__: Callable[[Any], Awaitable[None]] = None,
):
@@ -1811,7 +2248,9 @@ class Filter:
{
"type": "status",
"data": {
"description": "Generating context summary in background...",
"description": self._get_translation(
lang, "status_generating_summary"
),
"done": False,
},
}
@@ -1849,7 +2288,11 @@ class Filter:
{
"type": "status",
"data": {
"description": f"Context summary updated (Compressed {len(middle_messages)} messages)",
"description": self._get_translation(
lang,
"status_loaded_summary",
count=len(middle_messages),
),
"done": True,
},
}
@@ -1910,10 +2353,9 @@ class Filter:
# Summary
summary_content = (
f"【System Prompt: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n"
f"{new_summary}\n\n"
f"---\n"
f"Below is the recent conversation:"
self._get_translation(lang, "summary_prompt_prefix")
+ f"{new_summary}"
+ self._get_translation(lang, "summary_prompt_suffix")
)
summary_msg = {"role": "assistant", "content": summary_content}
@@ -1943,23 +2385,32 @@ class Filter:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
# 6. Emit Status
status_msg = f"Context Summary Updated: {token_count} / {max_context_tokens} Tokens"
# 6. Emit Status (only if threshold is met)
if max_context_tokens > 0:
ratio = (token_count / max_context_tokens) * 100
status_msg += f" ({ratio:.1f}%)"
if ratio > 90.0:
status_msg += " | ⚠️ High Usage"
usage_ratio = token_count / max_context_tokens
# Only show status if threshold is met
if self._should_show_status(usage_ratio):
status_msg = self._get_translation(
lang,
"status_context_summary_updated",
tokens=token_count,
max_tokens=max_context_tokens,
ratio=f"{usage_ratio*100:.1f}",
)
if usage_ratio > 0.9:
status_msg += self._get_translation(
lang, "status_high_usage"
)
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
except Exception as e:
await self._log(
f"[Status] Error calculating tokens: {e}",
@@ -1979,7 +2430,9 @@ class Filter:
{
"type": "status",
"data": {
"description": f"Summary Error: {str(e)[:100]}...",
"description": self._get_translation(
lang, "status_summary_error", error=str(e)[:100]
),
"done": True,
},
}

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK Pipe for OpenWebUI
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.6.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.7.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
@@ -14,12 +14,13 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
---
## ✨ v0.6.2 Updates (What's New)
## ✨ v0.7.0 Updates (What's New)
- **🛠️ New Workspace Artifacts Tool**: Introduced `publish_file_from_workspace`. Agents can now generate files (e.g., Python-generated Excel/CSV) and provide direct download links for the user to click and save.
- **⚙️ Workflow Optimization**: Improved reliability of the internal agentic workspace management.
- **🛡️ Enhanced Security**: Refined access control for system resources within the isolated environment.
- **🔧 Performance Tuning**: Optimized stream processing for larger context windows.
- **🚀 Integrated CLI Management**: The Copilot CLI is now automatically managed and bundled via the `github-copilot-sdk` pip package. No more manual `curl | bash` installation or version mismatches. (v0.7.0)
- **🧠 Native Tool Call UI**: Full adaptation to **OpenWebUI's native tool call UI** and thinking process visualization. (v0.7.0)
- **🏠 OpenWebUI v0.8.0+ Fix**: Resolved "Error getting file content" download failure by switching to absolute path registration for published files. (v0.7.0)
- **🌐 Comprehensive Multi-language Support**: Native localization for status messages in 11 languages (EN, ZH, JA, KO, FR, DE, ES, IT, RU, VI, ID). (v0.7.0)
- **🧹 Architecture Cleanup**: Refactored core setup and optimized reasoning status display for a leaner experience. (v0.7.0)
---
@@ -31,8 +32,8 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
- **♾️ Infinite Session Management**: Smart context window management with automatic compaction for indefinite conversation capability.
- **🧠 Deep Database Integration**: Real-time persistence of TOD·O lists for long-running workflows.
- **🌊 Advanced Streaming**: Full support for thinking process/Chain of Thought visualization.
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support.
- **⚡ Full-Lifecycle File Agent**: Supports receiving uploaded files for raw bypass analysis and publishing results (Excel/reports) as downloadable links.
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support (bypasses RAG for direct binary access).
- **📤 Workspace Artifacts (`publish_file_from_workspace`)**: Agents can generate files (Excel, CSV, HTML reports, etc.) and provide **persistent download links** directly in the chat.
- **🖼️ Interactive Artifacts**: Automatically renders HTML/JS apps generated by the agent directly in the chat interface.
---
@@ -110,7 +111,7 @@ If this plugin has been useful, a **Star** on [OpenWebUI Extensions](https://git
- **Agent ignores files?**: Ensure the Files Filter is enabled, otherwise RAG will interfere with raw binaries.
- **No progress bar?**: The bar only appears when the Agent uses the `update_todo` tool.
- **Dependencies**: This Pipe automatically installs `github-copilot-sdk` (Python) and `github-copilot-cli` (Binary).
- **Dependencies**: This Pipe automatically manages `github-copilot-sdk` (Python) and utilizes the bundled binary CLI. No manual install required.
---

View File

@@ -1,6 +1,6 @@
# GitHub Copilot SDK 官方管道
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.6.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.7.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商OpenAI, Anthropic并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
@@ -14,12 +14,13 @@
---
## ✨ 0.6.2 更新内容 (What's New)
## ✨ 0.7.0 更新内容 (What's New)
- **🛠️ 新增工作区产物工具**: 引入 `publish_file_from_workspace`。Agent 现在可以生成物理文件(如使用 Python 生成的 Excel/CSV 报表),并直接在聊天界面提供点击下载链接。
- **⚙️ 工作流优化**: 提升了内部 Agent 物理工作区管理的可靠性与原子性。
- **🛡️ 安全增强**: 精细化了隔离环境下系统资源的访问控制策略。
- **🔧 性能微调**: 针对大上下文窗口优化了流式数据处理性能。
- **🚀 CLI 免维护集成**: Copilot CLI 现在通过 `github-copilot-sdk` pip 包自动同步管理,彻底告别手动 `curl | bash` 安装及版本不匹配问题。(v0.7.0)
- **🧠 原生工具调用 UI**: 全面适配 **OpenWebUI 原生工具调用 UI** 与模型思考过程(思维链)展示。(v0.7.0)
- **🏠 OpenWebUI v0.8.0+ 兼容性修复**: 通过切换为绝对路径注册发布文件彻底解决了“Error getting file content”无法下载到本地的问题。(v0.7.0)
- **🌐 全面的多语言支持**: 针对状态消息进行了 11 国语言的原生本地化 (中/英/日/韩/法/德/西/意/俄/越/印尼)。(v0.7.0)
- **🧹 架构精简**: 重构了初始化逻辑并优化了推理状态显示,提供更轻量稳健的体验。(v0.7.0)
---
@@ -31,8 +32,8 @@
- **♾️ 无限会话管理**: 智能上下文窗口管理与自动压缩算法,支持无限时长的对话交互。
- **🧠 深度数据库集成**: 实时持久化 TOD·O 列表到 UI 进度条。
- **🌊 深度推理展示**: 完整支持模型思考过程 (Thinking Process) 的流式渲染。
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析。
- **⚡ 全生命周期文件 Agent**: 支持接收上传文件进行绕过 RAG 的深度分析,并将处理结果(如 Excel/报告)发布为下载链接。
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析(绕过 RAG 直接访问原始二进制内容)
- **📤 工作区产物工具 (`publish_file_from_workspace`)**: Agent 可生成文件Excel、CSV、HTML 报告等)并直接在聊天中提供**持久化下载链接**
- **🖼️ 交互式伪影 (Artifacts)**: 自动渲染 Agent 生成的 HTML/JS 应用程序,直接在聊天界面交互。
---
@@ -95,7 +96,7 @@
### 1) 导入函数
1. 打开 OpenWebUI前往 **工作区** -> **函数**
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk_cn.py` 的内容。
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk.py` 的内容。
3. 点击保存并确保已启用。
### 2) 获取 Token (Get Token)
@@ -114,7 +115,7 @@
- **Agent 无法识别文件?**: 请确保已安装并启用了 Files Filter 插件,否则原始文件会被 RAG 干扰。
- **看不到 TODO 进度条?**: 进度条仅在 Agent 使用 `update_todo` 工具(通常是处理复杂任务)时出现。
- **依赖安装**: 本管道会自动尝试安装 `github-copilot-sdk` (Python 包) `github-copilot-cli` (官方二进制)
- **依赖安装**: 本管道会自动管理 `github-copilot-sdk` (Python 包) 并优先直接使用内置的二进制 CLI无需手动干预
---

View File

@@ -1,12 +1,12 @@
"""
title: GitHub Copilot Official SDK Pipe
author: Fu-Jie
author_url: https://github.com/Fu-Jie/awesome-openwebui
author_url: https://github.com/Fu-Jie/openwebui-extensions
funding_url: https://github.com/open-webui
openwebui_id: ce96f7b4-12fc-4ac3-9a01-875713e69359
description: Integrate GitHub Copilot SDK. Supports dynamic models, multi-turn conversation, streaming, multimodal input, infinite sessions, and frontend debug logging.
version: 0.6.2
requirements: github-copilot-sdk==0.1.23
version: 0.7.0
requirements: github-copilot-sdk==0.1.25
"""
import os
@@ -226,10 +226,7 @@ class Pipe:
default=300,
description="Timeout for each stream chunk (seconds)",
)
COPILOT_CLI_VERSION: str = Field(
default="0.0.406",
description="Specific Copilot CLI version to install/enforce (e.g. '0.0.406'). Leave empty for latest.",
)
EXCLUDE_KEYWORDS: str = Field(
default="",
description="Exclude models containing these keywords (comma separated, e.g.: codex, haiku)",
@@ -360,6 +357,116 @@ class Pipe:
_env_setup_done = False # Track if env setup has been completed
_last_update_check = 0 # Timestamp of last CLI update check
TRANSLATIONS = {
"en-US": {
"status_conn_est": "Connection established, waiting for response...",
"status_reasoning_inj": "Reasoning Effort injected: {effort}",
"debug_agent_working_in": "Agent working in: {path}",
"debug_mcp_servers": "🔌 Connected MCP Servers: {servers}",
"publish_success": "File published successfully.",
"publish_hint_html": "Link: [View {filename}]({view_url}) | [Download]({download_url})",
"publish_hint_default": "Link: [Download {filename}]({download_url})",
},
"zh-CN": {
"status_conn_est": "已建立连接,等待响应...",
"status_reasoning_inj": "已注入推理级别:{effort}",
"debug_agent_working_in": "Agent 工作目录: {path}",
"debug_mcp_servers": "🔌 已连接 MCP 服务器: {servers}",
"publish_success": "文件发布成功。",
"publish_hint_html": "链接: [查看 {filename}]({view_url}) | [下载]({download_url})",
"publish_hint_default": "链接: [下载 {filename}]({download_url})",
},
"zh-HK": {
"status_conn_est": "已建立連接,等待響應...",
"status_reasoning_inj": "已注入推理級別:{effort}",
"debug_agent_working_in": "Agent 工作目錄: {path}",
"debug_mcp_servers": "🔌 已連接 MCP 伺服器: {servers}",
"publish_success": "文件發布成功。",
"publish_hint_html": "連結: [查看 {filename}]({view_url}) | [下載]({download_url})",
"publish_hint_default": "連結: [下載 {filename}]({download_url})",
},
"zh-TW": {
"status_conn_est": "已建立連接,等待響應...",
"status_reasoning_inj": "已注入推理級別:{effort}",
"debug_agent_working_in": "Agent 工作目錄: {path}",
"debug_mcp_servers": "🔌 已連接 MCP 伺服器: {servers}",
"publish_success": "文件發布成功。",
"publish_hint_html": "連結: [查看 {filename}]({view_url}) | [下載]({download_url})",
"publish_hint_default": "連結: [下載 {filename}]({download_url})",
},
"ja-JP": {
"status_conn_est": "接続が確立されました。応答を待っています...",
"status_reasoning_inj": "推論レベルが注入されました:{effort}",
"debug_agent_working_in": "Agent 作業ディレクトリ: {path}",
"debug_mcp_servers": "🔌 接続済み MCP サーバー: {servers}",
},
"ko-KR": {
"status_conn_est": "연결이 설정되었습니다. 응답을 기다리는 중...",
"status_reasoning_inj": "추론 수준 설정됨: {effort}",
"debug_agent_working_in": "Agent 작업 디렉토리: {path}",
"debug_mcp_servers": "🔌 연결된 MCP 서버: {servers}",
},
"fr-FR": {
"status_conn_est": "Connexion établie, en attente de réponse...",
"status_reasoning_inj": "Effort de raisonnement injecté : {effort}",
"debug_agent_working_in": "Répertoire de travail de l'Agent : {path}",
"debug_mcp_servers": "🔌 Serveurs MCP connectés : {servers}",
},
"de-DE": {
"status_conn_est": "Verbindung hergestellt, warte auf Antwort...",
"status_reasoning_inj": "Argumentationsaufwand injiziert: {effort}",
"debug_agent_working_in": "Agent-Arbeitsverzeichnis: {path}",
"debug_mcp_servers": "🔌 Verbundene MCP-Server: {servers}",
},
"es-ES": {
"status_conn_est": "Conexión establecida, esperando respuesta...",
"status_reasoning_inj": "Nivel de razonamiento inyectado: {effort}",
"debug_agent_working_in": "Directorio de trabajo del Agente: {path}",
"debug_mcp_servers": "🔌 Servidores MCP conectados: {servers}",
},
"it-IT": {
"status_conn_est": "Connessione stabilita, in attesa di risposta...",
"status_reasoning_inj": "Livello di ragionamento iniettato: {effort}",
"debug_agent_working_in": "Directory di lavoro dell'Agente: {path}",
"debug_mcp_servers": "🔌 Server MCP connessi: {servers}",
},
"ru-RU": {
"status_conn_est": "Соединение установлено, ожидание ответа...",
"status_reasoning_inj": "Уровень рассуждения внедрен: {effort}",
"debug_agent_working_in": "Рабочий каталог Агента: {path}",
"debug_mcp_servers": "🔌 Подключенные серверы MCP: {servers}",
},
"vi-VN": {
"status_conn_est": "Đã thiết lập kết nối, đang chờ phản hồi...",
"status_reasoning_inj": "Cấp độ suy luận đã được áp dụng: {effort}",
"debug_agent_working_in": "Thư mục làm việc của Agent: {path}",
"debug_mcp_servers": "🔌 Các máy chủ MCP đã kết nối: {servers}",
},
"id-ID": {
"status_conn_est": "Koneksi terjalin, menunggu respons...",
"status_reasoning_inj": "Tingkat penalaran diterapkan: {effort}",
"debug_agent_working_in": "Direktori kerja Agent: {path}",
"debug_mcp_servers": "🔌 Server MCP yang terhubung: {servers}",
},
}
FALLBACK_MAP = {
"zh": "zh-CN",
"zh-TW": "zh-TW",
"zh-HK": "zh-HK",
"en": "en-US",
"en-GB": "en-US",
"ja": "ja-JP",
"ko": "ko-KR",
"fr": "fr-FR",
"de": "de-DE",
"es": "es-ES",
"it": "it-IT",
"ru": "ru-RU",
"vi": "vi-VN",
"id": "id-ID",
}
def __init__(self):
self.type = "pipe"
self.id = "github_copilot_sdk"
@@ -390,6 +497,83 @@ class Pipe:
except Exception as e:
logger.error(f"[Database] ❌ Initialization failed: {str(e)}")
def _resolve_language(self, user_language: str) -> str:
"""Normalize user language code to a supported translation key."""
if not user_language:
return "en-US"
if user_language in self.TRANSLATIONS:
return user_language
lang_base = user_language.split("-")[0]
if user_language in self.FALLBACK_MAP:
return self.FALLBACK_MAP[user_language]
if lang_base in self.FALLBACK_MAP:
return self.FALLBACK_MAP[lang_base]
return "en-US"
def _get_translation(self, lang: str, key: str, **kwargs) -> str:
"""Helper function to get translated string for a key."""
lang_key = self._resolve_language(lang)
trans_map = self.TRANSLATIONS.get(lang_key, self.TRANSLATIONS["en-US"])
text = trans_map.get(key, self.TRANSLATIONS["en-US"].get(key, key))
if kwargs:
try:
text = text.format(**kwargs)
except Exception as e:
logger.warning(f"Translation formatting failed for {key}: {e}")
return text
async def _get_user_context(self, __user__, __event_call__=None, __request__=None):
"""Extract basic user context with safe fallbacks including JS localStorage."""
if isinstance(__user__, (list, tuple)):
user_data = __user__[0] if __user__ else {}
elif isinstance(__user__, dict):
user_data = __user__
else:
user_data = {}
user_id = user_data.get("id", "unknown_user")
user_name = user_data.get("name", "User")
user_language = user_data.get("language", "en-US")
if (
__request__
and hasattr(__request__, "headers")
and "accept-language" in __request__.headers
):
raw_lang = __request__.headers.get("accept-language", "")
if raw_lang:
user_language = raw_lang.split(",")[0].split(";")[0]
if __event_call__:
try:
js_code = """
try {
return (
document.documentElement.lang ||
localStorage.getItem('locale') ||
localStorage.getItem('language') ||
navigator.language ||
'en-US'
);
} catch (e) {
return 'en-US';
}
"""
frontend_lang = await asyncio.wait_for(
__event_call__({"type": "execute", "data": {"code": js_code}}),
timeout=2.0,
)
if frontend_lang and isinstance(frontend_lang, str):
user_language = frontend_lang
except Exception as e:
pass
return {
"user_id": user_id,
"user_name": user_name,
"user_language": user_language,
}
@contextlib.contextmanager
def _db_session(self):
"""Yield a database session using Open WebUI helpers with graceful fallbacks."""
@@ -611,6 +795,8 @@ class Pipe:
user_data = {}
user_id = user_data.get("id") or user_data.get("user_id")
user_lang = user_data.get("language") or "en-US"
is_admin = user_data.get("role") == "admin"
if not user_id:
return None
@@ -746,10 +932,7 @@ class Pipe:
dest_path = Path(UPLOAD_DIR) / f"{file_id}_{safe_filename}"
await asyncio.to_thread(shutil.copy2, target_path, dest_path)
try:
db_path = str(os.path.relpath(dest_path, DATA_DIR))
except:
db_path = str(dest_path)
db_path = str(dest_path)
file_form = FileForm(
id=file_id,
@@ -769,12 +952,37 @@ class Pipe:
# 5. Result
download_url = f"/api/v1/files/{file_id}/content"
view_url = download_url
is_html = safe_filename.lower().endswith(".html")
# For HTML files, if user is admin, provide a direct view link (/content/html)
if is_html and is_admin:
view_url = f"{download_url}/html"
# Localized output
msg = self._get_translation(user_lang, "publish_success")
if is_html and is_admin:
hint = self._get_translation(
user_lang,
"publish_hint_html",
filename=safe_filename,
view_url=view_url,
download_url=download_url,
)
else:
hint = self._get_translation(
user_lang,
"publish_hint_default",
filename=safe_filename,
download_url=download_url,
)
return {
"file_id": file_id,
"filename": safe_filename,
"download_url": download_url,
"message": "File published successfully.",
"hint": f"Link: [Download {safe_filename}]({download_url})",
"message": msg,
"hint": hint,
}
except Exception as e:
return {"error": str(e)}
@@ -1921,10 +2129,6 @@ class Pipe:
"on_post_tool_use": on_post_tool_use,
}
def _get_user_context(self):
"""Helper to get user context (placeholder for future use)."""
return {}
def _get_chat_context(
self,
body: dict,
@@ -2327,25 +2531,11 @@ class Pipe:
token: str = None,
enable_mcp: bool = True,
enable_cache: bool = True,
skip_cli_install: bool = False,
skip_cli_install: bool = False, # Kept for call-site compatibility, no longer used
__event_emitter__=None,
user_lang: str = "en-US",
):
"""Setup environment variables and verify Copilot CLI. Dynamic Token Injection."""
def emit_status_sync(description: str, done: bool = False):
if not __event_emitter__:
return
try:
loop = asyncio.get_running_loop()
loop.create_task(
__event_emitter__(
{
"type": "status",
"data": {"description": description, "done": done},
}
)
)
except Exception:
pass
"""Setup environment variables and resolve Copilot CLI path from SDK bundle."""
# 1. Real-time Token Injection (Always updates on each call)
effective_token = token or self.valves.GH_TOKEN
@@ -2353,8 +2543,6 @@ class Pipe:
os.environ["GH_TOKEN"] = os.environ["GITHUB_TOKEN"] = effective_token
if self._env_setup_done:
# If done, we only sync MCP if called explicitly or in debug mode
# To improve speed, we avoid redundant file I/O here for regular requests
if debug_enabled:
self._sync_mcp_config(
__event_call__,
@@ -2365,186 +2553,46 @@ class Pipe:
return
os.environ["COPILOT_AUTO_UPDATE"] = "false"
self._emit_debug_log_sync(
"Disabled CLI auto-update (COPILOT_AUTO_UPDATE=false)",
__event_call__,
debug_enabled=debug_enabled,
)
# 2. CLI Path Discovery
cli_path = "/usr/local/bin/copilot"
if os.environ.get("COPILOT_CLI_PATH"):
cli_path = os.environ["COPILOT_CLI_PATH"]
target_version = self.valves.COPILOT_CLI_VERSION.strip()
found = False
current_version = None
def get_cli_version(path):
try:
output = (
subprocess.check_output(
[path, "--version"], stderr=subprocess.STDOUT
)
.decode()
.strip()
)
import re
match = re.search(r"(\d+\.\d+\.\d+)", output)
return match.group(1) if match else output
except Exception:
return None
# Check existing version
if os.path.exists(cli_path):
found = True
current_version = get_cli_version(cli_path)
# 2. CLI Path Discovery (priority: env var > PATH > SDK bundle)
cli_path = os.environ.get("COPILOT_CLI_PATH", "")
found = bool(cli_path and os.path.exists(cli_path))
if not found:
sys_path = shutil.which("copilot")
if sys_path:
cli_path = sys_path
found = True
current_version = get_cli_version(cli_path)
if not found:
pkg_path = os.path.join(os.path.dirname(__file__), "bin", "copilot")
if os.path.exists(pkg_path):
cli_path = pkg_path
found = True
current_version = get_cli_version(cli_path)
# 3. Installation/Update Logic
should_install = not found
install_reason = "CLI not found"
if found and target_version:
norm_target = target_version.lstrip("v")
norm_current = current_version.lstrip("v") if current_version else ""
# Only install if target version is GREATER than current version
try:
from packaging.version import parse as parse_version
from copilot.client import _get_bundled_cli_path
if parse_version(norm_target) > parse_version(norm_current):
should_install = True
install_reason = (
f"Upgrade needed ({current_version} -> {target_version})"
)
elif parse_version(norm_target) < parse_version(norm_current):
self._emit_debug_log_sync(
f"Current version ({current_version}) is newer than specified ({target_version}). Skipping downgrade.",
__event_call__,
debug_enabled=debug_enabled,
)
except Exception as e:
# Fallback to string comparison if packaging is not available
if norm_target != norm_current:
should_install = True
install_reason = (
f"Version mismatch ({current_version} != {target_version})"
)
bundled_path = _get_bundled_cli_path()
if bundled_path and os.path.exists(bundled_path):
cli_path = bundled_path
found = True
except ImportError:
pass
if should_install and not skip_cli_install:
self._emit_debug_log_sync(
f"Installing/Updating Copilot CLI: {install_reason}...",
__event_call__,
debug_enabled=debug_enabled,
)
emit_status_sync(
"🔧 正在安装/更新 Copilot CLI首次可能需要 1-3 分钟)...",
done=False,
)
try:
env = os.environ.copy()
if target_version:
env["VERSION"] = target_version
proc = subprocess.Popen(
"curl -fsSL https://gh.io/copilot-install | bash",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
bufsize=1,
env=env,
)
progress_percent = -1
line_count = 0
while True:
raw_line = proc.stdout.readline() if proc.stdout else ""
if raw_line == "" and proc.poll() is not None:
break
line = (raw_line or "").strip()
if not line:
continue
line_count += 1
percent_match = re.search(r"(\d{1,3})%", line)
if percent_match:
try:
pct = int(percent_match.group(1))
if pct >= progress_percent + 5:
progress_percent = pct
emit_status_sync(
f"📦 Copilot CLI 安装中:{pct}%", done=False
)
except Exception:
pass
elif line_count % 20 == 0:
emit_status_sync(
f"📦 Copilot CLI 安装中:{line[:120]}", done=False
)
return_code = proc.wait()
if return_code != 0:
raise subprocess.CalledProcessError(
return_code,
"curl -fsSL https://gh.io/copilot-install | bash",
)
# Re-verify
current_version = get_cli_version(cli_path)
emit_status_sync(
f"✅ Copilot CLI 安装完成v{current_version or target_version or 'latest'}",
done=False,
)
except Exception as e:
self._emit_debug_log_sync(
f"CLI installation failed: {e}",
__event_call__,
debug_enabled=debug_enabled,
)
emit_status_sync(
f"❌ Copilot CLI 安装失败:{str(e)[:120]}",
done=True,
)
elif should_install and skip_cli_install:
self._emit_debug_log_sync(
f"Skipping CLI install during model listing: {install_reason}",
__event_call__,
debug_enabled=debug_enabled,
)
# 4. Finalize
cli_ready = bool(cli_path and os.path.exists(cli_path))
# 3. Finalize
cli_ready = found
if cli_ready:
os.environ["COPILOT_CLI_PATH"] = cli_path
# Add the CLI's parent directory to PATH so subprocesses can invoke `copilot` directly
cli_bin_dir = os.path.dirname(cli_path)
current_path = os.environ.get("PATH", "")
if cli_bin_dir and cli_bin_dir not in current_path.split(os.pathsep):
os.environ["PATH"] = cli_bin_dir + os.pathsep + current_path
self.__class__._env_setup_done = cli_ready
self.__class__._last_update_check = datetime.now().timestamp()
self._emit_debug_log_sync(
f"Environment setup complete. CLI ready={cli_ready}. Path: {cli_path} (v{current_version})",
f"Environment setup complete. CLI ready={cli_ready}. Path: {cli_path}",
__event_call__,
debug_enabled=debug_enabled,
)
if not skip_cli_install:
if cli_ready:
emit_status_sync("✅ Copilot CLI 已就绪", done=True)
else:
emit_status_sync("⚠️ Copilot CLI 尚未就绪,请稍后重试。", done=True)
def _process_attachments(
self,
@@ -2822,6 +2870,9 @@ class Pipe:
effective_mcp = user_valves.ENABLE_MCP_SERVER
effective_cache = user_valves.ENABLE_TOOL_CACHE
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
user_lang = user_ctx["user_language"]
# 2. Setup environment with effective settings
self._setup_env(
__event_call__,
@@ -2830,11 +2881,12 @@ class Pipe:
enable_mcp=effective_mcp,
enable_cache=effective_cache,
__event_emitter__=__event_emitter__,
user_lang=user_lang,
)
cwd = self._get_workspace_dir(user_id=user_id, chat_id=chat_id)
await self._emit_debug_log(
f"Agent working in: {cwd} (Admin: {is_admin}, MCP: {effective_mcp})",
f"{self._get_translation(user_lang, 'debug_agent_working_in', path=cwd)} (Admin: {is_admin}, MCP: {effective_mcp})",
__event_call__,
debug_enabled=effective_debug,
)
@@ -3269,9 +3321,9 @@ class Pipe:
if body.get("stream", False):
init_msg = ""
if effective_debug:
init_msg = f"> [Debug] Agent working in: {self._get_workspace_dir(user_id=user_id, chat_id=chat_id)}\n"
init_msg = f"> [Debug] {self._get_translation(user_lang, 'debug_agent_working_in', path=self._get_workspace_dir(user_id=user_id, chat_id=chat_id))}\n"
if mcp_server_names:
init_msg += f"> [Debug] 🔌 Connected MCP Servers: {', '.join(mcp_server_names)}\n"
init_msg += f"> [Debug] {self._get_translation(user_lang, 'debug_mcp_servers', servers=', '.join(mcp_server_names))}\n"
# Transfer client ownership to stream_response
should_stop_client = False
@@ -3284,9 +3336,14 @@ class Pipe:
init_message=init_msg,
__event_call__=__event_call__,
__event_emitter__=__event_emitter__,
reasoning_effort=effective_reasoning_effort,
reasoning_effort=(
effective_reasoning_effort
if (is_reasoning and not is_byok_model)
else "off"
),
show_thinking=show_thinking,
debug_enabled=effective_debug,
user_lang=user_lang,
)
else:
try:
@@ -3332,6 +3389,7 @@ class Pipe:
reasoning_effort: str = "",
show_thinking: bool = True,
debug_enabled: bool = False,
user_lang: str = "en-US",
) -> AsyncGenerator:
"""
Stream response from Copilot SDK, handling various event types.
@@ -3476,14 +3534,8 @@ class Pipe:
queue.put_nowait("\n</think>\n")
state["thinking_started"] = False
# Display tool call with improved formatting
if tool_args:
tool_args_json = json.dumps(tool_args, indent=2, ensure_ascii=False)
tool_display = f"\n\n<details>\n<summary>🔧 Executing Tool: {tool_name}</summary>\n\n**Parameters:**\n\n```json\n{tool_args_json}\n```\n\n</details>\n\n"
else:
tool_display = f"\n\n<details>\n<summary>🔧 Executing Tool: {tool_name}</summary>\n\n*No parameters*\n\n</details>\n\n"
queue.put_nowait(tool_display)
# Note: We do NOT emit a done="false" card here to avoid card duplication
# (unless we have a way to update text which SSE content stream doesn't)
self._emit_debug_log_sync(
f"Tool Start: {tool_name}",
@@ -3600,31 +3652,55 @@ class Pipe:
)
# ------------------------
# Try to detect content type for better formatting
is_json = False
try:
json_obj = (
json.loads(result_content)
if isinstance(result_content, str)
else result_content
# --- Build native OpenWebUI 0.8.3 tool_calls block ---
# Serialize input args (from execution_start)
tool_args_for_block = {}
if tool_call_id and tool_call_id in active_tools:
tool_args_for_block = active_tools[tool_call_id].get(
"arguments", {}
)
if isinstance(json_obj, (dict, list)):
result_content = json.dumps(
json_obj, indent=2, ensure_ascii=False
)
is_json = True
except:
pass
# Format based on content type
if is_json:
# JSON content: use code block with syntax highlighting
result_display = f"\n<details>\n<summary>{status_icon} Tool Result: {tool_name}</summary>\n\n```json\n{result_content}\n```\n\n</details>\n\n"
else:
# Plain text: use text code block to preserve formatting and add line breaks
result_display = f"\n<details>\n<summary>{status_icon} Tool Result: {tool_name}</summary>\n\n```text\n{result_content}\n```\n\n</details>\n\n"
try:
args_json_str = json.dumps(
tool_args_for_block, ensure_ascii=False
)
except Exception:
args_json_str = "{}"
queue.put_nowait(result_display)
def escape_html_attr(s: str) -> str:
if not isinstance(s, str):
return ""
return (
str(s)
.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
.replace('"', "&quot;")
.replace("\n", "&#10;")
.replace("\r", "&#13;")
)
# MUST escape both arguments and result with &quot; and &#10; to satisfy OpenWebUI's strict regex /="([^"]*)"/
# OpenWebUI `marked` extension does not match multiline attributes properly without &#10;
args_for_attr = (
escape_html_attr(args_json_str) if args_json_str else "{}"
)
result_for_attr = escape_html_attr(result_content)
# Emit the unified native tool_calls block:
# OpenWebUI 0.8.3 frontend regex explicitly expects: name="xxx" arguments="..." result="..." done="true"
# CRITICAL: <details> tag MUST be followed immediately by \n for the frontend Markdown extension to parse it!
tool_block = (
f'\n<details type="tool_calls"'
f' id="{tool_call_id}"'
f' name="{tool_name}"'
f' arguments="{args_for_attr}"'
f' result="{result_for_attr}"'
f' done="true">\n'
f"<summary>Tool Executed</summary>\n"
f"</details>\n\n"
)
queue.put_nowait(tool_block)
elif event_type == "tool.execution_progress":
# Tool execution progress update (for long-running tools)
@@ -3725,20 +3801,42 @@ class Pipe:
# Safe initial yield with error handling
try:
if debug_enabled and show_thinking:
yield "<think>\n"
if debug_enabled and __event_emitter__:
# Emit debug info as UI status rather than reasoning block
async def _emit_status(key: str, desc: str = None, **kwargs):
try:
final_desc = (
desc
if desc
else self._get_translation(user_lang, key, **kwargs)
)
await __event_emitter__(
{
"type": "status",
"data": {"description": final_desc, "done": True},
}
)
except:
pass
if init_message:
yield init_message
for line in init_message.split("\n"):
if line.strip():
clean_msg = line.replace("> [Debug] ", "").strip()
asyncio.create_task(_emit_status("custom", desc=clean_msg))
if reasoning_effort and reasoning_effort != "off":
yield f"> [Debug] Reasoning Effort injected: {reasoning_effort.upper()}\n"
asyncio.create_task(
_emit_status(
"status_reasoning_inj", effort=reasoning_effort.upper()
)
)
yield "> [Debug] Connection established, waiting for response...\n"
state["thinking_started"] = True
asyncio.create_task(_emit_status("status_conn_est"))
except Exception as e:
# If initial yield fails, log but continue processing
self._emit_debug_log_sync(
f"Initial yield warning: {e}",
f"Initial status warning: {e}",
__event_call__,
debug_enabled=debug_enabled,
)
@@ -3766,12 +3864,21 @@ class Pipe:
except asyncio.TimeoutError:
if done.is_set():
break
if state["thinking_started"]:
if __event_emitter__ and debug_enabled:
try:
yield f"> [Debug] Waiting for response ({self.valves.TIMEOUT}s exceeded)...\n"
asyncio.create_task(
__event_emitter__(
{
"type": "status",
"data": {
"description": f"Waiting for response ({self.valves.TIMEOUT}s exceeded)...",
"done": True,
},
}
)
)
except:
# If yield fails during timeout, connection is gone
break
pass
continue
while not queue.empty():

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,82 @@
# GitHub Copilot SDK Pipe v0.7.0
**GitHub Copilot SDK Pipe v0.7.0** — A major infrastructure and UX upgrade. This release eliminates manual CLI management, fully embraces OpenWebUI's native tool calling interface, and ensures seamless compatibility with the latest OpenWebUI versions.
---
## 📦 Quick Installation
- **GitHub Copilot SDK (Pipe)**: [Install v0.7.0](https://openwebui.com/posts/ce96f7b4-12fc-4ac3-9a01-875713e69359)
- **GitHub Copilot SDK (Filter)**: [Install v0.1.2](https://openwebui.com/posts/403a62ee-a596-45e7-be65-fab9cc249dd6)
---
## 🚀 What's New in v0.7.0
### 1. Zero-Maintenance CLI Integration
The most significant infrastructure change: you no longer need to worry about CLI versions or background downloads.
| Before (v0.6.x) | After (v0.7.0) |
| :--- | :--- |
| CLI installed via background `curl \| bash` | CLI bundled inside the `github-copilot-sdk` pip package |
| Version mismatches between SDK and CLI | Versions are always in sync automatically |
| Fails in restricted networks | Works everywhere `pip install` works |
**How it works**: When you install `github-copilot-sdk==0.1.25`, the matching `copilot-cli v0.0.411` is included. The plugin auto-discovers the path and injects it into the environment—zero configuration required.
### 2. Native OpenWebUI Tool Call UI
Tool calls from Copilot agents now render using **OpenWebUI's built-in tool call UI**.
- Tool execution status is displayed natively in the chat interface.
- Thinking processes (Chain of Thought) are visualized with the standard collapsible UI.
- Improved visual consistency and integration with the main OpenWebUI interface.
### 3. OpenWebUI v0.8.0+ Compatibility Fix (Bug Fix)
Resolved the **"Error getting file content"** failure that affected users on OpenWebUI v0.8.0 and later.
- **The Issue**: Relative path registration for published files was rejected by the latest OpenWebUI versions.
- **The Fix**: Switched to **absolute path registration**, restoring the ability to download generated artifacts to your local machine.
### 4. Comprehensive Multi-language Support (i18n)
Native localization for status messages and UI hints in **11 languages**:
*English, Chinese (Simp/Trad/HK/TW), Japanese, Korean, French, German, Spanish, Italian, Russian, Vietnamese, and Indonesian.*
### 5. Reasoning Status & UX Optimizations
- **Intelligent Status Display**: `Reasoning Effort injected` status is now only shown for native Copilot reasoning models.
- **Clean UI**: Removed redundant debug/status noise for BYOK and standard models.
- **Architecture Cleanup**: Refactored core setup and removed legacy installation code for a robust "one-click" experience.
---
## 🛠️ Key Capabilities
| Feature | Description |
| :--- | :--- |
| **Universal Tool Protocol** | Native support for **MCP**, **OpenAPI**, and **OpenWebUI built-in tools**. |
| **Native Tool Call UI** | Adapted to OpenWebUI's built-in tool call rendering. |
| **Workspace Isolation** | Strict sandboxing for per-session data privacy and security. |
| **Workspace Artifacts** | Agents generate files (Excel/CSV/HTML) with persistent download links via `publish_file_from_workspace`. |
| **Tool Execution** | Direct access to system binaries (Python, FFmpeg, Git, etc.). |
| **11-Language Localization** | Auto-detected, native status messages for global users. |
| **OpenWebUI v0.8.0+ Support** | Robust file handling for the latest OpenWebUI platform versions. |
---
## 📥 Import Chat Templates
- [📥 Star Prediction Chat log](https://fu-jie.github.io/awesome-openwebui/plugins/pipes/star-prediction-chat.json)
- [📥 Video Processing Chat log](https://fu-jie.github.io/awesome-openwebui/plugins/pipes/video-processing-chat.json)
*Settings -> Data -> Import Chats.*
---
## 🔗 Resources
- **GitHub Repository**: [openwebui-extensions](https://github.com/Fu-Jie/openwebui-extensions)
- **Full Changelog**: [README.md](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/pipes/github-copilot-sdk/README.md)