Compare commits
10 Commits
v2026.02.2
...
v2026.02.2
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
fc9f1ccb43 | ||
|
|
272b959a44 | ||
|
|
0bde066088 | ||
|
|
6334660e8d | ||
|
|
c29d84f97a | ||
|
|
aac2e89022 | ||
|
|
fea812d4f4 | ||
|
|
b570cbfcde | ||
|
|
adc5e0a1f4 | ||
|
|
04b8108890 |
Submodule .git-worktrees/feature-copilot-cli deleted from 1bbddb2222
177
.github/copilot-instructions.md
vendored
177
.github/copilot-instructions.md
vendored
@@ -478,7 +478,117 @@ async def get_user_language(self):
|
||||
|
||||
**注意**: 即使插件有 `Valves` 配置,也应优先尝试自动探测,提升用户体验。
|
||||
|
||||
### 8. 智能代理文件交付规范 (Agent File Delivery Standards)
|
||||
### 8. 国际化 (i18n) 适配规范 (Internationalization Standards)
|
||||
|
||||
开发供全球用户使用的插件时,必须预置多语言支持(如中文、英文等)。
|
||||
|
||||
#### i18n 字典定义
|
||||
|
||||
在文件顶部定义 `TRANSLATIONS` 字典存储多语言字符串:
|
||||
|
||||
```python
|
||||
TRANSLATIONS = {
|
||||
"en-US": {
|
||||
"status_starting": "Smart Mind Map is starting...",
|
||||
},
|
||||
"zh-CN": {
|
||||
"status_starting": "智能思维导图正在启动...",
|
||||
},
|
||||
# ... 其他语言
|
||||
}
|
||||
|
||||
# 语言回退映射 (Fallback Map)
|
||||
FALLBACK_MAP = {
|
||||
"zh": "zh-CN",
|
||||
"zh-TW": "zh-CN",
|
||||
"zh-HK": "zh-CN",
|
||||
"en": "en-US",
|
||||
"en-GB": "en-US"
|
||||
}
|
||||
```
|
||||
|
||||
#### 获取当前用户真实语言 (Robust Language Detection)
|
||||
|
||||
Open WebUI 的前端(localStorage)并未自动同步语言设置到后端数据库或通过标准 API 参数传递。为了获取精准的用户偏好语言,**必须**使用多层级回退机制(Multi-level Fallback):
|
||||
`JS 动态探测 (localStorage)` > `HTTP 浏览器头 (Accept-Language)` > `用户 Profile 默认设置` > `en-US`
|
||||
|
||||
> **注意!防卡死指南 (Anti-Deadlock Guide)**
|
||||
> 在通过 `__event_call__` 执行前端 JS 脚本时,如果前端脚本不慎抛出异常 (`Exception`) 会导致回调函数 `cb()` 永不执行,这会让后端的 `asyncio` 永远阻塞并卡死整个请求队列!
|
||||
> **必须**做两重防护:
|
||||
> 1. JS 内部包裹 `try...catch` 保证必须有 `return`。
|
||||
> 2. 后端使用 `asyncio.wait_for` 设置强制超时(建议 2 秒)。
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from fastapi import Request
|
||||
|
||||
async def _get_user_context(
|
||||
self,
|
||||
__user__: Optional[dict],
|
||||
__event_call__: Optional[callable] = None,
|
||||
__request__: Optional[Request] = None,
|
||||
) -> dict:
|
||||
user_language = __user__.get("language", "en-US") if __user__ else "en-US"
|
||||
|
||||
# 1st Fallback: HTTP Accept-Language header
|
||||
if __request__ and hasattr(__request__, "headers") and "accept-language" in __request__.headers:
|
||||
raw_lang = __request__.headers.get("accept-language", "")
|
||||
if raw_lang:
|
||||
user_language = raw_lang.split(",")[0].split(";")[0]
|
||||
|
||||
# 2nd Fallback (Best): Execute JS in frontend to read localStorage
|
||||
if __event_call__:
|
||||
try:
|
||||
js_code = """
|
||||
try {
|
||||
return (
|
||||
document.documentElement.lang ||
|
||||
localStorage.getItem('locale') ||
|
||||
navigator.language ||
|
||||
'en-US'
|
||||
);
|
||||
} catch (e) {
|
||||
return 'en-US';
|
||||
}
|
||||
"""
|
||||
# 【致命!】必须设置 wait_for 防止前端无响应卡死后端
|
||||
frontend_lang = await asyncio.wait_for(
|
||||
__event_call__({"type": "execute", "data": {"code": js_code}}),
|
||||
timeout=2.0
|
||||
)
|
||||
if frontend_lang and isinstance(frontend_lang, str):
|
||||
user_language = frontend_lang
|
||||
except Exception as e:
|
||||
pass # fallback to accept-language or en-US
|
||||
|
||||
return {
|
||||
"user_language": user_language,
|
||||
# ... user_name, user_id etc.
|
||||
}
|
||||
```
|
||||
|
||||
#### 实际使用 (Usage in Action/Filter)
|
||||
|
||||
在 Action 或者 Filter 执行时引用这套上下文获取机制,然后传入映射器获取最终翻译:
|
||||
|
||||
```python
|
||||
async def action(
|
||||
self,
|
||||
body: dict,
|
||||
__user__: Optional[dict] = None,
|
||||
__event_call__: Optional[callable] = None,
|
||||
__request__: Optional[Request] = None,
|
||||
**kwargs
|
||||
) -> Optional[dict]:
|
||||
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
|
||||
user_lang = user_ctx["user_language"]
|
||||
|
||||
# 获取多语言文本 (通过你的 translation.get() 扩展)
|
||||
# start_msg = self._get_translation(user_lang, "status_starting")
|
||||
```
|
||||
|
||||
### 9. 智能代理文件交付规范 (Agent File Delivery Standards)
|
||||
|
||||
在开发具备文件生成能力的智能代理插件(如 GitHub Copilot SDK 集成)时,必须遵循以下标准流程,以确保文件在不同存储后端(本地/S3)下的可用性并绕过不必要的 RAG 处理。
|
||||
|
||||
@@ -498,7 +608,7 @@ async def get_user_language(self):
|
||||
- 代理应始终将“当前目录”视为其受保护所在的私有工作空间。
|
||||
- `publish_file_from_workspace` 的参数 `filename` 仅需传入相对于当前目录的文件名。
|
||||
|
||||
### 9. Copilot SDK 插件工具定义规范 (Copilot SDK Tool Definition Standards)
|
||||
### 10. Copilot SDK 插件工具定义规范 (Copilot SDK Tool Definition Standards)
|
||||
|
||||
在为 GitHub Copilot SDK 开发自定义工具时,为了确保大模型能正确识别参数(避免生成空的 `properties` Schema),必须遵循以下定义模式:
|
||||
|
||||
@@ -532,6 +642,63 @@ my_tool = define_tool(
|
||||
2. **Field 描述**: 在 `BaseModel` 中使用 `Field(..., description="...")` 为每个参数提供详细的描述信息。
|
||||
3. **Required vs Optional**: 明确标注必填项(无默认值)和可选项(带 `default`)。
|
||||
|
||||
### 11. Copilot SDK 流式渲染与工具卡片规范 (Streaming & Tool Card Standards)
|
||||
|
||||
在处理大模型的思维链(Reasoning)输出和工具调用(Tool Calls)时,为了确保能完美兼容 OpenWebUI 0.8.x 前端的 Markdown 解析器及原生折叠 UI 组件,必须遵循以下极度严格的输出格式规范。
|
||||
|
||||
#### 思维链流式渲染 (Reasoning Streaming)
|
||||
|
||||
为了让前端能够正确显示“Thinking...”的折叠框和 Spinner 动画,**必须**使用原生的 `<think>` 标签。
|
||||
|
||||
- **正确的标签包裹**:
|
||||
```html
|
||||
<think>
|
||||
这里是思考过程...
|
||||
</think>
|
||||
```
|
||||
- **关键细节**:
|
||||
- **标签闭合检测**: 必须在代码内部维护状态(如 `state["thinking_started"]`)。当(1)正文内容即将开始输出,或(2)工具调用触发 (`tool.execution_start`) 时,**必须优先输出 `\n</think>\n` 强制闭合标签**。如果不闭合,后续的正文或工具面板会被全部吞进思考框内,导致页面完全崩坏!
|
||||
- **不要手动拼装**: 严禁通过手动输出 `<details type="reasoning">` 等大段 HTML 来模拟思考过程,这种方式极易在流式片段发送中破坏前端 DOM 树并导致错位。
|
||||
|
||||
#### 工具调用原生卡片 (Native Tool Calls Block)
|
||||
|
||||
为了在对话界面中生成标准、原生的下拉折叠“工具调用”卡片,当 `event_type == "tool.execution_complete"` 时,必须向队列输出如下严格格式的 HTML:
|
||||
|
||||
```python
|
||||
# 必须转义属性中的双引号为 "
|
||||
args_for_attr = args_json_str.replace('"', """)
|
||||
result_for_attr = result_content.replace('"', """)
|
||||
|
||||
tool_block = (
|
||||
f'\\n<details type="tool_calls"'
|
||||
f' id="{tool_call_id}"'
|
||||
f' name="{tool_name}"'
|
||||
f' arguments="{args_for_attr}"'
|
||||
f' result="{result_for_attr}"'
|
||||
f' done="true">\\n'
|
||||
f"<summary>Tool Executed</summary>\\n"
|
||||
f"</details>\\n\\n"
|
||||
)
|
||||
queue.put_nowait(tool_block)
|
||||
```
|
||||
|
||||
- **致命避坑点 (Critical Pitfalls)**:
|
||||
1. **属性转义 (Extremely Important)**: `<details>` 内的 `arguments` 和 `result` 属性**必须**将内部的所有双引号 `"` 替换为 `"`。因为 OpenWebUI 前端提取这些数据的 Regex 是严格的 `="([^"]*)"`,一旦内容中出现原生双引号,就会被瞬间截断,导致参数被渲染为空并引发解析错误!
|
||||
2. **换行符要求**: `<details ...>` 尖括号闭合后紧接着的内容**必须换行**(即 `>\\n`),否则 Markdown 扩展引擎无法将其识别为独立的 UI Block。
|
||||
3. **去除冗余通知**: 不要在 `tool.execution_start` 事件中提前向对话流输出普通的 `🔧 Executing...` 纯文本块,这会导致最终页面上同时出现两块工具提示(一个文本,一个折叠卡片)。
|
||||
|
||||
#### Debug 信息的解耦 (Decoupling Debug Logs)
|
||||
|
||||
对于连接建立、运行环境、缓存加载等属于 *脚本自身运行状态* 的 Debug 信息:
|
||||
- **禁止**: 不要将这些内容 yield 到最终的回答数据流(或塞进 `<think>` 标签内),这会污染回答的纯粹性。
|
||||
- **推荐**: 统一使用 OpenWebUI 顶部的原生状态反馈气泡(Status Events):
|
||||
```python
|
||||
await __event_emitter__({
|
||||
"type": "status",
|
||||
"data": {"description": "连接建立,正在等待响应...", "done": True}
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Action 插件规范 (Action Plugin Standards)
|
||||
@@ -947,8 +1114,10 @@ Filter 实例是**单例 (Singleton)**。
|
||||
|
||||
任何插件的**新增、修改或移除**,必须同时更新:
|
||||
1. **插件代码** (version)
|
||||
2. **项目文档** (`docs/`)
|
||||
3. **自述文件** (`README.md`)
|
||||
2. **插件自述文件** (`plugins/{type}/{name}/README.md` & `README_CN.md`)
|
||||
3. **项目文档** (`docs/plugins/{type}/{name}.md` & `.zh.md`)
|
||||
4. **项目文档索引** (`docs/plugins/{type}/index.md` & `index.zh.md` — 版本号)
|
||||
5. **项目根 README** (`README.md` & `README_CN.md` — 更新日期徽章 `` 必须同步为发布当天日期)
|
||||
|
||||
### 3. 发布工作流 (Release Workflow)
|
||||
|
||||
|
||||
3
.github/workflows/release.yml
vendored
3
.github/workflows/release.yml
vendored
@@ -329,8 +329,7 @@ jobs:
|
||||
DETECTED_CHANGES: ${{ needs.check-changes.outputs.release_notes }}
|
||||
COMMITS: ${{ steps.commits.outputs.commits }}
|
||||
run: |
|
||||
echo "# ${VERSION} Release" > release_notes.md
|
||||
echo "" >> release_notes.md
|
||||
> release_notes.md
|
||||
|
||||
if [ -n "$TITLE" ]; then
|
||||
echo "## $TITLE" >> release_notes.md
|
||||
|
||||
1
.gitignore
vendored
1
.gitignore
vendored
@@ -139,3 +139,4 @@ logs/
|
||||
|
||||
# OpenWebUI specific
|
||||
# Add any specific ignores for OpenWebUI plugins if needed
|
||||
.git-worktrees/
|
||||
|
||||
@@ -24,12 +24,12 @@ A collection of enhancements, plugins, and prompts for [OpenWebUI](https://githu
|
||||
|
||||
| Rank | Plugin | Version | Downloads | Views | 📅 Updated |
|
||||
| :---: | :--- | :---: | :---: | :---: | :---: |
|
||||
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) |  |  |  |  |
|
||||
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) |  |  |  |  |
|
||||
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) |  |  |  |  |
|
||||
| 🥉 | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) |  |  |  |  |
|
||||
| 4️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) |  |  |  |  |
|
||||
| 5️⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) |  |  |  |  |
|
||||
| 6️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) |  |  |  |  |
|
||||
| 4️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) |  |  |  |  |
|
||||
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) |  |  |  |  |
|
||||
| 6️⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) |  |  |  |  |
|
||||
|
||||
### 📈 Total Downloads Trend
|
||||
|
||||
|
||||
@@ -21,12 +21,12 @@ OpenWebUI 增强功能集合。包含个人开发与收集的插件、提示词
|
||||
|
||||
| 排名 | 插件 | 版本 | 下载 | 浏览 | 📅 更新 |
|
||||
| :---: | :--- | :---: | :---: | :---: | :---: |
|
||||
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) |  |  |  |  |
|
||||
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) |  |  |  |  |
|
||||
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) |  |  |  |  |
|
||||
| 🥉 | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) |  |  |  |  |
|
||||
| 4️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) |  |  |  |  |
|
||||
| 5️⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) |  |  |  |  |
|
||||
| 6️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) |  |  |  |  |
|
||||
| 4️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) |  |  |  |  |
|
||||
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) |  |  |  |  |
|
||||
| 6️⃣ | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) |  |  |  |  |
|
||||
|
||||
### 📈 总下载量累计趋势
|
||||
|
||||
|
||||
@@ -1,137 +1,81 @@
|
||||
# Async Context Compression
|
||||
# Async Context Compression Filter
|
||||
|
||||
<span class="category-badge filter">Filter</span>
|
||||
<span class="version-badge">v1.2.2</span>
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
|
||||
Reduces token consumption in long conversations through intelligent summarization while maintaining conversational coherence.
|
||||
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
|
||||
|
||||
## What's new in 1.3.0
|
||||
|
||||
- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
|
||||
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
|
||||
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
|
||||
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
|
||||
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
## Core Features
|
||||
|
||||
The Async Context Compression filter helps manage token usage in long conversations by:
|
||||
|
||||
- Intelligently summarizing older messages
|
||||
- Preserving important context
|
||||
- Reducing API costs
|
||||
- Maintaining conversation coherence
|
||||
|
||||
This is especially useful for:
|
||||
|
||||
- Long-running conversations
|
||||
- Complex multi-turn discussions
|
||||
- Cost optimization
|
||||
- Token limit management
|
||||
|
||||
## Features
|
||||
|
||||
- :material-arrow-collapse-vertical: **Smart Compression**: AI-powered context summarization
|
||||
- :material-clock-fast: **Async Processing**: Non-blocking background compression
|
||||
- :material-memory: **Context Preservation**: Keeps important information
|
||||
- :material-currency-usd-off: **Cost Reduction**: Minimize token usage
|
||||
- :material-console: **Frontend Debugging**: Debug logs in browser console
|
||||
- :material-alert-circle-check: **Enhanced Error Reporting**: Clear error status notifications
|
||||
- :material-check-all: **Open WebUI v0.7.x Compatibility**: Dynamic DB session handling
|
||||
- :material-account-convert: **Improved Compatibility**: Summary role changed to `assistant`
|
||||
- :material-shield-check: **Enhanced Stability**: Resolved race conditions in state management
|
||||
- :material-ruler: **Preflight Context Check**: Validates context fit before sending
|
||||
- :material-format-align-justify: **Structure-Aware Trimming**: Preserves document structure
|
||||
- :material-content-cut: **Native Tool Output Trimming**: Trims verbose tool outputs (Note: Non-native tool outputs are not fully injected into context)
|
||||
- :material-chart-bar: **Detailed Token Logging**: Granular token breakdown
|
||||
- :material-account-search: **Smart Model Matching**: Inherit config from base models
|
||||
- :material-image-off: **Multimodal Support**: Images are preserved but tokens are **NOT** calculated
|
||||
- ✅ **Full i18n Support**: Native localization across 9 languages.
|
||||
- ✅ Automatic compression triggered by token thresholds.
|
||||
- ✅ Asynchronous summarization that does not block chat responses.
|
||||
- ✅ Persistent storage via Open WebUI's shared database connection (PostgreSQL, SQLite, etc.).
|
||||
- ✅ Flexible retention policy to keep the first and last N messages.
|
||||
- ✅ Smart injection of historical summaries back into the context.
|
||||
- ✅ Structure-aware trimming that preserves document structure (headers, intro, conclusion).
|
||||
- ✅ Native tool output trimming for cleaner context when using function calling.
|
||||
- ✅ Real-time context usage monitoring with warning notifications (>90%).
|
||||
- ✅ Detailed token logging for precise debugging and optimization.
|
||||
- ✅ **Smart Model Matching**: Automatically inherits configuration from base models for custom presets.
|
||||
- ⚠ **Multimodal Support**: Images are preserved but their tokens are **NOT** calculated. Please adjust thresholds accordingly.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
## Installation & Configuration
|
||||
|
||||
1. Download the plugin file: [`async_context_compression.py`](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression)
|
||||
2. Upload to OpenWebUI: **Admin Panel** → **Settings** → **Functions**
|
||||
3. Configure compression settings
|
||||
4. Enable the filter
|
||||
### 1) Database (automatic)
|
||||
|
||||
- Uses Open WebUI's shared database connection; no extra configuration needed.
|
||||
- The `chat_summary` table is created on first run.
|
||||
|
||||
### 2) Filter order
|
||||
|
||||
- Recommended order: pre-filters (<10) → this filter (10) → post-filters (>10).
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
## Configuration Parameters
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Incoming Messages] --> B{Token Count > Threshold?}
|
||||
B -->|No| C[Pass Through]
|
||||
B -->|Yes| D[Summarize Older Messages]
|
||||
D --> E[Preserve Recent Messages]
|
||||
E --> F[Combine Summary + Recent]
|
||||
F --> G[Send to LLM]
|
||||
```
|
||||
| Parameter | Default | Description |
|
||||
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `priority` | `10` | Execution order; lower runs earlier. |
|
||||
| `compression_threshold_tokens` | `64000` | Trigger asynchronous summary when total tokens exceed this value. Set to 50%-70% of your model's context window. |
|
||||
| `max_context_tokens` | `128000` | Hard cap for context; older messages (except protected ones) are dropped if exceeded. |
|
||||
| `keep_first` | `1` | Always keep the first N messages (protects system prompts). |
|
||||
| `keep_last` | `6` | Always keep the last N messages to preserve recent context. |
|
||||
| `summary_model` | `None` | Model for summaries. Strongly recommended to set a fast, economical model (e.g., `gemini-2.5-flash`, `deepseek-v3`). Falls back to the current chat model when empty. |
|
||||
| `summary_model_max_context` | `0` | Max context tokens for the summary model. If 0, falls back to `model_thresholds` or global `max_context_tokens`. |
|
||||
| `max_summary_tokens` | `16384` | Maximum tokens for the generated summary. |
|
||||
| `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. |
|
||||
| `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). |
|
||||
| `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. |
|
||||
| `debug_mode` | `false` | Log verbose debug info. Set to `false` in production. |
|
||||
| `show_debug_log` | `false` | Print debug logs to browser console (F12). Useful for frontend debugging. |
|
||||
| `show_token_usage_status` | `true` | Show token usage status notification in the chat interface. |
|
||||
| `token_usage_status_threshold` | `80` | The minimum usage percentage (0-100) required to show a context usage status notification. |
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
## ⭐ Support
|
||||
|
||||
| Option | Type | Default | Description |
|
||||
|--------|------|---------|-------------|
|
||||
| `compression_threshold_tokens` | integer | `64000` | Trigger compression above this token count |
|
||||
| `max_context_tokens` | integer | `128000` | Hard limit for context |
|
||||
| `keep_first` | integer | `1` | Always keep the first N messages |
|
||||
| `keep_last` | integer | `6` | Always keep the last N messages |
|
||||
| `summary_model` | string | `None` | Model to use for summarization |
|
||||
| `summary_model_max_context` | integer | `0` | Max context tokens for summary model |
|
||||
| `max_summary_tokens` | integer | `16384` | Maximum tokens for the summary |
|
||||
| `enable_tool_output_trimming` | boolean | `false` | Enable trimming of large tool outputs |
|
||||
If this plugin has been useful, a star on [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) is a big motivation for me. Thank you for the support.
|
||||
|
||||
---
|
||||
## Troubleshooting ❓
|
||||
|
||||
## Example
|
||||
- **Initial system prompt is lost**: Keep `keep_first` greater than 0 to protect the initial message.
|
||||
- **Compression effect is weak**: Raise `compression_threshold_tokens` or lower `keep_first` / `keep_last` to allow more aggressive compression.
|
||||
- **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
|
||||
|
||||
### Before Compression
|
||||
## Changelog
|
||||
|
||||
```
|
||||
[Message 1] User: Tell me about Python...
|
||||
[Message 2] AI: Python is a programming language...
|
||||
[Message 3] User: What about its history?
|
||||
[Message 4] AI: Python was created by Guido...
|
||||
[Message 5] User: And its features?
|
||||
[Message 6] AI: Python has many features...
|
||||
... (many more messages)
|
||||
[Message 20] User: Current question
|
||||
```
|
||||
|
||||
### After Compression
|
||||
|
||||
```
|
||||
[Summary] Previous conversation covered Python basics,
|
||||
history, features, and common use cases...
|
||||
|
||||
[Message 18] User: Recent question about decorators
|
||||
[Message 19] AI: Decorators in Python are...
|
||||
[Message 20] User: Current question
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
!!! note "Prerequisites"
|
||||
- OpenWebUI v0.3.0 or later
|
||||
- Access to an LLM for summarization
|
||||
|
||||
!!! tip "Best Practices"
|
||||
- Set appropriate token thresholds based on your model's context window
|
||||
- Preserve more recent messages for technical discussions
|
||||
- Test compression settings in non-critical conversations first
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
??? question "Compression not triggering?"
|
||||
Check if the token count exceeds your configured threshold. Enable debug logging for more details.
|
||||
|
||||
??? question "Important context being lost?"
|
||||
Increase the `preserve_recent` setting or lower the compression ratio.
|
||||
|
||||
---
|
||||
|
||||
## Source Code
|
||||
|
||||
[:fontawesome-brands-github: View on GitHub](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression){ .md-button }
|
||||
See the full history on GitHub: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)
|
||||
|
||||
@@ -1,137 +1,119 @@
|
||||
# Async Context Compression(异步上下文压缩)
|
||||
# 异步上下文压缩过滤器
|
||||
|
||||
<span class="category-badge filter">Filter</span>
|
||||
<span class="version-badge">v1.2.2</span>
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
|
||||
通过智能摘要减少长对话的 token 消耗,同时保持对话连贯。
|
||||
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。
|
||||
|
||||
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
|
||||
|
||||
## 1.3.0 版本更新
|
||||
|
||||
- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化,现已原生支持 9 种语言(含中、英、日、韩及欧洲主要语言)。
|
||||
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门(默认 80%),可以智能控制何时显示 Token 用量状态,减少不必要的打扰。
|
||||
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构,完全不影响首字节响应时间(TTFB),保持毫秒级极速推流。
|
||||
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩,避免冲突。
|
||||
- **配置项调整**: 为了提供更安静的生产环境体验,`debug_mode` 现已默认设置为 `false`。
|
||||
|
||||
---
|
||||
|
||||
## 概览
|
||||
## 核心特性
|
||||
|
||||
Async Context Compression 过滤器通过以下方式帮助管理长对话的 token 使用:
|
||||
- ✅ **全方位国际化**: 原生支持 9 种界面语言。
|
||||
- ✅ **自动压缩**: 基于 Token 阈值自动触发上下文压缩。
|
||||
- ✅ **异步摘要**: 后台生成摘要,不阻塞当前对话响应。
|
||||
- ✅ **持久化存储**: 复用 Open WebUI 共享数据库连接,自动支持 PostgreSQL/SQLite 等。
|
||||
- ✅ **灵活保留策略**: 可配置保留对话头部和尾部消息,确保关键信息连贯。
|
||||
- ✅ **智能注入**: 将历史摘要智能注入到新上下文中。
|
||||
- ✅ **结构感知裁剪**: 智能折叠过长消息,保留文档骨架(标题、首尾)。
|
||||
- ✅ **原生工具输出裁剪**: 支持裁剪冗长的工具调用输出。
|
||||
- ✅ **实时监控**: 实时监控上下文使用情况,超过 90% 发出警告。
|
||||
- ✅ **详细日志**: 提供精确的 Token 统计日志,便于调试。
|
||||
- ✅ **智能模型匹配**: 自定义模型自动继承基础模型的阈值配置。
|
||||
- ⚠ **多模态支持**: 图片内容会被保留,但其 Token **不参与计算**。请相应调整阈值。
|
||||
|
||||
- 智能总结较早的消息
|
||||
- 保留关键信息
|
||||
- 降低 API 成本
|
||||
- 保持对话一致性
|
||||
|
||||
特别适用于:
|
||||
|
||||
- 长时间会话
|
||||
- 多轮复杂讨论
|
||||
- 成本优化
|
||||
- 上下文长度控制
|
||||
|
||||
## 功能特性
|
||||
|
||||
- :material-arrow-collapse-vertical: **智能压缩**:AI 驱动的上下文摘要
|
||||
- :material-clock-fast: **异步处理**:后台非阻塞压缩
|
||||
- :material-memory: **保留上下文**:尽量保留重要信息
|
||||
- :material-currency-usd-off: **降低成本**:减少 token 使用
|
||||
- :material-console: **前端调试**:支持浏览器控制台日志
|
||||
- :material-alert-circle-check: **增强错误报告**:清晰的错误状态通知
|
||||
- :material-check-all: **Open WebUI v0.7.x 兼容性**:动态数据库会话处理
|
||||
- :material-account-convert: **兼容性提升**:摘要角色改为 `assistant`
|
||||
- :material-shield-check: **稳定性增强**:解决状态管理竞态条件
|
||||
- :material-ruler: **预检上下文检查**:发送前验证上下文是否超限
|
||||
- :material-format-align-justify: **结构感知裁剪**:保留文档结构的智能裁剪
|
||||
- :material-content-cut: **原生工具输出裁剪**:自动裁剪冗长的工具输出(注意:非原生工具调用输出不会完整注入上下文)
|
||||
- :material-chart-bar: **详细 Token 日志**:提供细粒度的 Token 统计
|
||||
- :material-account-search: **智能模型匹配**:自定义模型自动继承基础模型配置
|
||||
- :material-image-off: **多模态支持**:图片内容保留但 Token **不参与计算**
|
||||
详细的工作原理和流程请参考 [工作流程指南](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/WORKFLOW_GUIDE_CN.md)。
|
||||
|
||||
---
|
||||
|
||||
## 安装
|
||||
## 安装与配置
|
||||
|
||||
1. 下载插件文件:[`async_context_compression.py`](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression)
|
||||
2. 上传到 OpenWebUI:**Admin Panel** → **Settings** → **Functions**
|
||||
3. 配置压缩参数
|
||||
4. 启用过滤器
|
||||
### 1. 数据库(自动)
|
||||
|
||||
- 自动使用 Open WebUI 的共享数据库连接,**无需额外配置**。
|
||||
- 首次运行自动创建 `chat_summary` 表。
|
||||
|
||||
### 2. 过滤器顺序
|
||||
|
||||
- 建议顺序:前置过滤器(<10)→ 本过滤器(10)→ 后置过滤器(>10)。
|
||||
|
||||
---
|
||||
|
||||
## 工作原理
|
||||
## 配置参数
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Incoming Messages] --> B{Token Count > Threshold?}
|
||||
B -->|No| C[Pass Through]
|
||||
B -->|Yes| D[Summarize Older Messages]
|
||||
D --> E[Preserve Recent Messages]
|
||||
E --> F[Combine Summary + Recent]
|
||||
F --> G[Send to LLM]
|
||||
您可以在过滤器的设置中调整以下参数:
|
||||
|
||||
### 核心参数
|
||||
|
||||
| 参数 | 默认值 | 描述 |
|
||||
| :----------------------------- | :------- | :------------------------------------------------------------------------------------ |
|
||||
| `priority` | `10` | 过滤器执行顺序,数值越小越先执行。 |
|
||||
| `compression_threshold_tokens` | `64000` | **重要**: 当上下文总 Token 超过此值时后台生成摘要,建议设为模型上下文窗口的 50%-70%。 |
|
||||
| `max_context_tokens` | `128000` | **重要**: 上下文硬上限,超过即移除最早消息(保留受保护消息)。 |
|
||||
| `keep_first` | `1` | 始终保留对话开始的 N 条消息,保护系统提示或环境变量。 |
|
||||
| `keep_last` | `6` | 始终保留对话末尾的 N 条消息,确保最近上下文连贯。 |
|
||||
|
||||
### 摘要生成配置
|
||||
|
||||
| 参数 | 默认值 | 描述 |
|
||||
| :-------------------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `summary_model` | `None` | 用于生成摘要的模型 ID。**强烈建议**配置快速、经济、上下文窗口大的模型(如 `gemini-2.5-flash`、`deepseek-v3`)。留空则尝试复用当前对话模型。 |
|
||||
| `summary_model_max_context` | `0` | 摘要模型的最大上下文 Token 数。如果为 0,则回退到 `model_thresholds` 或全局 `max_context_tokens`。 |
|
||||
| `max_summary_tokens` | `16384` | 生成摘要时允许的最大 Token 数。 |
|
||||
| `summary_temperature` | `0.1` | 控制摘要生成的随机性,较低的值结果更稳定。 |
|
||||
|
||||
### 高级配置
|
||||
|
||||
#### `model_thresholds` (模型特定阈值)
|
||||
|
||||
这是一个字典配置,可为特定模型 ID 覆盖全局 `compression_threshold_tokens` 与 `max_context_tokens`,适用于混合不同上下文窗口的模型。
|
||||
|
||||
**默认包含 GPT-4、Claude 3.5、Gemini 1.5/2.0、Qwen 2.5/3、DeepSeek V3 等推荐阈值。**
|
||||
|
||||
**配置示例:**
|
||||
|
||||
```json
|
||||
{
|
||||
"gpt-4": {
|
||||
"compression_threshold_tokens": 8000,
|
||||
"max_context_tokens": 32000
|
||||
},
|
||||
"gemini-2.5-flash": {
|
||||
"compression_threshold_tokens": 734000,
|
||||
"max_context_tokens": 1048576
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 配置项
|
||||
|
||||
| 选项 | 类型 | 默认值 | 说明 |
|
||||
|--------|------|---------|-------------|
|
||||
| `compression_threshold_tokens` | integer | `64000` | 超过该 token 数触发压缩 |
|
||||
| `max_context_tokens` | integer | `128000` | 上下文硬性上限 |
|
||||
| `keep_first` | integer | `1` | 始终保留的前 N 条消息 |
|
||||
| `keep_last` | integer | `6` | 始终保留的后 N 条消息 |
|
||||
| `summary_model` | string | `None` | 用于摘要的模型 |
|
||||
| `summary_model_max_context` | integer | `0` | 摘要模型的最大上下文 Token 数 |
|
||||
| `max_summary_tokens` | integer | `16384` | 摘要的最大 token 数 |
|
||||
| `enable_tool_output_trimming` | boolean | `false` | 启用长工具输出裁剪 |
|
||||
| 参数 | 默认值 | 描述 |
|
||||
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `enable_tool_output_trimming` | `false` | 启用时,若 `function_calling: "native"` 激活,将裁剪冗长的工具输出以仅提取最终答案。 |
|
||||
| `debug_mode` | `false` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息。生产环境默认且建议设为 `false`。 |
|
||||
| `show_debug_log` | `false` | 是否在浏览器控制台 (F12) 打印调试日志。便于前端调试。 |
|
||||
| `show_token_usage_status` | `true` | 是否在对话结束时显示 Token 使用情况的状态通知。 |
|
||||
| `token_usage_status_threshold` | `80` | 触发显示上下文用量状态通知的最低百分比阈值 (0-100)。 |
|
||||
|
||||
---
|
||||
|
||||
## 示例
|
||||
## ⭐ 支持
|
||||
|
||||
### 压缩前
|
||||
如果这个插件对你有帮助,欢迎到 [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) 点个 Star,这将是我持续改进的动力,感谢支持。
|
||||
|
||||
```
|
||||
[Message 1] User: Tell me about Python...
|
||||
[Message 2] AI: Python is a programming language...
|
||||
[Message 3] User: What about its history?
|
||||
[Message 4] AI: Python was created by Guido...
|
||||
[Message 5] User: And its features?
|
||||
[Message 6] AI: Python has many features...
|
||||
... (many more messages)
|
||||
[Message 20] User: Current question
|
||||
```
|
||||
## 故障排除 (Troubleshooting) ❓
|
||||
|
||||
### 压缩后
|
||||
- **初始系统提示丢失**:将 `keep_first` 设置为大于 0。
|
||||
- **压缩效果不明显**:提高 `compression_threshold_tokens`,或降低 `keep_first` / `keep_last` 以增强压缩力度。
|
||||
- **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue:[OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
|
||||
|
||||
```
|
||||
[Summary] Previous conversation covered Python basics,
|
||||
history, features, and common use cases...
|
||||
## 更新日志
|
||||
|
||||
[Message 18] User: Recent question about decorators
|
||||
[Message 19] AI: Decorators in Python are...
|
||||
[Message 20] User: Current question
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 运行要求
|
||||
|
||||
!!! note "前置条件"
|
||||
- OpenWebUI v0.3.0 及以上
|
||||
- 需要可用的 LLM 用于摘要
|
||||
|
||||
!!! tip "最佳实践"
|
||||
- 根据模型上下文窗口设置合适的 token 阈值
|
||||
- 技术讨论可适当提高 `preserve_recent`
|
||||
- 先在非关键对话中测试压缩效果
|
||||
|
||||
---
|
||||
|
||||
## 常见问题
|
||||
|
||||
??? question "没有触发压缩?"
|
||||
检查 token 数是否超过配置的阈值,并开启调试日志了解细节。
|
||||
|
||||
??? question "重要上下文丢失?"
|
||||
提高 `preserve_recent` 或降低压缩比例。
|
||||
|
||||
---
|
||||
|
||||
## 源码
|
||||
|
||||
[:fontawesome-brands-github: 在 GitHub 查看](https://github.com/Fu-Jie/openwebui-extensions/tree/main/plugins/filters/async-context-compression){ .md-button }
|
||||
完整历史请查看 GitHub 项目: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)
|
||||
|
||||
@@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline:
|
||||
|
||||
Reduces token consumption in long conversations through intelligent summarization while maintaining coherence.
|
||||
|
||||
**Version:** 1.2.2
|
||||
**Version:** 1.3.0
|
||||
|
||||
[:octicons-arrow-right-24: Documentation](async-context-compression.md)
|
||||
|
||||
|
||||
@@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件:
|
||||
|
||||
通过智能总结减少长对话的 token 消耗,同时保持连贯性。
|
||||
|
||||
**版本:** 1.2.2
|
||||
**版本:** 1.3.0
|
||||
|
||||
[:octicons-arrow-right-24: 查看文档](async-context-compression.md)
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# GitHub Copilot SDK Pipe for OpenWebUI
|
||||
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.6.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.7.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
|
||||
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
|
||||
|
||||
@@ -14,12 +14,13 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
|
||||
|
||||
---
|
||||
|
||||
## ✨ v0.6.2 Updates (What's New)
|
||||
## ✨ v0.7.0 Updates (What's New)
|
||||
|
||||
- **🛠️ New Workspace Artifacts Tool**: Introduced `publish_file_from_workspace`. Agents can now generate files (e.g., Python-generated Excel/CSV) and provide direct download links for the user to click and save.
|
||||
- **⚙️ Workflow Optimization**: Improved reliability of the internal agentic workspace management.
|
||||
- **🛡️ Enhanced Security**: Refined access control for system resources within the isolated environment.
|
||||
- **🔧 Performance Tuning**: Optimized stream processing for larger context windows.
|
||||
- **🚀 Integrated CLI Management**: The Copilot CLI is now automatically managed and bundled via the `github-copilot-sdk` pip package. (v0.7.0)
|
||||
- **🧠 Native Tool Call UI**: Full adaptation to **OpenWebUI's native tool call UI** and thinking process visualization. (v0.7.0)
|
||||
- **🏠 OpenWebUI v0.8.0+ Fix**: Resolved "Error getting file content" download failure by switching to absolute path registration for published files. (v0.7.0)
|
||||
- **🌐 Comprehensive Multi-language Support**: Native localization for status messages in 11 languages (EN, ZH, JA, KO, FR, DE, ES, IT, RU, VI, ID). (v0.7.0)
|
||||
- **🧹 Architecture Cleanup**: Refactored core setup and optimized reasoning status display for a leaner experience. (v0.7.0)
|
||||
|
||||
---
|
||||
|
||||
@@ -31,8 +32,8 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
|
||||
- **♾️ Infinite Session Management**: Smart context window management with automatic compaction for indefinite conversation capability.
|
||||
- **🧠 Deep Database Integration**: Real-time persistence of TOD·O lists for long-running workflows.
|
||||
- **🌊 Advanced Streaming**: Full support for thinking process/Chain of Thought visualization.
|
||||
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support.
|
||||
- **⚡ Full-Lifecycle File Agent**: Supports receiving uploaded files for raw bypass analysis and publishing results (Excel/reports) as downloadable links.
|
||||
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support (bypasses RAG for direct binary access).
|
||||
- **📤 Workspace Artifacts (`publish_file_from_workspace`)**: Agents can generate files (Excel, CSV, HTML reports, etc.) and provide **persistent download links** directly in the chat.
|
||||
- **🖼️ Interactive Artifacts**: Automatically renders HTML/JS apps generated by the agent directly in the chat interface.
|
||||
|
||||
---
|
||||
@@ -110,7 +111,7 @@ If this plugin has been useful, a **Star** on [OpenWebUI Extensions](https://git
|
||||
|
||||
- **Agent ignores files?**: Ensure the Files Filter is enabled, otherwise RAG will interfere with raw binaries.
|
||||
- **No progress bar?**: The bar only appears when the Agent uses the `update_todo` tool.
|
||||
- **Dependencies**: This Pipe automatically installs `github-copilot-sdk` (Python) and `github-copilot-cli` (Binary).
|
||||
- **Dependencies**: This Pipe automatically manages `github-copilot-sdk` (Python) and utilizes the bundled binary CLI. No manual install required.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# GitHub Copilot SDK 官方管道
|
||||
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie) | **版本:** 0.6.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie) | **版本:** 0.7.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
|
||||
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商(OpenAI, Anthropic),并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
|
||||
|
||||
@@ -14,12 +14,13 @@
|
||||
|
||||
---
|
||||
|
||||
## ✨ 0.6.2 更新内容 (What's New)
|
||||
## ✨ 0.7.0 更新内容 (What's New)
|
||||
|
||||
- **🛠️ 新增工作区产物工具**: 引入 `publish_file_from_workspace`。Agent 现在可以生成物理文件(如使用 Python 生成的 Excel/CSV 报表),并直接在聊天界面提供点击下载链接。
|
||||
- **⚙️ 工作流优化**: 提升了内部 Agent 物理工作区管理的可靠性与原子性。
|
||||
- **🛡️ 安全增强**: 精细化了隔离环境下系统资源的访问控制策略。
|
||||
- **🔧 性能微调**: 针对大上下文窗口优化了流式数据处理性能。
|
||||
- **🚀 CLI 免维护集成**: Copilot CLI 现在通过 `github-copilot-sdk` pip 包自动同步管理,彻底告别手动 `curl | bash` 安装问题。(v0.7.0)
|
||||
- **🧠 原生工具调用 UI**: 全面适配 **OpenWebUI 原生工具调用 UI** 与模型思考过程(思维链)展示。(v0.7.0)
|
||||
- **🏠 OpenWebUI v0.8.0+ 兼容性修复**: 通过切换为绝对路径注册发布文件,彻底解决了“Error getting file content”无法下载到本地的问题。(v0.7.0)
|
||||
- **🌐 全面的多语言支持**: 针对状态消息进行了 11 国语言的原生本地化 (中/英/日/韩/法/德/西/意/俄/越/印尼)。(v0.7.0)
|
||||
- **🧹 架构精简**: 重构了初始化逻辑并优化了推理状态显示,提供更轻量稳健的体验。(v0.7.0)
|
||||
|
||||
---
|
||||
|
||||
@@ -31,8 +32,8 @@
|
||||
- **♾️ 无限会话管理**: 智能上下文窗口管理与自动压缩算法,支持无限时长的对话交互。
|
||||
- **🧠 深度数据库集成**: 实时持久化 TOD·O 列表到 UI 进度条。
|
||||
- **🌊 深度推理展示**: 完整支持模型思考过程 (Thinking Process) 的流式渲染。
|
||||
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析。
|
||||
- **⚡ 全生命周期文件 Agent**: 支持接收上传文件进行绕过 RAG 的深度分析,并将处理结果(如 Excel/报告)发布为下载链接实现闭环。
|
||||
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析(绕过 RAG 直接访问原始二进制内容)。
|
||||
- **📤 工作区产物工具 (`publish_file_from_workspace`)**: Agent 可生成文件(Excel、CSV、HTML 报告等)并直接提供**持久化下载链接**。管理员还可额外获得通过 `/content/html` 接口的**聊天内 HTML 预览**链接。
|
||||
- **🖼️ 交互式伪影 (Artifacts)**: 自动渲染 Agent 生成的 HTML/JS 应用程序,直接在聊天界面交互。
|
||||
|
||||
---
|
||||
@@ -95,7 +96,7 @@
|
||||
### 1) 导入函数
|
||||
|
||||
1. 打开 OpenWebUI,前往 **工作区** -> **函数**。
|
||||
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk_cn.py` 的内容。
|
||||
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk.py` 的内容。
|
||||
3. 点击保存并确保已启用。
|
||||
|
||||
### 2) 获取 Token (Get Token)
|
||||
@@ -110,7 +111,7 @@
|
||||
|
||||
- **Agent 无法识别文件?**: 请确保已安装并启用了 Files Filter 插件,否则原始文件会被 RAG 干扰。
|
||||
- **看不到 TODO 进度条?**: 进度条仅在 Agent 使用 `update_todo` 工具(通常是处理复杂任务)时出现。
|
||||
- **依赖安装**: 本管道会自动尝试安装 `github-copilot-sdk` (Python 包) 和 `github-copilot-cli` (官方二进制)。
|
||||
- **依赖安装**: 本管道会自动管理 `github-copilot-sdk` (Python 包) 并优先直接使用内置的二进制 CLI,无需手动干预。
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -15,7 +15,7 @@ Pipes allow you to:
|
||||
|
||||
## Available Pipe Plugins
|
||||
|
||||
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.6.2) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Database Persistence**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. Supports streaming, multimodal, and infinite sessions. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md).
|
||||
- [GitHub Copilot SDK](github-copilot-sdk.md) (v0.7.0) - Official GitHub Copilot SDK integration. Features **Workspace Isolation**, **Database Persistence**, **Zero-config OpenWebUI Tool Bridge**, **BYOK** support, and **dynamic MCP discovery**. Supports streaming, multimodal, and infinite sessions. [View Deep Dive](github-copilot-sdk-deep-dive.md) | [**View Advanced Tutorial**](github-copilot-sdk-tutorial.md).
|
||||
- **[Case Study: GitHub 100 Star Growth Analysis](star-prediction-example.md)** - Learn how to use the GitHub Copilot SDK Pipe with Minimax 2.1 to automatically analyze CSV data and generate project growth reports.
|
||||
- **[Case Study: High-Quality Video to GIF Conversion](video-processing-example.md)** - See how the model uses system-level FFmpeg to accelerate, scale, and optimize colors for screen recordings.
|
||||
|
||||
|
||||
@@ -15,7 +15,7 @@ Pipes 可以用于:
|
||||
|
||||
## 可用的 Pipe 插件
|
||||
|
||||
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.6.2) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**数据库持久化**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。支持流式输出、打字机思考过程及无限会话。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md)。
|
||||
- [GitHub Copilot SDK](github-copilot-sdk.zh.md) (v0.7.0) - GitHub Copilot SDK 官方集成。具备**工作区安全隔离**、**数据库持久化**、**零配置工具桥接**与**BYOK (自带 Key) 支持**。支持流式输出、打字机思考过程及无限会话。[查看深度架构解析](github-copilot-sdk-deep-dive.zh.md) | [**查看进阶实战教程**](github-copilot-sdk-tutorial.zh.md)。
|
||||
- **[实战案例:GitHub 100 Star 增长预测](star-prediction-example.zh.md)** - 展示如何使用 GitHub Copilot SDK Pipe 结合 Minimax 2.1 模型,自动编写脚本分析 CSV 数据并生成详细的项目增长报告。
|
||||
- **[实战案例:视频高质量 GIF 转换与加速](video-processing-example.zh.md)** - 演示模型如何通过底层 FFmpeg 工具对录屏进行加速、缩放及双阶段色彩优化处理。
|
||||
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 752 KiB After Width: | Height: | Size: 200 KiB |
@@ -9,6 +9,7 @@ icon_url: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAw
|
||||
description: Intelligently analyzes text content and generates interactive mind maps to help users structure and visualize knowledge.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
@@ -693,7 +694,7 @@ CSS_TEMPLATE_MINDMAP = """
|
||||
.content-area {
|
||||
padding: 0;
|
||||
flex: 1 1 0;
|
||||
background: transparent;
|
||||
background: var(--card-bg-color);
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
width: 100%;
|
||||
@@ -1514,6 +1515,7 @@ class Action:
|
||||
self,
|
||||
__user__: Optional[Dict[str, Any]],
|
||||
__event_call__: Optional[Callable[[Any], Awaitable[None]]] = None,
|
||||
__request__: Optional[Request] = None,
|
||||
) -> Dict[str, str]:
|
||||
"""Extract basic user context with safe fallbacks."""
|
||||
if isinstance(__user__, (list, tuple)):
|
||||
@@ -1528,20 +1530,36 @@ class Action:
|
||||
# Default from profile
|
||||
user_language = user_data.get("language", "en-US")
|
||||
|
||||
# Priority: Document Lang > LocalStorage (Frontend) > Browser > Profile (Default)
|
||||
# Level 1 Fallback: Accept-Language from __request__ headers
|
||||
if (
|
||||
__request__
|
||||
and hasattr(__request__, "headers")
|
||||
and "accept-language" in __request__.headers
|
||||
):
|
||||
raw_lang = __request__.headers.get("accept-language", "")
|
||||
if raw_lang:
|
||||
user_language = raw_lang.split(",")[0].split(";")[0]
|
||||
|
||||
# Priority: Document Lang > LocalStorage (Frontend) > Browser > Request Header > Profile
|
||||
if __event_call__:
|
||||
try:
|
||||
js_code = """
|
||||
return (
|
||||
document.documentElement.lang ||
|
||||
localStorage.getItem('locale') ||
|
||||
localStorage.getItem('language') ||
|
||||
navigator.language ||
|
||||
'en-US'
|
||||
);
|
||||
try {
|
||||
return (
|
||||
document.documentElement.lang ||
|
||||
localStorage.getItem('locale') ||
|
||||
localStorage.getItem('language') ||
|
||||
navigator.language ||
|
||||
'en-US'
|
||||
);
|
||||
} catch (e) {
|
||||
return 'en-US';
|
||||
}
|
||||
"""
|
||||
frontend_lang = await __event_call__(
|
||||
{"type": "execute", "data": {"code": js_code}}
|
||||
# Use asyncio.wait_for to prevent hanging if frontend fails to callback
|
||||
frontend_lang = await asyncio.wait_for(
|
||||
__event_call__({"type": "execute", "data": {"code": js_code}}),
|
||||
timeout=2.0,
|
||||
)
|
||||
if frontend_lang and isinstance(frontend_lang, str):
|
||||
user_language = frontend_lang
|
||||
@@ -2204,7 +2222,7 @@ class Action:
|
||||
flex-grow: 1;
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
background: transparent;
|
||||
background: var(--card-bg-color);
|
||||
min-height: 0;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
@@ -2387,7 +2405,7 @@ class Action:
|
||||
__request__: Optional[Request] = None,
|
||||
) -> Optional[dict]:
|
||||
logger.info("Action: Smart Mind Map (v1.0.0) started")
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__)
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
|
||||
user_language = user_ctx["user_language"]
|
||||
user_name = user_ctx["user_name"]
|
||||
user_id = user_ctx["user_id"]
|
||||
|
||||
@@ -1,18 +1,22 @@
|
||||
# Async Context Compression Filter
|
||||
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.2.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
|
||||
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
|
||||
|
||||
## What's new in 1.2.2
|
||||
## What's new in 1.3.0
|
||||
|
||||
- **Critical Fix**: Resolved `TypeError: 'str' object is not callable` caused by variable name conflict in logging function.
|
||||
- **Compatibility**: Enhanced `params` handling to support Pydantic objects, improving compatibility with different OpenWebUI versions.
|
||||
- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
|
||||
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
|
||||
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
|
||||
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
|
||||
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
|
||||
|
||||
---
|
||||
|
||||
## Core Features
|
||||
|
||||
- ✅ **Full i18n Support**: Native localization across 9 languages.
|
||||
- ✅ Automatic compression triggered by token thresholds.
|
||||
- ✅ Asynchronous summarization that does not block chat responses.
|
||||
- ✅ Persistent storage via Open WebUI's shared database connection (PostgreSQL, SQLite, etc.).
|
||||
@@ -55,8 +59,10 @@ This filter reduces token consumption in long conversations through intelligent
|
||||
| `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. |
|
||||
| `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). |
|
||||
| `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. |
|
||||
| `debug_mode` | `true` | Log verbose debug info. Set to `false` in production. |
|
||||
| `debug_mode` | `false` | Log verbose debug info. Set to `false` in production. |
|
||||
| `show_debug_log` | `false` | Print debug logs to browser console (F12). Useful for frontend debugging. |
|
||||
| `show_token_usage_status` | `true` | Show token usage status notification in the chat interface. |
|
||||
| `token_usage_status_threshold` | `80` | The minimum usage percentage (0-100) required to show a context usage status notification. |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,20 +1,24 @@
|
||||
# 异步上下文压缩过滤器
|
||||
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.2.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
|
||||
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。
|
||||
|
||||
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
|
||||
|
||||
## 1.2.2 版本更新
|
||||
## 1.3.0 版本更新
|
||||
|
||||
- **严重错误修复**: 解决了因日志函数变量名冲突导致的 `TypeError: 'str' object is not callable` 错误。
|
||||
- **兼容性增强**: 改进了 `params` 处理逻辑以支持 Pydantic 对象,提高了对不同 OpenWebUI 版本的兼容性。
|
||||
- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化,现已原生支持 9 种语言(含中、英、日、韩及欧洲主要语言)。
|
||||
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门(默认 80%),可以智能控制何时显示 Token 用量状态,减少不必要的打扰。
|
||||
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构,完全不影响首字节响应时间(TTFB),保持毫秒级极速推流。
|
||||
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩,避免冲突。
|
||||
- **配置项调整**: 为了提供更安静的生产环境体验,`debug_mode` 现已默认设置为 `false`。
|
||||
|
||||
---
|
||||
|
||||
## 核心特性
|
||||
|
||||
- ✅ **全方位国际化**: 原生支持 9 种界面语言。
|
||||
- ✅ **自动压缩**: 基于 Token 阈值自动触发上下文压缩。
|
||||
- ✅ **异步摘要**: 后台生成摘要,不阻塞当前对话响应。
|
||||
- ✅ **持久化存储**: 复用 Open WebUI 共享数据库连接,自动支持 PostgreSQL/SQLite 等。
|
||||
@@ -27,7 +31,7 @@
|
||||
- ✅ **智能模型匹配**: 自定义模型自动继承基础模型的阈值配置。
|
||||
- ⚠ **多模态支持**: 图片内容会被保留,但其 Token **不参与计算**。请相应调整阈值。
|
||||
|
||||
详细的工作原理和流程请参考 [工作流程指南](WORKFLOW_GUIDE_CN.md)。
|
||||
详细的工作原理和流程请参考 [工作流程指南](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/WORKFLOW_GUIDE_CN.md)。
|
||||
|
||||
---
|
||||
|
||||
@@ -93,9 +97,10 @@
|
||||
| 参数 | 默认值 | 描述 |
|
||||
| :----------------------------- | :------- | :-------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `enable_tool_output_trimming` | `false` | 启用时,若 `function_calling: "native"` 激活,将裁剪冗长的工具输出以仅提取最终答案。 |
|
||||
| `debug_mode` | `true` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息(如 Token 计数、压缩进度、数据库操作等)。生产环境建议设为 `false`。 |
|
||||
| `debug_mode` | `false` | 是否在 Open WebUI 的控制台日志中打印详细的调试信息。生产环境默认且建议设为 `false`。 |
|
||||
| `show_debug_log` | `false` | 是否在浏览器控制台 (F12) 打印调试日志。便于前端调试。 |
|
||||
| `show_token_usage_status` | `true` | 是否在对话结束时显示 Token 使用情况的状态通知。 |
|
||||
| `token_usage_status_threshold` | `80` | 触发显示上下文用量状态通知的最低百分比阈值 (0-100)。 |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -5,17 +5,17 @@ author: Fu-Jie
|
||||
author_url: https://github.com/Fu-Jie/openwebui-extensions
|
||||
funding_url: https://github.com/open-webui
|
||||
description: Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression.
|
||||
version: 1.2.2
|
||||
version: 1.3.0
|
||||
openwebui_id: b1655bc8-6de9-4cad-8cb5-a6f7829a02ce
|
||||
license: MIT
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
📌 What's new in 1.2.1
|
||||
📌 What's new in 1.3.0
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
✅ Smart Configuration: Automatically detects base model settings for custom models and adds `summary_model_max_context` for independent summary limits.
|
||||
✅ Performance & Refactoring: Optimized threshold parsing with caching and removed redundant code for better efficiency.
|
||||
✅ Bug Fixes & Modernization: Fixed `datetime` deprecation warnings and corrected type annotations.
|
||||
✅ Smart Status Display: Added `token_usage_status_threshold` valve (default 80%) to control when token usage status is shown, reducing unnecessary notifications.
|
||||
✅ Copilot SDK Integration: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
|
||||
✅ Improved User Experience: Status messages now only appear when token usage exceeds the configured threshold, keeping the interface cleaner.
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
📌 Overview
|
||||
@@ -150,7 +150,7 @@ summary_temperature
|
||||
Description: Controls the randomness of the summary generation. Lower values produce more deterministic output.
|
||||
|
||||
debug_mode
|
||||
Default: true
|
||||
Default: false
|
||||
Description: Prints detailed debug information to the log. Recommended to set to `false` in production.
|
||||
|
||||
show_debug_log
|
||||
@@ -268,6 +268,7 @@ import hashlib
|
||||
import time
|
||||
import contextlib
|
||||
import logging
|
||||
from functools import lru_cache
|
||||
|
||||
# Setup logger
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -391,6 +392,130 @@ class ChatSummary(owui_Base):
|
||||
)
|
||||
|
||||
|
||||
TRANSLATIONS = {
|
||||
"en-US": {
|
||||
"status_context_usage": "Context Usage (Estimated): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ High Usage",
|
||||
"status_loaded_summary": "Loaded historical summary (Hidden {count} historical messages)",
|
||||
"status_context_summary_updated": "Context Summary Updated: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "Generating context summary in background...",
|
||||
"status_summary_error": "Summary Error: {error}",
|
||||
"summary_prompt_prefix": "【Previous Summary: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\nBelow is the recent conversation:",
|
||||
"tool_trimmed": "... [Tool outputs trimmed]\n{content}",
|
||||
"content_collapsed": "\n... [Content collapsed] ...\n",
|
||||
},
|
||||
"zh-CN": {
|
||||
"status_context_usage": "上下文用量 (预估): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ 用量较高",
|
||||
"status_loaded_summary": "已加载历史总结 (隐藏了 {count} 条历史消息)",
|
||||
"status_context_summary_updated": "上下文总结已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "正在后台生成上下文总结...",
|
||||
"status_summary_error": "总结生成错误: {error}",
|
||||
"summary_prompt_prefix": "【前情提要:以下是历史对话的总结,仅供上下文参考。请不要回复总结内容本身,直接回答之后最新的问题。】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\n以下是最近的对话:",
|
||||
"tool_trimmed": "... [工具输出已裁剪]\n{content}",
|
||||
"content_collapsed": "\n... [内容已折叠] ...\n",
|
||||
},
|
||||
"zh-HK": {
|
||||
"status_context_usage": "上下文用量 (預估): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ 用量較高",
|
||||
"status_loaded_summary": "已載入歷史總結 (隱藏了 {count} 條歷史訊息)",
|
||||
"status_context_summary_updated": "上下文總結已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "正在後台生成上下文總結...",
|
||||
"status_summary_error": "總結生成錯誤: {error}",
|
||||
"summary_prompt_prefix": "【前情提要:以下是歷史對話的總結,僅供上下文參考。請不要回覆總結內容本身,直接回答之後最新的問題。】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\n以下是最近的對話:",
|
||||
"tool_trimmed": "... [工具輸出已裁剪]\n{content}",
|
||||
"content_collapsed": "\n... [內容已折疊] ...\n",
|
||||
},
|
||||
"zh-TW": {
|
||||
"status_context_usage": "上下文用量 (預估): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ 用量較高",
|
||||
"status_loaded_summary": "已載入歷史總結 (隱藏了 {count} 條歷史訊息)",
|
||||
"status_context_summary_updated": "上下文總結已更新: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "正在後台生成上下文總結...",
|
||||
"status_summary_error": "總結生成錯誤: {error}",
|
||||
"summary_prompt_prefix": "【前情提要:以下是歷史對話的總結,僅供上下文参考。請不要回覆總結內容本身,直接回答之後最新的問題。】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\n以下是最近的對話:",
|
||||
"tool_trimmed": "... [工具輸出已裁剪]\n{content}",
|
||||
"content_collapsed": "\n... [內容已折疊] ...\n",
|
||||
},
|
||||
"ja-JP": {
|
||||
"status_context_usage": "コンテキスト使用量 (推定): {tokens} / {max_tokens} トークン ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ 使用量高",
|
||||
"status_loaded_summary": "履歴の要約を読み込みました ({count} 件の履歴メッセージを非表示)",
|
||||
"status_context_summary_updated": "コンテキストの要約が更新されました: {tokens} / {max_tokens} トークン ({ratio}%)",
|
||||
"status_generating_summary": "バックグラウンドでコンテキスト要約を生成しています...",
|
||||
"status_summary_error": "要約エラー: {error}",
|
||||
"summary_prompt_prefix": "【これまでのあらすじ:以下は過去の会話の要約であり、コンテキストの参考としてのみ提供されます。要約の内容自体には返答せず、その後の最新の質問に直接答えてください。】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\n以下は最近の会話です:",
|
||||
"tool_trimmed": "... [ツールの出力をトリミングしました]\n{content}",
|
||||
"content_collapsed": "\n... [コンテンツが折りたたまれました] ...\n",
|
||||
},
|
||||
"ko-KR": {
|
||||
"status_context_usage": "컨텍스트 사용량 (예상): {tokens} / {max_tokens} 토큰 ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ 사용량 높음",
|
||||
"status_loaded_summary": "이전 요약 불러옴 ({count}개의 이전 메시지 숨김)",
|
||||
"status_context_summary_updated": "컨텍스트 요약 업데이트됨: {tokens} / {max_tokens} 토큰 ({ratio}%)",
|
||||
"status_generating_summary": "백그라운드에서 컨텍스트 요약 생성 중...",
|
||||
"status_summary_error": "요약 오류: {error}",
|
||||
"summary_prompt_prefix": "【이전 요약: 다음은 이전 대화의 요약이며 문맥 참고용으로만 제공됩니다. 요약 내용 자체에 답하지 말고 последу의 최신 질문에 직접 답하세요.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\n다음은 최근 대화입니다:",
|
||||
"tool_trimmed": "... [도구 출력 잘림]\n{content}",
|
||||
"content_collapsed": "\n... [내용 접힘] ...\n",
|
||||
},
|
||||
"fr-FR": {
|
||||
"status_context_usage": "Utilisation du contexte (estimée) : {tokens} / {max_tokens} jetons ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ Utilisation élevée",
|
||||
"status_loaded_summary": "Résumé historique chargé ({count} messages d'historique masqués)",
|
||||
"status_context_summary_updated": "Résumé du contexte mis à jour : {tokens} / {max_tokens} jetons ({ratio}%)",
|
||||
"status_generating_summary": "Génération du résumé du contexte en arrière-plan...",
|
||||
"status_summary_error": "Erreur de résumé : {error}",
|
||||
"summary_prompt_prefix": "【Résumé précédent : Ce qui suit est un résumé de la conversation historique, fourni uniquement pour le contexte. Ne répondez pas au contenu du résumé lui-même ; répondez directement aux dernières questions.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\nVoici la conversation récente :",
|
||||
"tool_trimmed": "... [Sorties d'outils coupées]\n{content}",
|
||||
"content_collapsed": "\n... [Contenu réduit] ...\n",
|
||||
},
|
||||
"de-DE": {
|
||||
"status_context_usage": "Kontextnutzung (geschätzt): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ Hohe Nutzung",
|
||||
"status_loaded_summary": "Historische Zusammenfassung geladen ({count} historische Nachrichten ausgeblendet)",
|
||||
"status_context_summary_updated": "Kontextzusammenfassung aktualisiert: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "Kontextzusammenfassung wird im Hintergrund generiert...",
|
||||
"status_summary_error": "Zusammenfassungsfehler: {error}",
|
||||
"summary_prompt_prefix": "【Vorherige Zusammenfassung: Das Folgende ist eine Zusammenfassung der historischen Konversation, die nur als Kontext dient. Antworten Sie nicht auf den Inhalt der Zusammenfassung selbst, sondern direkt auf die nachfolgenden neuesten Fragen.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\nHier ist die jüngste Konversation:",
|
||||
"tool_trimmed": "... [Werkzeugausgaben gekürzt]\n{content}",
|
||||
"content_collapsed": "\n... [Inhalt ausgeblendet] ...\n",
|
||||
},
|
||||
"es-ES": {
|
||||
"status_context_usage": "Uso del contexto (estimado): {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ Uso elevado",
|
||||
"status_loaded_summary": "Resumen histórico cargado ({count} mensajes históricos ocultos)",
|
||||
"status_context_summary_updated": "Resumen del contexto actualizado: {tokens} / {max_tokens} Tokens ({ratio}%)",
|
||||
"status_generating_summary": "Generando resumen del contexto en segundo plano...",
|
||||
"status_summary_error": "Error de resumen: {error}",
|
||||
"summary_prompt_prefix": "【Resumen anterior: El siguiente es un resumen de la conversación histórica, proporcionado solo como contexto. No responda al contenido del resumen en sí; responda directamente a las preguntas más recientes.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\nA continuación se muestra la conversación reciente:",
|
||||
"tool_trimmed": "... [Salidas de herramientas recortadas]\n{content}",
|
||||
"content_collapsed": "\n... [Contenido contraído] ...\n",
|
||||
},
|
||||
"it-IT": {
|
||||
"status_context_usage": "Utilizzo contesto (stimato): {tokens} / {max_tokens} Token ({ratio}%)",
|
||||
"status_high_usage": " | ⚠️ Utilizzo elevato",
|
||||
"status_loaded_summary": "Riepilogo storico caricato ({count} messaggi storici nascosti)",
|
||||
"status_context_summary_updated": "Riepilogo contesto aggiornato: {tokens} / {max_tokens} Token ({ratio}%)",
|
||||
"status_generating_summary": "Generazione riepilogo contesto in background...",
|
||||
"status_summary_error": "Errore riepilogo: {error}",
|
||||
"summary_prompt_prefix": "【Riepilogo precedente: Il seguente è un riepilogo della conversazione storica, fornito solo per contesto. Non rispondere al contenuto del riepilogo stesso; rispondi direttamente alle domande più recenti.】\n\n",
|
||||
"summary_prompt_suffix": "\n\n---\nDi seguito è riportata la conversazione recente:",
|
||||
"tool_trimmed": "... [Output degli strumenti tagliati]\n{content}",
|
||||
"content_collapsed": "\n... [Contenuto compresso] ...\n",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
# Global cache for tiktoken encoding
|
||||
TIKTOKEN_ENCODING = None
|
||||
if tiktoken:
|
||||
@@ -400,6 +525,26 @@ if tiktoken:
|
||||
logger.error(f"[Init] Failed to load tiktoken encoding: {e}")
|
||||
|
||||
|
||||
@lru_cache(maxsize=1024)
|
||||
def _get_cached_tokens(text: str) -> int:
|
||||
"""Calculates tokens with LRU caching for exact string matches."""
|
||||
if not text:
|
||||
return 0
|
||||
if TIKTOKEN_ENCODING:
|
||||
try:
|
||||
# tiktoken logic is relatively fast, but caching it based on exact string match
|
||||
# turns O(N) encoding time to O(1) dictionary lookup for historical messages.
|
||||
return len(TIKTOKEN_ENCODING.encode(text))
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
f"[Token Count] tiktoken error: {e}, falling back to character estimation"
|
||||
)
|
||||
pass
|
||||
|
||||
# Fallback strategy: Rough estimation (1 token ≈ 4 chars)
|
||||
return len(text) // 4
|
||||
|
||||
|
||||
class Filter:
|
||||
def __init__(self):
|
||||
self.valves = self.Valves()
|
||||
@@ -409,8 +554,105 @@ class Filter:
|
||||
sessionmaker(bind=self._db_engine) if self._db_engine else None
|
||||
)
|
||||
self._model_thresholds_cache: Optional[Dict[str, Any]] = None
|
||||
|
||||
# Fallback mapping for variants not in TRANSLATIONS keys
|
||||
self.fallback_map = {
|
||||
"es-AR": "es-ES",
|
||||
"es-MX": "es-ES",
|
||||
"fr-CA": "fr-FR",
|
||||
"en-CA": "en-US",
|
||||
"en-GB": "en-US",
|
||||
"en-AU": "en-US",
|
||||
"de-AT": "de-DE",
|
||||
}
|
||||
|
||||
self._init_database()
|
||||
|
||||
def _resolve_language(self, lang: str) -> str:
|
||||
"""Resolve the best matching language code from the TRANSLATIONS dict."""
|
||||
target_lang = lang
|
||||
|
||||
# 1. Direct match
|
||||
if target_lang in TRANSLATIONS:
|
||||
return target_lang
|
||||
|
||||
# 2. Variant fallback (explicit mapping)
|
||||
if target_lang in self.fallback_map:
|
||||
target_lang = self.fallback_map[target_lang]
|
||||
if target_lang in TRANSLATIONS:
|
||||
return target_lang
|
||||
|
||||
# 3. Base language fallback (e.g. fr-BE -> fr-FR)
|
||||
if "-" in lang:
|
||||
base_lang = lang.split("-")[0]
|
||||
for supported_lang in TRANSLATIONS:
|
||||
if supported_lang.startswith(base_lang + "-"):
|
||||
return supported_lang
|
||||
|
||||
# 4. Final Fallback to en-US
|
||||
return "en-US"
|
||||
|
||||
def _get_translation(self, lang: str, key: str, **kwargs) -> str:
|
||||
"""Get translated string for the given language and key."""
|
||||
target_lang = self._resolve_language(lang)
|
||||
lang_dict = TRANSLATIONS.get(target_lang, TRANSLATIONS["en-US"])
|
||||
text = lang_dict.get(key, TRANSLATIONS["en-US"].get(key, key))
|
||||
if kwargs:
|
||||
try:
|
||||
text = text.format(**kwargs)
|
||||
except Exception as e:
|
||||
logger.warning(f"Translation formatting failed for {key}: {e}")
|
||||
return text
|
||||
|
||||
async def _get_user_context(
|
||||
self,
|
||||
__user__: Optional[Dict[str, Any]],
|
||||
__event_call__: Optional[Callable[[Any], Awaitable[None]]] = None,
|
||||
) -> Dict[str, str]:
|
||||
"""Extract basic user context with safe fallbacks."""
|
||||
if isinstance(__user__, (list, tuple)):
|
||||
user_data = __user__[0] if __user__ else {}
|
||||
elif isinstance(__user__, dict):
|
||||
user_data = __user__
|
||||
else:
|
||||
user_data = {}
|
||||
|
||||
user_id = user_data.get("id", "unknown_user")
|
||||
user_name = user_data.get("name", "User")
|
||||
user_language = user_data.get("language", "en-US")
|
||||
|
||||
if __event_call__:
|
||||
try:
|
||||
js_code = """
|
||||
return (
|
||||
document.documentElement.lang ||
|
||||
localStorage.getItem('locale') ||
|
||||
localStorage.getItem('language') ||
|
||||
navigator.language ||
|
||||
'en-US'
|
||||
);
|
||||
"""
|
||||
frontend_lang = await asyncio.wait_for(
|
||||
__event_call__({"type": "execute", "data": {"code": js_code}}),
|
||||
timeout=1.0,
|
||||
)
|
||||
if frontend_lang and isinstance(frontend_lang, str):
|
||||
user_language = frontend_lang
|
||||
except asyncio.TimeoutError:
|
||||
logger.warning(
|
||||
"Failed to retrieve frontend language: Timeout (using fallback)"
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
f"Failed to retrieve frontend language: {type(e).__name__}: {e}"
|
||||
)
|
||||
|
||||
return {
|
||||
"user_id": user_id,
|
||||
"user_name": user_name,
|
||||
"user_language": user_language,
|
||||
}
|
||||
|
||||
def _parse_model_thresholds(self) -> Dict[str, Any]:
|
||||
"""Parse model_thresholds string into a dictionary.
|
||||
|
||||
@@ -574,7 +816,7 @@ class Filter:
|
||||
description="The temperature for summary generation.",
|
||||
)
|
||||
debug_mode: bool = Field(
|
||||
default=True, description="Enable detailed logging for debugging."
|
||||
default=False, description="Enable detailed logging for debugging."
|
||||
)
|
||||
show_debug_log: bool = Field(
|
||||
default=False, description="Show debug logs in the frontend console"
|
||||
@@ -582,6 +824,12 @@ class Filter:
|
||||
show_token_usage_status: bool = Field(
|
||||
default=True, description="Show token usage status notification"
|
||||
)
|
||||
token_usage_status_threshold: int = Field(
|
||||
default=80,
|
||||
ge=0,
|
||||
le=100,
|
||||
description="Only show token usage status when usage exceeds this percentage (0-100). Set to 0 to always show.",
|
||||
)
|
||||
enable_tool_output_trimming: bool = Field(
|
||||
default=False,
|
||||
description="Enable trimming of large tool outputs (only works with native function calling).",
|
||||
@@ -654,20 +902,7 @@ class Filter:
|
||||
|
||||
def _count_tokens(self, text: str) -> int:
|
||||
"""Counts the number of tokens in the text."""
|
||||
if not text:
|
||||
return 0
|
||||
|
||||
if TIKTOKEN_ENCODING:
|
||||
try:
|
||||
return len(TIKTOKEN_ENCODING.encode(text))
|
||||
except Exception as e:
|
||||
if self.valves.debug_mode:
|
||||
logger.warning(
|
||||
f"[Token Count] tiktoken error: {e}, falling back to character estimation"
|
||||
)
|
||||
|
||||
# Fallback strategy: Rough estimation (1 token ≈ 4 chars)
|
||||
return len(text) // 4
|
||||
return _get_cached_tokens(text)
|
||||
|
||||
def _calculate_messages_tokens(self, messages: List[Dict]) -> int:
|
||||
"""Calculates the total tokens for a list of messages."""
|
||||
@@ -693,6 +928,20 @@ class Filter:
|
||||
|
||||
return total_tokens
|
||||
|
||||
def _estimate_messages_tokens(self, messages: List[Dict]) -> int:
|
||||
"""Fast estimation of tokens based on character count (1/4 ratio)."""
|
||||
total_chars = 0
|
||||
for msg in messages:
|
||||
content = msg.get("content", "")
|
||||
if isinstance(content, list):
|
||||
for part in content:
|
||||
if isinstance(part, dict) and part.get("type") == "text":
|
||||
total_chars += len(part.get("text", ""))
|
||||
else:
|
||||
total_chars += len(str(content))
|
||||
|
||||
return total_chars // 4
|
||||
|
||||
def _get_model_thresholds(self, model_id: str) -> Dict[str, int]:
|
||||
"""Gets threshold configuration for a specific model.
|
||||
|
||||
@@ -830,11 +1079,13 @@ class Filter:
|
||||
}})();
|
||||
"""
|
||||
|
||||
await __event_call__(
|
||||
{
|
||||
"type": "execute",
|
||||
"data": {"code": js_code},
|
||||
}
|
||||
asyncio.create_task(
|
||||
__event_call__(
|
||||
{
|
||||
"type": "execute",
|
||||
"data": {"code": js_code},
|
||||
}
|
||||
)
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"Error emitting debug log: {e}")
|
||||
@@ -876,17 +1127,55 @@ class Filter:
|
||||
js_code = f"""
|
||||
console.log("%c[Compression] {safe_message}", "{css}");
|
||||
"""
|
||||
# Add timeout to prevent blocking if frontend connection is broken
|
||||
await asyncio.wait_for(
|
||||
event_call({"type": "execute", "data": {"code": js_code}}),
|
||||
timeout=2.0,
|
||||
)
|
||||
except asyncio.TimeoutError:
|
||||
logger.warning(
|
||||
f"Failed to emit log to frontend: Timeout (connection may be broken)"
|
||||
asyncio.create_task(
|
||||
event_call({"type": "execute", "data": {"code": js_code}})
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to emit log to frontend: {type(e).__name__}: {e}")
|
||||
logger.error(
|
||||
f"Failed to process log to frontend: {type(e).__name__}: {e}"
|
||||
)
|
||||
|
||||
def _should_show_status(self, usage_ratio: float) -> bool:
|
||||
"""
|
||||
Check if token usage status should be shown based on threshold.
|
||||
|
||||
Args:
|
||||
usage_ratio: Current usage ratio (0.0 to 1.0)
|
||||
|
||||
Returns:
|
||||
True if status should be shown, False otherwise
|
||||
"""
|
||||
if not self.valves.show_token_usage_status:
|
||||
return False
|
||||
|
||||
# If threshold is 0, always show
|
||||
if self.valves.token_usage_status_threshold == 0:
|
||||
return True
|
||||
|
||||
# Check if usage exceeds threshold
|
||||
threshold_ratio = self.valves.token_usage_status_threshold / 100.0
|
||||
return usage_ratio >= threshold_ratio
|
||||
|
||||
def _should_skip_compression(
|
||||
self, body: dict, __model__: Optional[dict] = None
|
||||
) -> bool:
|
||||
"""
|
||||
Check if compression should be skipped.
|
||||
Returns True if:
|
||||
1. The base model includes 'copilot_sdk'
|
||||
"""
|
||||
# Check if base model includes copilot_sdk
|
||||
if __model__:
|
||||
base_model_id = __model__.get("base_model_id", "")
|
||||
if "copilot_sdk" in base_model_id.lower():
|
||||
return True
|
||||
|
||||
# Also check model in body
|
||||
model_id = body.get("model", "")
|
||||
if "copilot_sdk" in model_id.lower():
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
async def inlet(
|
||||
self,
|
||||
@@ -903,6 +1192,19 @@ class Filter:
|
||||
Compression Strategy: Only responsible for injecting existing summaries, no Token calculation.
|
||||
"""
|
||||
|
||||
# Check if compression should be skipped (e.g., for copilot_sdk)
|
||||
if self._should_skip_compression(body, __model__):
|
||||
if self.valves.debug_mode:
|
||||
logger.info(
|
||||
"[Inlet] Skipping compression: copilot_sdk detected in base model"
|
||||
)
|
||||
if self.valves.show_debug_log and __event_call__:
|
||||
await self._log(
|
||||
"[Inlet] ⏭️ Skipping compression: copilot_sdk detected",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
return body
|
||||
|
||||
messages = body.get("messages", [])
|
||||
|
||||
# --- Native Tool Output Trimming (Opt-in, only for native function calling) ---
|
||||
@@ -966,8 +1268,14 @@ class Filter:
|
||||
final_answer = content[last_match_end:].strip()
|
||||
|
||||
if final_answer:
|
||||
msg["content"] = (
|
||||
f"... [Tool outputs trimmed]\n{final_answer}"
|
||||
msg["content"] = self._get_translation(
|
||||
(
|
||||
__user__.get("language", "en-US")
|
||||
if __user__
|
||||
else "en-US"
|
||||
),
|
||||
"tool_trimmed",
|
||||
content=final_answer,
|
||||
)
|
||||
trimmed_count += 1
|
||||
else:
|
||||
@@ -980,8 +1288,14 @@ class Filter:
|
||||
if len(parts) > 1:
|
||||
final_answer = parts[-1].strip()
|
||||
if final_answer:
|
||||
msg["content"] = (
|
||||
f"... [Tool outputs trimmed]\n{final_answer}"
|
||||
msg["content"] = self._get_translation(
|
||||
(
|
||||
__user__.get("language", "en-US")
|
||||
if __user__
|
||||
else "en-US"
|
||||
),
|
||||
"tool_trimmed",
|
||||
content=final_answer,
|
||||
)
|
||||
trimmed_count += 1
|
||||
|
||||
@@ -1173,6 +1487,10 @@ class Filter:
|
||||
# Target is to compress up to the (total - keep_last) message
|
||||
target_compressed_count = max(0, len(messages) - self.valves.keep_last)
|
||||
|
||||
# Get user context for i18n
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__)
|
||||
lang = user_ctx["user_language"]
|
||||
|
||||
await self._log(
|
||||
f"[Inlet] Recorded target compression progress: {target_compressed_count}",
|
||||
event_call=__event_call__,
|
||||
@@ -1207,10 +1525,9 @@ class Filter:
|
||||
|
||||
# 2. Summary message (Inserted as Assistant message)
|
||||
summary_content = (
|
||||
f"【Previous Summary: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n"
|
||||
f"{summary_record.summary}\n\n"
|
||||
f"---\n"
|
||||
f"Below is the recent conversation:"
|
||||
self._get_translation(lang, "summary_prompt_prefix")
|
||||
+ f"{summary_record.summary}"
|
||||
+ self._get_translation(lang, "summary_prompt_suffix")
|
||||
)
|
||||
summary_msg = {"role": "assistant", "content": summary_content}
|
||||
|
||||
@@ -1249,16 +1566,27 @@ class Filter:
|
||||
"max_context_tokens", self.valves.max_context_tokens
|
||||
)
|
||||
|
||||
# Calculate total tokens
|
||||
total_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, calc_messages
|
||||
)
|
||||
# --- Fast Estimation Check ---
|
||||
estimated_tokens = self._estimate_messages_tokens(calc_messages)
|
||||
|
||||
# Preflight Check Log
|
||||
await self._log(
|
||||
f"[Inlet] 🔎 Preflight Check: {total_tokens}t / {max_context_tokens}t ({(total_tokens/max_context_tokens*100):.1f}%)",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
# Since this is a hard limit check, only skip precise calculation if we are far below it (margin of 15%)
|
||||
if estimated_tokens < max_context_tokens * 0.85:
|
||||
total_tokens = estimated_tokens
|
||||
await self._log(
|
||||
f"[Inlet] 🔎 Fast Preflight Check (Est): {total_tokens}t / {max_context_tokens}t (Well within limit)",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
else:
|
||||
# Calculate exact total tokens via tiktoken
|
||||
total_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, calc_messages
|
||||
)
|
||||
|
||||
# Preflight Check Log
|
||||
await self._log(
|
||||
f"[Inlet] 🔎 Precise Preflight Check: {total_tokens}t / {max_context_tokens}t ({(total_tokens/max_context_tokens*100):.1f}%)",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
|
||||
# If over budget, reduce history (Keep Last)
|
||||
if total_tokens > max_context_tokens:
|
||||
@@ -1325,7 +1653,9 @@ class Filter:
|
||||
first_line_found = True
|
||||
# Add placeholder if there's more content coming
|
||||
if idx < last_line_idx:
|
||||
kept_lines.append("\n... [Content collapsed] ...\n")
|
||||
kept_lines.append(
|
||||
self._get_translation(lang, "content_collapsed")
|
||||
)
|
||||
continue
|
||||
|
||||
# Keep last non-empty line
|
||||
@@ -1347,8 +1677,13 @@ class Filter:
|
||||
target_msg["metadata"]["is_trimmed"] = True
|
||||
|
||||
# Calculate token reduction
|
||||
old_tokens = self._count_tokens(content)
|
||||
new_tokens = self._count_tokens(target_msg["content"])
|
||||
# Use current token strategy
|
||||
if total_tokens == estimated_tokens:
|
||||
old_tokens = len(content) // 4
|
||||
new_tokens = len(target_msg["content"]) // 4
|
||||
else:
|
||||
old_tokens = self._count_tokens(content)
|
||||
new_tokens = self._count_tokens(target_msg["content"])
|
||||
diff = old_tokens - new_tokens
|
||||
total_tokens -= diff
|
||||
|
||||
@@ -1362,7 +1697,12 @@ class Filter:
|
||||
# Strategy 2: Fallback - Drop Oldest Message Entirely (FIFO)
|
||||
# (User requested to remove progressive trimming for other cases)
|
||||
dropped = tail_messages.pop(0)
|
||||
dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
|
||||
if total_tokens == estimated_tokens:
|
||||
dropped_tokens = len(str(dropped.get("content", ""))) // 4
|
||||
else:
|
||||
dropped_tokens = self._count_tokens(
|
||||
str(dropped.get("content", ""))
|
||||
)
|
||||
total_tokens -= dropped_tokens
|
||||
|
||||
if self.valves.show_debug_log and __event_call__:
|
||||
@@ -1382,14 +1722,24 @@ class Filter:
|
||||
final_messages = candidate_messages
|
||||
|
||||
# Calculate detailed token stats for logging
|
||||
system_tokens = (
|
||||
self._count_tokens(system_prompt_msg.get("content", ""))
|
||||
if system_prompt_msg
|
||||
else 0
|
||||
)
|
||||
head_tokens = self._calculate_messages_tokens(head_messages)
|
||||
summary_tokens = self._count_tokens(summary_content)
|
||||
tail_tokens = self._calculate_messages_tokens(tail_messages)
|
||||
if total_tokens == estimated_tokens:
|
||||
system_tokens = (
|
||||
len(system_prompt_msg.get("content", "")) // 4
|
||||
if system_prompt_msg
|
||||
else 0
|
||||
)
|
||||
head_tokens = self._estimate_messages_tokens(head_messages)
|
||||
summary_tokens = len(summary_content) // 4
|
||||
tail_tokens = self._estimate_messages_tokens(tail_messages)
|
||||
else:
|
||||
system_tokens = (
|
||||
self._count_tokens(system_prompt_msg.get("content", ""))
|
||||
if system_prompt_msg
|
||||
else 0
|
||||
)
|
||||
head_tokens = self._calculate_messages_tokens(head_messages)
|
||||
summary_tokens = self._count_tokens(summary_content)
|
||||
tail_tokens = self._calculate_messages_tokens(tail_messages)
|
||||
|
||||
system_info = (
|
||||
f"System({system_tokens}t)" if system_prompt_msg else "System(0t)"
|
||||
@@ -1408,22 +1758,43 @@ class Filter:
|
||||
# Prepare status message (Context Usage format)
|
||||
if max_context_tokens > 0:
|
||||
usage_ratio = total_section_tokens / max_context_tokens
|
||||
status_msg = f"Context Usage (Estimated): {total_section_tokens} / {max_context_tokens} Tokens ({usage_ratio*100:.1f}%)"
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += " | ⚠️ High Usage"
|
||||
else:
|
||||
status_msg = f"Loaded historical summary (Hidden {compressed_count} historical messages)"
|
||||
# Only show status if threshold is met
|
||||
if self._should_show_status(usage_ratio):
|
||||
status_msg = self._get_translation(
|
||||
lang,
|
||||
"status_context_usage",
|
||||
tokens=total_section_tokens,
|
||||
max_tokens=max_context_tokens,
|
||||
ratio=f"{usage_ratio*100:.1f}",
|
||||
)
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += self._get_translation(lang, "status_high_usage")
|
||||
|
||||
if __event_emitter__:
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
if __event_emitter__:
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
else:
|
||||
# For the case where max_context_tokens is 0, show summary info without threshold check
|
||||
if self.valves.show_token_usage_status and __event_emitter__:
|
||||
status_msg = self._get_translation(
|
||||
lang, "status_loaded_summary", count=compressed_count
|
||||
)
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
# Emit debug log to frontend (Keep the structured log as well)
|
||||
await self._emit_debug_log(
|
||||
@@ -1454,9 +1825,20 @@ class Filter:
|
||||
"max_context_tokens", self.valves.max_context_tokens
|
||||
)
|
||||
|
||||
total_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, calc_messages
|
||||
)
|
||||
# --- Fast Estimation Check ---
|
||||
estimated_tokens = self._estimate_messages_tokens(calc_messages)
|
||||
|
||||
# Only skip precise calculation if we are clearly below the limit
|
||||
if estimated_tokens < max_context_tokens * 0.85:
|
||||
total_tokens = estimated_tokens
|
||||
await self._log(
|
||||
f"[Inlet] 🔎 Fast limit check (Est): {total_tokens}t / {max_context_tokens}t",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
else:
|
||||
total_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, calc_messages
|
||||
)
|
||||
|
||||
if total_tokens > max_context_tokens:
|
||||
await self._log(
|
||||
@@ -1476,7 +1858,12 @@ class Filter:
|
||||
> start_trim_index + 1 # Keep at least 1 message after keep_first
|
||||
):
|
||||
dropped = final_messages.pop(start_trim_index)
|
||||
dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
|
||||
if total_tokens == estimated_tokens:
|
||||
dropped_tokens = len(str(dropped.get("content", ""))) // 4
|
||||
else:
|
||||
dropped_tokens = self._count_tokens(
|
||||
str(dropped.get("content", ""))
|
||||
)
|
||||
total_tokens -= dropped_tokens
|
||||
|
||||
await self._log(
|
||||
@@ -1485,23 +1872,30 @@ class Filter:
|
||||
)
|
||||
|
||||
# Send status notification (Context Usage format)
|
||||
if __event_emitter__:
|
||||
status_msg = f"Context Usage (Estimated): {total_tokens} / {max_context_tokens} Tokens"
|
||||
if max_context_tokens > 0:
|
||||
usage_ratio = total_tokens / max_context_tokens
|
||||
status_msg += f" ({usage_ratio*100:.1f}%)"
|
||||
if max_context_tokens > 0:
|
||||
usage_ratio = total_tokens / max_context_tokens
|
||||
# Only show status if threshold is met
|
||||
if self._should_show_status(usage_ratio):
|
||||
status_msg = self._get_translation(
|
||||
lang,
|
||||
"status_context_usage",
|
||||
tokens=total_tokens,
|
||||
max_tokens=max_context_tokens,
|
||||
ratio=f"{usage_ratio*100:.1f}",
|
||||
)
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += " | ⚠️ High Usage"
|
||||
status_msg += self._get_translation(lang, "status_high_usage")
|
||||
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
if __event_emitter__:
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
body["messages"] = final_messages
|
||||
|
||||
@@ -1517,6 +1911,7 @@ class Filter:
|
||||
body: dict,
|
||||
__user__: Optional[dict] = None,
|
||||
__metadata__: dict = None,
|
||||
__model__: dict = None,
|
||||
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
|
||||
__event_call__: Callable[[Any], Awaitable[None]] = None,
|
||||
) -> dict:
|
||||
@@ -1524,6 +1919,23 @@ class Filter:
|
||||
Executed after the LLM response is complete.
|
||||
Calculates Token count in the background and triggers summary generation (does not block current response, does not affect content output).
|
||||
"""
|
||||
# Check if compression should be skipped (e.g., for copilot_sdk)
|
||||
if self._should_skip_compression(body, __model__):
|
||||
if self.valves.debug_mode:
|
||||
logger.info(
|
||||
"[Outlet] Skipping compression: copilot_sdk detected in base model"
|
||||
)
|
||||
if self.valves.show_debug_log and __event_call__:
|
||||
await self._log(
|
||||
"[Outlet] ⏭️ Skipping compression: copilot_sdk detected",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
return body
|
||||
|
||||
# Get user context for i18n
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__)
|
||||
lang = user_ctx["user_language"]
|
||||
|
||||
chat_ctx = self._get_chat_context(body, __metadata__)
|
||||
chat_id = chat_ctx["chat_id"]
|
||||
if not chat_id:
|
||||
@@ -1547,6 +1959,7 @@ class Filter:
|
||||
body,
|
||||
__user__,
|
||||
target_compressed_count,
|
||||
lang,
|
||||
__event_emitter__,
|
||||
__event_call__,
|
||||
)
|
||||
@@ -1561,6 +1974,7 @@ class Filter:
|
||||
body: dict,
|
||||
user_data: Optional[dict],
|
||||
target_compressed_count: Optional[int],
|
||||
lang: str = "en-US",
|
||||
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
|
||||
__event_call__: Callable[[Any], Awaitable[None]] = None,
|
||||
):
|
||||
@@ -1595,37 +2009,58 @@ class Filter:
|
||||
event_call=__event_call__,
|
||||
)
|
||||
|
||||
# Calculate Token count in a background thread
|
||||
current_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, messages
|
||||
)
|
||||
# --- Fast Estimation Check ---
|
||||
estimated_tokens = self._estimate_messages_tokens(messages)
|
||||
|
||||
await self._log(
|
||||
f"[🔍 Background Calculation] Token count: {current_tokens}",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
# For triggering summary generation, we need to be more precise if we are in the grey zone
|
||||
# Margin is 15% (skip tiktoken if estimated is < 85% of threshold)
|
||||
# Note: We still use tiktoken if we exceed threshold, because we want an accurate usage status report
|
||||
if estimated_tokens < compression_threshold_tokens * 0.85:
|
||||
current_tokens = estimated_tokens
|
||||
await self._log(
|
||||
f"[🔍 Background Calculation] Fast estimate ({current_tokens}) is well below threshold ({compression_threshold_tokens}). Skipping tiktoken.",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
else:
|
||||
# Calculate Token count precisely in a background thread
|
||||
current_tokens = await asyncio.to_thread(
|
||||
self._calculate_messages_tokens, messages
|
||||
)
|
||||
await self._log(
|
||||
f"[🔍 Background Calculation] Precise token count: {current_tokens}",
|
||||
event_call=__event_call__,
|
||||
)
|
||||
|
||||
# Send status notification (Context Usage format)
|
||||
if __event_emitter__ and self.valves.show_token_usage_status:
|
||||
if __event_emitter__:
|
||||
max_context_tokens = thresholds.get(
|
||||
"max_context_tokens", self.valves.max_context_tokens
|
||||
)
|
||||
status_msg = f"Context Usage (Estimated): {current_tokens} / {max_context_tokens} Tokens"
|
||||
if max_context_tokens > 0:
|
||||
usage_ratio = current_tokens / max_context_tokens
|
||||
status_msg += f" ({usage_ratio*100:.1f}%)"
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += " | ⚠️ High Usage"
|
||||
# Only show status if threshold is met
|
||||
if self._should_show_status(usage_ratio):
|
||||
status_msg = self._get_translation(
|
||||
lang,
|
||||
"status_context_usage",
|
||||
tokens=current_tokens,
|
||||
max_tokens=max_context_tokens,
|
||||
ratio=f"{usage_ratio*100:.1f}",
|
||||
)
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += self._get_translation(
|
||||
lang, "status_high_usage"
|
||||
)
|
||||
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
# Check if compression is needed
|
||||
if current_tokens >= compression_threshold_tokens:
|
||||
@@ -1642,6 +2077,7 @@ class Filter:
|
||||
body,
|
||||
user_data,
|
||||
target_compressed_count,
|
||||
lang,
|
||||
__event_emitter__,
|
||||
__event_call__,
|
||||
)
|
||||
@@ -1672,6 +2108,7 @@ class Filter:
|
||||
body: dict,
|
||||
user_data: Optional[dict],
|
||||
target_compressed_count: Optional[int],
|
||||
lang: str = "en-US",
|
||||
__event_emitter__: Callable[[Any], Awaitable[None]] = None,
|
||||
__event_call__: Callable[[Any], Awaitable[None]] = None,
|
||||
):
|
||||
@@ -1811,7 +2248,9 @@ class Filter:
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": "Generating context summary in background...",
|
||||
"description": self._get_translation(
|
||||
lang, "status_generating_summary"
|
||||
),
|
||||
"done": False,
|
||||
},
|
||||
}
|
||||
@@ -1849,7 +2288,11 @@ class Filter:
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": f"Context summary updated (Compressed {len(middle_messages)} messages)",
|
||||
"description": self._get_translation(
|
||||
lang,
|
||||
"status_loaded_summary",
|
||||
count=len(middle_messages),
|
||||
),
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
@@ -1910,10 +2353,9 @@ class Filter:
|
||||
|
||||
# Summary
|
||||
summary_content = (
|
||||
f"【System Prompt: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n"
|
||||
f"{new_summary}\n\n"
|
||||
f"---\n"
|
||||
f"Below is the recent conversation:"
|
||||
self._get_translation(lang, "summary_prompt_prefix")
|
||||
+ f"{new_summary}"
|
||||
+ self._get_translation(lang, "summary_prompt_suffix")
|
||||
)
|
||||
summary_msg = {"role": "assistant", "content": summary_content}
|
||||
|
||||
@@ -1943,23 +2385,32 @@ class Filter:
|
||||
max_context_tokens = thresholds.get(
|
||||
"max_context_tokens", self.valves.max_context_tokens
|
||||
)
|
||||
# 6. Emit Status
|
||||
status_msg = f"Context Summary Updated: {token_count} / {max_context_tokens} Tokens"
|
||||
# 6. Emit Status (only if threshold is met)
|
||||
if max_context_tokens > 0:
|
||||
ratio = (token_count / max_context_tokens) * 100
|
||||
status_msg += f" ({ratio:.1f}%)"
|
||||
if ratio > 90.0:
|
||||
status_msg += " | ⚠️ High Usage"
|
||||
usage_ratio = token_count / max_context_tokens
|
||||
# Only show status if threshold is met
|
||||
if self._should_show_status(usage_ratio):
|
||||
status_msg = self._get_translation(
|
||||
lang,
|
||||
"status_context_summary_updated",
|
||||
tokens=token_count,
|
||||
max_tokens=max_context_tokens,
|
||||
ratio=f"{usage_ratio*100:.1f}",
|
||||
)
|
||||
if usage_ratio > 0.9:
|
||||
status_msg += self._get_translation(
|
||||
lang, "status_high_usage"
|
||||
)
|
||||
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": status_msg,
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
except Exception as e:
|
||||
await self._log(
|
||||
f"[Status] Error calculating tokens: {e}",
|
||||
@@ -1979,7 +2430,9 @@ class Filter:
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": f"Summary Error: {str(e)[:100]}...",
|
||||
"description": self._get_translation(
|
||||
lang, "status_summary_error", error=str(e)[:100]
|
||||
),
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,6 +1,6 @@
|
||||
# GitHub Copilot SDK Pipe for OpenWebUI
|
||||
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.6.2 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
**Author:** [Fu-Jie](https://github.com/Fu-Jie) | **Version:** 0.7.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
|
||||
|
||||
This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/open-webui) that integrates the official [GitHub Copilot SDK](https://github.com/github/copilot-sdk). It enables you to use **GitHub Copilot models** (e.g., `gpt-5.2-codex`, `claude-sonnet-4.5`,`gemini-3-pro`, `gpt-5-mini`) **AND** your own models via **BYOK** (OpenAI, Anthropic) directly within OpenWebUI, providing a unified agentic experience with **strict User & Chat-level Workspace Isolation**.
|
||||
|
||||
@@ -14,12 +14,13 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
|
||||
|
||||
---
|
||||
|
||||
## ✨ v0.6.2 Updates (What's New)
|
||||
## ✨ v0.7.0 Updates (What's New)
|
||||
|
||||
- **🛠️ New Workspace Artifacts Tool**: Introduced `publish_file_from_workspace`. Agents can now generate files (e.g., Python-generated Excel/CSV) and provide direct download links for the user to click and save.
|
||||
- **⚙️ Workflow Optimization**: Improved reliability of the internal agentic workspace management.
|
||||
- **🛡️ Enhanced Security**: Refined access control for system resources within the isolated environment.
|
||||
- **🔧 Performance Tuning**: Optimized stream processing for larger context windows.
|
||||
- **🚀 Integrated CLI Management**: The Copilot CLI is now automatically managed and bundled via the `github-copilot-sdk` pip package. No more manual `curl | bash` installation or version mismatches. (v0.7.0)
|
||||
- **🧠 Native Tool Call UI**: Full adaptation to **OpenWebUI's native tool call UI** and thinking process visualization. (v0.7.0)
|
||||
- **🏠 OpenWebUI v0.8.0+ Fix**: Resolved "Error getting file content" download failure by switching to absolute path registration for published files. (v0.7.0)
|
||||
- **🌐 Comprehensive Multi-language Support**: Native localization for status messages in 11 languages (EN, ZH, JA, KO, FR, DE, ES, IT, RU, VI, ID). (v0.7.0)
|
||||
- **🧹 Architecture Cleanup**: Refactored core setup and optimized reasoning status display for a leaner experience. (v0.7.0)
|
||||
|
||||
---
|
||||
|
||||
@@ -31,8 +32,8 @@ This is an advanced Pipe function for [OpenWebUI](https://github.com/open-webui/
|
||||
- **♾️ Infinite Session Management**: Smart context window management with automatic compaction for indefinite conversation capability.
|
||||
- **🧠 Deep Database Integration**: Real-time persistence of TOD·O lists for long-running workflows.
|
||||
- **🌊 Advanced Streaming**: Full support for thinking process/Chain of Thought visualization.
|
||||
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support.
|
||||
- **⚡ Full-Lifecycle File Agent**: Supports receiving uploaded files for raw bypass analysis and publishing results (Excel/reports) as downloadable links.
|
||||
- **🖼️ Intelligent Multimodal**: Vision capabilities and raw file analysis support (bypasses RAG for direct binary access).
|
||||
- **📤 Workspace Artifacts (`publish_file_from_workspace`)**: Agents can generate files (Excel, CSV, HTML reports, etc.) and provide **persistent download links** directly in the chat.
|
||||
- **🖼️ Interactive Artifacts**: Automatically renders HTML/JS apps generated by the agent directly in the chat interface.
|
||||
|
||||
---
|
||||
@@ -110,7 +111,7 @@ If this plugin has been useful, a **Star** on [OpenWebUI Extensions](https://git
|
||||
|
||||
- **Agent ignores files?**: Ensure the Files Filter is enabled, otherwise RAG will interfere with raw binaries.
|
||||
- **No progress bar?**: The bar only appears when the Agent uses the `update_todo` tool.
|
||||
- **Dependencies**: This Pipe automatically installs `github-copilot-sdk` (Python) and `github-copilot-cli` (Binary).
|
||||
- **Dependencies**: This Pipe automatically manages `github-copilot-sdk` (Python) and utilizes the bundled binary CLI. No manual install required.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# GitHub Copilot SDK 官方管道
|
||||
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.6.2 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 0.7.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
|
||||
|
||||
这是一个用于 [OpenWebUI](https://github.com/open-webui/open-webui) 的高级 Pipe 函数,深度集成了 **GitHub Copilot SDK**。它不仅支持 **GitHub Copilot 官方模型**(如 `gpt-5.2-codex`, `claude-sonnet-4.5`, `gemini-3-pro`, `gpt-5-mini`),还支持 **BYOK (自带 Key)** 模式对接自定义服务商(OpenAI, Anthropic),并具备**严格的用户与会话级工作区隔离**能力,提供统一且安全的 Agent 交互体验。
|
||||
|
||||
@@ -14,12 +14,13 @@
|
||||
|
||||
---
|
||||
|
||||
## ✨ 0.6.2 更新内容 (What's New)
|
||||
## ✨ 0.7.0 更新内容 (What's New)
|
||||
|
||||
- **🛠️ 新增工作区产物工具**: 引入 `publish_file_from_workspace`。Agent 现在可以生成物理文件(如使用 Python 生成的 Excel/CSV 报表),并直接在聊天界面提供点击下载链接。
|
||||
- **⚙️ 工作流优化**: 提升了内部 Agent 物理工作区管理的可靠性与原子性。
|
||||
- **🛡️ 安全增强**: 精细化了隔离环境下系统资源的访问控制策略。
|
||||
- **🔧 性能微调**: 针对大上下文窗口优化了流式数据处理性能。
|
||||
- **🚀 CLI 免维护集成**: Copilot CLI 现在通过 `github-copilot-sdk` pip 包自动同步管理,彻底告别手动 `curl | bash` 安装及版本不匹配问题。(v0.7.0)
|
||||
- **🧠 原生工具调用 UI**: 全面适配 **OpenWebUI 原生工具调用 UI** 与模型思考过程(思维链)展示。(v0.7.0)
|
||||
- **🏠 OpenWebUI v0.8.0+ 兼容性修复**: 通过切换为绝对路径注册发布文件,彻底解决了“Error getting file content”无法下载到本地的问题。(v0.7.0)
|
||||
- **🌐 全面的多语言支持**: 针对状态消息进行了 11 国语言的原生本地化 (中/英/日/韩/法/德/西/意/俄/越/印尼)。(v0.7.0)
|
||||
- **🧹 架构精简**: 重构了初始化逻辑并优化了推理状态显示,提供更轻量稳健的体验。(v0.7.0)
|
||||
|
||||
---
|
||||
|
||||
@@ -31,8 +32,8 @@
|
||||
- **♾️ 无限会话管理**: 智能上下文窗口管理与自动压缩算法,支持无限时长的对话交互。
|
||||
- **🧠 深度数据库集成**: 实时持久化 TOD·O 列表到 UI 进度条。
|
||||
- **🌊 深度推理展示**: 完整支持模型思考过程 (Thinking Process) 的流式渲染。
|
||||
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析。
|
||||
- **⚡ 全生命周期文件 Agent**: 支持接收上传文件进行绕过 RAG 的深度分析,并将处理结果(如 Excel/报告)发布为下载链接。
|
||||
- **🖼️ 智能多模态**: 完整支持图像识别与附件上传分析(绕过 RAG 直接访问原始二进制内容)。
|
||||
- **📤 工作区产物工具 (`publish_file_from_workspace`)**: Agent 可生成文件(Excel、CSV、HTML 报告等)并直接在聊天中提供**持久化下载链接**。
|
||||
- **🖼️ 交互式伪影 (Artifacts)**: 自动渲染 Agent 生成的 HTML/JS 应用程序,直接在聊天界面交互。
|
||||
|
||||
---
|
||||
@@ -95,7 +96,7 @@
|
||||
### 1) 导入函数
|
||||
|
||||
1. 打开 OpenWebUI,前往 **工作区** -> **函数**。
|
||||
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk_cn.py` 的内容。
|
||||
2. 点击 **+** (创建函数),完整粘贴 `github_copilot_sdk.py` 的内容。
|
||||
3. 点击保存并确保已启用。
|
||||
|
||||
### 2) 获取 Token (Get Token)
|
||||
@@ -114,7 +115,7 @@
|
||||
|
||||
- **Agent 无法识别文件?**: 请确保已安装并启用了 Files Filter 插件,否则原始文件会被 RAG 干扰。
|
||||
- **看不到 TODO 进度条?**: 进度条仅在 Agent 使用 `update_todo` 工具(通常是处理复杂任务)时出现。
|
||||
- **依赖安装**: 本管道会自动尝试安装 `github-copilot-sdk` (Python 包) 和 `github-copilot-cli` (官方二进制)。
|
||||
- **依赖安装**: 本管道会自动管理 `github-copilot-sdk` (Python 包) 并优先直接使用内置的二进制 CLI,无需手动干预。
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
"""
|
||||
title: GitHub Copilot Official SDK Pipe
|
||||
author: Fu-Jie
|
||||
author_url: https://github.com/Fu-Jie/awesome-openwebui
|
||||
author_url: https://github.com/Fu-Jie/openwebui-extensions
|
||||
funding_url: https://github.com/open-webui
|
||||
openwebui_id: ce96f7b4-12fc-4ac3-9a01-875713e69359
|
||||
description: Integrate GitHub Copilot SDK. Supports dynamic models, multi-turn conversation, streaming, multimodal input, infinite sessions, and frontend debug logging.
|
||||
version: 0.6.2
|
||||
requirements: github-copilot-sdk==0.1.23
|
||||
version: 0.7.0
|
||||
requirements: github-copilot-sdk==0.1.25
|
||||
"""
|
||||
|
||||
import os
|
||||
@@ -226,10 +226,7 @@ class Pipe:
|
||||
default=300,
|
||||
description="Timeout for each stream chunk (seconds)",
|
||||
)
|
||||
COPILOT_CLI_VERSION: str = Field(
|
||||
default="0.0.406",
|
||||
description="Specific Copilot CLI version to install/enforce (e.g. '0.0.406'). Leave empty for latest.",
|
||||
)
|
||||
|
||||
EXCLUDE_KEYWORDS: str = Field(
|
||||
default="",
|
||||
description="Exclude models containing these keywords (comma separated, e.g.: codex, haiku)",
|
||||
@@ -360,6 +357,116 @@ class Pipe:
|
||||
_env_setup_done = False # Track if env setup has been completed
|
||||
_last_update_check = 0 # Timestamp of last CLI update check
|
||||
|
||||
TRANSLATIONS = {
|
||||
"en-US": {
|
||||
"status_conn_est": "Connection established, waiting for response...",
|
||||
"status_reasoning_inj": "Reasoning Effort injected: {effort}",
|
||||
"debug_agent_working_in": "Agent working in: {path}",
|
||||
"debug_mcp_servers": "🔌 Connected MCP Servers: {servers}",
|
||||
"publish_success": "File published successfully.",
|
||||
"publish_hint_html": "Link: [View {filename}]({view_url}) | [Download]({download_url})",
|
||||
"publish_hint_default": "Link: [Download {filename}]({download_url})",
|
||||
},
|
||||
"zh-CN": {
|
||||
"status_conn_est": "已建立连接,等待响应...",
|
||||
"status_reasoning_inj": "已注入推理级别:{effort}",
|
||||
"debug_agent_working_in": "Agent 工作目录: {path}",
|
||||
"debug_mcp_servers": "🔌 已连接 MCP 服务器: {servers}",
|
||||
"publish_success": "文件发布成功。",
|
||||
"publish_hint_html": "链接: [查看 {filename}]({view_url}) | [下载]({download_url})",
|
||||
"publish_hint_default": "链接: [下载 {filename}]({download_url})",
|
||||
},
|
||||
"zh-HK": {
|
||||
"status_conn_est": "已建立連接,等待響應...",
|
||||
"status_reasoning_inj": "已注入推理級別:{effort}",
|
||||
"debug_agent_working_in": "Agent 工作目錄: {path}",
|
||||
"debug_mcp_servers": "🔌 已連接 MCP 伺服器: {servers}",
|
||||
"publish_success": "文件發布成功。",
|
||||
"publish_hint_html": "連結: [查看 {filename}]({view_url}) | [下載]({download_url})",
|
||||
"publish_hint_default": "連結: [下載 {filename}]({download_url})",
|
||||
},
|
||||
"zh-TW": {
|
||||
"status_conn_est": "已建立連接,等待響應...",
|
||||
"status_reasoning_inj": "已注入推理級別:{effort}",
|
||||
"debug_agent_working_in": "Agent 工作目錄: {path}",
|
||||
"debug_mcp_servers": "🔌 已連接 MCP 伺服器: {servers}",
|
||||
"publish_success": "文件發布成功。",
|
||||
"publish_hint_html": "連結: [查看 {filename}]({view_url}) | [下載]({download_url})",
|
||||
"publish_hint_default": "連結: [下載 {filename}]({download_url})",
|
||||
},
|
||||
"ja-JP": {
|
||||
"status_conn_est": "接続が確立されました。応答を待っています...",
|
||||
"status_reasoning_inj": "推論レベルが注入されました:{effort}",
|
||||
"debug_agent_working_in": "Agent 作業ディレクトリ: {path}",
|
||||
"debug_mcp_servers": "🔌 接続済み MCP サーバー: {servers}",
|
||||
},
|
||||
"ko-KR": {
|
||||
"status_conn_est": "연결이 설정되었습니다. 응답을 기다리는 중...",
|
||||
"status_reasoning_inj": "추론 수준 설정됨: {effort}",
|
||||
"debug_agent_working_in": "Agent 작업 디렉토리: {path}",
|
||||
"debug_mcp_servers": "🔌 연결된 MCP 서버: {servers}",
|
||||
},
|
||||
"fr-FR": {
|
||||
"status_conn_est": "Connexion établie, en attente de réponse...",
|
||||
"status_reasoning_inj": "Effort de raisonnement injecté : {effort}",
|
||||
"debug_agent_working_in": "Répertoire de travail de l'Agent : {path}",
|
||||
"debug_mcp_servers": "🔌 Serveurs MCP connectés : {servers}",
|
||||
},
|
||||
"de-DE": {
|
||||
"status_conn_est": "Verbindung hergestellt, warte auf Antwort...",
|
||||
"status_reasoning_inj": "Argumentationsaufwand injiziert: {effort}",
|
||||
"debug_agent_working_in": "Agent-Arbeitsverzeichnis: {path}",
|
||||
"debug_mcp_servers": "🔌 Verbundene MCP-Server: {servers}",
|
||||
},
|
||||
"es-ES": {
|
||||
"status_conn_est": "Conexión establecida, esperando respuesta...",
|
||||
"status_reasoning_inj": "Nivel de razonamiento inyectado: {effort}",
|
||||
"debug_agent_working_in": "Directorio de trabajo del Agente: {path}",
|
||||
"debug_mcp_servers": "🔌 Servidores MCP conectados: {servers}",
|
||||
},
|
||||
"it-IT": {
|
||||
"status_conn_est": "Connessione stabilita, in attesa di risposta...",
|
||||
"status_reasoning_inj": "Livello di ragionamento iniettato: {effort}",
|
||||
"debug_agent_working_in": "Directory di lavoro dell'Agente: {path}",
|
||||
"debug_mcp_servers": "🔌 Server MCP connessi: {servers}",
|
||||
},
|
||||
"ru-RU": {
|
||||
"status_conn_est": "Соединение установлено, ожидание ответа...",
|
||||
"status_reasoning_inj": "Уровень рассуждения внедрен: {effort}",
|
||||
"debug_agent_working_in": "Рабочий каталог Агента: {path}",
|
||||
"debug_mcp_servers": "🔌 Подключенные серверы MCP: {servers}",
|
||||
},
|
||||
"vi-VN": {
|
||||
"status_conn_est": "Đã thiết lập kết nối, đang chờ phản hồi...",
|
||||
"status_reasoning_inj": "Cấp độ suy luận đã được áp dụng: {effort}",
|
||||
"debug_agent_working_in": "Thư mục làm việc của Agent: {path}",
|
||||
"debug_mcp_servers": "🔌 Các máy chủ MCP đã kết nối: {servers}",
|
||||
},
|
||||
"id-ID": {
|
||||
"status_conn_est": "Koneksi terjalin, menunggu respons...",
|
||||
"status_reasoning_inj": "Tingkat penalaran diterapkan: {effort}",
|
||||
"debug_agent_working_in": "Direktori kerja Agent: {path}",
|
||||
"debug_mcp_servers": "🔌 Server MCP yang terhubung: {servers}",
|
||||
},
|
||||
}
|
||||
|
||||
FALLBACK_MAP = {
|
||||
"zh": "zh-CN",
|
||||
"zh-TW": "zh-TW",
|
||||
"zh-HK": "zh-HK",
|
||||
"en": "en-US",
|
||||
"en-GB": "en-US",
|
||||
"ja": "ja-JP",
|
||||
"ko": "ko-KR",
|
||||
"fr": "fr-FR",
|
||||
"de": "de-DE",
|
||||
"es": "es-ES",
|
||||
"it": "it-IT",
|
||||
"ru": "ru-RU",
|
||||
"vi": "vi-VN",
|
||||
"id": "id-ID",
|
||||
}
|
||||
|
||||
def __init__(self):
|
||||
self.type = "pipe"
|
||||
self.id = "github_copilot_sdk"
|
||||
@@ -390,6 +497,83 @@ class Pipe:
|
||||
except Exception as e:
|
||||
logger.error(f"[Database] ❌ Initialization failed: {str(e)}")
|
||||
|
||||
def _resolve_language(self, user_language: str) -> str:
|
||||
"""Normalize user language code to a supported translation key."""
|
||||
if not user_language:
|
||||
return "en-US"
|
||||
if user_language in self.TRANSLATIONS:
|
||||
return user_language
|
||||
lang_base = user_language.split("-")[0]
|
||||
if user_language in self.FALLBACK_MAP:
|
||||
return self.FALLBACK_MAP[user_language]
|
||||
if lang_base in self.FALLBACK_MAP:
|
||||
return self.FALLBACK_MAP[lang_base]
|
||||
return "en-US"
|
||||
|
||||
def _get_translation(self, lang: str, key: str, **kwargs) -> str:
|
||||
"""Helper function to get translated string for a key."""
|
||||
lang_key = self._resolve_language(lang)
|
||||
trans_map = self.TRANSLATIONS.get(lang_key, self.TRANSLATIONS["en-US"])
|
||||
text = trans_map.get(key, self.TRANSLATIONS["en-US"].get(key, key))
|
||||
if kwargs:
|
||||
try:
|
||||
text = text.format(**kwargs)
|
||||
except Exception as e:
|
||||
logger.warning(f"Translation formatting failed for {key}: {e}")
|
||||
return text
|
||||
|
||||
async def _get_user_context(self, __user__, __event_call__=None, __request__=None):
|
||||
"""Extract basic user context with safe fallbacks including JS localStorage."""
|
||||
if isinstance(__user__, (list, tuple)):
|
||||
user_data = __user__[0] if __user__ else {}
|
||||
elif isinstance(__user__, dict):
|
||||
user_data = __user__
|
||||
else:
|
||||
user_data = {}
|
||||
|
||||
user_id = user_data.get("id", "unknown_user")
|
||||
user_name = user_data.get("name", "User")
|
||||
user_language = user_data.get("language", "en-US")
|
||||
|
||||
if (
|
||||
__request__
|
||||
and hasattr(__request__, "headers")
|
||||
and "accept-language" in __request__.headers
|
||||
):
|
||||
raw_lang = __request__.headers.get("accept-language", "")
|
||||
if raw_lang:
|
||||
user_language = raw_lang.split(",")[0].split(";")[0]
|
||||
|
||||
if __event_call__:
|
||||
try:
|
||||
js_code = """
|
||||
try {
|
||||
return (
|
||||
document.documentElement.lang ||
|
||||
localStorage.getItem('locale') ||
|
||||
localStorage.getItem('language') ||
|
||||
navigator.language ||
|
||||
'en-US'
|
||||
);
|
||||
} catch (e) {
|
||||
return 'en-US';
|
||||
}
|
||||
"""
|
||||
frontend_lang = await asyncio.wait_for(
|
||||
__event_call__({"type": "execute", "data": {"code": js_code}}),
|
||||
timeout=2.0,
|
||||
)
|
||||
if frontend_lang and isinstance(frontend_lang, str):
|
||||
user_language = frontend_lang
|
||||
except Exception as e:
|
||||
pass
|
||||
|
||||
return {
|
||||
"user_id": user_id,
|
||||
"user_name": user_name,
|
||||
"user_language": user_language,
|
||||
}
|
||||
|
||||
@contextlib.contextmanager
|
||||
def _db_session(self):
|
||||
"""Yield a database session using Open WebUI helpers with graceful fallbacks."""
|
||||
@@ -611,6 +795,8 @@ class Pipe:
|
||||
user_data = {}
|
||||
|
||||
user_id = user_data.get("id") or user_data.get("user_id")
|
||||
user_lang = user_data.get("language") or "en-US"
|
||||
is_admin = user_data.get("role") == "admin"
|
||||
if not user_id:
|
||||
return None
|
||||
|
||||
@@ -746,10 +932,7 @@ class Pipe:
|
||||
dest_path = Path(UPLOAD_DIR) / f"{file_id}_{safe_filename}"
|
||||
await asyncio.to_thread(shutil.copy2, target_path, dest_path)
|
||||
|
||||
try:
|
||||
db_path = str(os.path.relpath(dest_path, DATA_DIR))
|
||||
except:
|
||||
db_path = str(dest_path)
|
||||
db_path = str(dest_path)
|
||||
|
||||
file_form = FileForm(
|
||||
id=file_id,
|
||||
@@ -769,12 +952,37 @@ class Pipe:
|
||||
|
||||
# 5. Result
|
||||
download_url = f"/api/v1/files/{file_id}/content"
|
||||
view_url = download_url
|
||||
is_html = safe_filename.lower().endswith(".html")
|
||||
|
||||
# For HTML files, if user is admin, provide a direct view link (/content/html)
|
||||
if is_html and is_admin:
|
||||
view_url = f"{download_url}/html"
|
||||
|
||||
# Localized output
|
||||
msg = self._get_translation(user_lang, "publish_success")
|
||||
if is_html and is_admin:
|
||||
hint = self._get_translation(
|
||||
user_lang,
|
||||
"publish_hint_html",
|
||||
filename=safe_filename,
|
||||
view_url=view_url,
|
||||
download_url=download_url,
|
||||
)
|
||||
else:
|
||||
hint = self._get_translation(
|
||||
user_lang,
|
||||
"publish_hint_default",
|
||||
filename=safe_filename,
|
||||
download_url=download_url,
|
||||
)
|
||||
|
||||
return {
|
||||
"file_id": file_id,
|
||||
"filename": safe_filename,
|
||||
"download_url": download_url,
|
||||
"message": "File published successfully.",
|
||||
"hint": f"Link: [Download {safe_filename}]({download_url})",
|
||||
"message": msg,
|
||||
"hint": hint,
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": str(e)}
|
||||
@@ -1921,10 +2129,6 @@ class Pipe:
|
||||
"on_post_tool_use": on_post_tool_use,
|
||||
}
|
||||
|
||||
def _get_user_context(self):
|
||||
"""Helper to get user context (placeholder for future use)."""
|
||||
return {}
|
||||
|
||||
def _get_chat_context(
|
||||
self,
|
||||
body: dict,
|
||||
@@ -2327,25 +2531,11 @@ class Pipe:
|
||||
token: str = None,
|
||||
enable_mcp: bool = True,
|
||||
enable_cache: bool = True,
|
||||
skip_cli_install: bool = False,
|
||||
skip_cli_install: bool = False, # Kept for call-site compatibility, no longer used
|
||||
__event_emitter__=None,
|
||||
user_lang: str = "en-US",
|
||||
):
|
||||
"""Setup environment variables and verify Copilot CLI. Dynamic Token Injection."""
|
||||
def emit_status_sync(description: str, done: bool = False):
|
||||
if not __event_emitter__:
|
||||
return
|
||||
try:
|
||||
loop = asyncio.get_running_loop()
|
||||
loop.create_task(
|
||||
__event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {"description": description, "done": done},
|
||||
}
|
||||
)
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
"""Setup environment variables and resolve Copilot CLI path from SDK bundle."""
|
||||
|
||||
# 1. Real-time Token Injection (Always updates on each call)
|
||||
effective_token = token or self.valves.GH_TOKEN
|
||||
@@ -2353,8 +2543,6 @@ class Pipe:
|
||||
os.environ["GH_TOKEN"] = os.environ["GITHUB_TOKEN"] = effective_token
|
||||
|
||||
if self._env_setup_done:
|
||||
# If done, we only sync MCP if called explicitly or in debug mode
|
||||
# To improve speed, we avoid redundant file I/O here for regular requests
|
||||
if debug_enabled:
|
||||
self._sync_mcp_config(
|
||||
__event_call__,
|
||||
@@ -2365,186 +2553,46 @@ class Pipe:
|
||||
return
|
||||
|
||||
os.environ["COPILOT_AUTO_UPDATE"] = "false"
|
||||
self._emit_debug_log_sync(
|
||||
"Disabled CLI auto-update (COPILOT_AUTO_UPDATE=false)",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
|
||||
# 2. CLI Path Discovery
|
||||
cli_path = "/usr/local/bin/copilot"
|
||||
if os.environ.get("COPILOT_CLI_PATH"):
|
||||
cli_path = os.environ["COPILOT_CLI_PATH"]
|
||||
|
||||
target_version = self.valves.COPILOT_CLI_VERSION.strip()
|
||||
found = False
|
||||
current_version = None
|
||||
|
||||
def get_cli_version(path):
|
||||
try:
|
||||
output = (
|
||||
subprocess.check_output(
|
||||
[path, "--version"], stderr=subprocess.STDOUT
|
||||
)
|
||||
.decode()
|
||||
.strip()
|
||||
)
|
||||
import re
|
||||
|
||||
match = re.search(r"(\d+\.\d+\.\d+)", output)
|
||||
return match.group(1) if match else output
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
# Check existing version
|
||||
if os.path.exists(cli_path):
|
||||
found = True
|
||||
current_version = get_cli_version(cli_path)
|
||||
# 2. CLI Path Discovery (priority: env var > PATH > SDK bundle)
|
||||
cli_path = os.environ.get("COPILOT_CLI_PATH", "")
|
||||
found = bool(cli_path and os.path.exists(cli_path))
|
||||
|
||||
if not found:
|
||||
sys_path = shutil.which("copilot")
|
||||
if sys_path:
|
||||
cli_path = sys_path
|
||||
found = True
|
||||
current_version = get_cli_version(cli_path)
|
||||
|
||||
if not found:
|
||||
pkg_path = os.path.join(os.path.dirname(__file__), "bin", "copilot")
|
||||
if os.path.exists(pkg_path):
|
||||
cli_path = pkg_path
|
||||
found = True
|
||||
current_version = get_cli_version(cli_path)
|
||||
|
||||
# 3. Installation/Update Logic
|
||||
should_install = not found
|
||||
install_reason = "CLI not found"
|
||||
if found and target_version:
|
||||
norm_target = target_version.lstrip("v")
|
||||
norm_current = current_version.lstrip("v") if current_version else ""
|
||||
|
||||
# Only install if target version is GREATER than current version
|
||||
try:
|
||||
from packaging.version import parse as parse_version
|
||||
from copilot.client import _get_bundled_cli_path
|
||||
|
||||
if parse_version(norm_target) > parse_version(norm_current):
|
||||
should_install = True
|
||||
install_reason = (
|
||||
f"Upgrade needed ({current_version} -> {target_version})"
|
||||
)
|
||||
elif parse_version(norm_target) < parse_version(norm_current):
|
||||
self._emit_debug_log_sync(
|
||||
f"Current version ({current_version}) is newer than specified ({target_version}). Skipping downgrade.",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
except Exception as e:
|
||||
# Fallback to string comparison if packaging is not available
|
||||
if norm_target != norm_current:
|
||||
should_install = True
|
||||
install_reason = (
|
||||
f"Version mismatch ({current_version} != {target_version})"
|
||||
)
|
||||
bundled_path = _get_bundled_cli_path()
|
||||
if bundled_path and os.path.exists(bundled_path):
|
||||
cli_path = bundled_path
|
||||
found = True
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
if should_install and not skip_cli_install:
|
||||
self._emit_debug_log_sync(
|
||||
f"Installing/Updating Copilot CLI: {install_reason}...",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
emit_status_sync(
|
||||
"🔧 正在安装/更新 Copilot CLI(首次可能需要 1-3 分钟)...",
|
||||
done=False,
|
||||
)
|
||||
try:
|
||||
env = os.environ.copy()
|
||||
if target_version:
|
||||
env["VERSION"] = target_version
|
||||
proc = subprocess.Popen(
|
||||
"curl -fsSL https://gh.io/copilot-install | bash",
|
||||
shell=True,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
text=True,
|
||||
bufsize=1,
|
||||
env=env,
|
||||
)
|
||||
|
||||
progress_percent = -1
|
||||
line_count = 0
|
||||
while True:
|
||||
raw_line = proc.stdout.readline() if proc.stdout else ""
|
||||
if raw_line == "" and proc.poll() is not None:
|
||||
break
|
||||
|
||||
line = (raw_line or "").strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
line_count += 1
|
||||
percent_match = re.search(r"(\d{1,3})%", line)
|
||||
if percent_match:
|
||||
try:
|
||||
pct = int(percent_match.group(1))
|
||||
if pct >= progress_percent + 5:
|
||||
progress_percent = pct
|
||||
emit_status_sync(
|
||||
f"📦 Copilot CLI 安装中:{pct}%", done=False
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
elif line_count % 20 == 0:
|
||||
emit_status_sync(
|
||||
f"📦 Copilot CLI 安装中:{line[:120]}", done=False
|
||||
)
|
||||
|
||||
return_code = proc.wait()
|
||||
if return_code != 0:
|
||||
raise subprocess.CalledProcessError(
|
||||
return_code,
|
||||
"curl -fsSL https://gh.io/copilot-install | bash",
|
||||
)
|
||||
|
||||
# Re-verify
|
||||
current_version = get_cli_version(cli_path)
|
||||
emit_status_sync(
|
||||
f"✅ Copilot CLI 安装完成(v{current_version or target_version or 'latest'})",
|
||||
done=False,
|
||||
)
|
||||
except Exception as e:
|
||||
self._emit_debug_log_sync(
|
||||
f"CLI installation failed: {e}",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
emit_status_sync(
|
||||
f"❌ Copilot CLI 安装失败:{str(e)[:120]}",
|
||||
done=True,
|
||||
)
|
||||
elif should_install and skip_cli_install:
|
||||
self._emit_debug_log_sync(
|
||||
f"Skipping CLI install during model listing: {install_reason}",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
|
||||
# 4. Finalize
|
||||
cli_ready = bool(cli_path and os.path.exists(cli_path))
|
||||
# 3. Finalize
|
||||
cli_ready = found
|
||||
if cli_ready:
|
||||
os.environ["COPILOT_CLI_PATH"] = cli_path
|
||||
# Add the CLI's parent directory to PATH so subprocesses can invoke `copilot` directly
|
||||
cli_bin_dir = os.path.dirname(cli_path)
|
||||
current_path = os.environ.get("PATH", "")
|
||||
if cli_bin_dir and cli_bin_dir not in current_path.split(os.pathsep):
|
||||
os.environ["PATH"] = cli_bin_dir + os.pathsep + current_path
|
||||
|
||||
self.__class__._env_setup_done = cli_ready
|
||||
self.__class__._last_update_check = datetime.now().timestamp()
|
||||
|
||||
self._emit_debug_log_sync(
|
||||
f"Environment setup complete. CLI ready={cli_ready}. Path: {cli_path} (v{current_version})",
|
||||
f"Environment setup complete. CLI ready={cli_ready}. Path: {cli_path}",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
if not skip_cli_install:
|
||||
if cli_ready:
|
||||
emit_status_sync("✅ Copilot CLI 已就绪", done=True)
|
||||
else:
|
||||
emit_status_sync("⚠️ Copilot CLI 尚未就绪,请稍后重试。", done=True)
|
||||
|
||||
def _process_attachments(
|
||||
self,
|
||||
@@ -2822,6 +2870,9 @@ class Pipe:
|
||||
effective_mcp = user_valves.ENABLE_MCP_SERVER
|
||||
effective_cache = user_valves.ENABLE_TOOL_CACHE
|
||||
|
||||
user_ctx = await self._get_user_context(__user__, __event_call__, __request__)
|
||||
user_lang = user_ctx["user_language"]
|
||||
|
||||
# 2. Setup environment with effective settings
|
||||
self._setup_env(
|
||||
__event_call__,
|
||||
@@ -2830,11 +2881,12 @@ class Pipe:
|
||||
enable_mcp=effective_mcp,
|
||||
enable_cache=effective_cache,
|
||||
__event_emitter__=__event_emitter__,
|
||||
user_lang=user_lang,
|
||||
)
|
||||
|
||||
cwd = self._get_workspace_dir(user_id=user_id, chat_id=chat_id)
|
||||
await self._emit_debug_log(
|
||||
f"Agent working in: {cwd} (Admin: {is_admin}, MCP: {effective_mcp})",
|
||||
f"{self._get_translation(user_lang, 'debug_agent_working_in', path=cwd)} (Admin: {is_admin}, MCP: {effective_mcp})",
|
||||
__event_call__,
|
||||
debug_enabled=effective_debug,
|
||||
)
|
||||
@@ -3269,9 +3321,9 @@ class Pipe:
|
||||
if body.get("stream", False):
|
||||
init_msg = ""
|
||||
if effective_debug:
|
||||
init_msg = f"> [Debug] Agent working in: {self._get_workspace_dir(user_id=user_id, chat_id=chat_id)}\n"
|
||||
init_msg = f"> [Debug] {self._get_translation(user_lang, 'debug_agent_working_in', path=self._get_workspace_dir(user_id=user_id, chat_id=chat_id))}\n"
|
||||
if mcp_server_names:
|
||||
init_msg += f"> [Debug] 🔌 Connected MCP Servers: {', '.join(mcp_server_names)}\n"
|
||||
init_msg += f"> [Debug] {self._get_translation(user_lang, 'debug_mcp_servers', servers=', '.join(mcp_server_names))}\n"
|
||||
|
||||
# Transfer client ownership to stream_response
|
||||
should_stop_client = False
|
||||
@@ -3284,9 +3336,14 @@ class Pipe:
|
||||
init_message=init_msg,
|
||||
__event_call__=__event_call__,
|
||||
__event_emitter__=__event_emitter__,
|
||||
reasoning_effort=effective_reasoning_effort,
|
||||
reasoning_effort=(
|
||||
effective_reasoning_effort
|
||||
if (is_reasoning and not is_byok_model)
|
||||
else "off"
|
||||
),
|
||||
show_thinking=show_thinking,
|
||||
debug_enabled=effective_debug,
|
||||
user_lang=user_lang,
|
||||
)
|
||||
else:
|
||||
try:
|
||||
@@ -3332,6 +3389,7 @@ class Pipe:
|
||||
reasoning_effort: str = "",
|
||||
show_thinking: bool = True,
|
||||
debug_enabled: bool = False,
|
||||
user_lang: str = "en-US",
|
||||
) -> AsyncGenerator:
|
||||
"""
|
||||
Stream response from Copilot SDK, handling various event types.
|
||||
@@ -3476,14 +3534,8 @@ class Pipe:
|
||||
queue.put_nowait("\n</think>\n")
|
||||
state["thinking_started"] = False
|
||||
|
||||
# Display tool call with improved formatting
|
||||
if tool_args:
|
||||
tool_args_json = json.dumps(tool_args, indent=2, ensure_ascii=False)
|
||||
tool_display = f"\n\n<details>\n<summary>🔧 Executing Tool: {tool_name}</summary>\n\n**Parameters:**\n\n```json\n{tool_args_json}\n```\n\n</details>\n\n"
|
||||
else:
|
||||
tool_display = f"\n\n<details>\n<summary>🔧 Executing Tool: {tool_name}</summary>\n\n*No parameters*\n\n</details>\n\n"
|
||||
|
||||
queue.put_nowait(tool_display)
|
||||
# Note: We do NOT emit a done="false" card here to avoid card duplication
|
||||
# (unless we have a way to update text which SSE content stream doesn't)
|
||||
|
||||
self._emit_debug_log_sync(
|
||||
f"Tool Start: {tool_name}",
|
||||
@@ -3600,31 +3652,55 @@ class Pipe:
|
||||
)
|
||||
# ------------------------
|
||||
|
||||
# Try to detect content type for better formatting
|
||||
is_json = False
|
||||
try:
|
||||
json_obj = (
|
||||
json.loads(result_content)
|
||||
if isinstance(result_content, str)
|
||||
else result_content
|
||||
# --- Build native OpenWebUI 0.8.3 tool_calls block ---
|
||||
# Serialize input args (from execution_start)
|
||||
tool_args_for_block = {}
|
||||
if tool_call_id and tool_call_id in active_tools:
|
||||
tool_args_for_block = active_tools[tool_call_id].get(
|
||||
"arguments", {}
|
||||
)
|
||||
if isinstance(json_obj, (dict, list)):
|
||||
result_content = json.dumps(
|
||||
json_obj, indent=2, ensure_ascii=False
|
||||
)
|
||||
is_json = True
|
||||
except:
|
||||
pass
|
||||
|
||||
# Format based on content type
|
||||
if is_json:
|
||||
# JSON content: use code block with syntax highlighting
|
||||
result_display = f"\n<details>\n<summary>{status_icon} Tool Result: {tool_name}</summary>\n\n```json\n{result_content}\n```\n\n</details>\n\n"
|
||||
else:
|
||||
# Plain text: use text code block to preserve formatting and add line breaks
|
||||
result_display = f"\n<details>\n<summary>{status_icon} Tool Result: {tool_name}</summary>\n\n```text\n{result_content}\n```\n\n</details>\n\n"
|
||||
try:
|
||||
args_json_str = json.dumps(
|
||||
tool_args_for_block, ensure_ascii=False
|
||||
)
|
||||
except Exception:
|
||||
args_json_str = "{}"
|
||||
|
||||
queue.put_nowait(result_display)
|
||||
def escape_html_attr(s: str) -> str:
|
||||
if not isinstance(s, str):
|
||||
return ""
|
||||
return (
|
||||
str(s)
|
||||
.replace("&", "&")
|
||||
.replace("<", "<")
|
||||
.replace(">", ">")
|
||||
.replace('"', """)
|
||||
.replace("\n", " ")
|
||||
.replace("\r", " ")
|
||||
)
|
||||
|
||||
# MUST escape both arguments and result with " and to satisfy OpenWebUI's strict regex /="([^"]*)"/
|
||||
# OpenWebUI `marked` extension does not match multiline attributes properly without
|
||||
args_for_attr = (
|
||||
escape_html_attr(args_json_str) if args_json_str else "{}"
|
||||
)
|
||||
result_for_attr = escape_html_attr(result_content)
|
||||
|
||||
# Emit the unified native tool_calls block:
|
||||
# OpenWebUI 0.8.3 frontend regex explicitly expects: name="xxx" arguments="..." result="..." done="true"
|
||||
# CRITICAL: <details> tag MUST be followed immediately by \n for the frontend Markdown extension to parse it!
|
||||
tool_block = (
|
||||
f'\n<details type="tool_calls"'
|
||||
f' id="{tool_call_id}"'
|
||||
f' name="{tool_name}"'
|
||||
f' arguments="{args_for_attr}"'
|
||||
f' result="{result_for_attr}"'
|
||||
f' done="true">\n'
|
||||
f"<summary>Tool Executed</summary>\n"
|
||||
f"</details>\n\n"
|
||||
)
|
||||
queue.put_nowait(tool_block)
|
||||
|
||||
elif event_type == "tool.execution_progress":
|
||||
# Tool execution progress update (for long-running tools)
|
||||
@@ -3725,20 +3801,42 @@ class Pipe:
|
||||
|
||||
# Safe initial yield with error handling
|
||||
try:
|
||||
if debug_enabled and show_thinking:
|
||||
yield "<think>\n"
|
||||
if debug_enabled and __event_emitter__:
|
||||
# Emit debug info as UI status rather than reasoning block
|
||||
async def _emit_status(key: str, desc: str = None, **kwargs):
|
||||
try:
|
||||
final_desc = (
|
||||
desc
|
||||
if desc
|
||||
else self._get_translation(user_lang, key, **kwargs)
|
||||
)
|
||||
await __event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {"description": final_desc, "done": True},
|
||||
}
|
||||
)
|
||||
except:
|
||||
pass
|
||||
|
||||
if init_message:
|
||||
yield init_message
|
||||
for line in init_message.split("\n"):
|
||||
if line.strip():
|
||||
clean_msg = line.replace("> [Debug] ", "").strip()
|
||||
asyncio.create_task(_emit_status("custom", desc=clean_msg))
|
||||
|
||||
if reasoning_effort and reasoning_effort != "off":
|
||||
yield f"> [Debug] Reasoning Effort injected: {reasoning_effort.upper()}\n"
|
||||
asyncio.create_task(
|
||||
_emit_status(
|
||||
"status_reasoning_inj", effort=reasoning_effort.upper()
|
||||
)
|
||||
)
|
||||
|
||||
yield "> [Debug] Connection established, waiting for response...\n"
|
||||
state["thinking_started"] = True
|
||||
asyncio.create_task(_emit_status("status_conn_est"))
|
||||
except Exception as e:
|
||||
# If initial yield fails, log but continue processing
|
||||
self._emit_debug_log_sync(
|
||||
f"Initial yield warning: {e}",
|
||||
f"Initial status warning: {e}",
|
||||
__event_call__,
|
||||
debug_enabled=debug_enabled,
|
||||
)
|
||||
@@ -3766,12 +3864,21 @@ class Pipe:
|
||||
except asyncio.TimeoutError:
|
||||
if done.is_set():
|
||||
break
|
||||
if state["thinking_started"]:
|
||||
if __event_emitter__ and debug_enabled:
|
||||
try:
|
||||
yield f"> [Debug] Waiting for response ({self.valves.TIMEOUT}s exceeded)...\n"
|
||||
asyncio.create_task(
|
||||
__event_emitter__(
|
||||
{
|
||||
"type": "status",
|
||||
"data": {
|
||||
"description": f"Waiting for response ({self.valves.TIMEOUT}s exceeded)...",
|
||||
"done": True,
|
||||
},
|
||||
}
|
||||
)
|
||||
)
|
||||
except:
|
||||
# If yield fails during timeout, connection is gone
|
||||
break
|
||||
pass
|
||||
continue
|
||||
|
||||
while not queue.empty():
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
82
plugins/pipes/github-copilot-sdk/v0.7.0.md
Normal file
82
plugins/pipes/github-copilot-sdk/v0.7.0.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# GitHub Copilot SDK Pipe v0.7.0
|
||||
|
||||
**GitHub Copilot SDK Pipe v0.7.0** — A major infrastructure and UX upgrade. This release eliminates manual CLI management, fully embraces OpenWebUI's native tool calling interface, and ensures seamless compatibility with the latest OpenWebUI versions.
|
||||
|
||||
---
|
||||
|
||||
## 📦 Quick Installation
|
||||
|
||||
- **GitHub Copilot SDK (Pipe)**: [Install v0.7.0](https://openwebui.com/posts/ce96f7b4-12fc-4ac3-9a01-875713e69359)
|
||||
- **GitHub Copilot SDK (Filter)**: [Install v0.1.2](https://openwebui.com/posts/403a62ee-a596-45e7-be65-fab9cc249dd6)
|
||||
|
||||
---
|
||||
|
||||
## 🚀 What's New in v0.7.0
|
||||
|
||||
### 1. Zero-Maintenance CLI Integration
|
||||
|
||||
The most significant infrastructure change: you no longer need to worry about CLI versions or background downloads.
|
||||
|
||||
| Before (v0.6.x) | After (v0.7.0) |
|
||||
| :--- | :--- |
|
||||
| CLI installed via background `curl \| bash` | CLI bundled inside the `github-copilot-sdk` pip package |
|
||||
| Version mismatches between SDK and CLI | Versions are always in sync automatically |
|
||||
| Fails in restricted networks | Works everywhere `pip install` works |
|
||||
|
||||
**How it works**: When you install `github-copilot-sdk==0.1.25`, the matching `copilot-cli v0.0.411` is included. The plugin auto-discovers the path and injects it into the environment—zero configuration required.
|
||||
|
||||
### 2. Native OpenWebUI Tool Call UI
|
||||
|
||||
Tool calls from Copilot agents now render using **OpenWebUI's built-in tool call UI**.
|
||||
|
||||
- Tool execution status is displayed natively in the chat interface.
|
||||
- Thinking processes (Chain of Thought) are visualized with the standard collapsible UI.
|
||||
- Improved visual consistency and integration with the main OpenWebUI interface.
|
||||
|
||||
### 3. OpenWebUI v0.8.0+ Compatibility Fix (Bug Fix)
|
||||
|
||||
Resolved the **"Error getting file content"** failure that affected users on OpenWebUI v0.8.0 and later.
|
||||
|
||||
- **The Issue**: Relative path registration for published files was rejected by the latest OpenWebUI versions.
|
||||
- **The Fix**: Switched to **absolute path registration**, restoring the ability to download generated artifacts to your local machine.
|
||||
|
||||
### 4. Comprehensive Multi-language Support (i18n)
|
||||
|
||||
Native localization for status messages and UI hints in **11 languages**:
|
||||
*English, Chinese (Simp/Trad/HK/TW), Japanese, Korean, French, German, Spanish, Italian, Russian, Vietnamese, and Indonesian.*
|
||||
|
||||
### 5. Reasoning Status & UX Optimizations
|
||||
|
||||
- **Intelligent Status Display**: `Reasoning Effort injected` status is now only shown for native Copilot reasoning models.
|
||||
- **Clean UI**: Removed redundant debug/status noise for BYOK and standard models.
|
||||
- **Architecture Cleanup**: Refactored core setup and removed legacy installation code for a robust "one-click" experience.
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Key Capabilities
|
||||
|
||||
| Feature | Description |
|
||||
| :--- | :--- |
|
||||
| **Universal Tool Protocol** | Native support for **MCP**, **OpenAPI**, and **OpenWebUI built-in tools**. |
|
||||
| **Native Tool Call UI** | Adapted to OpenWebUI's built-in tool call rendering. |
|
||||
| **Workspace Isolation** | Strict sandboxing for per-session data privacy and security. |
|
||||
| **Workspace Artifacts** | Agents generate files (Excel/CSV/HTML) with persistent download links via `publish_file_from_workspace`. |
|
||||
| **Tool Execution** | Direct access to system binaries (Python, FFmpeg, Git, etc.). |
|
||||
| **11-Language Localization** | Auto-detected, native status messages for global users. |
|
||||
| **OpenWebUI v0.8.0+ Support** | Robust file handling for the latest OpenWebUI platform versions. |
|
||||
|
||||
---
|
||||
|
||||
## 📥 Import Chat Templates
|
||||
|
||||
- [📥 Star Prediction Chat log](https://fu-jie.github.io/awesome-openwebui/plugins/pipes/star-prediction-chat.json)
|
||||
- [📥 Video Processing Chat log](https://fu-jie.github.io/awesome-openwebui/plugins/pipes/video-processing-chat.json)
|
||||
|
||||
*Settings -> Data -> Import Chats.*
|
||||
|
||||
---
|
||||
|
||||
## 🔗 Resources
|
||||
|
||||
- **GitHub Repository**: [openwebui-extensions](https://github.com/Fu-Jie/openwebui-extensions)
|
||||
- **Full Changelog**: [README.md](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/pipes/github-copilot-sdk/README.md)
|
||||
Reference in New Issue
Block a user