Compare commits

...

6 Commits

Author SHA1 Message Date
fujie
25c9d20f3d feat(async-context-compression): release v1.2.1 with smart config & optimizations
This release introduces significant improvements to configuration flexibility, performance, and stability.

**Key Changes:**

*   **Smart Configuration:**
    *   Added `summary_model_max_context` to allow independent context limits for the summary model (e.g., using `gemini-flash` with 1M context to summarize `gpt-4` history).
    *   Implemented auto-detection of base model settings for custom models, ensuring correct threshold application.
*   **Performance & Refactoring:**
    *   Optimized `model_thresholds` parsing with caching to reduce overhead.
    *   Refactored `inlet` and `outlet` logic to remove redundant code and improve maintainability.
    *   Replaced all `print` statements with proper `logging` calls for better production monitoring.
*   **Bug Fixes & Modernization:**
    *   Fixed `datetime.utcnow()` deprecation warnings by switching to timezone-aware `datetime.now(timezone.utc)`.
    *   Corrected type annotations and improved error handling for `JSONResponse` objects from LLM backends.
    *   Removed hard truncation in summary generation to allow full context usage.

**Files Updated:**
*   Plugin source code (English & Chinese)
*   Documentation and READMEs
*   Version bumped to 1.2.1
2026-01-20 19:09:25 +08:00
github-actions[bot]
0d853577df chore: update community stats - followers increased (136 -> 137) 2026-01-20 09:15:24 +00:00
github-actions[bot]
f91f3d8692 chore: update community stats - followers increased (135 -> 136) 2026-01-20 07:14:01 +00:00
github-actions[bot]
0f7cad8dfa chore: update community stats - followers increased (134 -> 135) 2026-01-19 23:08:06 +00:00
fujie
db1a1e7ef0 fix(async-context-compression): sync CN version with EN version logic
- Add missing imports (contextlib, sessionmaker, Engine)
- Add database engine discovery functions (_discover_owui_engine, _discover_owui_schema)
- Fix ChatSummary table to support schema configuration
- Fix duplicate code in __init__ method
- Add _db_session context manager for robust session handling
- Fix inlet method signature (add __request__, __model__ parameters)
- Fix tool output trimming to check native function calling
- Add chat_id empty check in outlet method
2026-01-19 20:37:37 +08:00
github-actions[bot]
e7de80a059 chore: update community stats - plugin version updated, followers increased (133 -> 134) 2026-01-19 12:15:44 +00:00
15 changed files with 763 additions and 335 deletions

View File

@@ -10,28 +10,28 @@ A collection of enhancements, plugins, and prompts for [OpenWebUI](https://githu
<!-- STATS_START --> <!-- STATS_START -->
## 📊 Community Stats ## 📊 Community Stats
> 🕐 Auto-updated: 2026-01-19 18:11 > 🕐 Auto-updated: 2026-01-20 17:15
| 👤 Author | 👥 Followers | ⭐ Points | 🏆 Contributions | | 👤 Author | 👥 Followers | ⭐ Points | 🏆 Contributions |
|:---:|:---:|:---:|:---:| |:---:|:---:|:---:|:---:|
| [Fu-Jie](https://openwebui.com/u/Fu-Jie) | **133** | **134** | **25** | | [Fu-Jie](https://openwebui.com/u/Fu-Jie) | **137** | **134** | **25** |
| 📝 Posts | ⬇️ Downloads | 👁️ Views | 👍 Upvotes | 💾 Saves | | 📝 Posts | ⬇️ Downloads | 👁️ Views | 👍 Upvotes | 💾 Saves |
|:---:|:---:|:---:|:---:|:---:| |:---:|:---:|:---:|:---:|:---:|
| **16** | **1792** | **21276** | **120** | **135** | | **16** | **1878** | **22027** | **120** | **147** |
### 🔥 Top 6 Popular Plugins ### 🔥 Top 6 Popular Plugins
> 🕐 Auto-updated: 2026-01-19 18:11 > 🕐 Auto-updated: 2026-01-20 17:15
| Rank | Plugin | Version | Downloads | Views | Updated | | Rank | Plugin | Version | Downloads | Views | Updated |
|:---:|------|:---:|:---:|:---:|:---:| |:---:|------|:---:|:---:|:---:|:---:|
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | 0.9.1 | 532 | 4822 | 2026-01-17 | | 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | 0.9.1 | 550 | 4933 | 2026-01-17 |
| 🥈 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | 1.4.9 | 260 | 2514 | 2026-01-18 | | 🥈 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | 1.4.9 | 281 | 2651 | 2026-01-18 |
| 🥉 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | 0.3.7 | 209 | 800 | 2026-01-07 | | 🥉 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | 0.3.7 | 213 | 835 | 2026-01-07 |
| 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | 1.1.3 | 180 | 1975 | 2026-01-17 | | 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | 1.2.0 | 189 | 2048 | 2026-01-19 |
| 5⃣ | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | 0.4.3 | 158 | 1377 | 2026-01-17 | | 5⃣ | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | 0.4.3 | 168 | 1449 | 2026-01-17 |
| 6⃣ | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | 0.2.4 | 138 | 2329 | 2026-01-17 | | 6⃣ | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | 0.2.4 | 143 | 2386 | 2026-01-17 |
*See full stats in [Community Stats Report](./docs/community-stats.md)* *See full stats in [Community Stats Report](./docs/community-stats.md)*
<!-- STATS_END --> <!-- STATS_END -->

View File

@@ -7,28 +7,28 @@ OpenWebUI 增强功能集合。包含个人开发与收集的插件、提示词
<!-- STATS_START --> <!-- STATS_START -->
## 📊 社区统计 ## 📊 社区统计
> 🕐 自动更新于 2026-01-19 18:11 > 🕐 自动更新于 2026-01-20 17:15
| 👤 作者 | 👥 粉丝 | ⭐ 积分 | 🏆 贡献 | | 👤 作者 | 👥 粉丝 | ⭐ 积分 | 🏆 贡献 |
|:---:|:---:|:---:|:---:| |:---:|:---:|:---:|:---:|
| [Fu-Jie](https://openwebui.com/u/Fu-Jie) | **133** | **134** | **25** | | [Fu-Jie](https://openwebui.com/u/Fu-Jie) | **137** | **134** | **25** |
| 📝 发布 | ⬇️ 下载 | 👁️ 浏览 | 👍 点赞 | 💾 收藏 | | 📝 发布 | ⬇️ 下载 | 👁️ 浏览 | 👍 点赞 | 💾 收藏 |
|:---:|:---:|:---:|:---:|:---:| |:---:|:---:|:---:|:---:|:---:|
| **16** | **1792** | **21276** | **120** | **135** | | **16** | **1878** | **22027** | **120** | **147** |
### 🔥 热门插件 Top 6 ### 🔥 热门插件 Top 6
> 🕐 自动更新于 2026-01-19 18:11 > 🕐 自动更新于 2026-01-20 17:15
| 排名 | 插件 | 版本 | 下载 | 浏览 | 更新日期 | | 排名 | 插件 | 版本 | 下载 | 浏览 | 更新日期 |
|:---:|------|:---:|:---:|:---:|:---:| |:---:|------|:---:|:---:|:---:|:---:|
| 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | 0.9.1 | 532 | 4822 | 2026-01-17 | | 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | 0.9.1 | 550 | 4933 | 2026-01-17 |
| 🥈 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | 1.4.9 | 260 | 2514 | 2026-01-18 | | 🥈 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | 1.4.9 | 281 | 2651 | 2026-01-18 |
| 🥉 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | 0.3.7 | 209 | 800 | 2026-01-07 | | 🥉 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | 0.3.7 | 213 | 835 | 2026-01-07 |
| 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | 1.1.3 | 180 | 1975 | 2026-01-17 | | 4⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | 1.2.0 | 189 | 2048 | 2026-01-19 |
| 5⃣ | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | 0.4.3 | 158 | 1377 | 2026-01-17 | | 5⃣ | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | 0.4.3 | 168 | 1449 | 2026-01-17 |
| 6⃣ | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | 0.2.4 | 138 | 2329 | 2026-01-17 | | 6⃣ | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | 0.2.4 | 143 | 2386 | 2026-01-17 |
*完整统计请查看 [社区统计报告](./docs/community-stats.zh.md)* *完整统计请查看 [社区统计报告](./docs/community-stats.zh.md)*
<!-- STATS_END --> <!-- STATS_END -->

View File

@@ -1,7 +1,7 @@
{ {
"schemaVersion": 1, "schemaVersion": 1,
"label": "downloads", "label": "downloads",
"message": "1.8k", "message": "1.9k",
"color": "blue", "color": "blue",
"namedLogo": "openwebui" "namedLogo": "openwebui"
} }

View File

@@ -1,6 +1,6 @@
{ {
"schemaVersion": 1, "schemaVersion": 1,
"label": "followers", "label": "followers",
"message": "133", "message": "137",
"color": "blue" "color": "blue"
} }

View File

@@ -1,13 +1,14 @@
{ {
"total_posts": 16, "total_posts": 16,
"total_downloads": 1792, "total_downloads": 1878,
"total_views": 21276, "total_views": 22027,
"total_upvotes": 120, "total_upvotes": 120,
"total_downvotes": 2, "total_downvotes": 2,
"total_saves": 135, "total_saves": 147,
"total_comments": 24, "total_comments": 24,
"by_type": { "by_type": {
"action": 14, "filter": 1,
"action": 13,
"unknown": 2 "unknown": 2
}, },
"posts": [ "posts": [
@@ -18,10 +19,10 @@
"version": "0.9.1", "version": "0.9.1",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "Intelligently analyzes text content and generates interactive mind maps to help users structure and visualize knowledge.", "description": "Intelligently analyzes text content and generates interactive mind maps to help users structure and visualize knowledge.",
"downloads": 532, "downloads": 550,
"views": 4822, "views": 4933,
"upvotes": 15, "upvotes": 15,
"saves": 28, "saves": 30,
"comments": 11, "comments": 11,
"created_at": "2025-12-30", "created_at": "2025-12-30",
"updated_at": "2026-01-17", "updated_at": "2026-01-17",
@@ -34,10 +35,10 @@
"version": "1.4.9", "version": "1.4.9",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "AI-powered infographic generator based on AntV Infographic. Supports professional templates, auto-icon matching, and SVG/PNG downloads.", "description": "AI-powered infographic generator based on AntV Infographic. Supports professional templates, auto-icon matching, and SVG/PNG downloads.",
"downloads": 260, "downloads": 281,
"views": 2514, "views": 2651,
"upvotes": 14, "upvotes": 14,
"saves": 20, "saves": 21,
"comments": 3, "comments": 3,
"created_at": "2025-12-28", "created_at": "2025-12-28",
"updated_at": "2026-01-18", "updated_at": "2026-01-18",
@@ -50,10 +51,10 @@
"version": "0.3.7", "version": "0.3.7",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "Extracts tables from chat messages and exports them to Excel (.xlsx) files with smart formatting.", "description": "Extracts tables from chat messages and exports them to Excel (.xlsx) files with smart formatting.",
"downloads": 209, "downloads": 213,
"views": 800, "views": 835,
"upvotes": 4, "upvotes": 4,
"saves": 5, "saves": 6,
"comments": 0, "comments": 0,
"created_at": "2025-05-30", "created_at": "2025-05-30",
"updated_at": "2026-01-07", "updated_at": "2026-01-07",
@@ -63,16 +64,16 @@
"title": "Async Context Compression", "title": "Async Context Compression",
"slug": "async_context_compression_b1655bc8", "slug": "async_context_compression_b1655bc8",
"type": "action", "type": "action",
"version": "1.1.3", "version": "1.2.0",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression.", "description": "Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression.",
"downloads": 180, "downloads": 189,
"views": 1975, "views": 2048,
"upvotes": 9, "upvotes": 9,
"saves": 19, "saves": 22,
"comments": 0, "comments": 0,
"created_at": "2025-11-08", "created_at": "2025-11-08",
"updated_at": "2026-01-17", "updated_at": "2026-01-19",
"url": "https://openwebui.com/posts/async_context_compression_b1655bc8" "url": "https://openwebui.com/posts/async_context_compression_b1655bc8"
}, },
{ {
@@ -82,10 +83,10 @@
"version": "0.4.3", "version": "0.4.3",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "Export current conversation from Markdown to Word (.docx) with Mermaid diagrams rendered client-side (Mermaid.js, SVG+PNG), LaTeX math, real hyperlinks, improved tables, syntax highlighting, and blockquote support.", "description": "Export current conversation from Markdown to Word (.docx) with Mermaid diagrams rendered client-side (Mermaid.js, SVG+PNG), LaTeX math, real hyperlinks, improved tables, syntax highlighting, and blockquote support.",
"downloads": 158, "downloads": 168,
"views": 1377, "views": 1449,
"upvotes": 8, "upvotes": 8,
"saves": 16, "saves": 17,
"comments": 0, "comments": 0,
"created_at": "2026-01-03", "created_at": "2026-01-03",
"updated_at": "2026-01-17", "updated_at": "2026-01-17",
@@ -98,10 +99,10 @@
"version": "0.2.4", "version": "0.2.4",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "Quickly generates beautiful flashcards from text, extracting key points and categories.", "description": "Quickly generates beautiful flashcards from text, extracting key points and categories.",
"downloads": 138, "downloads": 143,
"views": 2329, "views": 2386,
"upvotes": 10, "upvotes": 10,
"saves": 10, "saves": 12,
"comments": 2, "comments": 2,
"created_at": "2025-12-30", "created_at": "2025-12-30",
"updated_at": "2026-01-17", "updated_at": "2026-01-17",
@@ -111,16 +112,16 @@
"title": "Markdown Normalizer", "title": "Markdown Normalizer",
"slug": "markdown_normalizer_baaa8732", "slug": "markdown_normalizer_baaa8732",
"type": "action", "type": "action",
"version": "1.2.3", "version": "1.2.4",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "A content normalizer filter that fixes common Markdown formatting issues in LLM outputs, such as broken code blocks, LaTeX formulas, and list formatting.", "description": "A content normalizer filter that fixes common Markdown formatting issues in LLM outputs, such as broken code blocks, LaTeX formulas, and list formatting.",
"downloads": 84, "downloads": 95,
"views": 2100, "views": 2228,
"upvotes": 10, "upvotes": 10,
"saves": 17, "saves": 17,
"comments": 5, "comments": 5,
"created_at": "2026-01-12", "created_at": "2026-01-12",
"updated_at": "2026-01-17", "updated_at": "2026-01-19",
"url": "https://openwebui.com/posts/markdown_normalizer_baaa8732" "url": "https://openwebui.com/posts/markdown_normalizer_baaa8732"
}, },
{ {
@@ -130,10 +131,10 @@
"version": "1.0.0", "version": "1.0.0",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "A comprehensive thinking lens that dives deep into any content - from context to logic, insights, and action paths.", "description": "A comprehensive thinking lens that dives deep into any content - from context to logic, insights, and action paths.",
"downloads": 68, "downloads": 71,
"views": 663, "views": 703,
"upvotes": 4, "upvotes": 4,
"saves": 6, "saves": 7,
"comments": 0, "comments": 0,
"created_at": "2026-01-08", "created_at": "2026-01-08",
"updated_at": "2026-01-08", "updated_at": "2026-01-08",
@@ -146,8 +147,8 @@
"version": "0.4.3", "version": "0.4.3",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "将对话导出为 Word (.docx),支持 Mermaid 图表 (客户端渲染 SVG+PNG)、LaTeX 数学公式、真实超链接、增强表格格式、代码高亮和引用块。", "description": "将对话导出为 Word (.docx),支持 Mermaid 图表 (客户端渲染 SVG+PNG)、LaTeX 数学公式、真实超链接、增强表格格式、代码高亮和引用块。",
"downloads": 63, "downloads": 65,
"views": 1305, "views": 1329,
"upvotes": 11, "upvotes": 11,
"saves": 3, "saves": 3,
"comments": 1, "comments": 1,
@@ -162,8 +163,8 @@
"version": "1.4.9", "version": "1.4.9",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "基于 AntV Infographic 的智能信息图生成插件。支持多种专业模板,自动图标匹配,并提供 SVG/PNG 下载功能。", "description": "基于 AntV Infographic 的智能信息图生成插件。支持多种专业模板,自动图标匹配,并提供 SVG/PNG 下载功能。",
"downloads": 42, "downloads": 43,
"views": 683, "views": 702,
"upvotes": 6, "upvotes": 6,
"saves": 0, "saves": 0,
"comments": 0, "comments": 0,
@@ -178,8 +179,8 @@
"version": "0.9.1", "version": "0.9.1",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "智能分析文本内容,生成交互式思维导图,帮助用户结构化和可视化知识。", "description": "智能分析文本内容,生成交互式思维导图,帮助用户结构化和可视化知识。",
"downloads": 22, "downloads": 24,
"views": 398, "views": 406,
"upvotes": 3, "upvotes": 3,
"saves": 1, "saves": 1,
"comments": 0, "comments": 0,
@@ -195,7 +196,7 @@
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "快速将文本提炼为精美的学习记忆卡片,支持核心要点提取与分类。", "description": "快速将文本提炼为精美的学习记忆卡片,支持核心要点提取与分类。",
"downloads": 16, "downloads": 16,
"views": 443, "views": 452,
"upvotes": 5, "upvotes": 5,
"saves": 1, "saves": 1,
"comments": 0, "comments": 0,
@@ -206,17 +207,17 @@
{ {
"title": "异步上下文压缩", "title": "异步上下文压缩",
"slug": "异步上下文压缩_5c0617cb", "slug": "异步上下文压缩_5c0617cb",
"type": "action", "type": "filter",
"version": "1.1.3", "version": "1.2.0",
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "通过智能摘要和消息压缩,降低长对话的 token 消耗,同时保持对话连贯性。", "description": "通过智能摘要和消息压缩,降低长对话的 token 消耗,同时保持对话连贯性。",
"downloads": 14, "downloads": 14,
"views": 351, "views": 374,
"upvotes": 5, "upvotes": 5,
"saves": 1, "saves": 1,
"comments": 0, "comments": 0,
"created_at": "2025-11-08", "created_at": "2025-11-08",
"updated_at": "2026-01-17", "updated_at": "2026-01-19",
"url": "https://openwebui.com/posts/异步上下文压缩_5c0617cb" "url": "https://openwebui.com/posts/异步上下文压缩_5c0617cb"
}, },
{ {
@@ -227,7 +228,7 @@
"author": "Fu-Jie", "author": "Fu-Jie",
"description": "全方位的思维透镜 —— 从背景全景到逻辑脉络,从深度洞察到行动路径。", "description": "全方位的思维透镜 —— 从背景全景到逻辑脉络,从深度洞察到行动路径。",
"downloads": 6, "downloads": 6,
"views": 259, "views": 261,
"upvotes": 3, "upvotes": 3,
"saves": 1, "saves": 1,
"comments": 0, "comments": 0,
@@ -243,7 +244,7 @@
"author": "", "author": "",
"description": "", "description": "",
"downloads": 0, "downloads": 0,
"views": 59, "views": 62,
"upvotes": 1, "upvotes": 1,
"saves": 0, "saves": 0,
"comments": 0, "comments": 0,
@@ -259,9 +260,9 @@
"author": "", "author": "",
"description": "", "description": "",
"downloads": 0, "downloads": 0,
"views": 1198, "views": 1208,
"upvotes": 12, "upvotes": 12,
"saves": 7, "saves": 8,
"comments": 2, "comments": 2,
"created_at": "2026-01-10", "created_at": "2026-01-10",
"updated_at": "2026-01-10", "updated_at": "2026-01-10",
@@ -273,7 +274,7 @@
"name": "Fu-Jie", "name": "Fu-Jie",
"profile_url": "https://openwebui.com/u/Fu-Jie", "profile_url": "https://openwebui.com/u/Fu-Jie",
"profile_image": "https://community.s3.openwebui.com/uploads/users/b15d1348-4347-42b4-b815-e053342d6cb0/profile_d9510745-4bd4-4f8f-a997-4a21847d9300.webp", "profile_image": "https://community.s3.openwebui.com/uploads/users/b15d1348-4347-42b4-b815-e053342d6cb0/profile_d9510745-4bd4-4f8f-a997-4a21847d9300.webp",
"followers": 133, "followers": 137,
"following": 2, "following": 2,
"total_points": 134, "total_points": 134,
"post_points": 118, "post_points": 118,

View File

@@ -1,40 +1,41 @@
# 📊 OpenWebUI Community Stats Report # 📊 OpenWebUI Community Stats Report
> 📅 Updated: 2026-01-19 18:11 > 📅 Updated: 2026-01-20 17:15
## 📈 Overview ## 📈 Overview
| Metric | Value | | Metric | Value |
|------|------| |------|------|
| 📝 Total Posts | 16 | | 📝 Total Posts | 16 |
| ⬇️ Total Downloads | 1792 | | ⬇️ Total Downloads | 1878 |
| 👁️ Total Views | 21276 | | 👁️ Total Views | 22027 |
| 👍 Total Upvotes | 120 | | 👍 Total Upvotes | 120 |
| 💾 Total Saves | 135 | | 💾 Total Saves | 147 |
| 💬 Total Comments | 24 | | 💬 Total Comments | 24 |
## 📂 By Type ## 📂 By Type
- **action**: 14 - **filter**: 1
- **action**: 13
- **unknown**: 2 - **unknown**: 2
## 📋 Posts List ## 📋 Posts List
| Rank | Title | Type | Version | Downloads | Views | Upvotes | Saves | Updated | | Rank | Title | Type | Version | Downloads | Views | Upvotes | Saves | Updated |
|:---:|------|:---:|:---:|:---:|:---:|:---:|:---:|:---:| |:---:|------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| 1 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | action | 0.9.1 | 532 | 4822 | 15 | 28 | 2026-01-17 | | 1 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | action | 0.9.1 | 550 | 4933 | 15 | 30 | 2026-01-17 |
| 2 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | action | 1.4.9 | 260 | 2514 | 14 | 20 | 2026-01-18 | | 2 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | action | 1.4.9 | 281 | 2651 | 14 | 21 | 2026-01-18 |
| 3 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | action | 0.3.7 | 209 | 800 | 4 | 5 | 2026-01-07 | | 3 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | action | 0.3.7 | 213 | 835 | 4 | 6 | 2026-01-07 |
| 4 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | action | 1.1.3 | 180 | 1975 | 9 | 19 | 2026-01-17 | | 4 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | action | 1.2.0 | 189 | 2048 | 9 | 22 | 2026-01-19 |
| 5 | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | action | 0.4.3 | 158 | 1377 | 8 | 16 | 2026-01-17 | | 5 | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | action | 0.4.3 | 168 | 1449 | 8 | 17 | 2026-01-17 |
| 6 | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | action | 0.2.4 | 138 | 2329 | 10 | 10 | 2026-01-17 | | 6 | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | action | 0.2.4 | 143 | 2386 | 10 | 12 | 2026-01-17 |
| 7 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | action | 1.2.3 | 84 | 2100 | 10 | 17 | 2026-01-17 | | 7 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | action | 1.2.4 | 95 | 2228 | 10 | 17 | 2026-01-19 |
| 8 | [Deep Dive](https://openwebui.com/posts/deep_dive_c0b846e4) | action | 1.0.0 | 68 | 663 | 4 | 6 | 2026-01-08 | | 8 | [Deep Dive](https://openwebui.com/posts/deep_dive_c0b846e4) | action | 1.0.0 | 71 | 703 | 4 | 7 | 2026-01-08 |
| 9 | [导出为 Word (增强版)](https://openwebui.com/posts/导出为_word_支持公式流程图表格和代码块_8a6306c0) | action | 0.4.3 | 63 | 1305 | 11 | 3 | 2026-01-17 | | 9 | [导出为 Word (增强版)](https://openwebui.com/posts/导出为_word_支持公式流程图表格和代码块_8a6306c0) | action | 0.4.3 | 65 | 1329 | 11 | 3 | 2026-01-17 |
| 10 | [📊 智能信息图 (AntV Infographic)](https://openwebui.com/posts/智能信息图_e04a48ff) | action | 1.4.9 | 42 | 683 | 6 | 0 | 2026-01-17 | | 10 | [📊 智能信息图 (AntV Infographic)](https://openwebui.com/posts/智能信息图_e04a48ff) | action | 1.4.9 | 43 | 702 | 6 | 0 | 2026-01-17 |
| 11 | [思维导图](https://openwebui.com/posts/智能生成交互式思维导图帮助用户可视化知识_8d4b097b) | action | 0.9.1 | 22 | 398 | 3 | 1 | 2026-01-17 | | 11 | [思维导图](https://openwebui.com/posts/智能生成交互式思维导图帮助用户可视化知识_8d4b097b) | action | 0.9.1 | 24 | 406 | 3 | 1 | 2026-01-17 |
| 12 | [闪记卡 (Flash Card)](https://openwebui.com/posts/闪记卡生成插件_4a31eac3) | action | 0.2.4 | 16 | 443 | 5 | 1 | 2026-01-17 | | 12 | [闪记卡 (Flash Card)](https://openwebui.com/posts/闪记卡生成插件_4a31eac3) | action | 0.2.4 | 16 | 452 | 5 | 1 | 2026-01-17 |
| 13 | [异步上下文压缩](https://openwebui.com/posts/异步上下文压缩_5c0617cb) | action | 1.1.3 | 14 | 351 | 5 | 1 | 2026-01-17 | | 13 | [异步上下文压缩](https://openwebui.com/posts/异步上下文压缩_5c0617cb) | filter | 1.2.0 | 14 | 374 | 5 | 1 | 2026-01-19 |
| 14 | [精读](https://openwebui.com/posts/精读_99830b0f) | action | 1.0.0 | 6 | 259 | 3 | 1 | 2026-01-08 | | 14 | [精读](https://openwebui.com/posts/精读_99830b0f) | action | 1.0.0 | 6 | 261 | 3 | 1 | 2026-01-08 |
| 15 | [Review of Claude Haiku 4.5](https://openwebui.com/posts/review_of_claude_haiku_45_41b0db39) | unknown | | 0 | 59 | 1 | 0 | 2026-01-14 | | 15 | [Review of Claude Haiku 4.5](https://openwebui.com/posts/review_of_claude_haiku_45_41b0db39) | unknown | | 0 | 62 | 1 | 0 | 2026-01-14 |
| 16 | [ 🛠️ Debug Open WebUI Plugins in Your Browser](https://openwebui.com/posts/debug_open_webui_plugins_in_your_browser_81bf7960) | unknown | | 0 | 1198 | 12 | 7 | 2026-01-10 | | 16 | [ 🛠️ Debug Open WebUI Plugins in Your Browser](https://openwebui.com/posts/debug_open_webui_plugins_in_your_browser_81bf7960) | unknown | | 0 | 1208 | 12 | 8 | 2026-01-10 |

View File

@@ -1,40 +1,41 @@
# 📊 OpenWebUI 社区统计报告 # 📊 OpenWebUI 社区统计报告
> 📅 更新时间: 2026-01-19 18:11 > 📅 更新时间: 2026-01-20 17:15
## 📈 总览 ## 📈 总览
| 指标 | 数值 | | 指标 | 数值 |
|------|------| |------|------|
| 📝 发布数量 | 16 | | 📝 发布数量 | 16 |
| ⬇️ 总下载量 | 1792 | | ⬇️ 总下载量 | 1878 |
| 👁️ 总浏览量 | 21276 | | 👁️ 总浏览量 | 22027 |
| 👍 总点赞数 | 120 | | 👍 总点赞数 | 120 |
| 💾 总收藏数 | 135 | | 💾 总收藏数 | 147 |
| 💬 总评论数 | 24 | | 💬 总评论数 | 24 |
## 📂 按类型分类 ## 📂 按类型分类
- **action**: 14 - **filter**: 1
- **action**: 13
- **unknown**: 2 - **unknown**: 2
## 📋 发布列表 ## 📋 发布列表
| 排名 | 标题 | 类型 | 版本 | 下载 | 浏览 | 点赞 | 收藏 | 更新日期 | | 排名 | 标题 | 类型 | 版本 | 下载 | 浏览 | 点赞 | 收藏 | 更新日期 |
|:---:|------|:---:|:---:|:---:|:---:|:---:|:---:|:---:| |:---:|------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| 1 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | action | 0.9.1 | 532 | 4822 | 15 | 28 | 2026-01-17 | | 1 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | action | 0.9.1 | 550 | 4933 | 15 | 30 | 2026-01-17 |
| 2 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | action | 1.4.9 | 260 | 2514 | 14 | 20 | 2026-01-18 | | 2 | [📊 Smart Infographic (AntV)](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | action | 1.4.9 | 281 | 2651 | 14 | 21 | 2026-01-18 |
| 3 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | action | 0.3.7 | 209 | 800 | 4 | 5 | 2026-01-07 | | 3 | [Export to Excel](https://openwebui.com/posts/export_mulit_table_to_excel_244b8f9d) | action | 0.3.7 | 213 | 835 | 4 | 6 | 2026-01-07 |
| 4 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | action | 1.1.3 | 180 | 1975 | 9 | 19 | 2026-01-17 | | 4 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | action | 1.2.0 | 189 | 2048 | 9 | 22 | 2026-01-19 |
| 5 | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | action | 0.4.3 | 158 | 1377 | 8 | 16 | 2026-01-17 | | 5 | [Export to Word (Enhanced)](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | action | 0.4.3 | 168 | 1449 | 8 | 17 | 2026-01-17 |
| 6 | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | action | 0.2.4 | 138 | 2329 | 10 | 10 | 2026-01-17 | | 6 | [Flash Card](https://openwebui.com/posts/flash_card_65a2ea8f) | action | 0.2.4 | 143 | 2386 | 10 | 12 | 2026-01-17 |
| 7 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | action | 1.2.3 | 84 | 2100 | 10 | 17 | 2026-01-17 | | 7 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | action | 1.2.4 | 95 | 2228 | 10 | 17 | 2026-01-19 |
| 8 | [Deep Dive](https://openwebui.com/posts/deep_dive_c0b846e4) | action | 1.0.0 | 68 | 663 | 4 | 6 | 2026-01-08 | | 8 | [Deep Dive](https://openwebui.com/posts/deep_dive_c0b846e4) | action | 1.0.0 | 71 | 703 | 4 | 7 | 2026-01-08 |
| 9 | [导出为 Word (增强版)](https://openwebui.com/posts/导出为_word_支持公式流程图表格和代码块_8a6306c0) | action | 0.4.3 | 63 | 1305 | 11 | 3 | 2026-01-17 | | 9 | [导出为 Word (增强版)](https://openwebui.com/posts/导出为_word_支持公式流程图表格和代码块_8a6306c0) | action | 0.4.3 | 65 | 1329 | 11 | 3 | 2026-01-17 |
| 10 | [📊 智能信息图 (AntV Infographic)](https://openwebui.com/posts/智能信息图_e04a48ff) | action | 1.4.9 | 42 | 683 | 6 | 0 | 2026-01-17 | | 10 | [📊 智能信息图 (AntV Infographic)](https://openwebui.com/posts/智能信息图_e04a48ff) | action | 1.4.9 | 43 | 702 | 6 | 0 | 2026-01-17 |
| 11 | [思维导图](https://openwebui.com/posts/智能生成交互式思维导图帮助用户可视化知识_8d4b097b) | action | 0.9.1 | 22 | 398 | 3 | 1 | 2026-01-17 | | 11 | [思维导图](https://openwebui.com/posts/智能生成交互式思维导图帮助用户可视化知识_8d4b097b) | action | 0.9.1 | 24 | 406 | 3 | 1 | 2026-01-17 |
| 12 | [闪记卡 (Flash Card)](https://openwebui.com/posts/闪记卡生成插件_4a31eac3) | action | 0.2.4 | 16 | 443 | 5 | 1 | 2026-01-17 | | 12 | [闪记卡 (Flash Card)](https://openwebui.com/posts/闪记卡生成插件_4a31eac3) | action | 0.2.4 | 16 | 452 | 5 | 1 | 2026-01-17 |
| 13 | [异步上下文压缩](https://openwebui.com/posts/异步上下文压缩_5c0617cb) | action | 1.1.3 | 14 | 351 | 5 | 1 | 2026-01-17 | | 13 | [异步上下文压缩](https://openwebui.com/posts/异步上下文压缩_5c0617cb) | filter | 1.2.0 | 14 | 374 | 5 | 1 | 2026-01-19 |
| 14 | [精读](https://openwebui.com/posts/精读_99830b0f) | action | 1.0.0 | 6 | 259 | 3 | 1 | 2026-01-08 | | 14 | [精读](https://openwebui.com/posts/精读_99830b0f) | action | 1.0.0 | 6 | 261 | 3 | 1 | 2026-01-08 |
| 15 | [Review of Claude Haiku 4.5](https://openwebui.com/posts/review_of_claude_haiku_45_41b0db39) | unknown | | 0 | 59 | 1 | 0 | 2026-01-14 | | 15 | [Review of Claude Haiku 4.5](https://openwebui.com/posts/review_of_claude_haiku_45_41b0db39) | unknown | | 0 | 62 | 1 | 0 | 2026-01-14 |
| 16 | [ 🛠️ Debug Open WebUI Plugins in Your Browser](https://openwebui.com/posts/debug_open_webui_plugins_in_your_browser_81bf7960) | unknown | | 0 | 1198 | 12 | 7 | 2026-01-10 | | 16 | [ 🛠️ Debug Open WebUI Plugins in Your Browser](https://openwebui.com/posts/debug_open_webui_plugins_in_your_browser_81bf7960) | unknown | | 0 | 1208 | 12 | 8 | 2026-01-10 |

View File

@@ -1,7 +1,7 @@
# Async Context Compression # Async Context Compression
<span class="category-badge filter">Filter</span> <span class="category-badge filter">Filter</span>
<span class="version-badge">v1.2.0</span> <span class="version-badge">v1.2.1</span>
Reduces token consumption in long conversations through intelligent summarization while maintaining conversational coherence. Reduces token consumption in long conversations through intelligent summarization while maintaining conversational coherence.
@@ -38,6 +38,8 @@ This is especially useful for:
- :material-format-align-justify: **Structure-Aware Trimming**: Preserves document structure - :material-format-align-justify: **Structure-Aware Trimming**: Preserves document structure
- :material-content-cut: **Native Tool Output Trimming**: Trims verbose tool outputs (Note: Non-native tool outputs are not fully injected into context) - :material-content-cut: **Native Tool Output Trimming**: Trims verbose tool outputs (Note: Non-native tool outputs are not fully injected into context)
- :material-chart-bar: **Detailed Token Logging**: Granular token breakdown - :material-chart-bar: **Detailed Token Logging**: Granular token breakdown
- :material-account-search: **Smart Model Matching**: Inherit config from base models
- :material-image-off: **Multimodal Support**: Images are preserved but tokens are **NOT** calculated
--- ---
@@ -73,6 +75,7 @@ graph TD
| `keep_first` | integer | `1` | Always keep the first N messages | | `keep_first` | integer | `1` | Always keep the first N messages |
| `keep_last` | integer | `6` | Always keep the last N messages | | `keep_last` | integer | `6` | Always keep the last N messages |
| `summary_model` | string | `None` | Model to use for summarization | | `summary_model` | string | `None` | Model to use for summarization |
| `summary_model_max_context` | integer | `0` | Max context tokens for summary model |
| `max_summary_tokens` | integer | `16384` | Maximum tokens for the summary | | `max_summary_tokens` | integer | `16384` | Maximum tokens for the summary |
| `enable_tool_output_trimming` | boolean | `false` | Enable trimming of large tool outputs | | `enable_tool_output_trimming` | boolean | `false` | Enable trimming of large tool outputs |

View File

@@ -1,7 +1,7 @@
# Async Context Compression异步上下文压缩 # Async Context Compression异步上下文压缩
<span class="category-badge filter">Filter</span> <span class="category-badge filter">Filter</span>
<span class="version-badge">v1.2.0</span> <span class="version-badge">v1.2.1</span>
通过智能摘要减少长对话的 token 消耗,同时保持对话连贯。 通过智能摘要减少长对话的 token 消耗,同时保持对话连贯。
@@ -38,6 +38,8 @@ Async Context Compression 过滤器通过以下方式帮助管理长对话的 to
- :material-format-align-justify: **结构感知裁剪**:保留文档结构的智能裁剪 - :material-format-align-justify: **结构感知裁剪**:保留文档结构的智能裁剪
- :material-content-cut: **原生工具输出裁剪**:自动裁剪冗长的工具输出(注意:非原生工具调用输出不会完整注入上下文) - :material-content-cut: **原生工具输出裁剪**:自动裁剪冗长的工具输出(注意:非原生工具调用输出不会完整注入上下文)
- :material-chart-bar: **详细 Token 日志**:提供细粒度的 Token 统计 - :material-chart-bar: **详细 Token 日志**:提供细粒度的 Token 统计
- :material-account-search: **智能模型匹配**:自定义模型自动继承基础模型配置
- :material-image-off: **多模态支持**:图片内容保留但 Token **不参与计算**
--- ---
@@ -73,6 +75,7 @@ graph TD
| `keep_first` | integer | `1` | 始终保留的前 N 条消息 | | `keep_first` | integer | `1` | 始终保留的前 N 条消息 |
| `keep_last` | integer | `6` | 始终保留的后 N 条消息 | | `keep_last` | integer | `6` | 始终保留的后 N 条消息 |
| `summary_model` | string | `None` | 用于摘要的模型 | | `summary_model` | string | `None` | 用于摘要的模型 |
| `summary_model_max_context` | integer | `0` | 摘要模型的最大上下文 Token 数 |
| `max_summary_tokens` | integer | `16384` | 摘要的最大 token 数 | | `max_summary_tokens` | integer | `16384` | 摘要的最大 token 数 |
| `enable_tool_output_trimming` | boolean | `false` | 启用长工具输出裁剪 | | `enable_tool_output_trimming` | boolean | `false` | 启用长工具输出裁剪 |

View File

@@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline:
Reduces token consumption in long conversations through intelligent summarization while maintaining coherence. Reduces token consumption in long conversations through intelligent summarization while maintaining coherence.
**Version:** 1.1.3 **Version:** 1.2.1
[:octicons-arrow-right-24: Documentation](async-context-compression.md) [:octicons-arrow-right-24: Documentation](async-context-compression.md)

View File

@@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件:
通过智能总结减少长对话的 token 消耗,同时保持连贯性。 通过智能总结减少长对话的 token 消耗,同时保持连贯性。
**版本:** 1.1.3 **版本:** 1.2.1
[:octicons-arrow-right-24: 查看文档](async-context-compression.md) [:octicons-arrow-right-24: 查看文档](async-context-compression.md)

View File

@@ -1,9 +1,15 @@
# Async Context Compression Filter # Async Context Compression Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **Version:** 1.2.0 | **Project:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **License:** MIT **Author:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **Version:** 1.2.1 | **Project:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **License:** MIT
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent. This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
## What's new in 1.2.1
- **Smart Configuration**: Automatically detects base model settings for custom models and adds `summary_model_max_context` for independent summary limits.
- **Performance & Refactoring**: Optimized threshold parsing with caching, removed redundant code, and improved LLM response handling (JSONResponse support).
- **Bug Fixes & Modernization**: Fixed `datetime` deprecation warnings, corrected type annotations, and replaced print statements with proper logging.
## What's new in 1.2.0 ## What's new in 1.2.0
- **Preflight Context Check**: Before sending to the model, validates that total tokens fit within the context window. Automatically trims or drops oldest messages if exceeded. - **Preflight Context Check**: Before sending to the model, validates that total tokens fit within the context window. Automatically trims or drops oldest messages if exceeded.
@@ -19,18 +25,6 @@ This filter reduces token consumption in long conversations through intelligent
- **Enhanced Stability**: Fixed a race condition in state management that could cause "inlet state not found" warnings in high-concurrency scenarios. - **Enhanced Stability**: Fixed a race condition in state management that could cause "inlet state not found" warnings in high-concurrency scenarios.
- **Bug Fixes**: Corrected default model handling to prevent misleading logs when no model is specified. - **Bug Fixes**: Corrected default model handling to prevent misleading logs when no model is specified.
## What's new in 1.1.2
- **Open WebUI v0.7.x Compatibility**: Resolved a critical database session binding error affecting Open WebUI v0.7.x users. The plugin now dynamically discovers the database engine and session context, ensuring compatibility across versions.
- **Enhanced Error Reporting**: Errors during background summary generation are now reported via both the status bar and browser console.
- **Robust Model Handling**: Improved handling of missing or invalid model IDs to prevent crashes.
## What's new in 1.1.1
- **Frontend Debugging**: Added `show_debug_log` option to print debug info to the browser console (F12).
- **Optimized Compression**: Improved token calculation logic to prevent aggressive truncation of history, ensuring more context is retained.
--- ---
@@ -45,6 +39,8 @@ This filter reduces token consumption in long conversations through intelligent
- ✅ Native tool output trimming for cleaner context when using function calling. - ✅ Native tool output trimming for cleaner context when using function calling.
- ✅ Real-time context usage monitoring with warning notifications (>90%). - ✅ Real-time context usage monitoring with warning notifications (>90%).
- ✅ Detailed token logging for precise debugging and optimization. - ✅ Detailed token logging for precise debugging and optimization.
-**Smart Model Matching**: Automatically inherits configuration from base models for custom presets.
-**Multimodal Support**: Images are preserved but their tokens are **NOT** calculated. Please adjust thresholds accordingly.
--- ---
@@ -75,7 +71,8 @@ It is recommended to keep this filter early in the chain so it runs before filte
| `keep_first` | `1` | Always keep the first N messages (protects system prompts). | | `keep_first` | `1` | Always keep the first N messages (protects system prompts). |
| `keep_last` | `6` | Always keep the last N messages to preserve recent context. | | `keep_last` | `6` | Always keep the last N messages to preserve recent context. |
| `summary_model` | `None` | Model for summaries. Strongly recommended to set a fast, economical model (e.g., `gemini-2.5-flash`, `deepseek-v3`). Falls back to the current chat model when empty. | | `summary_model` | `None` | Model for summaries. Strongly recommended to set a fast, economical model (e.g., `gemini-2.5-flash`, `deepseek-v3`). Falls back to the current chat model when empty. |
| `max_summary_tokens` | `4000` | Maximum tokens for the generated summary. | | `summary_model_max_context` | `0` | Max context tokens for the summary model. If 0, falls back to `model_thresholds` or global `max_context_tokens`. |
| `max_summary_tokens` | `16384` | Maximum tokens for the generated summary. |
| `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. | | `summary_temperature` | `0.3` | Randomness for summary generation. Lower is more deterministic. |
| `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). | | `model_thresholds` | `{}` | Per-model overrides for `compression_threshold_tokens` and `max_context_tokens` (useful for mixed models). |
| `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. | | `enable_tool_output_trimming` | `false` | When enabled and `function_calling: "native"` is active, trims verbose tool outputs to extract only the final answer. |

View File

@@ -1,11 +1,17 @@
# 异步上下文压缩过滤器 # 异步上下文压缩过滤器
**作者:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **版本:** 1.2.0 | **项目:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **许可证:** MIT **作者:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **版本:** 1.2.1 | **项目:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **许可证:** MIT
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。 > **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。 本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
## 1.2.1 版本更新
- **智能配置增强**: 自动检测自定义模型的基础模型配置,并新增 `summary_model_max_context` 参数以独立控制摘要模型的上下文限制。
- **性能优化与重构**: 重构了阈值解析逻辑并增加缓存,移除了冗余的处理代码,并增强了 LLM 响应处理(支持 JSONResponse
- **稳定性改进**: 修复了 `datetime` 弃用警告,修正了类型注解,并将 print 语句替换为标准日志记录。
## 1.2.0 版本更新 ## 1.2.0 版本更新
- **预检上下文检查 (Preflight Context Check)**: 在发送给模型之前,验证总 Token 是否符合上下文窗口。如果超出,自动裁剪或丢弃最旧的消息。 - **预检上下文检查 (Preflight Context Check)**: 在发送给模型之前,验证总 Token 是否符合上下文窗口。如果超出,自动裁剪或丢弃最旧的消息。
@@ -21,18 +27,6 @@
- **稳定性增强**: 修复了状态管理中的竞态条件,解决了高并发场景下可能出现的“无法获取 inlet 状态”警告。 - **稳定性增强**: 修复了状态管理中的竞态条件,解决了高并发场景下可能出现的“无法获取 inlet 状态”警告。
- **Bug 修复**: 修正了默认模型处理逻辑,防止在未指定模型时产生误导性日志。 - **Bug 修复**: 修正了默认模型处理逻辑,防止在未指定模型时产生误导性日志。
## 1.1.2 版本更新
- **Open WebUI v0.7.x 兼容性**: 修复了影响 Open WebUI v0.7.x 用户的严重数据库会话绑定错误。插件现在动态发现数据库引擎和会话上下文,确保跨版本兼容性。
- **增强错误报告**: 后台摘要生成过程中的错误现在会通过状态栏和浏览器控制台同时报告。
- **健壮的模型处理**: 改进了对缺失或无效模型 ID 的处理,防止程序崩溃。
## 1.1.1 版本更新
- **前端调试**: 新增 `show_debug_log` 选项,支持在浏览器控制台 (F12) 打印调试信息。
- **压缩优化**: 优化 Token 计算逻辑,防止历史记录被过度截断,保留更多上下文。
--- ---
@@ -47,6 +41,8 @@
-**原生工具输出裁剪**: 支持裁剪冗长的工具调用输出。 -**原生工具输出裁剪**: 支持裁剪冗长的工具调用输出。
-**实时监控**: 实时监控上下文使用情况,超过 90% 发出警告。 -**实时监控**: 实时监控上下文使用情况,超过 90% 发出警告。
-**详细日志**: 提供精确的 Token 统计日志,便于调试。 -**详细日志**: 提供精确的 Token 统计日志,便于调试。
-**智能模型匹配**: 自定义模型自动继承基础模型的阈值配置。
-**多模态支持**: 图片内容会被保留,但其 Token **不参与计算**。请相应调整阈值。
详细的工作原理和流程请参考 [工作流程指南](WORKFLOW_GUIDE_CN.md)。 详细的工作原理和流程请参考 [工作流程指南](WORKFLOW_GUIDE_CN.md)。
@@ -88,6 +84,7 @@
| 参数 | 默认值 | 描述 | | 参数 | 默认值 | 描述 |
| :-------------------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------ | | :-------------------- | :------ | :------------------------------------------------------------------------------------------------------------------------------------------ |
| `summary_model` | `None` | 用于生成摘要的模型 ID。**强烈建议**配置快速、经济、上下文窗口大的模型(如 `gemini-2.5-flash``deepseek-v3`)。留空则尝试复用当前对话模型。 | | `summary_model` | `None` | 用于生成摘要的模型 ID。**强烈建议**配置快速、经济、上下文窗口大的模型(如 `gemini-2.5-flash``deepseek-v3`)。留空则尝试复用当前对话模型。 |
| `summary_model_max_context` | `0` | 摘要模型的最大上下文 Token 数。如果为 0则回退到 `model_thresholds` 或全局 `max_context_tokens`。 |
| `max_summary_tokens` | `16384` | 生成摘要时允许的最大 Token 数。 | | `max_summary_tokens` | `16384` | 生成摘要时允许的最大 Token 数。 |
| `summary_temperature` | `0.1` | 控制摘要生成的随机性,较低的值结果更稳定。 | | `summary_temperature` | `0.1` | 控制摘要生成的随机性,较低的值结果更稳定。 |

View File

@@ -5,19 +5,17 @@ author: Fu-Jie
author_url: https://github.com/Fu-Jie/awesome-openwebui author_url: https://github.com/Fu-Jie/awesome-openwebui
funding_url: https://github.com/open-webui funding_url: https://github.com/open-webui
description: Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression. description: Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression.
version: 1.2.0 version: 1.2.1
openwebui_id: b1655bc8-6de9-4cad-8cb5-a6f7829a02ce openwebui_id: b1655bc8-6de9-4cad-8cb5-a6f7829a02ce
license: MIT license: MIT
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
📌 What's new in 1.2.0 📌 What's new in 1.2.1
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
Preflight Context Check: Validates context fit before sending to model. Smart Configuration: Automatically detects base model settings for custom models and adds `summary_model_max_context` for independent summary limits.
Structure-Aware Trimming: Collapses long AI responses while keeping H1-H6, intro, and conclusion. Performance & Refactoring: Optimized threshold parsing with caching and removed redundant code for better efficiency.
Native Tool Output Trimming: Cleaner context when using function calling. (Note: Non-native tool outputs are not fully injected into context) Bug Fixes & Modernization: Fixed `datetime` deprecation warnings and corrected type annotations.
✅ Context Usage Warning: Notification when usage exceeds 90%.
✅ Detailed Token Logging: Granular breakdown of System, Head, Summary, and Tail tokens.
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
📌 Overview 📌 Overview
@@ -229,6 +227,8 @@ Statistics:
✓ This filter supports multimodal messages containing images. ✓ This filter supports multimodal messages containing images.
✓ The summary is generated only from the text content. ✓ The summary is generated only from the text content.
✓ Non-text parts (like images) are preserved in their original messages during compression. ✓ Non-text parts (like images) are preserved in their original messages during compression.
⚠ Image tokens are NOT calculated. Different models have vastly different image token costs
(GPT-4o: 85-1105, Claude: ~1300, Gemini: ~258 per image). Plan your thresholds accordingly.
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
🐛 Troubleshooting 🐛 Troubleshooting
@@ -259,7 +259,7 @@ Solution:
""" """
from pydantic import BaseModel, Field, model_validator from pydantic import BaseModel, Field
from typing import Optional, Dict, Any, List, Union, Callable, Awaitable from typing import Optional, Dict, Any, List, Union, Callable, Awaitable
import re import re
import asyncio import asyncio
@@ -267,6 +267,10 @@ import json
import hashlib import hashlib
import time import time
import contextlib import contextlib
import logging
# Setup logger
logger = logging.getLogger(__name__)
# Open WebUI built-in imports # Open WebUI built-in imports
from open_webui.utils.chat import generate_chat_completion from open_webui.utils.chat import generate_chat_completion
@@ -291,7 +295,7 @@ except ImportError:
from sqlalchemy import Column, String, Text, DateTime, Integer, inspect from sqlalchemy import Column, String, Text, DateTime, Integer, inspect
from sqlalchemy.orm import declarative_base, sessionmaker from sqlalchemy.orm import declarative_base, sessionmaker
from sqlalchemy.engine import Engine from sqlalchemy.engine import Engine
from datetime import datetime from datetime import datetime, timezone
def _discover_owui_engine(db_module: Any) -> Optional[Engine]: def _discover_owui_engine(db_module: Any) -> Optional[Engine]:
@@ -312,7 +316,7 @@ def _discover_owui_engine(db_module: Any) -> Optional[Engine]:
session, "engine", None session, "engine", None
) )
except Exception as exc: except Exception as exc:
print(f"[DB Discover] get_db_context failed: {exc}") logger.error(f"[DB Discover] get_db_context failed: {exc}")
for attr in ("engine", "ENGINE", "bind", "BIND"): for attr in ("engine", "ENGINE", "bind", "BIND"):
candidate = getattr(db_module, attr, None) candidate = getattr(db_module, attr, None)
@@ -334,7 +338,7 @@ def _discover_owui_schema(db_module: Any) -> Optional[str]:
if isinstance(candidate, str) and candidate.strip(): if isinstance(candidate, str) and candidate.strip():
return candidate.strip() return candidate.strip()
except Exception as exc: except Exception as exc:
print(f"[DB Discover] Base metadata schema lookup failed: {exc}") logger.error(f"[DB Discover] Base metadata schema lookup failed: {exc}")
try: try:
metadata_obj = getattr(db_module, "metadata_obj", None) metadata_obj = getattr(db_module, "metadata_obj", None)
@@ -344,7 +348,7 @@ def _discover_owui_schema(db_module: Any) -> Optional[str]:
if isinstance(candidate, str) and candidate.strip(): if isinstance(candidate, str) and candidate.strip():
return candidate.strip() return candidate.strip()
except Exception as exc: except Exception as exc:
print(f"[DB Discover] metadata_obj schema lookup failed: {exc}") logger.error(f"[DB Discover] metadata_obj schema lookup failed: {exc}")
try: try:
from open_webui import env as owui_env from open_webui import env as owui_env
@@ -353,7 +357,7 @@ def _discover_owui_schema(db_module: Any) -> Optional[str]:
if isinstance(candidate, str) and candidate.strip(): if isinstance(candidate, str) and candidate.strip():
return candidate.strip() return candidate.strip()
except Exception as exc: except Exception as exc:
print(f"[DB Discover] env schema lookup failed: {exc}") logger.error(f"[DB Discover] env schema lookup failed: {exc}")
return None return None
@@ -379,8 +383,21 @@ class ChatSummary(owui_Base):
chat_id = Column(String(255), unique=True, nullable=False, index=True) chat_id = Column(String(255), unique=True, nullable=False, index=True)
summary = Column(Text, nullable=False) summary = Column(Text, nullable=False)
compressed_message_count = Column(Integer, default=0) compressed_message_count = Column(Integer, default=0)
created_at = Column(DateTime, default=datetime.utcnow) created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc))
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow) updated_at = Column(
DateTime,
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)
# Global cache for tiktoken encoding
TIKTOKEN_ENCODING = None
if tiktoken:
try:
TIKTOKEN_ENCODING = tiktoken.get_encoding("o200k_base")
except Exception as e:
logger.error(f"[Init] Failed to load tiktoken encoding: {e}")
class Filter: class Filter:
@@ -391,8 +408,48 @@ class Filter:
self._fallback_session_factory = ( self._fallback_session_factory = (
sessionmaker(bind=self._db_engine) if self._db_engine else None sessionmaker(bind=self._db_engine) if self._db_engine else None
) )
self._model_thresholds_cache: Optional[Dict[str, Any]] = None
self._init_database() self._init_database()
def _parse_model_thresholds(self) -> Dict[str, Any]:
"""Parse model_thresholds string into a dictionary.
Format: model_id:compression_threshold:max_context, model_id2:threshold2:max2
Example: gpt-4:8000:32000, claude-3:100000:200000
Returns cached result if already parsed.
"""
if self._model_thresholds_cache is not None:
return self._model_thresholds_cache
self._model_thresholds_cache = {}
raw_config = self.valves.model_thresholds
if not raw_config:
return self._model_thresholds_cache
for entry in raw_config.split(","):
entry = entry.strip()
if not entry:
continue
parts = entry.split(":")
if len(parts) != 3:
continue
try:
model_id = parts[0].strip()
compression_threshold = int(parts[1].strip())
max_context = int(parts[2].strip())
self._model_thresholds_cache[model_id] = {
"compression_threshold_tokens": compression_threshold,
"max_context_tokens": max_context,
}
except ValueError:
continue
return self._model_thresholds_cache
@contextlib.contextmanager @contextlib.contextmanager
def _db_session(self): def _db_session(self):
"""Yield a database session using Open WebUI helpers with graceful fallbacks.""" """Yield a database session using Open WebUI helpers with graceful fallbacks."""
@@ -435,7 +492,7 @@ class Filter:
try: try:
session.close() session.close()
except Exception as exc: # pragma: no cover - best-effort cleanup except Exception as exc: # pragma: no cover - best-effort cleanup
print(f"[Database] ⚠️ Failed to close fallback session: {exc}") logger.warning(f"[Database] ⚠️ Failed to close fallback session: {exc}")
def _init_database(self): def _init_database(self):
"""Initializes the database table using Open WebUI's shared connection.""" """Initializes the database table using Open WebUI's shared connection."""
@@ -447,19 +504,26 @@ class Filter:
# Check if table exists using SQLAlchemy inspect # Check if table exists using SQLAlchemy inspect
inspector = inspect(self._db_engine) inspector = inspect(self._db_engine)
if not inspector.has_table("chat_summary"): # Support schema if configured
has_table = (
inspector.has_table("chat_summary", schema=owui_schema)
if owui_schema
else inspector.has_table("chat_summary")
)
if not has_table:
# Create the chat_summary table if it doesn't exist # Create the chat_summary table if it doesn't exist
ChatSummary.__table__.create(bind=self._db_engine, checkfirst=True) ChatSummary.__table__.create(bind=self._db_engine, checkfirst=True)
print( logger.info(
"[Database] ✅ Successfully created chat_summary table using Open WebUI's shared database connection." "[Database] ✅ Successfully created chat_summary table using Open WebUI's shared database connection."
) )
else: else:
print( logger.info(
"[Database] ✅ Using Open WebUI's shared database connection. chat_summary table already exists." "[Database] ✅ Using Open WebUI's shared database connection. chat_summary table already exists."
) )
except Exception as e: except Exception as e:
print(f"[Database] ❌ Initialization failed: {str(e)}") logger.error(f"[Database] ❌ Initialization failed: {str(e)}")
class Valves(BaseModel): class Valves(BaseModel):
priority: int = Field( priority: int = Field(
@@ -476,9 +540,9 @@ class Filter:
ge=0, ge=0,
description="Hard limit for context. Exceeding this value will force removal of earliest messages (Global Default)", description="Hard limit for context. Exceeding this value will force removal of earliest messages (Global Default)",
) )
model_thresholds: dict = Field( model_thresholds: str = Field(
default={}, default="",
description="Threshold override configuration for specific models. Only includes models requiring special configuration.", description="Per-model threshold overrides. Format: model_id:compression_threshold:max_context (comma-separated). Example: gpt-4:8000:32000, claude-3:100000:200000",
) )
keep_first: int = Field( keep_first: int = Field(
@@ -489,10 +553,15 @@ class Filter:
keep_last: int = Field( keep_last: int = Field(
default=6, ge=0, description="Always keep the last N full messages." default=6, ge=0, description="Always keep the last N full messages."
) )
summary_model: str = Field( summary_model: Optional[str] = Field(
default=None, default=None,
description="The model ID used to generate the summary. If empty, uses the current conversation's model. Used to match configurations in model_thresholds.", description="The model ID used to generate the summary. If empty, uses the current conversation's model. Used to match configurations in model_thresholds.",
) )
summary_model_max_context: int = Field(
default=0,
ge=0,
description="Max context tokens for the summary model. If 0, falls back to model_thresholds or global max_context_tokens. Example: gemini-flash=1000000, gpt-4o-mini=128000.",
)
max_summary_tokens: int = Field( max_summary_tokens: int = Field(
default=16384, default=16384,
ge=1, ge=1,
@@ -529,7 +598,7 @@ class Filter:
# [Optimization] Optimistic lock check: update only if progress moves forward # [Optimization] Optimistic lock check: update only if progress moves forward
if compressed_count <= existing.compressed_message_count: if compressed_count <= existing.compressed_message_count:
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.info(
f"[Storage] Skipping update: New progress ({compressed_count}) is not greater than existing progress ({existing.compressed_message_count})" f"[Storage] Skipping update: New progress ({compressed_count}) is not greater than existing progress ({existing.compressed_message_count})"
) )
return return
@@ -537,7 +606,7 @@ class Filter:
# Update existing record # Update existing record
existing.summary = summary existing.summary = summary
existing.compressed_message_count = compressed_count existing.compressed_message_count = compressed_count
existing.updated_at = datetime.utcnow() existing.updated_at = datetime.now(timezone.utc)
else: else:
# Create new record # Create new record
new_summary = ChatSummary( new_summary = ChatSummary(
@@ -551,12 +620,12 @@ class Filter:
if self.valves.debug_mode: if self.valves.debug_mode:
action = "Updated" if existing else "Created" action = "Updated" if existing else "Created"
print( logger.info(
f"[Storage] Summary has been {action.lower()} in the database (Chat ID: {chat_id})" f"[Storage] Summary has been {action.lower()} in the database (Chat ID: {chat_id})"
) )
except Exception as e: except Exception as e:
print(f"[Storage] ❌ Database save failed: {str(e)}") logger.error(f"[Storage] ❌ Database save failed: {str(e)}")
def _load_summary_record(self, chat_id: str) -> Optional[ChatSummary]: def _load_summary_record(self, chat_id: str) -> Optional[ChatSummary]:
"""Loads the summary record object from the database.""" """Loads the summary record object from the database."""
@@ -568,7 +637,7 @@ class Filter:
session.expunge(record) session.expunge(record)
return record return record
except Exception as e: except Exception as e:
print(f"[Load] ❌ Database read failed: {str(e)}") logger.error(f"[Load] ❌ Database read failed: {str(e)}")
return None return None
def _load_summary(self, chat_id: str, body: dict) -> Optional[str]: def _load_summary(self, chat_id: str, body: dict) -> Optional[str]:
@@ -576,8 +645,8 @@ class Filter:
record = self._load_summary_record(chat_id) record = self._load_summary_record(chat_id)
if record: if record:
if self.valves.debug_mode: if self.valves.debug_mode:
print(f"[Load] Loaded summary from database (Chat ID: {chat_id})") logger.info(f"[Load] Loaded summary from database (Chat ID: {chat_id})")
print( logger.info(
f"[Load] Last updated: {record.updated_at}, Compressed message count: {record.compressed_message_count}" f"[Load] Last updated: {record.updated_at}, Compressed message count: {record.compressed_message_count}"
) )
return record.summary return record.summary
@@ -588,14 +657,12 @@ class Filter:
if not text: if not text:
return 0 return 0
if tiktoken: if TIKTOKEN_ENCODING:
try: try:
# Uniformly use o200k_base encoding (adapted for latest models) return len(TIKTOKEN_ENCODING.encode(text))
encoding = tiktoken.get_encoding("o200k_base")
return len(encoding.encode(text))
except Exception as e: except Exception as e:
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.warning(
f"[Token Count] tiktoken error: {e}, falling back to character estimation" f"[Token Count] tiktoken error: {e}, falling back to character estimation"
) )
@@ -604,6 +671,7 @@ class Filter:
def _calculate_messages_tokens(self, messages: List[Dict]) -> int: def _calculate_messages_tokens(self, messages: List[Dict]) -> int:
"""Calculates the total tokens for a list of messages.""" """Calculates the total tokens for a list of messages."""
start_time = time.time()
total_tokens = 0 total_tokens = 0
for msg in messages: for msg in messages:
content = msg.get("content", "") content = msg.get("content", "")
@@ -616,6 +684,13 @@ class Filter:
total_tokens += self._count_tokens(text_content) total_tokens += self._count_tokens(text_content)
else: else:
total_tokens += self._count_tokens(str(content)) total_tokens += self._count_tokens(str(content))
duration = (time.time() - start_time) * 1000
if self.valves.debug_mode:
logger.info(
f"[Token Calc] Calculated {total_tokens} tokens for {len(messages)} messages in {duration:.2f}ms"
)
return total_tokens return total_tokens
def _get_model_thresholds(self, model_id: str) -> Dict[str, int]: def _get_model_thresholds(self, model_id: str) -> Dict[str, int]:
@@ -623,17 +698,48 @@ class Filter:
Priority: Priority:
1. If configuration exists for the model ID in model_thresholds, use it. 1. If configuration exists for the model ID in model_thresholds, use it.
2. Otherwise, use global parameters compression_threshold_tokens and max_context_tokens. 2. If model is a custom model, try to match its base_model_id.
3. Otherwise, use global parameters compression_threshold_tokens and max_context_tokens.
""" """
# Try to match from model-specific configuration parsed = self._parse_model_thresholds()
if model_id in self.valves.model_thresholds:
if self.valves.debug_mode:
print(f"[Config] Using model-specific configuration: {model_id}")
return self.valves.model_thresholds[model_id]
# Use global default configuration # 1. Direct match with model_id
if model_id in parsed:
if self.valves.debug_mode:
logger.info(f"[Config] Using model-specific configuration: {model_id}")
return parsed[model_id]
# 2. Try to find base_model_id for custom models
try:
model_obj = Models.get_model_by_id(model_id)
if model_obj:
# Check for base_model_id (custom model)
base_model_id = getattr(model_obj, "base_model_id", None)
if not base_model_id:
# Try base_model_ids (array) - take first one
base_model_ids = getattr(model_obj, "base_model_ids", None)
if (
base_model_ids
and isinstance(base_model_ids, list)
and len(base_model_ids) > 0
):
base_model_id = base_model_ids[0]
if base_model_id and base_model_id in parsed:
if self.valves.debug_mode:
logger.info(
f"[Config] Custom model '{model_id}' -> base_model '{base_model_id}': using base model configuration"
)
return parsed[base_model_id]
except Exception as e:
if self.valves.debug_mode:
logger.warning(
f"[Config] Failed to lookup base_model for '{model_id}': {e}"
)
# 3. Use global default configuration
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.info(
f"[Config] Model {model_id} not in model_thresholds, using global parameters" f"[Config] Model {model_id} not in model_thresholds, using global parameters"
) )
@@ -731,13 +837,13 @@ class Filter:
} }
) )
except Exception as e: except Exception as e:
print(f"Error emitting debug log: {e}") logger.error(f"Error emitting debug log: {e}")
async def _log(self, message: str, type: str = "info", event_call=None): async def _log(self, message: str, type: str = "info", event_call=None):
"""Unified logging to both backend (print) and frontend (console.log)""" """Unified logging to both backend (print) and frontend (console.log)"""
# Backend logging # Backend logging
if self.valves.debug_mode: if self.valves.debug_mode:
print(message) logger.info(message)
# Frontend logging # Frontend logging
if self.valves.show_debug_log and event_call: if self.valves.show_debug_log and event_call:
@@ -770,9 +876,17 @@ class Filter:
js_code = f""" js_code = f"""
console.log("%c[Compression] {safe_message}", "{css}"); console.log("%c[Compression] {safe_message}", "{css}");
""" """
await event_call({"type": "execute", "data": {"code": js_code}}) # Add timeout to prevent blocking if frontend connection is broken
await asyncio.wait_for(
event_call({"type": "execute", "data": {"code": js_code}}),
timeout=2.0,
)
except asyncio.TimeoutError:
logger.warning(
f"Failed to emit log to frontend: Timeout (connection may be broken)"
)
except Exception as e: except Exception as e:
print(f"Failed to emit log to frontend: {e}") logger.error(f"Failed to emit log to frontend: {type(e).__name__}: {e}")
async def inlet( async def inlet(
self, self,
@@ -819,42 +933,57 @@ class Filter:
event_call=__event_call__, event_call=__event_call__,
) )
# Extract the final answer (after last tool call metadata) # Strategy 1: Tool Output / Code Block Trimming
# Pattern: Matches escaped JSON strings like ""&quot;...&quot;"" followed by newlines # Detect if message contains large tool outputs or code blocks
# We look for the last occurrence of such a pattern and take everything after it # Improved regex to be less brittle
is_tool_output = (
# 1. Try matching the specific OpenWebUI tool output format: ""&quot;...&quot;"" "&quot;" in content
# This regex finds the last end-quote of a tool output block or "Arguments:" in content
tool_output_pattern = r'""&quot;.*?&quot;""\s*' or "```" in content
or "<tool_code>" in content
# Find all matches
matches = list(
re.finditer(tool_output_pattern, content, re.DOTALL)
) )
if matches: if is_tool_output:
# Get the end position of the last match # Regex to find the last occurrence of a tool output block or code block
last_match_end = matches[-1].end() # This pattern looks for:
# 1. OpenWebUI's escaped JSON format: ""&quot;...&quot;""
# 2. "Arguments: {...}" pattern
# 3. Generic code blocks: ```...```
# 4. <tool_code>...</tool_code>
# It captures the content *after* the last such block.
tool_output_pattern = r'(?:""&quot;.*?&quot;""|Arguments:\s*\{[^}]+\}|```.*?```|<tool_code>.*?</tool_code>)\s*'
# Everything after the last tool output is the final answer # Find all matches
final_answer = content[last_match_end:].strip() matches = list(
re.finditer(tool_output_pattern, content, re.DOTALL)
)
if matches:
# Get the end position of the last match
last_match_end = matches[-1].end()
# Everything after the last tool output is the final answer
final_answer = content[last_match_end:].strip()
if final_answer:
msg["content"] = (
f"... [Tool outputs trimmed]\n{final_answer}"
)
trimmed_count += 1
else:
# Fallback: Try splitting on "Arguments:" if the new format isn't found
# (Preserving backward compatibility or different model behaviors)
parts = re.split(r"(?:Arguments:\s*\{[^}]+\})\n+", content)
if len(parts) > 1:
final_answer = parts[-1].strip()
if final_answer: if final_answer:
msg["content"] = ( msg["content"] = (
f"... [Tool outputs trimmed]\n{final_answer}" f"... [Tool outputs trimmed]\n{final_answer}"
) )
trimmed_count += 1 trimmed_count += 1
else:
# Fallback: If no specific pattern matched, but it was identified as tool output,
# try a simpler split or just mark as trimmed if no final answer can be extracted.
# (Preserving backward compatibility or different model behaviors)
parts = re.split(
r"(?:Arguments:\s*\{[^}]+\})\n+", content
)
if len(parts) > 1:
final_answer = parts[-1].strip()
if final_answer:
msg["content"] = (
f"... [Tool outputs trimmed]\n{final_answer}"
)
trimmed_count += 1
if trimmed_count > 0 and self.valves.show_debug_log and __event_call__: if trimmed_count > 0 and self.valves.show_debug_log and __event_call__:
await self._log( await self._log(
@@ -881,7 +1010,8 @@ class Filter:
) )
# Clean model ID if needed (though get_model_by_id usually expects the full ID) # Clean model ID if needed (though get_model_by_id usually expects the full ID)
model_obj = Models.get_model_by_id(model_id) # Run in thread to avoid blocking event loop on slow DB queries
model_obj = await asyncio.to_thread(Models.get_model_by_id, model_id)
if model_obj: if model_obj:
if self.valves.show_debug_log and __event_call__: if self.valves.show_debug_log and __event_call__:
@@ -933,8 +1063,7 @@ class Filter:
else: else:
if self.valves.show_debug_log and __event_call__: if self.valves.show_debug_log and __event_call__:
await self._log( await self._log(
f"[Inlet] ❌ Model NOT found in DB", f"[Inlet] Not a custom model, skipping custom system prompt check",
type="warning",
event_call=__event_call__, event_call=__event_call__,
) )
@@ -946,7 +1075,7 @@ class Filter:
event_call=__event_call__, event_call=__event_call__,
) )
if self.valves.debug_mode: if self.valves.debug_mode:
print(f"[Inlet] Error fetching system prompt from DB: {e}") logger.error(f"[Inlet] Error fetching system prompt from DB: {e}")
# Fall back to checking messages (base model or already included) # Fall back to checking messages (base model or already included)
if not system_prompt_content: if not system_prompt_content:
@@ -960,7 +1089,7 @@ class Filter:
if system_prompt_content: if system_prompt_content:
system_prompt_msg = {"role": "system", "content": system_prompt_content} system_prompt_msg = {"role": "system", "content": system_prompt_content}
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.info(
f"[Inlet] Found system prompt ({len(system_prompt_content)} chars). Including in budget." f"[Inlet] Found system prompt ({len(system_prompt_content)} chars). Including in budget."
) )
@@ -991,7 +1120,7 @@ class Filter:
f"[Inlet] Message Stats: {stats_str}", event_call=__event_call__ f"[Inlet] Message Stats: {stats_str}", event_call=__event_call__
) )
except Exception as e: except Exception as e:
print(f"[Inlet] Error logging message stats: {e}") logger.error(f"[Inlet] Error logging message stats: {e}")
if not chat_id: if not chat_id:
await self._log( await self._log(
@@ -1007,6 +1136,33 @@ class Filter:
event_call=__event_call__, event_call=__event_call__,
) )
# Log custom model configurations
raw_config = self.valves.model_thresholds
parsed_configs = self._parse_model_thresholds()
if raw_config:
config_list = [
f"{model}: {cfg['compression_threshold_tokens']}t/{cfg['max_context_tokens']}t"
for model, cfg in parsed_configs.items()
]
if config_list:
await self._log(
f"[Inlet] 📋 Model Configs (Raw: '{raw_config}'): {', '.join(config_list)}",
event_call=__event_call__,
)
else:
await self._log(
f"[Inlet] ⚠️ Invalid Model Configs (Raw: '{raw_config}'): No valid configs parsed. Expected format: 'model_id:threshold:max_context'",
type="warning",
event_call=__event_call__,
)
else:
await self._log(
f"[Inlet] 📋 Model Configs: No custom configuration (Global defaults only)",
event_call=__event_call__,
)
# Record the target compression progress for the original messages, for use in outlet # Record the target compression progress for the original messages, for use in outlet
# Target is to compress up to the (total - keep_last) message # Target is to compress up to the (total - keep_last) message
target_compressed_count = max(0, len(messages) - self.valves.keep_last) target_compressed_count = max(0, len(messages) - self.valves.keep_last)
@@ -1043,9 +1199,9 @@ class Filter:
if effective_keep_first > 0: if effective_keep_first > 0:
head_messages = messages[:effective_keep_first] head_messages = messages[:effective_keep_first]
# 2. Summary message (Inserted as User message) # 2. Summary message (Inserted as Assistant message)
summary_content = ( summary_content = (
f"System Prompt: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n" f"Previous Summary: The following is a summary of the historical conversation, provided for context only. Do not reply to the summary content itself; answer the subsequent latest questions directly.】\n\n"
f"{summary_record.summary}\n\n" f"{summary_record.summary}\n\n"
f"---\n" f"---\n"
f"Below is the recent conversation:" f"Below is the recent conversation:"
@@ -1287,7 +1443,7 @@ class Filter:
# Get max context limit # Get max context limit
model = self._clean_model_id(body.get("model")) model = self._clean_model_id(body.get("model"))
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
max_context_tokens = thresholds.get( max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens "max_context_tokens", self.valves.max_context_tokens
) )
@@ -1314,7 +1470,8 @@ class Filter:
> start_trim_index + 1 # Keep at least 1 message after keep_first > start_trim_index + 1 # Keep at least 1 message after keep_first
): ):
dropped = final_messages.pop(start_trim_index) dropped = final_messages.pop(start_trim_index)
total_tokens -= self._count_tokens(str(dropped.get("content", ""))) dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
total_tokens -= dropped_tokens
await self._log( await self._log(
f"[Inlet] ✂️ Messages reduced. New total: {total_tokens} Tokens", f"[Inlet] ✂️ Messages reduced. New total: {total_tokens} Tokens",
@@ -1371,18 +1528,11 @@ class Filter:
) )
return body return body
model = body.get("model") or "" model = body.get("model") or ""
messages = body.get("messages", [])
# Calculate target compression progress directly # Calculate target compression progress directly
# Assuming body['messages'] in outlet contains the full history (including new response)
messages = body.get("messages", [])
target_compressed_count = max(0, len(messages) - self.valves.keep_last) target_compressed_count = max(0, len(messages) - self.valves.keep_last)
if self.valves.debug_mode or self.valves.show_debug_log:
await self._log(
f"\n{'='*60}\n[Outlet] Chat ID: {chat_id}\n[Outlet] Response complete\n[Outlet] Calculated target compression progress: {target_compressed_count} (Messages: {len(messages)})",
event_call=__event_call__,
)
# Process Token calculation and summary generation asynchronously in the background (do not wait for completion, do not affect output) # Process Token calculation and summary generation asynchronously in the background (do not wait for completion, do not affect output)
asyncio.create_task( asyncio.create_task(
self._check_and_generate_summary_async( self._check_and_generate_summary_async(
@@ -1396,11 +1546,6 @@ class Filter:
) )
) )
await self._log(
f"[Outlet] Background processing started\n{'='*60}\n",
event_call=__event_call__,
)
return body return body
async def _check_and_generate_summary_async( async def _check_and_generate_summary_async(
@@ -1416,11 +1561,25 @@ class Filter:
""" """
Background processing: Calculates Token count and generates summary (does not block response). Background processing: Calculates Token count and generates summary (does not block response).
""" """
try: try:
messages = body.get("messages", []) messages = body.get("messages", [])
# Clean model ID
model = self._clean_model_id(model)
if self.valves.debug_mode or self.valves.show_debug_log:
await self._log(
f"\n{'='*60}\n[Outlet] Chat ID: {chat_id}\n[Outlet] Response complete\n[Outlet] Calculated target compression progress: {target_compressed_count} (Messages: {len(messages)})",
event_call=__event_call__,
)
await self._log(
f"[Outlet] Background processing started\n{'='*60}\n",
event_call=__event_call__,
)
# Get threshold configuration for current model # Get threshold configuration for current model
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
compression_threshold_tokens = thresholds.get( compression_threshold_tokens = thresholds.get(
"compression_threshold_tokens", self.valves.compression_threshold_tokens "compression_threshold_tokens", self.valves.compression_threshold_tokens
) )
@@ -1440,6 +1599,28 @@ class Filter:
event_call=__event_call__, event_call=__event_call__,
) )
# Send status notification (Context Usage format)
if __event_emitter__ and self.valves.show_token_usage_status:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
status_msg = f"Context Usage (Estimated): {current_tokens} / {max_context_tokens} Tokens"
if max_context_tokens > 0:
usage_ratio = current_tokens / max_context_tokens
status_msg += f" ({usage_ratio*100:.1f}%)"
if usage_ratio > 0.9:
status_msg += " | ⚠️ High Usage"
await __event_emitter__(
{
"type": "status",
"data": {
"description": status_msg,
"done": True,
},
}
)
# Check if compression is needed # Check if compression is needed
if current_tokens >= compression_threshold_tokens: if current_tokens >= compression_threshold_tokens:
await self._log( await self._log(
@@ -1559,10 +1740,13 @@ class Filter:
return return
thresholds = self._get_model_thresholds(summary_model_id) thresholds = self._get_model_thresholds(summary_model_id)
# Note: Using the summary model's max context limit here # Priority: 1. summary_model_max_context (if > 0) -> 2. model_thresholds -> 3. global max_context_tokens
max_context_tokens = thresholds.get( if self.valves.summary_model_max_context > 0:
"max_context_tokens", self.valves.max_context_tokens max_context_tokens = self.valves.summary_model_max_context
) else:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
await self._log( await self._log(
f"[🤖 Async Summary Task] Using max limit for model {summary_model_id}: {max_context_tokens} Tokens", f"[🤖 Async Summary Task] Using max limit for model {summary_model_id}: {max_context_tokens} Tokens",
@@ -1753,7 +1937,6 @@ class Filter:
max_context_tokens = thresholds.get( max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens "max_context_tokens", self.valves.max_context_tokens
) )
# 6. Emit Status # 6. Emit Status
status_msg = f"Context Summary Updated: {token_count} / {max_context_tokens} Tokens" status_msg = f"Context Summary Updated: {token_count} / {max_context_tokens} Tokens"
if max_context_tokens > 0: if max_context_tokens > 0:
@@ -1798,7 +1981,7 @@ class Filter:
import traceback import traceback
traceback.print_exc() logger.exception("[🤖 Async Summary Task] Unhandled exception")
def _format_messages_for_summary(self, messages: list) -> str: def _format_messages_for_summary(self, messages: list) -> str:
"""Formats messages for summarization.""" """Formats messages for summarization."""
@@ -1818,9 +2001,8 @@ class Filter:
# Handle role name # Handle role name
role_name = {"user": "User", "assistant": "Assistant"}.get(role, role) role_name = {"user": "User", "assistant": "Assistant"}.get(role, role)
# Limit length of each message to avoid excessive length # User requested to remove truncation to allow full context for summary
if len(content) > 500: # unless it exceeds model limits (which is handled by the LLM call itself or max_tokens)
content = content[:500] + "..."
formatted.append(f"[{i}] {role_name}: {content}") formatted.append(f"[{i}] {role_name}: {content}")
@@ -1927,8 +2109,25 @@ Based on the content above, generate the summary:
# Call generate_chat_completion # Call generate_chat_completion
response = await generate_chat_completion(request, payload, user) response = await generate_chat_completion(request, payload, user)
if not response or "choices" not in response or not response["choices"]: # Handle JSONResponse (some backends return JSONResponse instead of dict)
raise ValueError("LLM response format incorrect or empty") if hasattr(response, "body"):
# It's a Response object, extract the body
import json as json_module
try:
response = json_module.loads(response.body.decode("utf-8"))
except Exception:
raise ValueError(f"Failed to parse JSONResponse body: {response}")
if (
not response
or not isinstance(response, dict)
or "choices" not in response
or not response["choices"]
):
raise ValueError(
f"LLM response format incorrect or empty: {type(response).__name__}"
)
summary = response["choices"][0]["message"]["content"].strip() summary = response["choices"][0]["message"]["content"].strip()

View File

@@ -5,19 +5,17 @@ author: Fu-Jie
author_url: https://github.com/Fu-Jie/awesome-openwebui author_url: https://github.com/Fu-Jie/awesome-openwebui
funding_url: https://github.com/open-webui funding_url: https://github.com/open-webui
description: 通过智能摘要和消息压缩,降低长对话的 token 消耗,同时保持对话连贯性。 description: 通过智能摘要和消息压缩,降低长对话的 token 消耗,同时保持对话连贯性。
version: 1.2.0 version: 1.2.1
openwebui_id: 5c0617cb-a9e4-4bd6-a440-d276534ebd18 openwebui_id: 5c0617cb-a9e4-4bd6-a440-d276534ebd18
license: MIT license: MIT
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
📌 1.2.0 版本更新 📌 1.2.1 版本更新
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
预检上下文检查:发送给模型前验证上下文是否适配 智能配置增强:自动检测自定义模型的基础模型配置,并新增 `summary_model_max_context` 参数以独立控制摘要模型的上下文限制
结构感知裁剪:折叠过长的 AI 响应,同时保留标题 (H1-H6)、开头和结尾 性能优化与重构:重构了阈值解析逻辑并增加缓存,移除了冗余的处理代码,并增强了 LLM 响应处理(支持 JSONResponse
原生工具输出裁剪:使用函数调用时清理上下文,去除冗余输出。(注意:非原生工具调用输出不会完整注入上下文) 稳定性改进:修复了 `datetime` 弃用警告,修正了类型注解,并将 print 语句替换为标准日志记录。
✅ 上下文使用警告:当使用量超过 90% 时发出通知。
✅ 详细 Token 日志:细粒度记录 System、Head、Summary 和 Tail 的 Token 消耗。
═══════════════════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════════════════
📌 功能概述 📌 功能概述
@@ -254,23 +252,36 @@ show_debug_log (前端调试日志)
from pydantic import BaseModel, Field, model_validator from pydantic import BaseModel, Field, model_validator
from typing import Optional, Dict, Any, List, Union, Callable, Awaitable from typing import Optional, Dict, Any, List, Union, Callable, Awaitable
import re
import asyncio import asyncio
import json import json
import hashlib import hashlib
import time import contextlib
import re import logging
# 配置日志记录
logger = logging.getLogger(__name__)
if not logger.handlers:
handler = logging.StreamHandler()
formatter = logging.Formatter(
"%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)
# Open WebUI 内置导入 # Open WebUI 内置导入
from open_webui.utils.chat import generate_chat_completion from open_webui.utils.chat import generate_chat_completion
from open_webui.models.models import Models
from open_webui.models.users import Users from open_webui.models.users import Users
from open_webui.models.models import Models
from fastapi.requests import Request from fastapi.requests import Request
from open_webui.main import app as webui_app from open_webui.main import app as webui_app
# Open WebUI 内部数据库 (复用共享连接) # Open WebUI 内部数据库 (复用共享连接)
from open_webui.internal.db import engine as owui_engine try:
from open_webui.internal.db import Session as owui_Session from open_webui.internal import db as owui_db
from open_webui.internal.db import Base as owui_Base except ModuleNotFoundError: # pragma: no cover - filter runs inside Open WebUI
owui_db = None
# 尝试导入 tiktoken # 尝试导入 tiktoken
try: try:
@@ -280,35 +291,167 @@ except ImportError:
# 数据库导入 # 数据库导入
from sqlalchemy import Column, String, Text, DateTime, Integer, inspect from sqlalchemy import Column, String, Text, DateTime, Integer, inspect
from datetime import datetime from sqlalchemy.orm import declarative_base, sessionmaker
from sqlalchemy.engine import Engine
from datetime import datetime, timezone
def _discover_owui_engine(db_module: Any) -> Optional[Engine]:
"""Discover the Open WebUI SQLAlchemy engine via provided db module helpers."""
if db_module is None:
return None
db_context = getattr(db_module, "get_db_context", None) or getattr(
db_module, "get_db", None
)
if callable(db_context):
try:
with db_context() as session:
try:
return session.get_bind()
except AttributeError:
return getattr(session, "bind", None) or getattr(
session, "engine", None
)
except Exception as exc:
print(f"[DB Discover] get_db_context failed: {exc}")
for attr in ("engine", "ENGINE", "bind", "BIND"):
candidate = getattr(db_module, attr, None)
if candidate is not None:
return candidate
return None
def _discover_owui_schema(db_module: Any) -> Optional[str]:
"""Discover the Open WebUI database schema name if configured."""
if db_module is None:
return None
try:
base = getattr(db_module, "Base", None)
metadata = getattr(base, "metadata", None) if base is not None else None
candidate = getattr(metadata, "schema", None) if metadata is not None else None
if isinstance(candidate, str) and candidate.strip():
return candidate.strip()
except Exception as exc:
print(f"[DB Discover] Base metadata schema lookup failed: {exc}")
try:
metadata_obj = getattr(db_module, "metadata_obj", None)
candidate = (
getattr(metadata_obj, "schema", None) if metadata_obj is not None else None
)
if isinstance(candidate, str) and candidate.strip():
return candidate.strip()
except Exception as exc:
print(f"[DB Discover] metadata_obj schema lookup failed: {exc}")
try:
from open_webui import env as owui_env
candidate = getattr(owui_env, "DATABASE_SCHEMA", None)
if isinstance(candidate, str) and candidate.strip():
return candidate.strip()
except Exception as exc:
print(f"[DB Discover] env schema lookup failed: {exc}")
return None
owui_engine = _discover_owui_engine(owui_db)
owui_schema = _discover_owui_schema(owui_db)
owui_Base = getattr(owui_db, "Base", None) if owui_db is not None else None
if owui_Base is None:
owui_Base = declarative_base()
class ChatSummary(owui_Base): class ChatSummary(owui_Base):
"""对话摘要存储表""" """对话摘要存储表"""
__tablename__ = "chat_summary" __tablename__ = "chat_summary"
__table_args__ = {"extend_existing": True} __table_args__ = (
{"extend_existing": True, "schema": owui_schema}
if owui_schema
else {"extend_existing": True}
)
id = Column(Integer, primary_key=True, autoincrement=True) id = Column(Integer, primary_key=True, autoincrement=True)
chat_id = Column(String(255), unique=True, nullable=False, index=True) chat_id = Column(String(255), unique=True, nullable=False, index=True)
summary = Column(Text, nullable=False) summary = Column(Text, nullable=False)
compressed_message_count = Column(Integer, default=0) compressed_message_count = Column(Integer, default=0)
created_at = Column(DateTime, default=datetime.utcnow) created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc))
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow) updated_at = Column(
DateTime,
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)
class Filter: class Filter:
def __init__(self): def __init__(self):
self.valves = self.Valves() self.valves = self.Valves()
self._owui_db = owui_db
self._db_engine = owui_engine self._db_engine = owui_engine
self._SessionLocal = owui_Session self._fallback_session_factory = (
self._SessionLocal = owui_Session sessionmaker(bind=self._db_engine) if self._db_engine else None
self._init_database() )
self._threshold_cache = {}
self._init_database() self._init_database()
@contextlib.contextmanager
def _db_session(self):
"""Yield a database session using Open WebUI helpers with graceful fallbacks."""
db_module = self._owui_db
db_context = None
if db_module is not None:
db_context = getattr(db_module, "get_db_context", None) or getattr(
db_module, "get_db", None
)
if callable(db_context):
with db_context() as session:
yield session
return
factory = None
if db_module is not None:
factory = getattr(db_module, "SessionLocal", None) or getattr(
db_module, "ScopedSession", None
)
if callable(factory):
session = factory()
try:
yield session
finally:
close = getattr(session, "close", None)
if callable(close):
close()
return
if self._fallback_session_factory is None:
raise RuntimeError(
"Open WebUI database session is unavailable. Ensure Open WebUI's database layer is initialized."
)
session = self._fallback_session_factory()
try:
yield session
finally:
try:
session.close()
except Exception as exc: # pragma: no cover - best-effort cleanup
print(f"[Database] ⚠️ Failed to close fallback session: {exc}")
def _init_database(self): def _init_database(self):
"""使用 Open WebUI 的共享连接初始化数据库表""" """使用 Open WebUI 的共享连接初始化数据库表"""
try: try:
if self._db_engine is None:
raise RuntimeError(
"Open WebUI database engine is unavailable. Ensure Open WebUI is configured with a valid DATABASE_URL."
)
# 使用 SQLAlchemy inspect 检查表是否存在 # 使用 SQLAlchemy inspect 检查表是否存在
inspector = inspect(self._db_engine) inspector = inspect(self._db_engine)
if not inspector.has_table("chat_summary"): if not inspector.has_table("chat_summary"):
@@ -340,21 +483,38 @@ class Filter:
ge=0, ge=0,
description="上下文的硬性上限。超过此值将强制移除最早的消息 (全局默认值)", description="上下文的硬性上限。超过此值将强制移除最早的消息 (全局默认值)",
) )
model_thresholds: dict = Field( model_thresholds: Union[str, dict] = Field(
default={}, default={},
description="针对特定模型的阈值覆盖配置。仅包含需要特殊配置的模型", description="针对特定模型的阈值覆盖配置。可以是 JSON 字符串或字典",
) )
@model_validator(mode="before")
@classmethod
def parse_model_thresholds(cls, data: Any) -> Any:
if isinstance(data, dict):
thresholds = data.get("model_thresholds")
if isinstance(thresholds, str) and thresholds.strip():
try:
data["model_thresholds"] = json.loads(thresholds)
except Exception as e:
logger.error(f"Failed to parse model_thresholds JSON: {e}")
return data
keep_first: int = Field( keep_first: int = Field(
default=1, ge=0, description="始终保留最初的 N 条消息。设置为 0 则不保留。" default=1, ge=0, description="始终保留最初的 N 条消息。设置为 0 则不保留。"
) )
keep_last: int = Field( keep_last: int = Field(
default=6, ge=0, description="始终保留最近的 N 条完整消息。" default=6, ge=0, description="始终保留最近的 N 条完整消息。"
) )
summary_model: str = Field( summary_model: Optional[str] = Field(
default=None, default=None,
description="用于生成摘要的模型 ID。留空则使用当前对话的模型。用于匹配 model_thresholds 中的配置。", description="用于生成摘要的模型 ID。留空则使用当前对话的模型。用于匹配 model_thresholds 中的配置。",
) )
summary_model_max_context: int = Field(
default=0,
ge=0,
description="摘要模型的最大上下文 Token 数。如果为 0则回退到 model_thresholds 或全局 max_context_tokens。",
)
max_summary_tokens: int = Field( max_summary_tokens: int = Field(
default=16384, ge=1, description="摘要的最大 token 数" default=16384, ge=1, description="摘要的最大 token 数"
) )
@@ -376,7 +536,7 @@ class Filter:
def _save_summary(self, chat_id: str, summary: str, compressed_count: int): def _save_summary(self, chat_id: str, summary: str, compressed_count: int):
"""保存摘要到数据库""" """保存摘要到数据库"""
try: try:
with self._SessionLocal() as session: with self._db_session() as session:
# 查找现有记录 # 查找现有记录
existing = session.query(ChatSummary).filter_by(chat_id=chat_id).first() existing = session.query(ChatSummary).filter_by(chat_id=chat_id).first()
@@ -384,7 +544,7 @@ class Filter:
# [优化] 乐观锁检查:只有进度向前推进时才更新 # [优化] 乐观锁检查:只有进度向前推进时才更新
if compressed_count <= existing.compressed_message_count: if compressed_count <= existing.compressed_message_count:
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.debug(
f"[存储] 跳过更新:新进度 ({compressed_count}) 不大于现有进度 ({existing.compressed_message_count})" f"[存储] 跳过更新:新进度 ({compressed_count}) 不大于现有进度 ({existing.compressed_message_count})"
) )
return return
@@ -392,7 +552,7 @@ class Filter:
# 更新现有记录 # 更新现有记录
existing.summary = summary existing.summary = summary
existing.compressed_message_count = compressed_count existing.compressed_message_count = compressed_count
existing.updated_at = datetime.utcnow() existing.updated_at = datetime.now(timezone.utc)
else: else:
# 创建新记录 # 创建新记录
new_summary = ChatSummary( new_summary = ChatSummary(
@@ -406,22 +566,22 @@ class Filter:
if self.valves.debug_mode: if self.valves.debug_mode:
action = "更新" if existing else "创建" action = "更新" if existing else "创建"
print(f"[存储] 摘要已{action}到数据库 (Chat ID: {chat_id})") logger.info(f"[存储] 摘要已{action}到数据库 (Chat ID: {chat_id})")
except Exception as e: except Exception as e:
print(f"[存储] ❌ 数据库保存失败: {str(e)}") logger.error(f"[存储] ❌ 数据库保存失败: {str(e)}")
def _load_summary_record(self, chat_id: str) -> Optional[ChatSummary]: def _load_summary_record(self, chat_id: str) -> Optional[ChatSummary]:
"""从数据库加载摘要记录对象""" """从数据库加载摘要记录对象"""
try: try:
with self._SessionLocal() as session: with self._db_session() as session:
record = session.query(ChatSummary).filter_by(chat_id=chat_id).first() record = session.query(ChatSummary).filter_by(chat_id=chat_id).first()
if record: if record:
# Detach the object from the session so it can be used after session close # Detach the object from the session so it can be used after session close
session.expunge(record) session.expunge(record)
return record return record
except Exception as e: except Exception as e:
print(f"[加载] ❌ 数据库读取失败: {str(e)}") logger.error(f"[加载] ❌ 数据库读取失败: {str(e)}")
return None return None
def _load_summary(self, chat_id: str, body: dict) -> Optional[str]: def _load_summary(self, chat_id: str, body: dict) -> Optional[str]:
@@ -429,8 +589,8 @@ class Filter:
record = self._load_summary_record(chat_id) record = self._load_summary_record(chat_id)
if record: if record:
if self.valves.debug_mode: if self.valves.debug_mode:
print(f"[加载] 从数据库加载摘要 (Chat ID: {chat_id})") logger.debug(f"[加载] 从数据库加载摘要 (Chat ID: {chat_id})")
print( logger.debug(
f"[加载] 更新时间: {record.updated_at}, 已压缩消息数: {record.compressed_message_count}" f"[加载] 更新时间: {record.updated_at}, 已压缩消息数: {record.compressed_message_count}"
) )
return record.summary return record.summary
@@ -473,23 +633,68 @@ class Filter:
"""获取特定模型的阈值配置 """获取特定模型的阈值配置
优先级: 优先级:
1. 如果 model_thresholds 中存在该模型ID的配置使用该配置 1. 缓存匹配
2. 否则使用全局参数 compression_threshold_tokens 和 max_context_tokens 2. model_thresholds 直接匹配
3. 基础模型 (base_model_id) 匹配
4. 全局默认配置
""" """
# 尝试从模型特定配置中匹配 if not model_id:
if model_id in self.valves.model_thresholds: return {
"compression_threshold_tokens": self.valves.compression_threshold_tokens,
"max_context_tokens": self.valves.max_context_tokens,
}
# 1. 检查缓存
if model_id in self._threshold_cache:
return self._threshold_cache[model_id]
# 获取解析后的阈值配置
parsed = self.valves.model_thresholds
if isinstance(parsed, str):
try:
parsed = json.loads(parsed)
except Exception:
parsed = {}
# 2. 尝试直接匹配
if model_id in parsed:
res = parsed[model_id]
self._threshold_cache[model_id] = res
if self.valves.debug_mode: if self.valves.debug_mode:
print(f"[配置] 使用模型特定配置: {model_id}") logger.debug(f"[配置] 模型 {model_id} 命中直接配置")
return self.valves.model_thresholds[model_id] return res
# 使用全局默认配置 # 3. 尝试匹配基础模型 (base_model_id)
if self.valves.debug_mode: try:
print(f"[配置] 模型 {model_id} 未在 model_thresholds 中,使用全局参数") model_obj = Models.get_model_by_id(model_id)
if model_obj:
# 某些模型可能有多个基础模型 ID
base_ids = []
if hasattr(model_obj, "base_model_id") and model_obj.base_model_id:
base_ids.append(model_obj.base_model_id)
if hasattr(model_obj, "base_model_ids") and model_obj.base_model_ids:
if isinstance(model_obj.base_model_ids, list):
base_ids.extend(model_obj.base_model_ids)
return { for b_id in base_ids:
if b_id in parsed:
res = parsed[b_id]
self._threshold_cache[model_id] = res
if self.valves.debug_mode:
logger.info(
f"[配置] 模型 {model_id} 匹配到基础模型 {b_id} 的配置"
)
return res
except Exception as e:
logger.error(f"[配置] 查找基础模型失败: {e}")
# 4. 使用全局默认配置
res = {
"compression_threshold_tokens": self.valves.compression_threshold_tokens, "compression_threshold_tokens": self.valves.compression_threshold_tokens,
"max_context_tokens": self.valves.max_context_tokens, "max_context_tokens": self.valves.max_context_tokens,
} }
self._threshold_cache[model_id] = res
return res
def _get_chat_context( def _get_chat_context(
self, body: dict, __metadata__: Optional[dict] = None self, body: dict, __metadata__: Optional[dict] = None
@@ -621,13 +826,15 @@ class Filter:
""" """
await event_call({"type": "execute", "data": {"code": js_code}}) await event_call({"type": "execute", "data": {"code": js_code}})
except Exception as e: except Exception as e:
print(f"发送前端日志失败: {e}") logger.error(f"发送前端日志失败: {e}")
async def inlet( async def inlet(
self, self,
body: dict, body: dict,
__user__: Optional[dict] = None, __user__: Optional[dict] = None,
__metadata__: dict = None, __metadata__: dict = None,
__request__: Request = None,
__model__: dict = None,
__event_emitter__: Callable[[Any], Awaitable[None]] = None, __event_emitter__: Callable[[Any], Awaitable[None]] = None,
__event_call__: Callable[[Any], Awaitable[None]] = None, __event_call__: Callable[[Any], Awaitable[None]] = None,
) -> dict: ) -> dict:
@@ -641,8 +848,10 @@ class Filter:
messages = body.get("messages", []) messages = body.get("messages", [])
# --- 原生工具输出裁剪 (Native Tool Output Trimming) --- # --- 原生工具输出裁剪 (Native Tool Output Trimming) ---
# 即使未启用压缩,也始终检查并裁剪过长的工具输出,以节省 Token metadata = body.get("metadata", {})
if self.valves.enable_tool_output_trimming: is_native_func_calling = metadata.get("function_calling") == "native"
if self.valves.enable_tool_output_trimming and is_native_func_calling:
trimmed_count = 0 trimmed_count = 0
for msg in messages: for msg in messages:
content = msg.get("content", "") content = msg.get("content", "")
@@ -789,7 +998,7 @@ class Filter:
event_call=__event_call__, event_call=__event_call__,
) )
if self.valves.debug_mode: if self.valves.debug_mode:
print(f"[Inlet] 从数据库获取系统提示词错误: {e}") logger.error(f"[Inlet] 从数据库获取系统提示词错误: {e}")
# 回退:检查消息列表 (基础模型或已包含) # 回退:检查消息列表 (基础模型或已包含)
if not system_prompt_content: if not system_prompt_content:
@@ -803,7 +1012,7 @@ class Filter:
if system_prompt_content: if system_prompt_content:
system_prompt_msg = {"role": "system", "content": system_prompt_content} system_prompt_msg = {"role": "system", "content": system_prompt_content}
if self.valves.debug_mode: if self.valves.debug_mode:
print( logger.debug(
f"[Inlet] 找到系统提示词 ({len(system_prompt_content)} 字符)。计入预算。" f"[Inlet] 找到系统提示词 ({len(system_prompt_content)} 字符)。计入预算。"
) )
@@ -834,7 +1043,7 @@ class Filter:
f"[Inlet] 消息统计: {stats_str}", event_call=__event_call__ f"[Inlet] 消息统计: {stats_str}", event_call=__event_call__
) )
except Exception as e: except Exception as e:
print(f"[Inlet] 记录消息统计错误: {e}") logger.error(f"[Inlet] 记录消息统计错误: {e}")
if not chat_id: if not chat_id:
await self._log( await self._log(
@@ -925,7 +1134,7 @@ class Filter:
# 获取最大上下文限制 # 获取最大上下文限制
model = self._clean_model_id(body.get("model")) model = self._clean_model_id(body.get("model"))
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
max_context_tokens = thresholds.get( max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens "max_context_tokens", self.valves.max_context_tokens
) )
@@ -1129,7 +1338,7 @@ class Filter:
# 获取最大上下文限制 # 获取最大上下文限制
model = self._clean_model_id(body.get("model")) model = self._clean_model_id(body.get("model"))
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
max_context_tokens = thresholds.get( max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens "max_context_tokens", self.valves.max_context_tokens
) )
@@ -1156,7 +1365,8 @@ class Filter:
> start_trim_index + 1 # 保留 keep_first 之后至少 1 条消息 > start_trim_index + 1 # 保留 keep_first 之后至少 1 条消息
): ):
dropped = final_messages.pop(start_trim_index) dropped = final_messages.pop(start_trim_index)
total_tokens -= self._count_tokens(str(dropped.get("content", ""))) dropped_tokens = self._count_tokens(str(dropped.get("content", "")))
total_tokens -= dropped_tokens
await self._log( await self._log(
f"[Inlet] ✂️ 消息已缩减。新总数: {total_tokens} Tokens", f"[Inlet] ✂️ 消息已缩减。新总数: {total_tokens} Tokens",
@@ -1207,18 +1417,18 @@ class Filter:
""" """
chat_ctx = self._get_chat_context(body, __metadata__) chat_ctx = self._get_chat_context(body, __metadata__)
chat_id = chat_ctx["chat_id"] chat_id = chat_ctx["chat_id"]
model = body.get("model") or "" if not chat_id:
# 直接计算目标压缩进度
# 假设 outlet 中的 body['messages'] 包含完整历史(包括新响应)
messages = body.get("messages", [])
target_compressed_count = max(0, len(messages) - self.valves.keep_last)
if self.valves.debug_mode or self.valves.show_debug_log:
await self._log( await self._log(
f"\n{'='*60}\n[Outlet] Chat ID: {chat_id}\n[Outlet] 响应完成\n[Outlet] 计算目标压缩进度: {target_compressed_count} (消息数: {len(messages)})", "[Outlet] ❌ metadata 中缺少 chat_id跳过压缩",
type="error",
event_call=__event_call__, event_call=__event_call__,
) )
return body
model = body.get("model") or ""
messages = body.get("messages", [])
# 直接计算目标压缩进度
target_compressed_count = max(0, len(messages) - self.valves.keep_last)
# 在后台异步处理 Token 计算和摘要生成(不等待完成,不影响输出) # 在后台异步处理 Token 计算和摘要生成(不等待完成,不影响输出)
asyncio.create_task( asyncio.create_task(
@@ -1233,11 +1443,6 @@ class Filter:
) )
) )
await self._log(
f"[Outlet] 后台处理已启动\n{'='*60}\n",
event_call=__event_call__,
)
return body return body
async def _check_and_generate_summary_async( async def _check_and_generate_summary_async(
@@ -1257,7 +1462,7 @@ class Filter:
messages = body.get("messages", []) messages = body.get("messages", [])
# 获取当前模型的阈值配置 # 获取当前模型的阈值配置
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
compression_threshold_tokens = thresholds.get( compression_threshold_tokens = thresholds.get(
"compression_threshold_tokens", self.valves.compression_threshold_tokens "compression_threshold_tokens", self.valves.compression_threshold_tokens
) )
@@ -1393,11 +1598,14 @@ class Filter:
) )
return return
thresholds = self._get_model_thresholds(summary_model_id) thresholds = self._get_model_thresholds(summary_model_id) or {}
# 注意:这里使用的是摘要模型的最大上下文限制 # Priority: 1. summary_model_max_context (if > 0) -> 2. model_thresholds -> 3. global max_context_tokens
max_context_tokens = thresholds.get( if self.valves.summary_model_max_context > 0:
"max_context_tokens", self.valves.max_context_tokens max_context_tokens = self.valves.summary_model_max_context
) else:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
await self._log( await self._log(
f"[🤖 异步摘要任务] 使用模型 {summary_model_id} 的上限: {max_context_tokens} Tokens", f"[🤖 异步摘要任务] 使用模型 {summary_model_id} 的上限: {max_context_tokens} Tokens",
@@ -1582,10 +1790,14 @@ class Filter:
# 5. 获取阈值并计算比例 # 5. 获取阈值并计算比例
model = self._clean_model_id(body.get("model")) model = self._clean_model_id(body.get("model"))
thresholds = self._get_model_thresholds(model) thresholds = self._get_model_thresholds(model) or {}
max_context_tokens = thresholds.get( # Priority: 1. summary_model_max_context (if > 0) -> 2. model_thresholds -> 3. global max_context_tokens
"max_context_tokens", self.valves.max_context_tokens if self.valves.summary_model_max_context > 0:
) max_context_tokens = self.valves.summary_model_max_context
else:
max_context_tokens = thresholds.get(
"max_context_tokens", self.valves.max_context_tokens
)
# 6. 发送状态 # 6. 发送状态
status_msg = ( status_msg = (
@@ -1631,9 +1843,7 @@ class Filter:
} }
) )
import traceback logger.exception("[🤖 异步摘要任务] ❌ 发生异常")
traceback.print_exc()
def _format_messages_for_summary(self, messages: list) -> str: def _format_messages_for_summary(self, messages: list) -> str:
"""Formats messages for summarization.""" """Formats messages for summarization."""
@@ -1653,9 +1863,8 @@ class Filter:
# Handle role name # Handle role name
role_name = {"user": "User", "assistant": "Assistant"}.get(role, role) role_name = {"user": "User", "assistant": "Assistant"}.get(role, role)
# Limit length of each message to avoid excessive length # User requested to remove truncation to allow full context for summary
if len(content) > 500: # unless it exceeds model limits (which is handled by the LLM call itself or max_tokens)
content = content[:500] + "..."
formatted.append(f"[{i}] {role_name}: {content}") formatted.append(f"[{i}] {role_name}: {content}")
@@ -1762,8 +1971,25 @@ class Filter:
# 调用 generate_chat_completion # 调用 generate_chat_completion
response = await generate_chat_completion(request, payload, user) response = await generate_chat_completion(request, payload, user)
if not response or "choices" not in response or not response["choices"]: # Handle JSONResponse (some backends return JSONResponse instead of dict)
raise ValueError("LLM 响应格式不正确或为空") if hasattr(response, "body"):
# It's a Response object, extract the body
import json as json_module
try:
response = json_module.loads(response.body.decode("utf-8"))
except Exception:
raise ValueError(f"Failed to parse JSONResponse body: {response}")
if (
not response
or not isinstance(response, dict)
or "choices" not in response
or not response["choices"]
):
raise ValueError(
f"LLM response format incorrect or empty: {type(response).__name__}"
)
summary = response["choices"][0]["message"]["content"].strip() summary = response["choices"][0]["message"]["content"].strip()