feat(async-context-compression): release v1.4.0 with structure-aware grouping and session locking

- Introduced Atomic Message Grouping to prevent tool-calling corruption (Issue #56) - Implemented Tail Boundary Alignment for deterministic context truncation - Added per-chat asynchronous session locking to prevent duplicate background tasks - Enhanced summarization traceability with message IDs and names - Synchronized version and changelog across all documentation files - Optimized release-prep skill to remove redundant H1 titles Closes #56
2026-03-09 20:31:25 +08:00
parent 2eee7c5d35
commit f061d82409
27 changed files with 3540 additions and 286 deletions
--- a/docs/plugins/filters/async-context-compression.md
+++ b/docs/plugins/filters/async-context-compression.md
@@ -1,16 +1,15 @@
 # Async Context Compression Filter

-**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
+**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.4.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT

 This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.

-## What's new in 1.3.0
+## What's new in 1.4.0

- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
+- **Atomic Message Grouping**: Introduced structure-aware grouping for `assistant-tool-tool-assistant` chains to prevent "No tool call found" errors.
+- **Tail Boundary Alignment**: Implemented automatic correction for truncation points to ensure they don't fall inside a tool-calling sequence.
+- **Chat Session Locking**: Added a session-based lock to prevent multiple concurrent summary tasks for the same chat ID.
+- **Enhanced Traceability**: Improved summary formatting to include message IDs, names, and metadata for better context tracking.

 ---

--- a/docs/plugins/filters/async-context-compression.zh.md
+++ b/docs/plugins/filters/async-context-compression.zh.md
@@ -1,18 +1,17 @@
 # 异步上下文压缩过滤器

-**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
+**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.4.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT

 > **重要提示**：为了确保所有过滤器的可维护性和易用性，每个过滤器都应附带清晰、完整的文档，以确保其功能、配置和使用方法得到充分说明。

 本过滤器通过智能摘要和消息压缩技术，在保持对话连贯性的同时，显著降低长对话的 Token 消耗。

-## 1.3.0 版本更新
+## 1.4.0 版本更新

- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化，现已原生支持 9 种语言（含中、英、日、韩及欧洲主要语言）。
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门（默认 80%），可以智能控制何时显示 Token 用量状态，减少不必要的打扰。
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构，完全不影响首字节响应时间（TTFB），保持毫秒级极速推流。
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩，避免冲突。
- **配置项调整**: 为了提供更安静的生产环境体验，`debug_mode` 现已默认设置为 `false`。
+- **原子消息组 (Atomic Grouping)**: 引入结构感知的消息分组逻辑，确保工具调用链被整体保留或移除，彻底解决 "No tool call found" 错误。
+- **尾部边界自动对齐**: 实现了截断点的自动修正逻辑，确保历史上下文截断不会落在工具调用序列中间。
+- **会话级异步锁**: 增加了基于 `chat_id` 的后台任务锁，防止同一会话并发触发多个总结任务。
+- **元数据溯源增强**: 优化了总结输入格式，在总结中保留了消息 ID、参与者名称及关键元数据，提升上下文可追踪性。

 ---

--- a/docs/plugins/filters/index.md
+++ b/docs/plugins/filters/index.md
@@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline:

    Reduces token consumption in long conversations through intelligent summarization while maintaining coherence.

-    **Version:** 1.3.0
+    **Version:** 1.4.0

    [:octicons-arrow-right-24: Documentation](async-context-compression.md)

--- a/docs/plugins/filters/index.zh.md
+++ b/docs/plugins/filters/index.zh.md
@@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件：

    通过智能总结减少长对话的 token 消耗，同时保持连贯性。

-    **版本：** 1.3.0
+    **版本：** 1.4.0

    [:octicons-arrow-right-24: 查看文档](async-context-compression.md)