feat(async-context-compression): release v1.4.0 with structure-aware grouping and session locking

- Introduced Atomic Message Grouping to prevent tool-calling corruption (Issue #56)
- Implemented Tail Boundary Alignment for deterministic context truncation
- Added per-chat asynchronous session locking to prevent duplicate background tasks
- Enhanced summarization traceability with message IDs and names
- Synchronized version and changelog across all documentation files
- Optimized release-prep skill to remove redundant H1 titles

Closes #56
This commit is contained in:
fujie
2026-03-09 20:31:25 +08:00
parent 2eee7c5d35
commit f061d82409
27 changed files with 3540 additions and 286 deletions

View File

@@ -1,16 +1,15 @@
# Async Context Compression Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.3.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.4.0 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
This filter reduces token consumption in long conversations through intelligent summarization and message compression while keeping conversations coherent.
## What's new in 1.3.0
## What's new in 1.4.0
- **Internationalization (i18n)**: Complete localization of user-facing messages across 9 languages (English, Chinese, Japanese, Korean, French, German, Spanish, Italian).
- **Smart Status Display**: Added `token_usage_status_threshold` valve (default 80%) to intelligently control when token usage status is shown.
- **Improved Performance**: Frontend language detection and logging are optimized to be completely non-blocking, maintaining lightning-fast TTFB.
- **Copilot SDK Integration**: Automatically detects and skips compression for copilot_sdk based models to prevent conflicts.
- **Configuration**: `debug_mode` is now set to `false` by default for a quieter production experience.
- **Atomic Message Grouping**: Introduced structure-aware grouping for `assistant-tool-tool-assistant` chains to prevent "No tool call found" errors.
- **Tail Boundary Alignment**: Implemented automatic correction for truncation points to ensure they don't fall inside a tool-calling sequence.
- **Chat Session Locking**: Added a session-based lock to prevent multiple concurrent summary tasks for the same chat ID.
- **Enhanced Traceability**: Improved summary formatting to include message IDs, names, and metadata for better context tracking.
---

View File

@@ -1,18 +1,17 @@
# 异步上下文压缩过滤器
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.3.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **版本:** 1.4.0 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
> **重要提示**:为了确保所有过滤器的可维护性和易用性,每个过滤器都应附带清晰、完整的文档,以确保其功能、配置和使用方法得到充分说明。
本过滤器通过智能摘要和消息压缩技术,在保持对话连贯性的同时,显著降低长对话的 Token 消耗。
## 1.3.0 版本更新
## 1.4.0 版本更新
- **国际化 (i18n) 支持**: 完成了所有用户可见消息的本地化,现已原生支持 9 种语言(含中、英、日、韩及欧洲主要语言)
- **智能状态显示**: 新增 `token_usage_status_threshold` 阀门(默认 80%),可以智能控制何时显示 Token 用量状态,减少不必要的打扰
- **性能大幅优化**: 对前端语言检测和日志处理流程进行了非阻塞重构完全不影响首字节响应时间TTFB保持毫秒级极速推流
- **Copilot SDK 兼容**: 自动检测并跳过基于 `copilot_sdk` 模型的上下文压缩,避免冲突
- **配置项调整**: 为了提供更安静的生产环境体验,`debug_mode` 现已默认设置为 `false`
- **原子消息组 (Atomic Grouping)**: 引入结构感知的消息分组逻辑,确保工具调用链被整体保留或移除,彻底解决 "No tool call found" 错误
- **尾部边界自动对齐**: 实现了截断点的自动修正逻辑,确保历史上下文截断不会落在工具调用序列中间
- **会话级异步锁**: 增加了基于 `chat_id` 的后台任务锁,防止同一会话并发触发多个总结任务
- **元数据溯源增强**: 优化了总结输入格式,在总结中保留了消息 ID、参与者名称及关键元数据提升上下文可追踪性
---

View File

@@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline:
Reduces token consumption in long conversations through intelligent summarization while maintaining coherence.
**Version:** 1.3.0
**Version:** 1.4.0
[:octicons-arrow-right-24: Documentation](async-context-compression.md)

View File

@@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件:
通过智能总结减少长对话的 token 消耗,同时保持连贯性。
**版本:** 1.3.0
**版本:** 1.4.0
[:octicons-arrow-right-24: 查看文档](async-context-compression.md)