Files
Fu-Jie_openwebui-extensions/docs/development/fix-role-tool-error.zh.md
fujie 7efb64b16b feat(async-context-compression): release v1.4.0 with structure-aware grouping and session locking
- Introduced Atomic Message Grouping to prevent tool-calling corruption (Issue #56)
- Implemented Tail Boundary Alignment for deterministic context truncation
- Added per-chat asynchronous session locking to prevent duplicate background tasks
- Enhanced summarization traceability with message IDs and names
- Synchronized version and changelog across all documentation files
- Optimized release-prep skill to remove redundant H1 titles

Closes #56
2026-03-09 20:50:24 +08:00

127 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 修复OpenAI API 错误 "messages with role 'tool' must be a response to a preceding message with 'tool_calls'"
## 问题描述
`async-context-compression` 过滤器中,当对话历史变长时,系统会对消息进行裁剪或摘要。如果保留下来的尾部历史恰好从一个原生工具调用序列的中间开始,那么下一次请求就可能以一条 `tool` 消息开头,而触发它的 `assistant` 消息已经被裁掉。
这就会触发 OpenAI API 的错误:
`"messages with role 'tool' must be a response to a preceding message with 'tool_calls'"`
## 根本原因
真正的缺陷在于历史压缩边界没有完整识别工具调用链的“原子性”。一个合法的工具调用链通常包括:
1. 一条带有 `tool_calls``assistant` 消息
2. 一条或多条 `tool` 消息
3. 一条可选的 assistant 跟进回复,用于消费工具结果
如果裁剪点落在这段链条内部,发给模型的消息序列就会变成非法格式。
## 解决方案:对齐原子边界
修复通过把工具调用序列分组为原子单元,并使裁剪边界对齐到这些单元。
### 1. `_get_atomic_groups()`
这个辅助函数会把消息索引分组为“必须一起保留或一起丢弃”的原子单元。它显式识别以下原生工具调用模式:
- `assistant(tool_calls)`
- `tool`
- assistant 跟进回复
也就是说,它不再把这些消息看成彼此独立的单条消息,而是把整段序列视为一个原子块。
```python
def _get_atomic_groups(self, messages: List[Dict]) -> List[List[int]]:
groups = []
current_group = []
for i, msg in enumerate(messages):
role = msg.get("role")
has_tool_calls = bool(msg.get("tool_calls"))
if role == "assistant" and has_tool_calls:
if current_group:
groups.append(current_group)
current_group = [i]
elif role == "tool":
if not current_group:
groups.append([i])
else:
current_group.append(i)
elif (
role == "assistant"
and current_group
and messages[current_group[-1]].get("role") == "tool"
):
current_group.append(i)
groups.append(current_group)
current_group = []
else:
if current_group:
groups.append(current_group)
current_group = []
groups.append([i])
if current_group:
groups.append(current_group)
return groups
```
### 2. `_align_tail_start_to_atomic_boundary()`
这个辅助函数会检查一个拟定的裁剪起点是否落在某个原子块内部。如果是,它会把起点向前回退到该原子块的开头位置。
```python
def _align_tail_start_to_atomic_boundary(
self, messages: List[Dict], raw_start_index: int, protected_prefix: int
) -> int:
aligned_start = max(raw_start_index, protected_prefix)
if aligned_start <= protected_prefix or aligned_start >= len(messages):
return aligned_start
trimmable = messages[protected_prefix:]
local_start = aligned_start - protected_prefix
for group in self._get_atomic_groups(trimmable):
group_start = group[0]
group_end = group[-1] + 1
if local_start == group_start:
return aligned_start
if group_start < local_start < group_end:
return protected_prefix + group_start
return aligned_start
```
### 3. 应用于尾部保留和摘要进度计算
这个对齐后的边界现在被用于重建保留尾部消息,以及计算可以安全摘要的历史范围。
当前实现中的示例:
```python
raw_start_index = max(compressed_count, effective_keep_first)
start_index = self._align_tail_start_to_atomic_boundary(
messages, raw_start_index, effective_keep_first
)
tail_messages = messages[start_index:]
```
在摘要进度计算中同样如此:
```python
raw_target_compressed_count = max(0, len(messages) - self.valves.keep_last)
target_compressed_count = self._align_tail_start_to_atomic_boundary(
messages, raw_target_compressed_count, effective_keep_first
)
```
## 验证结果
- **首次压缩边界**:当历史第一次越过压缩阈值时,保留尾部不再从工具调用块中间开始。
- **复杂会话验证**:在 30+ 条消息、多个工具调用和失败调用的真实场景下,后台摘要过程保持稳定。
- **回归行为更安全**:过滤器现在会优先选择合法边界,即使这意味着比原始的朴素切片稍微多保留一点上下文。
## 结论
通过让历史裁剪与摘要进度计算具备"工具调用原子块感知"能力,避免孤立的 `tool` 消息出现,消除长对话与后台压缩期间的 400 错误。