fix(markdown_normalizer): adopt safe-by-default strategy for escaping

- Set 'enable_escape_fix' to False by default to prevent accidental corruption
- Improve LaTeX display math identification using regex protection
- Update documentation to reflect opt-in recommendation for escape fixes
- Fix Issue #57 remaining aggressive escaping bugs
This commit is contained in:
fujie
2026-03-09 01:05:13 +08:00
parent 9bf31488ae
commit 2eee7c5d35
9 changed files with 230 additions and 39 deletions

View File

@@ -1,5 +1,4 @@
# Markdown Normalizer Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.2.8 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
A powerful, context-aware content normalizer filter for Open WebUI designed to fix common Markdown formatting issues in LLM outputs. It ensures that code blocks, LaTeX formulas, Mermaid diagrams, and other structural Markdown elements are rendered flawlessly, without destroying valid technical content.
@@ -10,6 +9,16 @@ A powerful, context-aware content normalizer filter for Open WebUI designed to f
---
## 🔥 What's New in v1.2.8
* **Safe-by-Default Strategy**: The `enable_escape_fix` feature is now **disabled by default**. This prevents unwanted modifications to valid technical text like Windows file paths (`C:\new\test`) or complex LaTeX formulas.
* **LaTeX Parsing Fix**: Improved the logic for identifying display math (`$$ ... $$`). Fixed a bug where LaTeX commands starting with `\n` (like `\nabla`) were incorrectly treated as newlines.
* **Reliability Enhancement**: Complete error fallback mechanism. Guarantees 0% data loss during processing.
* **Inline Code Protection**: Upgraded escaping logic to protect inline code blocks (`` `...` ``).
* **Code Block Escaping Control**: The `enable_escape_fix_in_code_blocks` Valve now correctly targets broken newlines inside code blocks (perfect for fixing flat SQL queries) when enabled.
* **Privacy Optimization**: `show_debug_log` now defaults to `False` to prevent console noise.
---
## 🚀 Why do you need this plugin? (What does it do?)
Language Models (LLMs) often generate malformed Markdown due to tokenization artifacts, aggressive escaping, or hallucinated formatting. If you've ever seen:
@@ -42,12 +51,6 @@ Before making any changes, the plugin builds a semantic map of the text to prote
### 3. Reliability & Safety
- **100% Rollback Guarantee**: If any normalization logic fails or crashes, the plugin catches the error and silently returns the exact original text, ensuring your chat never breaks.
## 🔥 What's New in v1.2.8
* **Reliability Enhancement**: Complete error fallback mechanism. Guarantees 0% data loss during processing.
* **Inline Code Protection**: Upgraded escaping logic to protect inline code blocks (`` `...` ``).
* **Code Block Escaping Control**: The `enable_escape_fix_in_code_blocks` Valve now correctly targets broken newlines inside code blocks (perfect for fixing flat SQL queries) when enabled.
* **Privacy Optimization**: `show_debug_log` now defaults to `False` to prevent console noise.
## 🌐 Multilingual Support
The plugin UI and status notifications automatically switch based on your language:
@@ -64,7 +67,7 @@ The plugin UI and status notifications automatically switch based on your langua
| Parameter | Default | Description |
| :--- | :--- | :--- |
| `priority` | `50` | Filter priority. Higher runs later (recommended to run this after all other content filters). |
| `enable_escape_fix` | `True` | Convert excessive literal escape characters (`\n`, `\t`) to real spacing. |
| `enable_escape_fix` | `False` | Convert excessive literal escape characters (`\n`, `\t`) to real spacing. (Default: False for safety). |
| `enable_escape_fix_in_code_blocks` | `False` | **Pro-tip**: Turn this ON if your SQL/HTML code blocks are constantly printing on a single line. Turn OFF for Python/C++. |
| `enable_thought_tag_fix` | `True` | Normalize `<think>` tags. |
| `enable_details_tag_fix` | `True` | Normalize `<details>` spacing. |

View File

@@ -10,6 +10,14 @@
---
## 🔥 最新更新 v1.2.8
* **“默认安全”策略 (Safe-by-Default)**`enable_escape_fix` 功能现在**默认禁用**。这能有效防止插件在未经授权的情况下误改 Windows 路径 (`C:\new\test`) 或复杂的 LaTeX 公式。
* **LaTeX 解析优化**:重构了显示数学公式 (`$$ ... $$`) 的识别逻辑。修复了 LaTeX 命令如果以 `\n` 开头(如 `\nabla`)会被错误识别为换行符的 Bug。
* **可靠性增强**:实现了完整的错误回滚机制。当修复过程发生意外错误时,保证 100% 返回原始文本,不丢失任何数据。
* **配置项修复**`enable_escape_fix_in_code_blocks` 配置项现在能正确作用于代码块了。**如果您遇到 SQL 挤在一行的问题,只需在设置中手动开启此项即可。**
---
## 🚀 为什么你需要这个插件?(它能解决什么问题?)
由于分词 (Tokenization) 伪影、过度转义或格式幻觉LLM 经常会生成破损的 Markdown。如果你遇到过以下情况
@@ -42,12 +50,6 @@
### 3. 绝对的可靠性与安全 (100% Rollback)
- **无损回滚机制**:如果在修复过程中发生任何意外错误或崩溃,插件会立即捕获异常,并静默返回**绝对原始**的文本,确保你的对话永远不会因插件报错而丢失。
## 🔥 最新更新 v1.2.8
* **可靠性增强**:修复了错误回滚机制。当规范化过程中发生意外错误时,插件现在会正确返回原始文本,而不是返回被部分修改的损坏内容。
* **内联代码保护**:优化了转义字符清理逻辑,现在会保护内联代码块(`` `...` ``)不被错误转义,防止破坏有效的代码片段。
* **配置项修复**`enable_escape_fix_in_code_blocks` 配置项现在能正确作用于代码块了。**在代码块内修复换行符(比如修复 SQL只需在设置中开启此选项即可。**
* **隐私与日志优化**:将 `show_debug_log` 默认值修改为 `False`,避免将可能敏感的内容自动输出到浏览器控制台,并减少不必要的日志噪音。
## 🌐 多语言支持 (i18n)
界面的状态提示气泡会根据你的浏览器语言自动切换:
@@ -64,7 +66,7 @@
| 参数 | 默认值 | 描述 |
| :--- | :--- | :--- |
| `priority` | `50` | 过滤器优先级。数值越大越靠后(建议放在其他内容过滤器之后运行)。 |
| `enable_escape_fix` | `True` | 修复过度的转义字符(将字面量 `\n` 转换为实际换行)。 |
| `enable_escape_fix` | `False` | 修复过度的转义字符(将字面量 `\n` 转换为实际换行)。**默认禁用以保证安全。** |
| `enable_escape_fix_in_code_blocks` | `False` | **高阶技巧**:如果你的 SQL 或 HTML 代码块总是挤在一行,**请开启此项**。如果你经常写 Python/C++,建议保持关闭。 |
| `enable_thought_tag_fix` | `True` | 规范化思维标签为 `<thought>`。 |
| `enable_details_tag_fix` | `True` | 修复 `<details>` 标签的排版间距。 |
@@ -84,4 +86,4 @@
如果这个插件拯救了你的排版,欢迎到 [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) 点个 Star这是我持续改进的最大动力。感谢支持
## 🧩 其他
* **故障排除**:遇到“负向修复”(即原本正常的排版被修坏了)?请开启 `show_debug_log`,在 F12 控制台复制出原始文本,并在 GitHub 提交 Issue[提交 Issue](https://github.com/Fu-Jie/openwebui-extensions/issues)
* **故障排除**:遇到“负向修复”(即原本正常的排版被修坏了)?请开启 `show_debug_log`,在 F12 控制台复制出原始文本,并在 GitHub 提交 Issue[提交 Issue](https://github.com/Fu-Jie/openwebui-extensions/issues)