feat(filters): upgrade markdown-normalizer to v1.2.7

- Fix Issue #49: resolve greedy regex matching in consecutive emphasis
- Add LaTeX formula protection to prevent corruption of \times, \nu, etc.
- Expand i18n support to 12 languages with strict alignment
- Fix NameError in Request import during testing
This commit is contained in:
fujie
2026-02-24 15:05:25 +08:00
parent 18ada2a177
commit 2da934dd92
16 changed files with 981 additions and 1009 deletions

View File

@@ -52,7 +52,7 @@ Filters act as middleware in the message pipeline:
Fixes common Markdown formatting issues in LLM outputs, including Mermaid syntax, code blocks, and LaTeX formulas.
**Version:** 1.2.4
**Version:** 1.2.7
[:octicons-arrow-right-24: Documentation](markdown_normalizer.md)

View File

@@ -52,7 +52,7 @@ Filter 充当消息管线中的中间件:
修复 LLM 输出中常见的 Markdown 格式问题,包括 Mermaid 语法、代码块和 LaTeX 公式。
**版本:** 1.2.4
**版本:** 1.2.7
[:octicons-arrow-right-24: 查看文档](markdown_normalizer.zh.md)

View File

@@ -1,8 +1,24 @@
# Markdown Normalizer Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.2.7 | **Project:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **License:** MIT
A content normalizer filter for Open WebUI that fixes common Markdown formatting issues in LLM outputs. It ensures that code blocks, LaTeX formulas, Mermaid diagrams, and other Markdown elements are rendered correctly.
## Features
## 🔥 What's New in v1.2.7
* **LaTeX Formula Protection**: Enhanced escape character cleaning to protect LaTeX commands like `\times`, `\nu`, and `\theta` from being corrupted.
* **Expanded i18n Support**: Now supports 12 languages with automatic detection and fallback.
* **Valves Optimization**: Optimized configuration descriptions to be English-only for better consistency.
* **Bug Fixes**:
* Resolved [Issue #49](https://github.com/Fu-Jie/openwebui-extensions/issues/49): Fixed a bug where consecutive bold parts on the same line caused spaces between them to be removed.
* Fixed a `NameError` in the plugin code that caused test collection failures.
## 🌐 Multilingual Support
Supports automatic interface and status switching for the following languages:
`English`, `简体中文`, `繁體中文 (香港)`, `繁體中文 (台灣)`, `한국어`, `日本語`, `Français`, `Deutsch`, `Español`, `Italiano`, `Tiếng Việt`, `Bahasa Indonesia`.
## ✨ Core Features
* **Details Tag Normalization**: Ensures proper spacing for `<details>` tags (used for thought chains). Adds a blank line after `</details>` and ensures a newline after self-closing `<details />` tags to prevent rendering issues.
* **Emphasis Spacing Fix**: Fixes extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`) which can cause rendering failures. Includes safeguards to protect math expressions (e.g., `2 * 3 * 4`) and list variables.
@@ -17,7 +33,7 @@ A content normalizer filter for Open WebUI that fixes common Markdown formatting
* **Table Fix**: Adds missing closing pipes in tables.
* **XML Cleanup**: Removes leftover XML artifacts.
## Usage
## How to Use 🛠️
1. Install the plugin in Open WebUI.
2. Enable the filter globally or for specific models.
@@ -26,73 +42,38 @@ A content normalizer filter for Open WebUI that fixes common Markdown formatting
> [!WARNING]
> As this is an initial version, some "negative fixes" might occur (e.g., breaking valid Markdown). If you encounter issues, please check the console logs, copy the "Original" vs "Normalized" content, and submit an issue.
## Configuration (Valves)
## Configuration (Valves) ⚙️
* `priority`: Filter priority (default: 50).
* `enable_escape_fix`: Fix excessive escape characters.
* `enable_thought_tag_fix`: Normalize thought tags.
* `enable_details_tag_fix`: Normalize details tags (default: True).
* `enable_code_block_fix`: Fix code block formatting.
* `enable_latex_fix`: Normalize LaTeX formulas.
* `enable_list_fix`: Fix list item newlines (Experimental).
* `enable_unclosed_block_fix`: Auto-close unclosed code blocks.
* `enable_fullwidth_symbol_fix`: Fix full-width symbols in code blocks.
* `enable_mermaid_fix`: Fix Mermaid syntax errors.
* `enable_heading_fix`: Fix missing space in headings.
* `enable_table_fix`: Fix missing closing pipe in tables.
* `enable_xml_tag_cleanup`: Cleanup leftover XML tags.
* `enable_emphasis_spacing_fix`: Fix extra spaces in emphasis (default: True).
* `show_status`: Show status notification when fixes are applied.
* `show_debug_log`: Print debug logs to browser console.
| Parameter | Default | Description |
| :--- | :--- | :--- |
| `priority` | `50` | Filter priority. Higher runs later (recommended after other filters). |
| `enable_escape_fix` | `True` | Fix excessive escape characters (`\n`, `\t`, etc.). |
| `enable_escape_fix_in_code_blocks` | `False` | Apply escape fix inside code blocks (may affect valid code). |
| `enable_thought_tag_fix` | `True` | Normalize thought tags (`</thought>`). |
| `enable_details_tag_fix` | `True` | Normalize `<details>` tags and add safe spacing. |
| `enable_code_block_fix` | `True` | Fix code block formatting (indentation/newlines). |
| `enable_latex_fix` | `True` | Normalize LaTeX delimiters (`\[` -> `$$`, `\(` -> `$`). |
| `enable_list_fix` | `False` | Fix list item newlines (experimental). |
| `enable_unclosed_block_fix` | `True` | Auto-close unclosed code blocks. |
| `enable_fullwidth_symbol_fix` | `False` | Fix full-width symbols in code blocks. |
| `enable_mermaid_fix` | `True` | Fix common Mermaid syntax errors. |
| `enable_heading_fix` | `True` | Fix missing space in headings. |
| `enable_table_fix` | `True` | Fix missing closing pipe in tables. |
| `enable_xml_tag_cleanup` | `True` | Cleanup leftover XML tags. |
| `enable_emphasis_spacing_fix` | `False` | Fix extra spaces in emphasis. |
| `show_status` | `True` | Show status notification when fixes are applied. |
| `show_debug_log` | `True` | Print debug logs to browser console (F12). |
## Troubleshooting ❓
## ⭐ Support
If this plugin has been useful, a star on [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) is a big motivation for me. Thank you for the support.
## 🧩 Others
### Troubleshooting ❓
* **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
## Changelog
### Changelog
### v1.2.4
* **Documentation Updates**: Synchronized version numbers across all documentation and code files.
### v1.2.3
* **List Marker Protection Enhancement**: Fixed a bug where list markers (`*`) followed by plain text and emphasis were having their spaces incorrectly stripped (e.g., `* U16 forward` became `*U16 forward`).
* **Placeholder Support**: Confirmed that 4 or more underscores (e.g., `____`) are correctly treated as placeholders and not modified by the emphasis fix.
### v1.2.2
* **Code Block Indentation Fix**: Fixed an issue where code blocks nested inside lists were having their indentation incorrectly stripped. Now preserves proper indentation for nested code blocks.
* **Underscore Emphasis Support**: Extended emphasis spacing fix to support `__` (double underscore for bold) and `___` (triple underscore for bold+italic) syntax.
* **List Marker Protection**: Fixed a bug where list markers (`*`) followed by emphasis markers (`**`) were incorrectly merged (e.g., `* **Yes**` became `***Yes**`). Added safeguard to prevent this.
* **Test Suite**: Added comprehensive pytest test suite with 56 test cases covering all major features.
### v1.2.1
* **Emphasis Spacing Fix**: Added a new fix for extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`).
* Uses a recursive approach to handle nested emphasis (e.g., `**bold _italic _**`).
* Includes safeguards to prevent modifying math expressions (e.g., `2 * 3 * 4`) or list variables.
* Controlled by the `enable_emphasis_spacing_fix` valve (default: True).
### v1.2.0
* **Details Tag Support**: Added normalization for `<details>` tags.
* Ensures a blank line is added after `</details>` closing tags to separate thought content from the main response.
* Ensures a newline is added after self-closing `<details ... />` tags to prevent them from interfering with subsequent Markdown headings (e.g., fixing `<details/>#Heading`).
* Includes safeguard to prevent modification of `<details>` tags inside code blocks.
### v1.1.2
* **Mermaid Edge Label Protection**: Implemented comprehensive protection for edge labels (text on connecting lines) to prevent them from being incorrectly modified. Now supports all Mermaid link types including solid (`--`), dotted (`-.`), and thick (`==`) lines with or without arrows.
* **Bug Fixes**: Fixed an issue where lines without arrows (e.g., `A -- text --- B`) were not correctly protected.
### v1.1.0
* **Mermaid Fix Refinement**: Improved regex to handle nested parentheses in node labels (e.g., `ID("Label (text)")`) and avoided matching connection labels.
* **HTML Safeguard Optimization**: Refined `_contains_html` to allow common tags like `<br/>`, `<b>`, `<i>`, etc., ensuring Mermaid diagrams with these tags are still normalized.
* **Full-width Symbol Cleanup**: Fixed duplicate keys and incorrect quote mapping in `FULLWIDTH_MAP`.
* **Bug Fixes**: Fixed missing `Dict` import in Python files.
## License
MIT
See the full history on GitHub: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)

View File

@@ -1,8 +1,24 @@
# Markdown 格式化过滤器 (Markdown Normalizer)
**作者:** [Fu-Jie](https://github.com/Fu-Jie/openwebui-extensions) | **Version:** 1.2.7 | **项目:** [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) | **许可证:** MIT
这是一个用于 Open WebUI 的内容格式化过滤器,旨在修复 LLM 输出中常见的 Markdown 格式问题。它能确保代码块、LaTeX 公式、Mermaid 图表和其他 Markdown 元素被正确渲染。
## 功能特性
## 🔥 最新更新 v1.2.7
* **LaTeX 公式保护**: 增强了转义字符清理逻辑,自动保护 `$ $``$$ $$` 内的 LaTeX 命令(如 `\times``\nu``\theta`),防止渲染失效。
* **扩展国际化 (i18n) 支持**: 现已支持 12 种语言,具备自动探测与回退机制。
* **配置项优化**: 将 Valves 配置项的描述统一为英文,保持界面一致性。
* **修复 Bug**:
* 修复了 [Issue #49](https://github.com/Fu-Jie/openwebui-extensions/issues/49):解决了当同一行存在多个加粗部分时,由于正则匹配过于贪婪导致中间内容丢失空格的问题。
* 修复了插件代码中的 `NameError` 错误,确保测试脚本能正常运行。
## 🌐 多语言支持 (i18n)
支持以下语言的界面与状态自动切换:
`English`, `简体中文`, `繁體中文 (香港)`, `繁體中文 (台灣)`, `한국어`, `日本語`, `Français`, `Deutsch`, `Español`, `Italiano`, `Tiếng Việt`, `Bahasa Indonesia`
## ✨ 核心特性
* **Details 标签规范化**: 确保 `<details>` 标签(常用于思维链)有正确的间距。在 `</details>` 后添加空行,并在自闭合 `<details />` 标签后添加换行,防止渲染问题。
* **强调空格修复**: 修复强调标记内部的多余空格(例如 `** 文本 **` -> `**文本**`),这会导致 Markdown 渲染失败。包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量。
@@ -24,75 +40,40 @@
3.**Valves** 设置中配置需要启用的修复项。
4. (可选) **显示调试日志 (Show Debug Log)** 在 Valves 中默认开启。这会将结构化的日志打印到浏览器控制台 (F12)。
> [!WARNING]
> 由于这是初版,可能会出现“负向修复”的情况(例如破坏了原本正确的格式)。如果您遇到问题,请务查看控制台日志,复制“原始 (Original)”与“规范化 (Normalized)”的内容对比,并提交 Issue 反馈。
> 由于这是初版,可能会出现“负向修复”的情况(例如破坏了原本正确的格式)。如果您遇到问题,请务查看控制台日志,复制“原始 (Original)”与“规范化 (Normalized)”的内容对比,并提交 Issue 反馈。
## 配置 (Valves)
## 配置参数 (Valves) ⚙️
* `priority`: 过滤器优先级 (默认: 50)。
* `enable_escape_fix`: 修复过度的转义字符。
* `enable_thought_tag_fix`: 规范化思维标签。
* `enable_details_tag_fix`: 规范化 Details 标签 (默认: True)。
* `enable_code_block_fix`: 修复代码块格式。
* `enable_latex_fix`: 规范化 LaTeX 公式。
* `enable_list_fix`: 修复列表项换行 (实验性)。
* `enable_unclosed_block_fix`: 自动闭合未闭合的代码块。
* `enable_fullwidth_symbol_fix`: 修复代码块中的全角符号。
* `enable_mermaid_fix`: 修复 Mermaid 语法错误。
* `enable_heading_fix`: 修复标题中缺失的空格。
* `enable_table_fix`: 修复表格中缺失的闭合管道符。
* `enable_xml_tag_cleanup`: 清理残留的 XML 标签。
* `enable_emphasis_spacing_fix`: 修复强调语法中的多余空格 (默认: True)。
* `show_status`: 应用修复时显示状态通知。
* `show_debug_log`: 在浏览器控制台打印调试日志。
| 参数 | 默认值 | 描述 |
| :--- | :--- | :--- |
| `priority` | `50` | 过滤器优先级。数值越大越靠后(建议在其他过滤器之后运行)。 |
| `enable_escape_fix` | `True` | 修复过度的转义字符(`\n`, `\t` 等)。 |
| `enable_escape_fix_in_code_blocks` | `False` | 在代码块内应用转义修复(可能影响有效代码)。 |
| `enable_thought_tag_fix` | `True` | 规范化思维标签(`</thought>`)。 |
| `enable_details_tag_fix` | `True` | 规范化 `<details>` 标签并添加安全间距。 |
| `enable_code_block_fix` | `True` | 修复代码块格式(缩进/换行)。 |
| `enable_latex_fix` | `True` | 规范化 LaTeX 定界符(`\[` -> `$$`, `\(` -> `$`)。 |
| `enable_list_fix` | `False` | 修复列表项换行(实验性)。 |
| `enable_unclosed_block_fix` | `True` | 自动闭合未闭合的代码块。 |
| `enable_fullwidth_symbol_fix` | `False` | 修复代码块中的全角符号。 |
| `enable_mermaid_fix` | `True` | 修复常见 Mermaid 语法错误。 |
| `enable_heading_fix` | `True` | 修复标题中缺失的空格。 |
| `enable_table_fix` | `True` | 修复表格中缺失的闭合管道符。 |
| `enable_xml_tag_cleanup` | `True` | 清理残留的 XML 标签。 |
| `enable_emphasis_spacing_fix` | `False` | 修复强调语法中的多余空格。 |
| `show_status` | `True` | 应用修复时显示状态通知。 |
| `show_debug_log` | `True` | 在浏览器控制台打印调试日志。 |
## 故障排除 (Troubleshooting) ❓
## ⭐ 支持
如果这个插件对你有帮助,欢迎到 [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) 点个 Star这将是我持续改进的动力感谢支持。
## 其他
### 故障排除 (Troubleshooting) ❓
* **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue[OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues)
## 更新日志
### 更新日志
### v1.2.4
* **文档更新**: 同步了所有文档和代码文件的版本号。
### v1.2.3
* **列表标记保护增强**: 修复了列表标记 (`*`) 后跟普通文本和强调标记时,空格被错误剥离的问题(例如 `* U16 前锋` 变成 `*U16 前锋`)。
* **占位符支持**: 确认 4 个或更多下划线(如 `____`)会被正确视为占位符,不会被强调修复逻辑修改。
### v1.2.2
* **代码块缩进修复**: 修复了列表中嵌套代码块的缩进被错误剥离的问题。现在会正确保留嵌套代码块的缩进。
* **下划线强调语法支持**: 扩展强调空格修复以支持 `__` (双下划线加粗) 和 `___` (三下划线加粗斜体) 语法。
* **列表标记保护**: 修复了列表标记 (`*`) 后跟强调标记 (`**`) 被错误合并的 Bug例如 `* **是**` 变成 `***是**`)。添加了保护逻辑防止此问题。
* **测试套件**: 新增完整的 pytest 测试套件,包含 56 个测试用例,覆盖所有主要功能。
### v1.2.1
* **强调空格修复**: 新增了对强调标记内部多余空格的修复(例如 `** 文本 **` -> `**文本**`)。
* 采用递归方法处理嵌套强调(例如 `**加粗 _斜体 _**`)。
* 包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量。
* 通过 `enable_emphasis_spacing_fix` 开关控制(默认:开启)。
### v1.2.0
* **Details 标签支持**: 新增了对 `<details>` 标签的规范化支持。
* 确保在 `</details>` 闭合标签后添加空行,将思维内容与正文分隔开。
* 确保在自闭合 `<details ... />` 标签后添加换行,防止其干扰后续的 Markdown 标题(例如修复 `<details/>#标题`)。
* 包含保护机制,防止修改代码块内部的 `<details>` 标签。
### v1.1.2
* **Mermaid 连线标签保护**: 实现了全面的连线标签保护机制,防止连接线上的文字被误修改。现在支持所有 Mermaid 连线类型,包括实线 (`--`)、虚线 (`-.`) 和粗线 (`==`),无论是否带有箭头。
* **Bug 修复**: 修复了无箭头连线(如 `A -- text --- B`)未被正确保护的问题。
### v1.1.0
* **Mermaid 修复优化**: 改进了正则表达式以处理节点标签中的嵌套括号(如 `ID("标签 (文本)")`),并避免误匹配连接线上的文字。
* **HTML 保护机制优化**: 优化了 `_contains_html` 检测,允许 `<br/>`, `<b>`, `<i>` 等常见标签,确保包含这些标签的 Mermaid 图表能被正常规范化。
* **全角符号清理**: 修复了 `FULLWIDTH_MAP` 中的重复键名和错误的引号映射。
* **Bug 修复**: 修复了 Python 文件中缺失的 `Dict` 类型导入。
## 许可证
MIT
完整历史请查看 GitHub 项目: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions)