docs: sync markdown_normalizer 1.2.2

This commit is contained in:
fujie
2026-01-17 18:52:30 +08:00
parent e51d87ae80
commit 3b11537b5e
6 changed files with 381 additions and 156 deletions

View File

@@ -1,47 +1,85 @@
# Markdown Normalizer Filter
A production-grade content normalizer filter for Open WebUI that fixes common Markdown formatting issues in LLM outputs. It ensures that code blocks, LaTeX formulas, Mermaid diagrams, and other Markdown elements are rendered correctly.
A content normalizer filter for Open WebUI that fixes common Markdown formatting issues in LLM outputs. It ensures that code blocks, LaTeX formulas, Mermaid diagrams, and other Markdown elements are rendered correctly.
## Features
* **Details Tag Normalization**: Ensures proper spacing for `<details>` tags (used for thought chains). Adds a blank line after `</details>` and ensures a newline after self-closing `<details />` tags to prevent rendering issues.
* **Mermaid Syntax Fix**: Automatically fixes common Mermaid syntax errors, such as unquoted node labels (including multi-line labels and citations) and unclosed subgraphs, ensuring diagrams render correctly.
* **Frontend Console Debugging**: Supports printing structured debug logs directly to the browser console (F12) for easier troubleshooting.
* **Code Block Formatting**: Fixes broken code block prefixes, suffixes, and indentation.
* **LaTeX Normalization**: Standardizes LaTeX formula delimiters (`\[` -> `$$`, `\(` -> `$`).
* **Thought Tag Normalization**: Unifies thought tags (`<think>`, `<thinking>` -> `<thought>`).
* **Escape Character Fix**: Cleans up excessive escape characters (`\\n`, `\\t`).
* **List Formatting**: Ensures proper newlines in list items.
* **Heading Fix**: Adds missing spaces in headings (`#Heading` -> `# Heading`).
* **Table Fix**: Adds missing closing pipes in tables.
* **XML Cleanup**: Removes leftover XML artifacts.
* **Details Tag Normalization**: Ensures proper spacing for `<details>` tags (used for thought chains). Adds a blank line after `</details>` and ensures a newline after self-closing `<details />` tags to prevent rendering issues.
* **Emphasis Spacing Fix**: Fixes extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`) which can cause rendering failures. Includes safeguards to protect math expressions (e.g., `2 * 3 * 4`) and list variables.
* **Mermaid Syntax Fix**: Automatically fixes common Mermaid syntax errors, such as unquoted node labels (including multi-line labels and citations) and unclosed subgraphs. **New in v1.1.2**: Comprehensive protection for edge labels (text on connecting lines) across all link types (solid, dotted, thick).
* **Frontend Console Debugging**: Supports printing structured debug logs directly to the browser console (F12) for easier troubleshooting.
* **Code Block Formatting**: Fixes broken code block prefixes, suffixes, and indentation.
* **LaTeX Normalization**: Standardizes LaTeX formula delimiters (`\[` -> `$$`, `\(` -> `$`).
* **Thought Tag Normalization**: Unifies thought tags (`<think>`, `<thinking>` -> `<thought>`).
* **Escape Character Fix**: Cleans up excessive escape characters (`\\n`, `\\t`).
* **List Formatting**: Ensures proper newlines in list items.
* **Heading Fix**: Adds missing spaces in headings (`#Heading` -> `# Heading`).
* **Table Fix**: Adds missing closing pipes in tables.
* **XML Cleanup**: Removes leftover XML artifacts.
## Usage
1. Install the plugin in Open WebUI.
2. Enable the filter globally or for specific models.
3. Configure the enabled fixes in the **Valves** settings.
4. (Optional) **Show Debug Log** is enabled by default in Valves. This prints structured logs to the browser console (F12).
1. Install the plugin in Open WebUI.
2. Enable the filter globally or for specific models.
3. Configure the enabled fixes in the **Valves** settings.
4. (Optional) **Show Debug Log** is enabled by default in Valves. This prints structured logs to the browser console (F12).
> [!WARNING]
> As this is an initial version, some "negative fixes" might occur (e.g., breaking valid Markdown). If you encounter issues, please check the console logs, copy the "Original" vs "Normalized" content, and submit an issue.
## Configuration (Valves)
* `priority`: Filter priority (default: 50).
* `enable_escape_fix`: Fix excessive escape characters.
* `enable_thought_tag_fix`: Normalize thought tags.
* `enable_details_tag_fix`: Normalize details tags (default: True).
* `enable_code_block_fix`: Fix code block formatting.
* `enable_latex_fix`: Normalize LaTeX formulas.
* `enable_list_fix`: Fix list item newlines (Experimental).
* `enable_unclosed_block_fix`: Auto-close unclosed code blocks.
* `enable_fullwidth_symbol_fix`: Fix full-width symbols in code blocks.
* `enable_mermaid_fix`: Fix Mermaid syntax errors.
* `enable_heading_fix`: Fix missing space in headings.
* `enable_table_fix`: Fix missing closing pipe in tables.
* `enable_xml_tag_cleanup`: Cleanup leftover XML tags.
* `show_status`: Show status notification when fixes are applied.
* `show_debug_log`: Print debug logs to browser console.
* `priority`: Filter priority (default: 50).
* `enable_escape_fix`: Fix excessive escape characters.
* `enable_thought_tag_fix`: Normalize thought tags.
* `enable_details_tag_fix`: Normalize details tags (default: True).
* `enable_code_block_fix`: Fix code block formatting.
* `enable_latex_fix`: Normalize LaTeX formulas.
* `enable_list_fix`: Fix list item newlines (Experimental).
* `enable_unclosed_block_fix`: Auto-close unclosed code blocks.
* `enable_fullwidth_symbol_fix`: Fix full-width symbols in code blocks.
* `enable_mermaid_fix`: Fix Mermaid syntax errors.
* `enable_heading_fix`: Fix missing space in headings.
* `enable_table_fix`: Fix missing closing pipe in tables.
* `enable_xml_tag_cleanup`: Cleanup leftover XML tags.
* `enable_emphasis_spacing_fix`: Fix extra spaces in emphasis (default: True).
* `show_status`: Show status notification when fixes are applied.
* `show_debug_log`: Print debug logs to browser console.
## Troubleshooting ❓
* **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
## Changelog
### v1.2.2
* **Version Bump**: Documentation and metadata updated for the latest release.
### v1.2.1
* **Emphasis Spacing Fix**: Added a new fix for extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`).
* Uses a recursive approach to handle nested emphasis (e.g., `**bold _italic _**`).
* Includes safeguards to prevent modifying math expressions (e.g., `2 * 3 * 4`) or list variables.
* Controlled by the `enable_emphasis_spacing_fix` valve (default: True).
### v1.2.0
* **Details Tag Support**: Added normalization for `<details>` tags.
* Ensures a blank line is added after `</details>` closing tags to separate thought content from the main response.
* Ensures a newline is added after self-closing `<details ... />` tags to prevent them from interfering with subsequent Markdown headings (e.g., fixing `<details/>#Heading`).
* Includes safeguard to prevent modification of `<details>` tags inside code blocks.
### v1.1.2
* **Mermaid Edge Label Protection**: Implemented comprehensive protection for edge labels (text on connecting lines) to prevent them from being incorrectly modified. Now supports all Mermaid link types including solid (`--`), dotted (`-.`), and thick (`==`) lines with or without arrows.
* **Bug Fixes**: Fixed an issue where lines without arrows (e.g., `A -- text --- B`) were not correctly protected.
### v1.1.0
* **Mermaid Fix Refinement**: Improved regex to handle nested parentheses in node labels (e.g., `ID("Label (text)")`) and avoided matching connection labels.
* **HTML Safeguard Optimization**: Refined `_contains_html` to allow common tags like `<br/>`, `<b>`, `<i>`, etc., ensuring Mermaid diagrams with these tags are still normalized.
* **Full-width Symbol Cleanup**: Fixed duplicate keys and incorrect quote mapping in `FULLWIDTH_MAP`.
* **Bug Fixes**: Fixed missing `Dict` import in Python files.
## License

View File

@@ -1,47 +1,85 @@
# Markdown 格式化过滤器 (Markdown Normalizer)
这是一个用于 Open WebUI 的生产级内容格式化过滤器,旨在修复 LLM 输出中常见的 Markdown 格式问题。它能确保代码块、LaTeX 公式、Mermaid 图表和其他 Markdown 元素被正确渲染。
这是一个用于 Open WebUI 的内容格式化过滤器,旨在修复 LLM 输出中常见的 Markdown 格式问题。它能确保代码块、LaTeX 公式、Mermaid 图表和其他 Markdown 元素被正确渲染。
## 功能特性
* **Details 标签规范化**: 确保 `<details>` 标签(常用于思维链)有正确的间距。在 `</details>` 后添加空行,并在自闭合 `<details />` 标签后添加换行,防止渲染问题。
* **Mermaid 语法修复**: 自动修复常见的 Mermaid 语法错误,如未加引号的节点标签(支持多行标签和引用标记)和未闭合的子图 (Subgraph),确保图表能正确渲染
* **前端控制台调试**: 支持将结构化的调试日志直接打印到浏览器控制台 (F12),方便排查问题
* **代码块格式化**: 修复破损的代码块前缀、后缀和缩进问题。
* **LaTeX 规范化**: 标准化 LaTeX 公式定界符 (`\[` -> `$$`, `\(` -> `$`)
* **思维标签规范化**: 统一思维链标签 (`<think>`, `<thinking>` -> `<thought>`)。
* **转义字符修复**: 清理过度的转义字符 (`\\n`, `\\t`)。
* **列表格式化**: 确保列表项有正确的换行
* **标题修复**: 修复标题中缺失的空格 (`#标题` -> `# 标题`)
* **表格修复**: 修复表格中缺失的闭合管道符
* **XML 清理**: 移除残留的 XML 标签
* **Details 标签规范化**: 确保 `<details>` 标签(常用于思维链)有正确的间距。在 `</details>` 后添加空行,并在自闭合 `<details />` 标签后添加换行,防止渲染问题。
* **强调空格修复**: 修复强调标记内部的多余空格(例如 `** 文本 **` -> `**文本**`),这会导致 Markdown 渲染失败。包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量
* **Mermaid 语法修复**: 自动修复常见的 Mermaid 语法错误,如未加引号的节点标签(支持多行标签和引用标记)和未闭合的子图 (Subgraph)。**v1.1.2 新增**: 全面保护各种类型的连线标签(实线、虚线、粗线),防止被误修改
* **前端控制台调试**: 支持将结构化的调试日志直接打印到浏览器控制台 (F12),方便排查问题。
* **代码块格式化**: 修复破损的代码块前缀、后缀和缩进问题
* **LaTeX 规范化**: 标准化 LaTeX 公式定界符 (`\[` -> `$$`, `\(` -> `$`)。
* **思维标签规范化**: 统一思维链标签 (`<think>`, `<thinking>` -> `<thought>`)。
* **转义字符修复**: 清理过度的转义字符 (`\\n`, `\\t`)
* **列表格式化**: 确保列表项有正确的换行
* **标题修复**: 修复标题中缺失的空格 (`#标题` -> `# 标题`)
* **表格修复**: 修复表格中缺失的闭合管道符
* **XML 清理**: 移除残留的 XML 标签。
## 使用方法
1. 在 Open WebUI 中安装此插件。
2. 全局启用或为特定模型启用此过滤器。
3. **Valves** 设置中配置需要启用的修复项。
4. (可选) **显示调试日志 (Show Debug Log)** 在 Valves 中默认开启。这会将结构化的日志打印到浏览器控制台 (F12)。
1. 在 Open WebUI 中安装此插件。
2. 全局启用或为特定模型启用此过滤器。
3.**Valves** 设置中配置需要启用的修复项。
4. (可选) **显示调试日志 (Show Debug Log)** 在 Valves 中默认开启。这会将结构化的日志打印到浏览器控制台 (F12)。
> [!WARNING]
> 由于这是初版,可能会出现“负向修复”的情况(例如破坏了原本正确的格式)。如果您遇到问题,请务必查看控制台日志,复制“原始 (Original)”与“规范化 (Normalized)”的内容对比,并提交 Issue 反馈。
## 配置项 (Valves)
* `priority`: 过滤器优先级 (默认: 50)。
* `enable_escape_fix`: 修复过度的转义字符。
* `enable_thought_tag_fix`: 规范化思维标签。
* `enable_details_tag_fix`: 规范化 Details 标签 (默认: True)。
* `enable_code_block_fix`: 修复代码块格式。
* `enable_latex_fix`: 规范化 LaTeX 公式。
* `enable_list_fix`: 修复列表项换行 (实验性)。
* `enable_unclosed_block_fix`: 自动闭合未闭合的代码块。
* `enable_fullwidth_symbol_fix`: 修复代码块中的全角符号。
* `enable_mermaid_fix`: 修复 Mermaid 语法错误。
* `enable_heading_fix`: 修复标题中缺失的空格。
* `enable_table_fix`: 修复表格中缺失的闭合管道符。
* `enable_xml_tag_cleanup`: 清理残留的 XML 标签。
* `show_status`: 应用修复时显示状态通知
* `show_debug_log`: 在浏览器控制台打印调试日志
* `priority`: 过滤器优先级 (默认: 50)。
* `enable_escape_fix`: 修复过度的转义字符。
* `enable_thought_tag_fix`: 规范化思维标签。
* `enable_details_tag_fix`: 规范化 Details 标签 (默认: True)。
* `enable_code_block_fix`: 修复代码块格式。
* `enable_latex_fix`: 规范化 LaTeX 公式。
* `enable_list_fix`: 修复列表项换行 (实验性)。
* `enable_unclosed_block_fix`: 自动闭合未闭合的代码块。
* `enable_fullwidth_symbol_fix`: 修复代码块中的全角符号。
* `enable_mermaid_fix`: 修复 Mermaid 语法错误。
* `enable_heading_fix`: 修复标题中缺失的空格。
* `enable_table_fix`: 修复表格中缺失的闭合管道符。
* `enable_xml_tag_cleanup`: 清理残留的 XML 标签。
* `enable_emphasis_spacing_fix`: 修复强调语法中的多余空格 (默认: True)
* `show_status`: 应用修复时显示状态通知
* `show_debug_log`: 在浏览器控制台打印调试日志。
## 故障排除 (Troubleshooting) ❓
* **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue[Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
## 更新日志
### v1.2.2
* **版本更新**: 文档与元数据已同步到最新版本。
### v1.2.1
* **强调空格修复**: 新增了对强调标记内部多余空格的修复(例如 `** 文本 **` -> `**文本**`)。
* 采用递归方法处理嵌套强调(例如 `**加粗 _斜体 _**`)。
* 包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量。
* 通过 `enable_emphasis_spacing_fix` 开关控制(默认:开启)。
### v1.2.0
* **Details 标签支持**: 新增了对 `<details>` 标签的规范化支持。
* 确保在 `</details>` 闭合标签后添加空行,将思维内容与正文分隔开。
* 确保在自闭合 `<details ... />` 标签后添加换行,防止其干扰后续的 Markdown 标题(例如修复 `<details/>#标题`)。
* 包含保护机制,防止修改代码块内部的 `<details>` 标签。
### v1.1.2
* **Mermaid 连线标签保护**: 实现了全面的连线标签保护机制,防止连接线上的文字被误修改。现在支持所有 Mermaid 连线类型,包括实线 (`--`)、虚线 (`-.`) 和粗线 (`==`),无论是否带有箭头。
* **Bug 修复**: 修复了无箭头连线(如 `A -- text --- B`)未被正确保护的问题。
### v1.1.0
* **Mermaid 修复优化**: 改进了正则表达式以处理节点标签中的嵌套括号(如 `ID("标签 (文本)")`),并避免误匹配连接线上的文字。
* **HTML 保护机制优化**: 优化了 `_contains_html` 检测,允许 `<br/>`, `<b>`, `<i>` 等常见标签,确保包含这些标签的 Mermaid 图表能被正常规范化。
* **全角符号清理**: 修复了 `FULLWIDTH_MAP` 中的重复键名和错误的引号映射。
* **Bug 修复**: 修复了 Python 文件中缺失的 `Dict` 类型导入。
## 许可证

View File

@@ -1,69 +1,84 @@
# Markdown Normalizer Filter
**Author:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **Version:** 1.2.0 | **Project:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **License:** MIT
**Author:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **Version:** 1.2.2 | **Project:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **License:** MIT
A content normalizer filter for Open WebUI that fixes common Markdown formatting issues in LLM outputs. It ensures that code blocks, LaTeX formulas, Mermaid diagrams, and other Markdown elements are rendered correctly.
## Features
* **Details Tag Normalization**: Ensures proper spacing for `<details>` tags (used for thought chains). Adds a blank line after `</details>` and ensures a newline after self-closing `<details />` tags to prevent rendering issues.
* **Mermaid Syntax Fix**: Automatically fixes common Mermaid syntax errors, such as unquoted node labels (including multi-line labels and citations) and unclosed subgraphs. **New in v1.1.2**: Comprehensive protection for edge labels (text on connecting lines) across all link types (solid, dotted, thick).
* **Frontend Console Debugging**: Supports printing structured debug logs directly to the browser console (F12) for easier troubleshooting.
* **Code Block Formatting**: Fixes broken code block prefixes, suffixes, and indentation.
* **LaTeX Normalization**: Standardizes LaTeX formula delimiters (`\[` -> `$$`, `\(` -> `$`).
* **Thought Tag Normalization**: Unifies thought tags (`<think>`, `<thinking>` -> `<thought>`).
* **Escape Character Fix**: Cleans up excessive escape characters (`\\n`, `\\t`).
* **List Formatting**: Ensures proper newlines in list items.
* **Heading Fix**: Adds missing spaces in headings (`#Heading` -> `# Heading`).
* **Table Fix**: Adds missing closing pipes in tables.
* **XML Cleanup**: Removes leftover XML artifacts.
* **Details Tag Normalization**: Ensures proper spacing for `<details>` tags (used for thought chains). Adds a blank line after `</details>` and ensures a newline after self-closing `<details />` tags to prevent rendering issues.
* **Emphasis Spacing Fix**: Fixes extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`) which can cause rendering failures. Includes safeguards to protect math expressions (e.g., `2 * 3 * 4`) and list variables.
* **Mermaid Syntax Fix**: Automatically fixes common Mermaid syntax errors, such as unquoted node labels (including multi-line labels and citations) and unclosed subgraphs. **New in v1.1.2**: Comprehensive protection for edge labels (text on connecting lines) across all link types (solid, dotted, thick).
* **Frontend Console Debugging**: Supports printing structured debug logs directly to the browser console (F12) for easier troubleshooting.
* **Code Block Formatting**: Fixes broken code block prefixes, suffixes, and indentation.
* **LaTeX Normalization**: Standardizes LaTeX formula delimiters (`\[` -> `$$`, `\(` -> `$`).
* **Thought Tag Normalization**: Unifies thought tags (`<think>`, `<thinking>` -> `<thought>`).
* **Escape Character Fix**: Cleans up excessive escape characters (`\\n`, `\\t`).
* **List Formatting**: Ensures proper newlines in list items.
* **Heading Fix**: Adds missing spaces in headings (`#Heading` -> `# Heading`).
* **Table Fix**: Adds missing closing pipes in tables.
* **XML Cleanup**: Removes leftover XML artifacts.
## Usage
1. Install the plugin in Open WebUI.
2. Enable the filter globally or for specific models.
3. Configure the enabled fixes in the **Valves** settings.
4. (Optional) **Show Debug Log** is enabled by default in Valves. This prints structured logs to the browser console (F12).
1. Install the plugin in Open WebUI.
2. Enable the filter globally or for specific models.
3. Configure the enabled fixes in the **Valves** settings.
4. (Optional) **Show Debug Log** is enabled by default in Valves. This prints structured logs to the browser console (F12).
> [!WARNING]
> As this is an initial version, some "negative fixes" might occur (e.g., breaking valid Markdown). If you encounter issues, please check the console logs, copy the "Original" vs "Normalized" content, and submit an issue.
## Configuration (Valves)
* `priority`: Filter priority (default: 50).
* `enable_escape_fix`: Fix excessive escape characters.
* `enable_thought_tag_fix`: Normalize thought tags.
* `enable_details_tag_fix`: Normalize details tags (default: True).
* `enable_code_block_fix`: Fix code block formatting.
* `enable_latex_fix`: Normalize LaTeX formulas.
* `enable_list_fix`: Fix list item newlines (Experimental).
* `enable_unclosed_block_fix`: Auto-close unclosed code blocks.
* `enable_fullwidth_symbol_fix`: Fix full-width symbols in code blocks.
* `enable_mermaid_fix`: Fix Mermaid syntax errors.
* `enable_heading_fix`: Fix missing space in headings.
* `enable_table_fix`: Fix missing closing pipe in tables.
* `enable_xml_tag_cleanup`: Cleanup leftover XML tags.
* `show_status`: Show status notification when fixes are applied.
* `show_debug_log`: Print debug logs to browser console.
* `priority`: Filter priority (default: 50).
* `enable_escape_fix`: Fix excessive escape characters.
* `enable_thought_tag_fix`: Normalize thought tags.
* `enable_details_tag_fix`: Normalize details tags (default: True).
* `enable_code_block_fix`: Fix code block formatting.
* `enable_latex_fix`: Normalize LaTeX formulas.
* `enable_list_fix`: Fix list item newlines (Experimental).
* `enable_unclosed_block_fix`: Auto-close unclosed code blocks.
* `enable_fullwidth_symbol_fix`: Fix full-width symbols in code blocks.
* `enable_mermaid_fix`: Fix Mermaid syntax errors.
* `enable_heading_fix`: Fix missing space in headings.
* `enable_table_fix`: Fix missing closing pipe in tables.
* `enable_xml_tag_cleanup`: Cleanup leftover XML tags.
* `enable_emphasis_spacing_fix`: Fix extra spaces in emphasis (default: True).
* `show_status`: Show status notification when fixes are applied.
* `show_debug_log`: Print debug logs to browser console.
## Troubleshooting ❓
- **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
* **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
## Changelog
### v1.2.2
* **Version Bump**: Documentation and metadata updated for the latest release.
### v1.2.1
* **Emphasis Spacing Fix**: Added a new fix for extra spaces inside emphasis markers (e.g., `** text **` -> `**text**`).
* Uses a recursive approach to handle nested emphasis (e.g., `**bold _italic _**`).
* Includes safeguards to prevent modifying math expressions (e.g., `2 * 3 * 4`) or list variables.
* Controlled by the `enable_emphasis_spacing_fix` valve (default: True).
### v1.2.0
* **Details Tag Support**: Added normalization for `<details>` tags.
* Ensures a blank line is added after `</details>` closing tags to separate thought content from the main response.
* Ensures a newline is added after self-closing `<details ... />` tags to prevent them from interfering with subsequent Markdown headings (e.g., fixing `<details/>#Heading`).
* Includes safeguard to prevent modification of `<details>` tags inside code blocks.
* **Details Tag Support**: Added normalization for `<details>` tags.
* Ensures a blank line is added after `</details>` closing tags to separate thought content from the main response.
* Ensures a newline is added after self-closing `<details ... />` tags to prevent them from interfering with subsequent Markdown headings (e.g., fixing `<details/>#Heading`).
* Includes safeguard to prevent modification of `<details>` tags inside code blocks.
### v1.1.2
* **Mermaid Edge Label Protection**: Implemented comprehensive protection for edge labels (text on connecting lines) to prevent them from being incorrectly modified. Now supports all Mermaid link types including solid (`--`), dotted (`-.`), and thick (`==`) lines with or without arrows.
* **Bug Fixes**: Fixed an issue where lines without arrows (e.g., `A -- text --- B`) were not correctly protected.
* **Mermaid Edge Label Protection**: Implemented comprehensive protection for edge labels (text on connecting lines) to prevent them from being incorrectly modified. Now supports all Mermaid link types including solid (`--`), dotted (`-.`), and thick (`==`) lines with or without arrows.
* **Bug Fixes**: Fixed an issue where lines without arrows (e.g., `A -- text --- B`) were not correctly protected.
### v1.1.0
* **Mermaid Fix Refinement**: Improved regex to handle nested parentheses in node labels (e.g., `ID("Label (text)")`) and avoided matching connection labels.
* **HTML Safeguard Optimization**: Refined `_contains_html` to allow common tags like `<br/>`, `<b>`, `<i>`, etc., ensuring Mermaid diagrams with these tags are still normalized.
* **Full-width Symbol Cleanup**: Fixed duplicate keys and incorrect quote mapping in `FULLWIDTH_MAP`.
* **Bug Fixes**: Fixed missing `Dict` import in Python files.
* **Mermaid Fix Refinement**: Improved regex to handle nested parentheses in node labels (e.g., `ID("Label (text)")`) and avoided matching connection labels.
* **HTML Safeguard Optimization**: Refined `_contains_html` to allow common tags like `<br/>`, `<b>`, `<i>`, etc., ensuring Mermaid diagrams with these tags are still normalized.
* **Full-width Symbol Cleanup**: Fixed duplicate keys and incorrect quote mapping in `FULLWIDTH_MAP`.
* **Bug Fixes**: Fixed missing `Dict` import in Python files.

View File

@@ -1,69 +1,84 @@
# Markdown 格式化过滤器 (Markdown Normalizer)
**作者:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **版本:** 1.2.0 | **项目:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **许可证:** MIT
**作者:** [Fu-Jie](https://github.com/Fu-Jie/awesome-openwebui) | **版本:** 1.2.2 | **项目:** [Awesome OpenWebUI](https://github.com/Fu-Jie/awesome-openwebui) | **许可证:** MIT
这是一个用于 Open WebUI 的内容格式化过滤器,旨在修复 LLM 输出中常见的 Markdown 格式问题。它能确保代码块、LaTeX 公式、Mermaid 图表和其他 Markdown 元素被正确渲染。
## 功能特性
* **Details 标签规范化**: 确保 `<details>` 标签(常用于思维链)有正确的间距。在 `</details>` 后添加空行,并在自闭合 `<details />` 标签后添加换行,防止渲染问题。
* **Mermaid 语法修复**: 自动修复常见的 Mermaid 语法错误,如未加引号的节点标签(支持多行标签和引用标记)和未闭合的子图 (Subgraph)。**v1.1.2 新增**: 全面保护各种类型的连线标签(实线、虚线、粗线),防止误修改。
* **前端控制台调试**: 支持将结构化的调试日志直接打印到浏览器控制台 (F12),方便排查问题
* **代码块格式化**: 修复破损的代码块前缀、后缀和缩进问题。
* **LaTeX 规范化**: 标准化 LaTeX 公式定界符 (`\[` -> `$$`, `\(` -> `$`)
* **思维标签规范化**: 统一思维链标签 (`<think>`, `<thinking>` -> `<thought>`)。
* **转义字符修复**: 清理过度的转义字符 (`\\n`, `\\t`)。
* **列表格式化**: 确保列表项有正确的换行
* **标题修复**: 修复标题中缺失的空格 (`#标题` -> `# 标题`)
* **表格修复**: 修复表格中缺失的闭合管道符
* **XML 清理**: 移除残留的 XML 标签
* **Details 标签规范化**: 确保 `<details>` 标签(常用于思维链)有正确的间距。在 `</details>` 后添加空行,并在自闭合 `<details />` 标签后添加换行,防止渲染问题。
* **强调空格修复**: 修复强调标记内部的多余空格(例如 `** 文本 **` -> `**文本**`),这会导致 Markdown 渲染失败。包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量
* **Mermaid 语法修复**: 自动修复常见的 Mermaid 语法错误,如未加引号的节点标签(支持多行标签和引用标记)和未闭合的子图 (Subgraph)。**v1.1.2 新增**: 全面保护各种类型的连线标签(实线、虚线、粗线),防止被误修改
* **前端控制台调试**: 支持将结构化的调试日志直接打印到浏览器控制台 (F12),方便排查问题。
* **代码块格式化**: 修复破损的代码块前缀、后缀和缩进问题
* **LaTeX 规范化**: 标准化 LaTeX 公式定界符 (`\[` -> `$$`, `\(` -> `$`)。
* **思维标签规范化**: 统一思维链标签 (`<think>`, `<thinking>` -> `<thought>`)。
* **转义字符修复**: 清理过度的转义字符 (`\\n`, `\\t`)
* **列表格式化**: 确保列表项有正确的换行
* **标题修复**: 修复标题中缺失的空格 (`#标题` -> `# 标题`)
* **表格修复**: 修复表格中缺失的闭合管道符
* **XML 清理**: 移除残留的 XML 标签。
## 使用方法
1. 在 Open WebUI 中安装此插件。
2. 全局启用或为特定模型启用此过滤器。
3. **Valves** 设置中配置需要启用的修复项。
4. (可选) **显示调试日志 (Show Debug Log)** 在 Valves 中默认开启。这会将结构化的日志打印到浏览器控制台 (F12)。
1. 在 Open WebUI 中安装此插件。
2. 全局启用或为特定模型启用此过滤器。
3.**Valves** 设置中配置需要启用的修复项。
4. (可选) **显示调试日志 (Show Debug Log)** 在 Valves 中默认开启。这会将结构化的日志打印到浏览器控制台 (F12)。
> [!WARNING]
> 由于这是初版,可能会出现“负向修复”的情况(例如破坏了原本正确的格式)。如果您遇到问题,请务必查看控制台日志,复制“原始 (Original)”与“规范化 (Normalized)”的内容对比,并提交 Issue 反馈。
## 配置项 (Valves)
* `priority`: 过滤器优先级 (默认: 50)。
* `enable_escape_fix`: 修复过度的转义字符。
* `enable_thought_tag_fix`: 规范化思维标签。
* `enable_details_tag_fix`: 规范化 Details 标签 (默认: True)。
* `enable_code_block_fix`: 修复代码块格式。
* `enable_latex_fix`: 规范化 LaTeX 公式。
* `enable_list_fix`: 修复列表项换行 (实验性)。
* `enable_unclosed_block_fix`: 自动闭合未闭合的代码块。
* `enable_fullwidth_symbol_fix`: 修复代码块中的全角符号。
* `enable_mermaid_fix`: 修复 Mermaid 语法错误。
* `enable_heading_fix`: 修复标题中缺失的空格。
* `enable_table_fix`: 修复表格中缺失的闭合管道符。
* `enable_xml_tag_cleanup`: 清理残留的 XML 标签。
* `show_status`: 应用修复时显示状态通知
* `show_debug_log`: 在浏览器控制台打印调试日志
* `priority`: 过滤器优先级 (默认: 50)。
* `enable_escape_fix`: 修复过度的转义字符。
* `enable_thought_tag_fix`: 规范化思维标签。
* `enable_details_tag_fix`: 规范化 Details 标签 (默认: True)。
* `enable_code_block_fix`: 修复代码块格式。
* `enable_latex_fix`: 规范化 LaTeX 公式。
* `enable_list_fix`: 修复列表项换行 (实验性)。
* `enable_unclosed_block_fix`: 自动闭合未闭合的代码块。
* `enable_fullwidth_symbol_fix`: 修复代码块中的全角符号。
* `enable_mermaid_fix`: 修复 Mermaid 语法错误。
* `enable_heading_fix`: 修复标题中缺失的空格。
* `enable_table_fix`: 修复表格中缺失的闭合管道符。
* `enable_xml_tag_cleanup`: 清理残留的 XML 标签。
* `enable_emphasis_spacing_fix`: 修复强调语法中的多余空格 (默认: True)
* `show_status`: 应用修复时显示状态通知
* `show_debug_log`: 在浏览器控制台打印调试日志。
## 故障排除 (Troubleshooting) ❓
- **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue[Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
* **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue[Awesome OpenWebUI Issues](https://github.com/Fu-Jie/awesome-openwebui/issues)
## 更新日志
### v1.2.2
* **版本更新**: 文档与元数据已同步到最新版本。
### v1.2.1
* **强调空格修复**: 新增了对强调标记内部多余空格的修复(例如 `** 文本 **` -> `**文本**`)。
* 采用递归方法处理嵌套强调(例如 `**加粗 _斜体 _**`)。
* 包含保护机制,防止误修改数学表达式(如 `2 * 3 * 4`)或列表变量。
* 通过 `enable_emphasis_spacing_fix` 开关控制(默认:开启)。
### v1.2.0
* **Details 标签支持**: 新增了对 `<details>` 标签的规范化支持。
* 确保在 `</details>` 闭合标签后添加空行,将思维内容与正文分隔开
* 确保在自闭合 `<details ... />` 标签后添加行,防止其干扰后续的 Markdown 标题(例如修复 `<details/>#标题`
* 包含保护机制,防止修改代码块内部的 `<details>` 标签
* **Details 标签支持**: 新增了对 `<details>` 标签的规范化支持
* 确保在 `</details>` 闭合标签后添加行,将思维内容与正文分隔开
* 确保在自闭合 `<details ... />` 标签后添加换行,防止其干扰后续的 Markdown 标题(例如修复 `<details/>#标题`
* 包含保护机制,防止修改代码块内部的 `<details>` 标签。
### v1.1.2
* **Mermaid 连线标签保护**: 实现了全面的连线标签保护机制,防止连接线上的文字被误修改。现在支持所有 Mermaid 连线类型,包括实线 (`--`)、虚线 (`-.`) 和粗线 (`==`),无论是否带有箭头。
* **Bug 修复**: 修复了无箭头连线(如 `A -- text --- B`)未被正确保护的问题
* **Mermaid 连线标签保护**: 实现了全面的连线标签保护机制,防止连接线上的文字被误修改。现在支持所有 Mermaid 连线类型,包括实线 (`--`)、虚线 (`-.`) 和粗线 (`==`),无论是否带有箭头
* **Bug 修复**: 修复了无箭头连线(如 `A -- text --- B`)未被正确保护的问题。
### v1.1.0
* **Mermaid 修复优化**: 改进了正则表达式以处理节点标签中的嵌套括号(如 `ID("标签 (文本)")`),并避免误匹配连接线上的文字。
* **HTML 保护机制优化**: 优化了 `_contains_html` 检测,允许 `<br/>`, `<b>`, `<i>` 等常见标签,确保包含这些标签的 Mermaid 图表能被正常规范化。
* **全角符号清理**: 修复了 `FULLWIDTH_MAP` 中的重复键名和错误的引号映射。
* **Bug 修复**: 修复了 Python 文件中缺失的 `Dict` 类型导入。
* **Mermaid 修复优化**: 改进了正则表达式以处理节点标签中的嵌套括号(如 `ID("标签 (文本)")`),并避免误匹配连接线上的文字。
* **HTML 保护机制优化**: 优化了 `_contains_html` 检测,允许 `<br/>`, `<b>`, `<i>` 等常见标签,确保包含这些标签的 Mermaid 图表能被正常规范化。
* **全角符号清理**: 修复了 `FULLWIDTH_MAP` 中的重复键名和错误的引号映射。
* **Bug 修复**: 修复了 Python 文件中缺失的 `Dict` 类型导入。

View File

@@ -3,7 +3,7 @@ title: Markdown Normalizer
author: Fu-Jie
author_url: https://github.com/Fu-Jie/awesome-openwebui
funding_url: https://github.com/open-webui
version: 1.2.0
version: 1.2.2
openwebui_id: baaa8732-9348-40b7-8359-7e009660e23c
description: A content normalizer filter that fixes common Markdown formatting issues in LLM outputs, such as broken code blocks, LaTeX formulas, and list formatting.
"""
@@ -43,6 +43,7 @@ class NormalizerConfig:
)
enable_table_fix: bool = True # Fix missing closing pipe in tables
enable_xml_tag_cleanup: bool = True # Cleanup leftover XML tags
enable_emphasis_spacing_fix: bool = True # Fix spaces inside **emphasis**
# Custom cleaner functions (for advanced extension)
custom_cleaners: List[Callable[[str], str]] = field(default_factory=list)
@@ -53,8 +54,8 @@ class ContentNormalizer:
# --- 1. Pre-compiled Regex Patterns (Performance Optimization) ---
_PATTERNS = {
# Code block prefix: if ``` is not at start of line or file
"code_block_prefix": re.compile(r"(?<!^)(?<!\n)(```)", re.MULTILINE),
# Code block prefix: if ``` is not at start of line (ignoring whitespace)
"code_block_prefix": re.compile(r"(\S[ \t]*)(```)"),
# Code block suffix: ```lang followed by non-whitespace (no newline)
"code_block_suffix": re.compile(r"(```[\w\+\-\.]*)[ \t]+([^\n\r])"),
# Code block indent: whitespace at start of line + ```
@@ -108,6 +109,13 @@ class ContentNormalizer:
"heading_space": re.compile(r"^(#+)([^ \n#])", re.MULTILINE),
# Table: | col1 | col2 -> | col1 | col2 |
"table_pipe": re.compile(r"^(\|.*[^|\n])$", re.MULTILINE),
# Emphasis spacing: ** text ** -> **text**
# Matches emphasis blocks within a single line. We use a recursive approach
# in _fix_emphasis_spacing to handle nesting and spaces correctly.
# NOTE: We use [^\n] instead of . to prevent cross-line matching.
"emphasis_spacing": re.compile(
r"(?<!\*|_)(\*{1,3}|_)(?P<inner>[^\n]*?)(\1)(?!\*|_)"
),
}
def __init__(self, config: Optional[NormalizerConfig] = None):
@@ -207,6 +215,13 @@ class ContentNormalizer:
if content != original:
self.applied_fixes.append("Cleanup XML Tags")
# 12. Emphasis spacing fix
if self.config.enable_emphasis_spacing_fix:
original = content
content = self._fix_emphasis_spacing(content)
if content != original:
self.applied_fixes.append("Fix Emphasis Spacing")
# 9. Custom cleaners
for cleaner in self.config.custom_cleaners:
original = content
@@ -283,8 +298,6 @@ class ContentNormalizer:
def _fix_code_blocks(self, content: str) -> str:
"""Fix code block formatting (prefixes, suffixes, indentation)"""
# Remove indentation before code blocks
content = self._PATTERNS["code_block_indent"].sub(r"\1", content)
# Ensure newline before ```
content = self._PATTERNS["code_block_prefix"].sub(r"\n\1", content)
# Ensure newline after ```lang
@@ -443,6 +456,47 @@ class ContentNormalizer:
"""Remove leftover XML tags"""
return self._PATTERNS["xml_artifacts"].sub("", content)
def _fix_emphasis_spacing(self, content: str) -> str:
"""Fix spaces inside **emphasis** or _emphasis_
Example: ** text ** -> **text**, **text ** -> **text**, ** text** -> **text**
"""
def replacer(match):
symbol = match.group(1)
inner = match.group("inner")
# Recursive step: Fix emphasis spacing INSIDE the current block first
# This ensures that ** _ italic _ ** becomes ** _italic_ ** before we strip outer spaces.
inner = self._PATTERNS["emphasis_spacing"].sub(replacer, inner)
# If no leading/trailing whitespace, nothing to fix at this level
stripped_inner = inner.strip()
if stripped_inner == inner:
return f"{symbol}{inner}{symbol}"
# Safeguard: If inner content is just whitespace, don't touch it
if not stripped_inner:
return match.group(0)
# Safeguard: If it looks like a math expression or list of variables (e.g. " * 3 * " or " _ b _ ")
# If the symbol is surrounded by spaces in the original text, it's likely an operator.
if inner.startswith(" ") and inner.endswith(" "):
# If it's single '*' or '_', and both sides have spaces, it's almost certainly an operator.
if symbol in ["*", "_"]:
return match.group(0)
return f"{symbol}{stripped_inner}{symbol}"
parts = content.split("```")
for i in range(0, len(parts), 2): # Even indices are markdown text
# We use a while loop to handle overlapping or multiple occurrences at the top level
while True:
new_part = self._PATTERNS["emphasis_spacing"].sub(replacer, parts[i])
if new_part == parts[i]:
break
parts[i] = new_part
return "```".join(parts)
class Filter:
class Valves(BaseModel):
@@ -494,6 +548,10 @@ class Filter:
enable_xml_tag_cleanup: bool = Field(
default=True, description="Cleanup leftover XML tags"
)
enable_emphasis_spacing_fix: bool = Field(
default=True,
description="Fix spaces inside **emphasis** (e.g. ** text ** -> **text**)",
)
show_status: bool = Field(
default=True, description="Show status notification when fixes are applied"
)
@@ -637,6 +695,7 @@ class Filter:
enable_heading_fix=self.valves.enable_heading_fix,
enable_table_fix=self.valves.enable_table_fix,
enable_xml_tag_cleanup=self.valves.enable_xml_tag_cleanup,
enable_emphasis_spacing_fix=self.valves.enable_emphasis_spacing_fix,
)
normalizer = ContentNormalizer(config)

View File

@@ -3,7 +3,7 @@ title: Markdown 格式修复器 (Markdown Normalizer)
author: Fu-Jie
author_url: https://github.com/Fu-Jie/awesome-openwebui
funding_url: https://github.com/open-webui
version: 1.2.0
version: 1.2.2
description: 内容规范化过滤器,修复 LLM 输出中常见的 Markdown 格式问题如损坏的代码块、LaTeX 公式、Mermaid 图表和列表格式。
"""
@@ -35,6 +35,7 @@ class NormalizerConfig:
enable_heading_fix: bool = True # 修复标题中缺失的空格 (#Header -> # Header)
enable_table_fix: bool = True # 修复表格中缺失的闭合管道符
enable_xml_tag_cleanup: bool = True # 清理残留的 XML 标签
enable_emphasis_spacing_fix: bool = True # 修复 **强调内容** 中的多余空格
# 自定义清理函数 (用于高级扩展)
custom_cleaners: List[Callable[[str], str]] = field(default_factory=list)
@@ -45,8 +46,8 @@ class ContentNormalizer:
# --- 1. Pre-compiled Regex Patterns (Performance Optimization) ---
_PATTERNS = {
# Code block prefix: if ``` is not at start of line or file
"code_block_prefix": re.compile(r"(?<!^)(?<!\n)(```)", re.MULTILINE),
# Code block prefix: if ``` is not at start of line (ignoring whitespace)
"code_block_prefix": re.compile(r"(\S[ \t]*)(```)"),
# Code block suffix: ```lang followed by non-whitespace (no newline)
"code_block_suffix": re.compile(r"(```[\w\+\-\.]*)[ \t]+([^\n\r])"),
# Code block indent: whitespace at start of line + ```
@@ -100,6 +101,13 @@ class ContentNormalizer:
"heading_space": re.compile(r"^(#+)([^ \n#])", re.MULTILINE),
# Table: | col1 | col2 -> | col1 | col2 |
"table_pipe": re.compile(r"^(\|.*[^|\n])$", re.MULTILINE),
# Emphasis spacing: ** text ** -> **text**
# Matches emphasis blocks within a single line. We use a recursive approach
# in _fix_emphasis_spacing to handle nesting and spaces correctly.
# NOTE: We use [^\n] instead of . to prevent cross-line matching.
"emphasis_spacing": re.compile(
r"(?<!\*|_)(\*{1,3}|_)(?P<inner>[^\n]*?)(\1)(?!\*|_)"
),
}
def __init__(self, config: Optional[NormalizerConfig] = None):
@@ -199,6 +207,13 @@ class ContentNormalizer:
if content != original:
self.applied_fixes.append("Cleanup XML Tags")
# 12. Emphasis spacing fix
if self.config.enable_emphasis_spacing_fix:
original = content
content = self._fix_emphasis_spacing(content)
if content != original:
self.applied_fixes.append("Fix Emphasis Spacing")
# 9. Custom cleaners
for cleaner in self.config.custom_cleaners:
original = content
@@ -257,8 +272,6 @@ class ContentNormalizer:
def _fix_code_blocks(self, content: str) -> str:
"""Fix code block formatting (prefixes, suffixes, indentation)"""
# Remove indentation before code blocks
content = self._PATTERNS["code_block_indent"].sub(r"\1", content)
# Ensure newline before ```
content = self._PATTERNS["code_block_prefix"].sub(r"\n\1", content)
# Ensure newline after ```lang
@@ -422,6 +435,47 @@ class ContentNormalizer:
"""Remove leftover XML tags"""
return self._PATTERNS["xml_artifacts"].sub("", content)
def _fix_emphasis_spacing(self, content: str) -> str:
"""Fix spaces inside **emphasis** or _emphasis_
Example: ** text ** -> **text**, **text ** -> **text**, ** text** -> **text**
"""
def replacer(match):
symbol = match.group(1)
inner = match.group("inner")
# Recursive step: Fix emphasis spacing INSIDE the current block first
# This ensures that ** _ italic _ ** becomes ** _italic_ ** before we strip outer spaces.
inner = self._PATTERNS["emphasis_spacing"].sub(replacer, inner)
# If no leading/trailing whitespace, nothing to fix at this level
stripped_inner = inner.strip()
if stripped_inner == inner:
return f"{symbol}{inner}{symbol}"
# Safeguard: If inner content is just whitespace, don't touch it
if not stripped_inner:
return match.group(0)
# Safeguard: If it looks like a math expression or list of variables (e.g. " * 3 * " or " _ b _ ")
# If the symbol is surrounded by spaces in the original text, it's likely an operator.
if inner.startswith(" ") and inner.endswith(" "):
# If it's single '*' or '_', and both sides have spaces, it's almost certainly an operator.
if symbol in ["*", "_"]:
return match.group(0)
return f"{symbol}{stripped_inner}{symbol}"
parts = content.split("```")
for i in range(0, len(parts), 2): # Even indices are markdown text
# We use a while loop to handle overlapping or multiple occurrences at the top level
while True:
new_part = self._PATTERNS["emphasis_spacing"].sub(replacer, parts[i])
if new_part == parts[i]:
break
parts[i] = new_part
return "```".join(parts)
class Filter:
class Valves(BaseModel):
@@ -469,6 +523,10 @@ class Filter:
enable_xml_tag_cleanup: bool = Field(
default=True, description="清理残留的 XML 标签"
)
enable_emphasis_spacing_fix: bool = Field(
default=True,
description="修复强调语法中的多余空格 (例如 ** 文本 ** -> **文本**)",
)
show_status: bool = Field(default=True, description="应用修复时显示状态通知")
show_debug_log: bool = Field(
default=True, description="在浏览器控制台打印调试日志 (F12)"
@@ -540,6 +598,7 @@ class Filter:
"Fix Headings": "标题格式",
"Fix Tables": "表格格式",
"Cleanup XML Tags": "XML清理",
"Fix Emphasis Spacing": "强调空格",
"Custom Cleaner": "自定义清理",
}
translated_fixes = [fix_map.get(fix, fix) for fix in applied_fixes]
@@ -626,6 +685,7 @@ class Filter:
enable_heading_fix=self.valves.enable_heading_fix,
enable_table_fix=self.valves.enable_table_fix,
enable_xml_tag_cleanup=self.valves.enable_xml_tag_cleanup,
enable_emphasis_spacing_fix=self.valves.enable_emphasis_spacing_fix,
)
normalizer = ContentNormalizer(config)