Skip to content

[codex] Add WeChat public account article converter#1912

Draft
Lego1997 wants to merge 1 commit into
microsoft:mainfrom
Lego1997:codex/wechat-article-cleanup
Draft

[codex] Add WeChat public account article converter#1912
Lego1997 wants to merge 1 commit into
microsoft:mainfrom
Lego1997:codex/wechat-article-cleanup

Conversation

@Lego1997
Copy link
Copy Markdown

Summary

  • Add a dedicated WeChat public account article converter for mp.weixin.qq.com/s/... pages.
  • Use browser-like request headers for WeChat article URLs so the CLI can fetch article HTML instead of the platform verification page.
  • Preserve article title, account metadata, cover image, body content, and in-article images while omitting common WeChat platform chrome.
  • Document direct web page conversion and WeChat article behavior in the README.

Validation

  • uv run --with pytest pytest -q 'tests/test_module_vectors.py::test_guess_stream_info[test_vector8]' 'tests/test_module_vectors.py::test_convert_local[test_vector8]' 'tests/test_module_vectors.py::test_convert_stream_with_hints[test_vector8]' 'tests/test_module_vectors.py::test_convert_stream_without_hints[test_vector8]' 'tests/test_module_vectors.py::test_convert_file_uri[test_vector8]' 'tests/test_module_vectors.py::test_convert_data_uri[test_vector8]' tests/test_module_misc.py::test_wechat_uri_uses_browser_like_headers
  • uv run markitdown 'https://mp.weixin.qq.com/s/dQPd5jgGobw-Q4LxLq3wHA' -o /tmp/upstream-wechat-live.md
  • uvx black==23.7.0 --check packages/markitdown/src/markitdown/converters/_wechat_converter.py packages/markitdown/src/markitdown/converters/__init__.py packages/markitdown/src/markitdown/_markitdown.py packages/markitdown/tests/_test_vectors.py packages/markitdown/tests/test_module_misc.py
  • uvx pre-commit run --all-files

@Lego1997
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant