Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033
Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033alibrohde wants to merge 2 commits into
Conversation
…d text fidelity Modern web sources (browsers, Gmail, Google Docs, Notion, Linear, etc.) place higher-fidelity HTML on the macOS pasteboard than RTF. Hyperlinked text in those sources arrives as real anchor tags in the HTML flavor, which pandoc cleanly renders as [label](url). The RTF flavor from the same sources often drops or mangles those links. The script now tries the HTML flavor first and falls back to the original RTF pipeline when HTML is not present, preserving existing behavior for all sources that only expose RTF.
unnamedd
left a comment
There was a problem hiding this comment.
Hi Ali,
Thank you for your contribution to our repository.
I've made only two comments, please check them and re-request a review when you finish addressing them, okay?
| # @raycast.author Adam Zethraeus | ||
| # @raycast.authorURL https://github.com/adam-zethraeus |
There was a problem hiding this comment.
If you are making changes to the Script Command originally created by someone else, it is nice to credit them too, in this case: yourself.
| # @raycast.author Adam Zethraeus | |
| # @raycast.authorURL https://github.com/adam-zethraeus | |
| # @raycast.author Ali Rohde | |
| # @raycast.authorURL https://github.com/alibrohde |
There was a problem hiding this comment.
Done — pushed in a95cef2. Took your literal suggestion since the rest of the repo uses a single @raycast.author field rather than co-authoring. Thanks for catching this!
| # Prefer HTML: most modern web sources (browsers, Gmail, Google Docs, Notion, | ||
| # Linear, etc.) place higher-fidelity HTML on the pasteboard than RTF, and it | ||
| # preserves hyperlinked text as real anchor tags that pandoc renders as | ||
| # [label](url) in markdown. | ||
| html=$(osascript -e 'try' -e 'the clipboard as «class HTML»' -e 'on error' -e 'return ""' -e 'end try' 2>/dev/null \ | ||
| | perl -ne 'chomp; next unless s/^«data HTML//; s/»$//; print pack("H*", $_)') |
There was a problem hiding this comment.
I am concerned about that because if the editor works with both, and the person prefers to have Markdown instead of RTF, the editor will kind of force the user to make use of RTF. Right?
The case I am thinking of here is Google Docs, where we can have blocks of code and want to paste Markdown inside a code block. I am not sure if it will respect that.
Perhaps it is better to have a second Script Command to convert to RTF instead of doing two things with a single Script Command.
What do you think?
There was a problem hiding this comment.
Thanks for raising this — I think there might be a small misread of what the script does (and that's on me for not making it clearer). Both branches produce markdown as the final output: pandoc … --to=gfm | pbcopy runs on either path. The HTML-vs-RTF choice only controls which input flavor on the pasteboard we read from before converting.
The reason HTML is preferred: when you copy from Gmail, Google Docs, Notion, Linear, etc., the pasteboard holds both flavors, but only the HTML version preserves hyperlinks as real <a href="…"> anchor tags. Pandoc turns those into [label](url) in markdown. The RTF flavor on the same sources often flattens links to plain text, so users get markdown with the link text but no URL. RTF stays as a fallback for sources that only expose rich text (some older native apps).
So for the Google Docs code-block case — pasting markdown into a ``` block — the output is markdown regardless of which branch fires. The user always gets markdown on the clipboard; no second script command needed.
Happy to update the @raycast.description to make the input-vs-output distinction more obvious if that would help. Let me know!
|
This issue/pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in the next 7 days to keep our backlog clean. Thanks for your contributions. |
Reviewer asked the author field reflect whoever modified the script, matching the single-author convention used throughout the rest of the repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Description
Extends
commands/conversions/rich-text-clipboard-to-markdown.shto prefer the HTML pasteboard flavor over the RTF flavor, with the original RTF pipeline kept as a fallback.Why
When you copy hyperlinked text from modern web sources (browsers, Gmail, Google Docs, Notion, Linear, Slack web, etc.), macOS puts both an RTF and an HTML flavor on the pasteboard. The HTML flavor carries the hyperlinks as real anchor tags, which pandoc renders cleanly as
[label](url)in markdown. The RTF flavor from the same sources often drops or mangles those links, leaving the visible text intact but stripping the URL.For anyone who frequently pastes rich text into CLIs or plain-text editors (Claude Code, terminals, vim, etc.), the HTML path meaningfully reduces link loss.
Before / after
Copy from a webpage:
Before (RTF path): hyperlinks often stripped, only visible text survives.
After (HTML path):
```
Check out Raycast and Anthropic.
```
Backward compatibility
No breaking changes. When HTML is absent from the pasteboard (some native apps, older editors), the script falls through to the original RTF pipeline unchanged.
Type of change
Dependencies / Requirements
Unchanged — still only requires
pandoc(brew install pandoc).Checklist