Skip to content

Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033

Open
alibrohde wants to merge 2 commits into
raycast:masterfrom
alibrohde:ali/hyperlinked-html-rich-text-to-markdown
Open

Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033
alibrohde wants to merge 2 commits into
raycast:masterfrom
alibrohde:ali/hyperlinked-html-rich-text-to-markdown

Conversation

@alibrohde
Copy link
Copy Markdown

Description

Extends commands/conversions/rich-text-clipboard-to-markdown.sh to prefer the HTML pasteboard flavor over the RTF flavor, with the original RTF pipeline kept as a fallback.

Why

When you copy hyperlinked text from modern web sources (browsers, Gmail, Google Docs, Notion, Linear, Slack web, etc.), macOS puts both an RTF and an HTML flavor on the pasteboard. The HTML flavor carries the hyperlinks as real anchor tags, which pandoc renders cleanly as [label](url) in markdown. The RTF flavor from the same sources often drops or mangles those links, leaving the visible text intact but stripping the URL.

For anyone who frequently pastes rich text into CLIs or plain-text editors (Claude Code, terminals, vim, etc.), the HTML path meaningfully reduces link loss.

Before / after

Copy from a webpage:

Check out Raycast and Anthropic.

Before (RTF path): hyperlinks often stripped, only visible text survives.

After (HTML path):

```
Check out Raycast and Anthropic.
```

Backward compatibility

No breaking changes. When HTML is absent from the pasteboard (some native apps, older editors), the script falls through to the original RTF pipeline unchanged.

Type of change

  • Improvement of an existing script

Dependencies / Requirements

Unchanged — still only requires pandoc (brew install pandoc).

Checklist

…d text fidelity

Modern web sources (browsers, Gmail, Google Docs, Notion, Linear, etc.) place
higher-fidelity HTML on the macOS pasteboard than RTF. Hyperlinked text in
those sources arrives as real anchor tags in the HTML flavor, which pandoc
cleanly renders as [label](url). The RTF flavor from the same sources often
drops or mangles those links.

The script now tries the HTML flavor first and falls back to the original RTF
pipeline when HTML is not present, preserving existing behavior for all
sources that only expose RTF.
Copy link
Copy Markdown
Contributor

@unnamedd unnamedd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ali,
Thank you for your contribution to our repository.

I've made only two comments, please check them and re-request a review when you finish addressing them, okay?

Comment on lines 7 to 8
# @raycast.author Adam Zethraeus
# @raycast.authorURL https://github.com/adam-zethraeus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are making changes to the Script Command originally created by someone else, it is nice to credit them too, in this case: yourself.

Suggested change
# @raycast.author Adam Zethraeus
# @raycast.authorURL https://github.com/adam-zethraeus
# @raycast.author Ali Rohde
# @raycast.authorURL https://github.com/alibrohde

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — pushed in a95cef2. Took your literal suggestion since the rest of the repo uses a single @raycast.author field rather than co-authoring. Thanks for catching this!

Comment on lines +24 to +29
# Prefer HTML: most modern web sources (browsers, Gmail, Google Docs, Notion,
# Linear, etc.) place higher-fidelity HTML on the pasteboard than RTF, and it
# preserves hyperlinked text as real anchor tags that pandoc renders as
# [label](url) in markdown.
html=$(osascript -e 'try' -e 'the clipboard as «class HTML»' -e 'on error' -e 'return ""' -e 'end try' 2>/dev/null \
| perl -ne 'chomp; next unless s/^«data HTML//; s/»$//; print pack("H*", $_)')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about that because if the editor works with both, and the person prefers to have Markdown instead of RTF, the editor will kind of force the user to make use of RTF. Right?

The case I am thinking of here is Google Docs, where we can have blocks of code and want to paste Markdown inside a code block. I am not sure if it will respect that.

Perhaps it is better to have a second Script Command to convert to RTF instead of doing two things with a single Script Command.

What do you think?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this — I think there might be a small misread of what the script does (and that's on me for not making it clearer). Both branches produce markdown as the final output: pandoc … --to=gfm | pbcopy runs on either path. The HTML-vs-RTF choice only controls which input flavor on the pasteboard we read from before converting.

The reason HTML is preferred: when you copy from Gmail, Google Docs, Notion, Linear, etc., the pasteboard holds both flavors, but only the HTML version preserves hyperlinks as real <a href="…"> anchor tags. Pandoc turns those into [label](url) in markdown. The RTF flavor on the same sources often flattens links to plain text, so users get markdown with the link text but no URL. RTF stays as a fallback for sources that only expose rich text (some older native apps).

So for the Google Docs code-block case — pasting markdown into a ``` block — the output is markdown regardless of which branch fires. The user always gets markdown on the clipboard; no second script command needed.

Happy to update the @raycast.description to make the input-vs-output distinction more obvious if that would help. Let me know!

@stale
Copy link
Copy Markdown

stale Bot commented May 22, 2026

This issue/pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in the next 7 days to keep our backlog clean. Thanks for your contributions.

@stale stale Bot added the wontfix This will not be worked on label May 22, 2026
Reviewer asked the author field reflect whoever modified the
script, matching the single-author convention used throughout the
rest of the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@stale stale Bot removed the wontfix This will not be worked on label May 22, 2026
@alibrohde alibrohde requested a review from unnamedd May 22, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants