-
Notifications
You must be signed in to change notification settings - Fork 139
Description
Email attachments with a text/* content-type need to identify the encoding used for non-ASCII characters in the charset MIME parameter. Anymail (v13.1 and earlier) is not currently passing charset to ESPs whose APIs support it, which can lead to mojibake if the recipient's email app guesses the wrong charset.
Based on recent Unicode testing related to #448, here's each ESP's support for text attachment charsets and what Anymail needs to do:
- ESP APIs that allow
charset=...in their attachment type fields. Anymail needs to provide the charset for text attachments (it's a bug that we don't):- Postmark (
ContentTypefield) - Mailgun (in the multipart/form-data field's Content-Type header, which requests allows passing in the
fileslist as the third element of a (name, content, content_type) tuple) - Mailtrap (
typefield) - Mandrill (
typefield) - Resend (
content_typefield, added to Resend's API at some point and not currently sent by Anymail) - Scaleway (
typefield) - Sparkpost (
typefield) - Unisender Go (
typefield)
- Postmark (
- ESP APIs that don't have a way to specify the charset. Anymail should ensure any text attachment content is encoded as utf-8 before calling the API:
- Brevo: API guesses content-type from filename extension; seems to unconditionally add
charset=utf-8to all text attachments (so Anymail should ensure utf-8 encoding) - Mailersend: guesses content-type from filename extension; no way to get a
charseton a text attachment (⚠️ and document that lack of charset can cause mojibake in some email clients) - Mailjet: accepts
charset=...in attachmenttypefield and includes that in the attachment headers, but also unconditionally addscharset=utf-8for text attachments. To avoid duplicate, conflicting charset headers, Anymail should just ensure utf-8 encoding.
- Brevo: API guesses content-type from filename extension; seems to unconditionally add
- Already works correctly:
- Amazon SES (Anymail uses Python's email package to build the raw MIME message, which handles attachment charset correctly)
- Unknown:
- Postal
- Sendgrid: Sendgrid's API includes a
typefield, but it didn't supportcharsetin 2019—see Sendgrid: charset of text attachment is always iso-8859-1 #150 (and I no longer have access to check if that's changed)
While we're updating the docs, should also note ESPs that incorrectly send Unicode attachment filenames as raw 8-bit utf-8 (in violation of rfc2231). This can lead to mojibake filenames:
- Brevo
- Mailtrap (they have a fix in the pipeline)
- Mandrill (see Mandrill: Attachment file name characters garbled #257)
- Sendgrid (tested 1/2025)
Nearly all of the other ESPs incorrectly send attachment filenames using rfc2047. This is invalid, but a lot of email clients seem to allow it, and because rfc2047 includes the charset you at least won't get mojibake. (The two that handle Unicode attachment filenames correctly are Mailjet, which correctly uses rfc2231, and Amazon SES, because we let Python build the raw MIME message.) I'm inclined not to document this unless someone can identify an email app that displays the undecoded =?utf-8?...?= rfc2047 encoded-word rather than the decoded Unicode attachment filename.