Skip to content

Some attachment properties are null terminated #464

@martinburchell

Description

@martinburchell

Bug Metadata

  • Version of extract_msg: [0.54.1]
  • Your python version: Python [3.9.22]
  • How did you launch extract_msg?
    • I used the extract_msg package

Describe the bug
First of all, thanks for providing msg-extractor. It's going to save us a ton of effort.

I've noticed that with the .msg files I'm seeing in the real world, some attachment properties, such as mimetype, dislayName and extension, are terminated with \x00. I don't see this with your unicode.msg example. Workaround is to call str.replace("\x00", "") wherever needed.

What code did you use or can we use to reproduce this error?

Run this script with the attached .msg file (remove .txt extension first!).

#!/usr/bin/env python

import sys

from extract_msg import openMsg

filename = sys.argv[1]

message = openMsg(filename, delayAttachments=False)
print(f"Properties for {filename}:")
for attachment in message.attachments:
    for name in ("mimetype", "displayName", "extension"):
        if value := getattr(attachment, name):
            print(f"{name}: {value.encode()}")

Is there a message.msg file you want to share to help us reproduce this?

  • [X ] Uploaded message (drag and drop on this window)

null_terminated_attachment_properties_test.msg.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions