Skip to content

M4A: calling Save() twice on the same TagLib.File instance corrupts the container #378

@noriyahd28v

Description

@noriyahd28v

Summary

When an MPEG-4 (M4A/M4B) file is opened with TagLib.File.Create(...) and Save() is called more than once on the same instance, the second save can leave uninitialized bytes inside moov. The next reader interprets those bytes as a stray atom with an enormous size header (e.g. "Box header specified a size of 809055744 bytes but only 14831634 bytes left in the file"), which causes PossiblyCorrupt to become true and Writeable to become false. Subsequent saves on that file then throw InvalidOperationException: File not writeable.

The corruption is not visible to lenient parsers (ffprobe still reads tags and decodes audio fine), but strict players treat the file as broken.

Affected version

  • TagLibSharp 2.3.0 (latest on NuGet as of 2026-05)
  • Reproduced on .NET 8 / Windows 11

Minimal reproduction

1. Generate a small input M4A with ffmpeg

ffmpeg -y -f lavfi -i "sine=frequency=440:duration=5" \
       -f lavfi -i "color=c=red:s=400x400:d=5" \
       -map 0:a -map 1:v -c:a alac -c:v mjpeg -frames:v 1 \
       -metadata title="Original" -metadata track="1/10" -metadata disc="1/1" \
       -disposition:v attached_pic \
       input.m4a

2. Drive it with TagLibSharp

using TagLib;

var src = "input.m4a";

void RunCase(string label, string dst, Action<TagLib.File> run)
{
    System.IO.File.Copy(src, dst, overwrite: true);
    using var f = TagLib.File.Create(dst, ReadStyle.Average);
    run(f);
    using var verify = TagLib.File.Create(dst, ReadStyle.Average);
    Console.WriteLine($"[{label}] PossiblyCorrupt={verify.PossiblyCorrupt}");
    if (verify.CorruptionReasons != null)
        foreach (var r in verify.CorruptionReasons) Console.WriteLine($"   {r}");
}

// CASE 1 — same instance, two Save() calls
RunCase("two-saves on same instance", "case1.m4a", f =>
{
    f.Tag.Title = "First";
    f.Tag.Track = 1;
    f.Save();
    f.Tag.Title = "Second";
    f.Save();
});

// CASE 2 — fresh TagLib.File for each save
{
    System.IO.File.Copy(src, "case2.m4a", overwrite: true);
    using (var f = TagLib.File.Create("case2.m4a", ReadStyle.Average))
    {
        f.Tag.Title = "First"; f.Tag.Track = 1; f.Save();
    }
    using (var f = TagLib.File.Create("case2.m4a", ReadStyle.Average))
    {
        f.Tag.Title = "Second"; f.Save();
    }
    using var verify = TagLib.File.Create("case2.m4a", ReadStyle.Average);
    Console.WriteLine($"[fresh per save] PossiblyCorrupt={verify.PossiblyCorrupt}");
}

3. Output

[two-saves on same instance] PossiblyCorrupt=True
   Box header specified a size of 134243954 bytes but only 2046 bytes left in the file
[fresh per save] PossiblyCorrupt=False

Expected behavior

Calling Save() multiple times on the same TagLib.File instance for an M4A should leave the file in a structurally valid state, identical to disposing and reopening between saves.

Actual behavior

After the second save, moov ends with a region of bytes that are read back as a stray atom with a garbage size header. The atom appears immediately after the last legitimate child of moov (typically udta):

type:'udta' parent:'moov' sz: 770951
type:'[0][8][0]f' parent:'moov' sz: 809055744   ← stray
type:'mdat' parent:'root' ...

The 8 bytes interpreted as the stray atom header are uninitialized — the exact "size" value varies between runs.

Sensitivity

The corruption is not deterministic with respect to which tags are written. A handful of combinations from a matrix test on the same input file:

second-save tag set corrupt
only Track yes
only Disc yes
AlbumArtists + Title no
AlbumArtists + Track no
AlbumArtists + Title + Track + Disc yes
full set (all standard tags including Track/Disc) no

Setting Track or Disc (the integer-typed properties that map to trkn/disk boxes) together with a small but not full subset of other tags is the most reliable trigger.

Probable area

TagLib.Mpeg4.File.Save and the udta/ilst rewrite path. When the new udta is shorter than the old, padding is added — but in the scenarios above the padding write appears not to fully cover the freed region inside moov. The leftover bytes are previous file contents, which become the stray atom on the next read.

Workaround

Always create a fresh TagLib.File for each save:

using (var f = TagLib.File.Create(path)) { /* set tags */; f.Save(); }
// Next save: open again, do not reuse the previous instance
using (var f = TagLib.File.Create(path)) { /* set tags */; f.Save(); }

This is what we have applied in our application; it eliminates the corruption end-to-end.

Impact context

We hit this in a tag editor where each file's TagLib.File instance is kept alive for the lifetime of the GUI. Users would batch-edit tags, save, then edit one more field and save again — the second save corrupted the M4A. Recovery requires ffmpeg -c copy -movflags +faststart to remux the container.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions