Skip to content

Serializing a CArray results in a broken document. #300

@m-gallesio

Description

@m-gallesio

The way CArray is serialized just concatenates its elements with no spacing included:

This means a document gets broken if when a page's content is replaced via PdfPage.Contents.ReplaceContent(CSequence) if the new content contains any CArray.

See the reproduction code sample, reproduced here:

using PdfSharp.Pdf.Content;
using PdfSharp.Pdf.Content.Objects;
using PdfSharp.Pdf.IO;
using System.IO;

using var inputStream = File.OpenRead(args[0]);
using var document = PdfReader.Open(inputStream, PdfDocumentOpenMode.Modify);

foreach (var page in document.Pages)
{
    var newContent = new CSequence();
    foreach (var item in ContentReader.ReadContent(page))
        newContent.Add(item);
    page.Contents.ReplaceContent(newContent);
}

document.Save(Path.Combine(Path.GetDirectoryName(args[0]), Path.GetFileNameWithoutExtension(args[0]) + "_EDITED.pdf"));

This sample reads each page via ContentReader and re-creates it by just concatenating said contents.

The sample files included in the /files folder are:

  • A DOCX document containing a box with dashed borders
  • Its PDF version converted by Microsoft Word. The dashed borders are rendered via a d operator with a CArray of CReals as its operand
  • The result of processing said PDF document with the sample code.

Exactly how broken the document appears depends on the viewer; in the sample case:

  • Microsoft Edge's embedded viewer renders the box with dashed borders but ignores the original dash spacing
  • Firefox's embedded viewer renders the box correctly
  • Adobe Acrobat reader completely stops rendering the document once it reaches the dashed box
  • The original document I discovered this in breaks PDFBox's parser because it tries to read the concatenated floats as a single float

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions