PDFsharp/MigraDoc: Get Smaller PDF Files by Tweaking Compression Options

PDFsharp has some options you can use to control compression.
You can activate some options to generate smaller PDF files – at the price of longer creation times.

The most commonly used filter (encoder and decoder) in PDF is “/FlateDecode”. It’s the same encoder used for ZIP files. Many ZIP tools allow you to optimise for size or optimise for speed – and so does PDFsharp.

PDF allows to use multiple filters for one object. PDFsharp supports this in some cases.

One case where two filters can be combined is embedded JPEG files. JPEG files use a very efficient compression – but sometimes ZIP can squeeze it another 1% or 2% or 3%. When you enable zipping for JPEG images, then PDFsharp will ZIP the image to see if that reduces the file size, but will use the zipped file only if it is smaller. You can also enforce zipping of JPEG files – then the zipped version will be used even if it is larger.

For monochrome images (bilevel images) PDFsharp can use the compression method used by fax machines. If you enable fax compression in PDFsharp then PDFsharp will try fax compression for bilevel images and will use the smallest version – this can be the zipped version, the version using fax encoding, or a combination of both.

When you enable zipping for JPEG files or fax encoding for bilevel images then PDFsharp has to do some extra work during PDF creation. And sometimes this work will be in vain, leading to larger compressed objects that will be discarded.

So when should I use that extra compression?

You decide. When you create a static file that will be stored on a server for download by many visitors then using the strongest compression looks like a good idea. You lose a bit time during the one-time creation, but every visitor will save a bit of time when downloading the PDF.
If the static PDF file is placed inside a ZIP file then maybe skip the extra compression – since it gets zipped anyway, the visitor will download a smaller version anyway.
If you create files on demand then it is a trade-off: use extra compression on a fast server if visitors use a slow connection, do not use extra compression on a slow server or if visitors use fast connections.

Here’s how to do it

First thing: set the compression options before adding any contents to the PDF file.
In the code snippets, “document” is my object of class PdfDocument.
To create small PDF files, I recommend these settings:

document.Options.FlateEncodeMode =
    PdfFlateEncodeMode.BestCompression;
document.Options.UseFlateDecoderForJpegImages =
    PdfUseFlateDecoderForJpegImages.Automatic;
document.Options.NoCompression = false;
// Defaults to false in debug build,
// so we set it to true.
document.Options.CompressContentStreams = true;

If speed is important, then maybe try these settings:

document.Options.FlateEncodeMode =
    PdfFlateEncodeMode.BestSpeed;
document.Options.NoCompression = true;
document.Options.CompressContentStreams = false;

And how to do it with MigraDoc?

The “PdfDocumentRenderer” class creates the “PdfDocument” object when needed. Just create the PdfDocument object yourself before doing anything with the renderer.

PdfDocumentRenderer pdfRenderer =
    new PdfDocumentRenderer(true);
pdfRenderer.PdfDocument = new PdfDocument();
pdfRenderer.PdfDocument.Options.FlateEncodeMode =
    PdfFlateEncodeMode.BestCompression;

This post applies to PDFsharp 1.50. Older versions may not have the settings described here, newer versions may have more or different options.

What can I expect?

Do not expect miracles.
With my small test PDF I got 18,677 bytes with the default settings, 21,844 bytes with the BestSpeed setting, and 18,243 bytes with the BestCompression setting and zipped JPEG files.
With my large test PDF I got 9,394,532 bytes with the default settings, 9,397,783 bytes with the BestSpeed setting, and 9,333,141 bytes with the BestCompression setting and zipped JPEG files.

What else can I do

300 dpi should normally be enough for PDF files printed “at home” while even 100 or 150 dpi are enough for PDFs viewed on the screen. You can drastically shrink your PDF files when you reduce 20 megapixel photos from modern cameras to around 1 megapixel (depends on how large you show them in the PDF) and maybe save images with a lower JPEG quality.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.