A 48MB PDF sits between you and an email send button. The attachment limit is 25MB. You could split the document, but that defeats the purpose — the recipient needs the whole thing. What you actually need is a smaller file that looks identical to the original.
PDF compression can shrink files dramatically, but the wrong settings will turn sharp text into mush and crisp photos into pixelated blocks. The trick is understanding what's inside your PDF and which compression knobs to turn. This guide covers the mechanics: what makes PDFs large, how compression works at a technical level, and which approaches preserve quality while cutting file size.
Why PDFs get large
A PDF is a container format. The file size depends entirely on what's inside the container. Three things account for nearly all the bloat:
Embedded images
Images are the biggest offender by far. A single uncompressed 300 DPI photograph at letter size occupies roughly 25MB of raw pixel data. Most PDF creators apply some image compression, but the defaults vary wildly. Scanners in particular tend to produce enormous files — a 10-page scanned document can easily hit 80MB because each page is a full-resolution raster image.
The image format matters too. Some PDF generators embed images as uncompressed bitmaps or lossless PNG data when JPEG would be perfectly fine for photographs. Others use JPEG but at quality 100, which produces files almost as large as uncompressed while adding compression artifacts (JPEG at quality 100 is still lossy — it just wastes bytes trying to minimize that loss).
Embedded fonts
PDFs embed the fonts they use to guarantee consistent rendering on any device. A full TrueType font file can be 500KB to several megabytes. If a document uses five fonts (body, bold, italic, headings, code), that's potentially 5–15MB of font data alone — even if the document only uses a fraction of the characters in each font.
Font subsetting solves this: instead of embedding the entire font, the PDF includes only the glyphs actually used. A document that uses 200 unique characters from a font only needs those 200 glyphs, not all 3,000. Most modern PDF generators subset by default, but older tools and some exporters don't.
Duplicated objects and metadata
PDFs built by merging multiple source documents often contain duplicated resources — the same font embedded three times, the same ICC color profile repeated on every page, identical images stored as separate objects. Each redundancy adds to the file size without adding any visible content. A well-optimized PDF deduplicates these objects so each resource appears once and is referenced everywhere it's used.
Lossless vs lossy compression
This is the core distinction. Every compression decision you make falls into one of these two categories.
Lossless compression
Lossless compression reduces file size without changing the content at all. The decompressed output is bit-for-bit identical to the original. Techniques include:
- Object stream compression (Flate/zlib). The PDF spec supports compressing internal object streams using zlib deflation — the same algorithm behind gzip and PNG. This is pure overhead removal. If a PDF was written without stream compression (some older generators skip it), applying Flate encoding can cut file size by 20–40% with zero quality impact.
- Object deduplication. Removing duplicate fonts, images, and metadata objects. Merged documents benefit the most. Savings vary: a document assembled from 20 separate PDFs sharing the same corporate template might shrink by 50% just from deduplication.
- Unused object removal.PDFs accumulate dead objects over time — earlier revisions of pages, deleted annotations, orphaned form fields. Stripping these out is called "linearizing" or "garbage collecting" the object tree. It's safe and sometimes recovers significant space in heavily edited documents.
- Font subsetting. Trimming embedded fonts to only the glyphs used in the document. A font that originally embedded 3,000 glyphs but only uses 150 can shrink from 2MB to under 100KB. No visual change whatsoever.
Lossless compression is always worth doing. There is no trade-off — the output looks identical and the file is smaller. If your PDF was produced by a tool that doesn't optimize well (Word's PDF export, many scanners, some web-to-PDF converters), lossless compression alone can dramatically reduce file size.
Lossy compression
Lossy compression reduces file size by permanently discarding information. The output looks similar to the original but is not identical. The question is whether the differences are noticeable:
- Image recompression. Re-encoding embedded images at a lower JPEG quality level. Dropping from quality 95 to quality 75 typically shrinks images by 3–5x with differences invisible at normal viewing distance. Below quality 50, compression artifacts become noticeable on photographs. Text rendered as images (scanned documents) shows artifacts sooner.
- Image downsampling. Reducing image resolution. A 600 DPI scan displayed on screen at 100% zoom uses about 96 DPI. The extra 504 DPI worth of pixels are invisible on screen and only matter for high-quality print. Downsampling to 150 DPI cuts image data by roughly 94% compared to 600 DPI. For documents that will only be viewed on screens, 150 DPI is usually sufficient. For documents that will be printed, 300 DPI is the standard target.
- Color space conversion. Converting images from CMYK (4 channels) to RGB (3 channels) saves 25% per image. If the document is only going to screens, not to a commercial printer, the CMYK data is unnecessary. Some compression tools do this automatically.
How Ghostscript compression works
Ghostscript is the engine behind most PDF compression tools — including many online services that don't mention it. When you upload a PDF to a "free PDF compressor," there's a good chance Ghostscript is doing the work on the server side.
Ghostscript provides four preset quality levels through its -dPDFSETTINGS flag:
/default— A balanced middle ground. Images are downsampled to 150 DPI and compressed with medium JPEG quality. Suitable for general-purpose documents./screen— Aggressive compression. Images downsampled to 72 DPI, low JPEG quality. Produces the smallest files but with visible quality loss on images. Acceptable for documents that will only be viewed on screen at normal zoom./ebook— Slightly better than screen. Images at 150 DPI with moderate quality. A good balance for documents that might be viewed on high-DPI tablets or printed occasionally./printer— 300 DPI images with high quality. The file size reduction comes mainly from font subsetting, object deduplication, and stream compression. Minimal visible impact on image quality./prepress— Preserves everything for commercial printing. Minimal compression. Use this when you need the smallest possible reduction without touching image quality at all.
The command looks like this:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dBATCH -sOutputFile=compressed.pdf input.pdfFor finer control, you can override individual settings. For example, to keep 300 DPI images but still apply font subsetting and object compression:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -dColorImageResolution=300 -dGrayImageResolution=300 -dMonoImageResolution=300 -dNOPAUSE -dBATCH -sOutputFile=compressed.pdf input.pdfOne important caveat: Ghostscript re-renders the PDF internally. It parses the input, builds its own representation, and writes a new output. This process can subtly change font hinting, line weights, and color values. For most documents the differences are imperceptible. But if you're compressing a design file where exact color reproduction matters, test the output carefully.
Compression strategies by document type
The right compression approach depends on what your PDF contains:
Scanned documents
Every page is a raster image, so image compression is everything. Start by checking the scan resolution — many scanners default to 600 DPI, which is excessive for most uses. Downsampling to 200–300 DPI and applying JPEG compression at quality 60–75 will typically reduce a scanned document to 10–20% of its original size. If the scans are black-and-white text, converting to grayscale (or 1-bit monochrome with CCITT Group 4 compression) is even more effective.
Office-generated PDFs (Word, PowerPoint, Excel)
These tend to have unsubsetted fonts and unoptimized images pasted from other sources. Lossless compression (font subsetting + object cleanup) often cuts 30–50% without any quality impact. If the document contains high-resolution photos, downsampling to 150–200 DPI adds further savings.
Design-heavy PDFs (InDesign, Illustrator exports)
These files contain vector graphics, CMYK color data, spot colors, and embedded ICC profiles. Aggressive compression can damage color accuracy or flatten transparency in unexpected ways. Stick to the /prepress or /printersettings in Ghostscript. Apply lossless optimizations only. If the file needs to go to commercial print, don't compress at all — send the original.
Text-heavy PDFs (reports, manuals, academic papers)
Most of the file size comes from fonts and a few embedded figures. Font subsetting is the biggest win. Images can usually be downsampled to 150 DPI without anyone noticing. These documents compress well because text itself is extremely compact — the entire text content of a 200-page novel is about 500KB before fonts.
Tools for compressing PDFs
Browser-based (no installation)
Our Compress PDF tool applies lossless optimizations directly in your browser — your file never leaves your device. For documents where the bloat comes from inefficient encoding rather than oversized images, this is often enough.
For heavier compression involving image resampling and Ghostscript-level processing, you'll need a tool that runs server-side. Many online services (iLovePDF, Smallpdf, Adobe Acrobat online) offer this, though your file is uploaded to their servers for processing.
Command-line
Ghostscript (covered above) is the most powerful option. Install with brew install ghostscript on macOS or sudo apt install ghostscript on Ubuntu. For lossless-only optimization, qpdf is faster: qpdf --optimize-images --linearize input.pdf output.pdf.
Desktop applications
Adobe Acrobat Pro has "Save As > Reduced Size PDF" and "Save As > Optimized PDF" (which gives granular control over image compression, font handling, and object cleanup). Preview on macOS has a "Reduce File Size" Quartz filter under File > Export — it's aggressive and gives you no control over settings, but it's built into every Mac.
A practical compression workflow
For most situations, this sequence produces the best results with minimal risk:
- Check the file size breakdown. Open the PDF in Acrobat Pro or run
pdfinfofrom poppler-utils. Identify whether images, fonts, or structural overhead is the main contributor. - Apply lossless compression first. Run font subsetting, object deduplication, and stream compression. These are free — zero quality cost, guaranteed smaller output.
- If the file is still too large, apply targeted lossy compression. Downsample images to the minimum resolution your use case requires (150 DPI for screen, 300 DPI for print). Apply JPEG compression at quality 70–80.
- Compare the result.Open both files side by side at 100% zoom. Check images, fine text, and any diagrams with thin lines. If the compressed version looks identical at your intended viewing conditions, you're done.
FAQ
Will compression remove text from my PDF?
No. Text in a PDF is stored as character codes with position data — it's already extremely compact. Compression tools don't touch the text layer. Font subsetting removes unused glyphs from the font file, but the characters actually used in the document are preserved.
Can I compress a password-protected PDF?
You'll need to unlock the PDF first. Compression tools need to read and rewrite the internal structure, which encryption prevents. Remove the password, compress, then re-apply password protection if needed.
Why did my compressed file get larger?
This happens occasionally. If the input PDF was already well-optimized, some compression tools add overhead (metadata, a new cross-reference table) that outweighs the savings. Ghostscript in particular can increase file size on already-compressed PDFs because it re-renders and re-encodes everything from scratch. If your file gets larger, the original was already efficient — keep it as is.
What's the smallest I can make a PDF without quality loss?
It depends entirely on the content. A text-only document with subsetted fonts can be a few hundred kilobytes regardless of page count. A document with photographs has a floor determined by the image data — you can't make a 10-megapixel photo smaller than about 200KB at JPEG quality 50, and quality suffers noticeably below that. The practical answer: apply all lossless optimizations, then decide if you're willing to resample images. That gets you to the minimum without visible degradation.