How to convert a PDF to grayscale from command line avoiding to be rasterized?
A bit late in the day, but the top answer doesn't work for me with a different file. The underlying problem appears to be old code in Ghostscript, for which there is a later version that is not enabled by default. More on that here: http://bugs.ghostscript.com/show_bug.cgi?id=694608
The page above also gives a command that works for me:
gs \
-sDEVICE=pdfwrite \
-dProcessColorModel=/DeviceGray \
-dColorConversionStrategy=/Gray \
-dPDFUseOldCMS=false \
-o out.pdf \
-f in.pdf
If you crack into the file, you'll find that most of the colors are determined through an RGB ICC based color space (look for 8 0 R
to find all the references to this colorspace). Perhaps gs is complaining about that?
Who knows.
The take away is that converting a page from one colorspace to another without affecting the content is non-trivial in that you need to be able to render the page and trap all changes to the current color/colorspace and substitute an equivalent in the target space as well as convert all image XObjects in the wrong colorspace, which will require decoding the image data and re-encoding it in the target space, as well as all form XObjects, which will be a task similar to trying to convert the parent page since form XObjects (I think your doc has 4) also contain resources and a content stream of page marking operators (which may include more XObjects).
It's certainly doable, but the process is nearly the same as rendering but with some fairly special-purpose code.
gs \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceGray \
-sColorConversionStrategy=Gray \
-dOverrideICC \
-o out.pdf \
-f page-27.pdf
This command converts your file to grayscale (GS 9.10).
Use the most recent code (not yet released) and set ColorConversionStrategy=Gray