linux - Using Ghostscript to convert multi-page PDF into single JPG?

07
2013-09
  • Andrew

    I know Ghostscript can convert PDFs to JPGs, and in the case of a multi-page PDF, can rip each page to an individual JPG. But is it possible to have it rip them to one JPG, so that the pages are pasted below each other, e.g. the top half of the JPG is page 1, the bottom half is page 2? Or do I have to use another program (and can ImageMagick do this?) to combine the JPG pages into one image?

  • Answers
  • Dennis Williamson

    Yes, you'll have to convert each PDF page into a single JPG file (Ghostscript can do that).

    Then stitch together the resulting JPG files using another program (ImageMagick or GraphicsMagic can do that using their montage sub-commands).

    I'm not aware of any software which can do that in one go.

    PDF-to-JPG conversion (with Ghostscript): You'll want to make sure that you get the best possible result. So make sure you tweak the commandline options so they work for you. I'd start with this:

    gswin32c.exe ^
        -dBATCH ^
        -dNOPAUSE ^
        -dSAFER ^
        -sDEVICE=jpeg ^
        -dJPEGQ=95 ^
        -r600x600 ^
        -sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
        c:/path/to/pdffile.pdf
    

    This will create JPGs called pdffile-001.jpeg, pdffile-002.jpg etc. The parameter *-dJPEGQ=95" sets "JPEG Quality" to 95%. It uses a resolution of "600x600 dpi". You may need to additionally control the pagesize of the resulting JPGs in case your Ghostscript's default doesn't fit your needs:

    gswin32c.exe ^
        -dBATCH ^
        -dNOPAUSE ^
        -dSAFER ^
        -sDEVICE=jpeg ^
        -dJPEGQ=95 ^
        -r600x600 ^
        -dPDFFitPage ^
        -dFIXEDMEDIA ^
        -dDEVICEWIDTHPOINTS=800 ^
        -dDEVICEHEIGHTPOINTS=600 ^
        -sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
        c:/path/to/pdffile.pdf
    

    or

    gswin32c.exe ^
        -dBATCH ^
        -dNOPAUSE ^
        -dSAFER ^
        -sDEVICE=jpeg ^
        -dJPEGQ=95 ^
        -r600x600 ^
        -dPDFFitPage ^
        -dFIXEDMEDIA ^
        -sDEFAULTPAPERSIZE=a4 ^
        -sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
        c:/path/to/pdffile.pdf
    

    multiple-to-single-JPG-stitching with montage (ImageMagick or GraphicsMagick): The montage command (used in this example is ImageMagick) allows you to control the tiling pattern. If you use e.g. -tile 4x3 you'd get this imposition layout:

    1  2  3  4    
    5  6  7  8    
    9 10 11 12    
    

    You could use this command to stitch together 12 individual JPGs into one:

    montage ^
        -border 0  ^
        -tile 4x3  ^
        c:/path/to/jpeg-dir/pdffile-*.jpeg  ^
        c:/path/to/final.jpg
    

    Of course, montage has many dozen of additional parameters which allow you to determine background, spacing, offsets, decoration, labels, rotation, cropping, caption etc. for the input and the resulting JPG.


    EDIT: (I had wanted to give this hint already in my original answer, but forgot.) montage by default will use tile sizes of 120x120 pixels. If you want to keep the original page sizes for each tile, you have to add -geometry to the commandline. Assuming you had A4 (=595x852 pt) pages in your PDF, and you want to keep this, but also add a spacing of 11pt to the horizontal and 22 pt to the vertical direction of the tiling (plus 4pt strong gray border/frame lines around each tile), do this:

    montage ^
        -border 4 ^
        -tile 4x3 ^
        -geometry 595x842+11+22 ^
        c:/path/to/jpeg-dir/pdffile-*.jpeg ^
        c:/path/to/final.jpg
    

    EDIT 2: (Missed still another important hint.) If you do not want to lose the good image quality during the stitching/montage process, which your PDF-to-JPG conversion had created, then also add the -quality 100 parameter to your commandline like this:

    montage ^
        -border 4 ^
        -tile 4x3 ^
        -geometry 595x842+11+22 ^
        -quality 100 ^
        c:/path/to/jpeg-dir/pdffile-*.jpeg ^
        c:/path/to/final.jpg
    
  • erjiang

    Since ImageMagick has support for GhostScript built in, you can do the whole thing in one go:

    montage -tile 5 thispdfis25pages.pdf tiledoverview.jpg
    

    which will take every page and create one long jpeg of them end-to-end.


  • Related Question

    jpeg - Imagemagick PDF to JPG conversion failing
  • Scott

    I'm trying to convert the first page of a PDF to a JPG. I'm pretty sure I got this to work with certain PDFs, but is it really possible that certain PDFs are made incorrectly and cannot be converted?

    I tried running this first:

    $ convert 10-03-26.pdf[1] test.jpg
    

    And I got the follow:

    Error: /syntaxerror in readxref
    Operand stack:
    
    Execution stack:
       %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1   3   %oparray_pop   1   3   %oparray_pop   --nostringval--   --nostringval--   --nostringval--   --nostringval--   --nostringval--   --nostringval--
    Dictionary stack:
       --dict:1062/1417(ro)(G)--   --dict:0/20(G)--   --dict:73/200(L)--   --dict:73/200(L)--   --dict:97/127(ro)(G)--   --dict:229/230(ro)(G)--   --dict:14/15(L)--
    Current allocation mode is local
    ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1
    convert: Postscript delegate failed `10-03-26.pdf'.
    

    Running this instead:

    $ convert -verbose -colorspace rgb '10-03-26.pdf[1]' test.jpg
    

    I get the following:

    Error: /syntaxerror in readxref
    Operand stack:
    
    Execution stack:
       %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1   3   %oparray_pop   1   3   %oparray_pop   --nostringval--   --nostringval--   --nostringval--   --nostringval--   --nostringval--   --nostringval--
    Dictionary stack:
       --dict:1062/1417(ro)(G)--   --dict:0/20(G)--   --dict:73/200(L)--   --dict:73/200(L)--   --dict:97/127(ro)(G)--   --dict:229/230(ro)(G)--   --dict:14/15(L)--
    Current allocation mode is local
    ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1
    "gs" -q -dBATCH -dSAFER -dMaxBitmap=500000000 -dNOPAUSE -dAlignToPixels=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-g792x1611" "-r72x72" -dFirstPage=2 -dLastPage=2 "-sOutputFile=/tmp/magick-XXU3T44P" "-f/tmp/magick-XXoMKL8Z" "-f/tmp/magic2eec1F"Start of Image
    Define Huffman Table 0x00
              0   1   5   1   1   1   1   1
              1   0   0   0   0   0   0   0
    Define Huffman Table 0x01
              0   3   1   1   1   1   1   1
              1   1   1   0   0   0   0   0
    Define Huffman Table 0x10
              0   2   1   3   3   2   4   3
              5   5   4   4   0   0   1 125
    Define Huffman Table 0x11
              0   2   1   2   4   4   3   4
              7   5   4   4   0   1   2 119
    End Of Image
    convert: Postscript delegate failed `10-03-26.pdf'.
    

    Why would the conversion fail?

    Just as an aside, this is happening on a (gs) Grid-Service on (mt) Media Temple hosting. I cannot install programs on the server, but both Imagemagick and Ghostscript are installed

    Thanks!


  • Related Answers
  • Scott

    The issue was that the files need to be made compatible with Acrobat 5.0 in order to work with such an old version of Ghostscript.