开云体育

Re: Service manual scan post processing


 

I use ghostscript to compress PDFs, among many other pdf operations. I never notice any quality differences in the output after compression with the default settings, and I'm usually scanning music that was poorly scanned, but has a lot of fine details that need to be easily readable when reading the music afterwards.

I also use ScanTailor Advanced for fixing up the initial scans - it fixes rotation / skew problems, and can do page splitting (if you scan or photograph 2 pages of a book simultaneously and want to split those images into separate PDF pages), re-adjust margins, de-speckle, and also has adjustable thresholds from converting from grayscale to true black and white - changing that threshold can have the effect of fattening or thinning all of the lines in the document. I find especially with poorly scanned music, thick lines lead to all the white space getting filled up, and it makes it much harder to read later, so adjusting the threshold at grayscale -> b/w conversion can really help thin the lines down, and open back up the whitespace, increasing legibility a lot.

Both ghostscrpt and ScanTailor Advanced are open source / free. My workflow usually involves other various linux tiff / pdf utilities as well, including PDFJam which is a wrapper for some Latex PDF utilities.

My typical workflow is:
poorly scanned input PDF - pages not straight / skewed, margins all over the place
convert pdf into multi-page tiff file
run through scantailor -> produces individual tiff per page
recombine those individual page tiffs into a multi-page tiff
convert multipage tiff to pdf
compress with ghostscript if needed
use PDFJam to either "n-up" (for only 2 pages) or "pdfbook"-ize (for more than 2 pages) the resulting 8.5x11 PDF onto 11x17 paper

The main command I use for ghostscrpt to compress PDF's is:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=./out.pdf [list of input files - .ps or .pdf]

I often see original pdfs which uncompressed can be a few hundred MB shrink down to a MB or 2, with no discernable loss of quality.


On Mon, Feb 17, 2025 at 6:42?AM Peter Brown via <peter=penreach.com@groups.io> wrote:
I have recently been scanning sections of microfiched service manuals for a couple of group members using a Canon MS-800
There is a significant tradeoff between file size and readability (especially with circuit diagrams)
To simplify the scanning process I have been acquiring everything at maximum equipment resolution but this leads to files that might be 200Mb+ per fiche
These are unwieldy but get the job done
?
Does anyone in the group have experience with tools that might be used to post process these scans to reduce size whilst maintaining small font fidelity?
Any recommendations?

Join HP-Agilent-Keysight-equipment@groups.io to automatically receive all group messages.