开云体育

ctrl + shift + ? for shortcuts
© 2025 开云体育

Re: Service manual scan post processing


 

I'm hesitant to bring this up because I'm only just barely beginning to understand it and create a workflow, but as an alternative to Adobe, there is a Google Cloud "Vision" API that does OCR of PDF files. According to ChatGPT, it does a better job than the various open source tools would, though I don't know how it compares to Acrobat.

You need a Google cloud or workspace account, and from there you set up a cloud bucket to hold the raw PDFs, and then create an API Key to the Vision API. Then a Python script can call the Google APIs to trigger conversion of the PDF to a text only document. Most of the pain is getting the bucket and API set up with the right permissions and account info.

Believe it or not, I used ChatGPT to walk me through the whole process and even write the Python script! (Which I'm happy to share.)

Google lets you process 1000 pages per month for free, and it's an additional $1.50/1000 pages thereafter. But I found that my Google Workspace account gave me a $300 credit, so I can do a lot of conversion before I have to pay any real money.

Anyway, this may be too far down the rabbit hole, but looks like it would work well for processing large numbers of documents automatically; even at $1.50 per thousand pages, it's pretty inexpensive.

John
----

On 2/17/25 10:13, Peter Brown via groups.io wrote:
Thanks, Alexandre? I will take a look.
Seems like Acrobat Pro v11 is no longer supported? - any idea experience of their current product?

Join [email protected] to automatically receive all group messages.