Hi Dave,
My comments embedded
On Tue, Sep 1, 2020 at 01:46 AM, Dave Brown wrote:
I am of the minority opinion here. The file is linear. You grab the scroll bar
and scroll to where you need to be. For me it's quick.
That's true when looking for Tek part numbers. The added value of OCR here would be *reverse* use: "What's the Tek no. of a 2N4275?" Unless the reverse list is available.
I'm just waiting for someone to post an OCR'd/searchable file.
I was trying to find out if posting such file was acceptable, see my earlier messages.
I would like to copy and paste a couple of pages and see how well it matches the RPR.
The OCR'ed pages don't look different from the original because the searchable text normally is invisible "behind" the normally visible level.
Believe me, I tried different tools and the accuracy was absolutely awful. My opinion
is that a false search is worse than no search, hence the lack of OCR. I don't
think microfiched 132 column computer printout works all that well for this
purpose.
After looking further into my results this evening I have to agree with you, unfortunately.
Tektronix gave the museum a release. I personally spent 80+ hours scanning
these documents and put a lot of care into them. I have an OCR version of the
670 RPR. Here's what the part numbers look like when I copy and paste to see
what's really there..
670-0070-01
67o-oo71-00
67(R)07~:.:oo
670-0073-00
:67(R)07S~tr
A full search of this document is taking a very long time and it will never
find 670-0071-00. Your mileage may vary. I chose to use the scroll bar as it
is faster and accurate. Somebody prove me wrong. I'd like to know what tool
you used that gave accurate results.
As I wrote above, my results have on further study shown to be quite disappointing as well. A certain category of items, like module numbers, serial number groups etc. were pretty successful but that's not what the effort was for. Searches on e.g. a JEDEC number are far less successful and that's what I would consider an important benefit. Obviously, I didn't try "looking for" Tek numbers, because that's not what would be interesting: Simple scrolling gets you there. That is never what OCR'ing is about. It is for searches like the example I gave above.
It is my impression that the discussion in this thread has been hampered by misunderstanding of what was to be achieved: Smaller files, OCR and if so, for what purpose. Especially the latter two have caused confusion.
It turns out that OCR'ing the available files does not produce satisfactory results, as stated early during the discussion. That aspect was confused with why one would want it at all.
Raymond