On Fri, Mar 28, 2025 at 03:19 PM, Jon Templin wrote:
I've uploaded the file ibm_pubs_pdf_metadata.zip which contains the data from the full run of all the files.
A small sample is elsewhere in this thread.
?
I wrote the code quickly today because I'm leaving town on vacation and I may or may not have access to this forum while away.
?
I forgot to add that I dropped "Keywords" from the final JSON because there was no such data in the vast majority of the files. The few that did duplicated the "DocumentNumber" data, so I didn't deem it useful.