Feedback

What's your question? Be descriptive.

By: [ Editor ] Asked

How to determine if PDFs are searchable

Does anyone know of a utility that can browse a directory of PDFs and identify all PDF that are image only or are searchable?

Add comment viewed 602 times Latest activity about 1 year ago

or Cancel

2 answers

  • 1

rowan [ Admin ]

I don't know of any utilities that do this off the top of my head, but I have a tip for how you would programmatically determine if a file was searchable or not.

Basically, to determine if a PDF is searchable or not, you need to check if a PDF has any font resources, if it doesn't have any font resources, then you can presume that it doesn't contain any searchable text.

So you could build your own utility that scans all PDFs in a directory and then just quickly examine the PDF using a PDF developer library to see if it contains any font resources.

or Cancel