Feedback

What's your question? Be descriptive.

By: [ Editor ] Asked

How to programmatically count the number of pages in a PDF?

I need a way to programmatically count the number of pages in PDF files. I'm working in a Windows environment. I don't really want to use any third party libraries.

Any suggestions?

Add comment viewed 1,095 times Latest activity over 1 year ago

or Cancel

1 answer

  • 1

rowan [ Admin ]

If you don't want to use any third party libraries then you'll need to programmatically parse the internals of the PDF yourself.

Some background: each page in a PDF is represented by a page object. The page object is a dictionary which includes references to the page's content and other attributes. The individual page objects are tied together in a structure called the page tree.

To count the number of pages all you need to do is to parse the PDF, as if it were a text file, for the /Page entry. The total number of /Page entries will equal the total number of pages in the document.

I've included an example of what one of these /Page entries might look like here:

10 0 obj    % <-- Page object
<</Type /Page
/Parent 5 0 R
/Resources 20 0 R
/Contents 40 0 R
>>
endobj

For more information about this I'd recommend taking a look at the PDF specification.

NN comments
pdf seeder
-

Thanks for the answer, very useful!

or Cancel