Skip to main content
Nintex Community Menu Bar

Analyze PDF - Get PDF Page Text (Formatted)


I am curious if anyone has any insight into how the PDF text extraction works behind the scenes. The results produced using the command are different than any other method I have used in the past for extracting the text from PDF files using code. In some cases that's a good thing, in other cases not so much. Just curious if anyone had any ideas.

Translate

3 replies

@andy Brommel​  you want us to share our top secrets? ?

@Ivgeni Rapoport​ what can we share on this?

 

Translate

Kryon's "read PDF" command works on searchable PDFs, which means that there is a text layer that exists inside the PDF, and our command extracts this layer to the string variable. the "formatted" option is adding Tabs and newlines into the string.

Translate

  • Author
  • 1 reply
  • August 29, 2022

Thanks for the insight! After I started digging "under the hood" (through the application folders) I believe I was able to gain a better understanding of how it works. I really appreciate the assistance.

Translate

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings