Knowledge extraction and annotation fails on PDF

If the format isn’t just right in the PDF, knowledge extraction with annotation fails on PDF’s. How do you get around this?

Hello @ken.domen :smiley:,

Welcome to Kore Community :clap:

Your query is related to knowledge extraction from a pdf. As you have mentioned there is a certain prerequisite for the struct of content in a PDF for the extraction to work.

If we want to work around it, we can only modify the pdf or convert the text into structured data i.e, CSV and use the modified file for extraction.

Thank you for stopping by!!