Document Extraction: Key Components
Here we learn the Document Data Extraction - IDP work
Where to start
Before starting to build a process involving document understanding and extraction, make sure you learn the OCR book in your department before starting.
How Can I Extract Data From a Document?
Tools you can utilize to extract information from a document
- Query
- Tables
- Other ways to extract information from a document
- Lines based
- GPT(Large Language Model)
- Labels
get the document
get the document's "invoice number", "total", "invoice date"
Choosing an OCR Engine
Kognitos makes it extremely easy to apply OCR to a document. Our out of the box OCR engines include:
- AWS Textract
- Azure
- Open AI
As long as you have these Books learnt, you can specify which OCR you would like to use. Unless specified otherwise we will use Textract by default. You might learn more in the upcoming pages.
Updated 15 days ago