Extract Tables
Extracts table data from documents, text, or files using AI-powered analysis.
Overview
This procedure extracts structured table data from various sources including documents, text, and files. It uses AI to identify and parse table structures, even when they don't have clear visual boundaries. You can customize the extraction with specific descriptions, models, and processing modes for optimal accuracy.
Make sure to add the Document Processing Book to your agent before using this automation procedure.
Syntax
Below is a line-by-line overview of the automation syntax. Expand each line to learn more.
the description is "table-description"
What does it do?
Describes the table structure and content to be extracted.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
✅ Yes — This phrase is required.
Does it require data?
✅ Yes — Replace table-description with a detailed description of the table you want to extract.
Example
the description is "A table with invoice number, date and amount columns"the openai model is "openai-model"
What does it do?
Specifies the OpenAI model to use for extraction.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace openai-model with a valid OpenAI model name. The default is gpt-4o.
Example
the openai model is "gpt-4o"the gemini model is "gemini-model"
What does it do?
Specifies the Gemini model to use for extraction.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace gemini-model with a valid Gemini model name. The default is gemini-2.5-pro.
Example
the gemini model is "gemini-2.0-flash"the extraction mode is {"precise" | "no ocr"}
What does it do?
Specifies the extraction mode for improved accuracy.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Use "precise" to use location data for higher accuracy or use "no ocr" to skip OCR (Optical Character Recognition).
Example
the subdocument size is s
What does it do?
Specifies the maximum number of pages per subdocument for large documents.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace s with a number representing the page limit per subdocument.
Example
the strict mode is "mode"
What does it do?
Controls table extraction behavior.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace mode with "on" for two-pass extraction with header validation (default), or "off" for single-pass extraction without validation. The default is on.
Example
Examples
1. Extract Invoice Table with Precise Mode
2. Extract Simple Table from Text
3. Extract Table from Large Document
Last updated
Was this helpful?
