Extracts table data from documents, text, or files using AI-powered analysis.
Overview
This procedure extracts structured table data from various sources including documents, text, and files. It uses AI to identify and parse table structures, even when they don't have clear visual boundaries. You can customize the extraction with specific descriptions, models, and processing modes for optimal accuracy.
Make sure to add the Document Processing Book to your agent before using this automation procedure.
Syntax
Below is a line-by-line overview of the automation syntax. Expand each line to learn more.
extract a table from {the source}
What does it do?
Begins table extraction from the specified source.
Where does it go?
This phrase should be written on a new line.
Is it required?
✅ Yes — This phrase is required.
Does it require data?
✅ Yes — Replace the source with the document, text, or file from which to extract the table.
the description is "table-description"
What does it do?
Describes the table structure and content to be extracted.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
✅ Yes — This phrase is required.
Does it require data?
✅ Yes — Replace table-description with a detailed description of the table you want to extract.
Example
the description is "A table with invoice number, date and amount columns"
the openai model is "openai-model"
What does it do?
Specifies the OpenAI model to use for extraction.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace openai-model with a valid OpenAI model name. The default is gpt-4o.
Example
the openai model is "gpt-4o"
the gemini model is "gemini-model"
What does it do?
Specifies the Gemini model to use for extraction.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace gemini-model with a valid Gemini model name. The default is gemini-2.5-pro.
Example
the gemini model is "gemini-2.0-flash"
the visual reference is the document
What does it do?
Specifies a visual reference to guide the extraction.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — A reference to the document must be defined in the automation.
the extraction mode is {"precise" | "no ocr"}
What does it do?
Specifies the extraction mode for improved accuracy.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Use "precise" to use location data for higher accuracy or use "no ocr" to skip OCR (Optical Character Recognition).
Example
the extraction mode is "precise"
the subdocument size is s
What does it do?
Specifies the maximum number of pages per subdocument for large documents.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace s with a number representing the page limit per subdocument.
Example
the subdocument size is 5
the creativity is x
What does it do?
Adjusts the creativity of the response.
Where does it go?
Indented under extract a table from {the source}.
Is it required?
❌ No — This phrase is optional.
Does it require data?
✅ Yes — Replace x with a number between 0.0 and 1.0. Higher values produce more creative responses.
Example
the creativity is 0.2
Examples
1. Extract Invoice Table with Precise Mode
extract a table from the document where
the openai model is "gpt-4o"
the visual reference is the document
the extraction mode is "precise"
the description is "The table has invoice number, date and amount columns. Remove currency symbols from amounts. Keep only the first 4 digits of invoice numbers."
2. Extract Simple Table from Text
extract a table from the text where
the description is "A table with employee names and their departments"
the gemini model is "gemini-2.0-flash"
3. Extract Table from Large Document
extract a table from the document where
the openai model is "gpt-4o"
the subdocument size is 3
the description is "Transaction history table with date, description, and amount columns"