Extract Data
Extracts data from sources like texts, documents, or files using LLM models.
Last updated
Was this helpful?
Extracts data from sources like texts, documents, or files using LLM models.
Last updated
Was this helpful?
This procedure extracts data from texts, documents, and files. Using AI models, it identifies and retrieves the text content, making it easy to access and work with the information in those documents. This procedure can be used to get information from a PDF, document, or file.
Required
source
A variable to represent the information source to extract data from
Examples: document
, file
, text
.
field
A field to be extracted from the text or file.
You can add any number of fields, but at least 1 is required.
Examples: invoice number
, date
.
Optional
creativity
A number that controls the creativity of the response. Higher values produce more creative responses.
Default: 0.0
Range: 0.0-2.0.
model
The OpenAI model to use to generate the response.
Default: gpt-4o-latest
gemini model
Specifies the Gemini model to use to generate the response.
A model must be specified when using this field; no default is set.
Example: gemini-2.5-pro
output format
The desired format of the response.
Default: string
Allowed Values:
string
text
date
table
list of texts
list of numbers
list of dates
list of records
structured data
json
visual reference
A variable to represent a document or image that serves as a visual reference, helping GPT enhance its accuracy. This refers to data defined earlier in the automation.
Examples the file
, the document
, the text
field rule
The rule to be followed for the field extraction.
field default
The field's default value.
field format
The format of the fields that need to be extracted from the text.
Possible values: number
, string
, date
.