Extract Data from Multiple Documents
Extracts data from multiple documents in a single LLM call using file upload.
Overview
This procedure extracts data from multiple documents simultaneously by uploading all documents to the LLM provider and making a single API call. This is more efficient than processing documents individually and is particularly useful for batch processing of large document sets. The documents are uploaded using the LLM provider's Files API (OpenAI or Gemini).
Before using this procedure, ensure you have added the Document Processing Book to your agent. After learning the Book, make sure to create a new Playground for it to take effect.
Syntax
Below is a line-by-line overview of the automation syntax. Expand each line to learn more.
Examples
1. Extract Multiple Fields from Multiple Documents
the documents are the list of files
extract data from the documents
the openai model is "gpt-4o"
the first field is "invoice number"
the first field's format is "string"
the second field is "invoice date"
the second field's format is "date"
the third field is "total amount"
the third field's format is "number"Returns a list of JSON objects, one for each document with extracted fields.
Last updated
Was this helpful?
