Extract Subdocuments
This procedure extracts subdocuments from a document or file.
📘 Prerequisites
Ensure you have learned the Document Processing Book before using this operation. After learning the Book, make sure to publish your Agent and create a new Playground for it to take effect.
Overview
This operation extracts one or more subdocuments from a document or file. A subdocument is a subsection of a larger document.
Syntax
The syntax for this operation begins with extract subdocuments from the document
. Nested within this statement, you must specify the subdocuments to be extracted.
⚠️ If the subdocuments to be extracted are not specified, a question will be raised.
Ways to Specify Subdocuments
To specify which subdocuments should be extracted, include one or more of the following lines under extract subdocuments from the document
:
Defining the Start Page Marker
Specifying the Subdocument Size
Specifying the Subdocument Overlap Size
OpenAI Model Specification
Data
Components
This table lists the names of the data components in this operation and their properties.
the document
Yes
Required
the start page marker
No
Optional
the subdocument size
No
Optional
the subdocument overlap size
No
Optional
the openai model
No
Optional
Values
The table below lists the parameters in the operation. Parameters are placeholders for data values. In the operation's syntax, replace the parameters with your own data values as needed.
start marker
Text indicating where a subdocument should begin.
Section 2
Optional
size
The maximum number of pages per subdocument.
10
Optional
overlap size
The number of pages that overlap between consecutive subdocuments.
1
Optional
model
gpt-4o
Optional
Examples
1. Using Start Page Marker and Additional Fields
2. Using Subdocument Size & Overlap Size
3. Using Start Marker and OpenAI Model
Last updated
Was this helpful?