Extract a Subdocument
This procedure extracts a subdocument from a document or file.
Overview
This operation extracts a subdocument from a document or file. A subdocument is a subsection of a larger document.
Syntax
The syntax for this operation begins with extract subdocument from the document
. Nested within this statement, you must specify the subdocument to be extracted.
⚠️ If the subdocument to be extracted is not specified, a question will be raised.
Ways to Specify a Subdocument for Extraction
1. Page Numbers
Specify the start and/or end page numbers of the subdocument:
extract subdocument from the document
the start page is {start}
the end page is {end}
2. Start and End Markers
Use textual markers to define where the subdocument begins and/or ends:
Included End Marker
extract subdocument from the document
the start page marker is "{start marker}"
the end page marker is "{end marker}"
Excluded End Marker
extract subdocument from the document
the start page marker is "{start marker}"
the excluded end page marker is "{excluded end marker}"
ℹ️ Included vs. Excluded End Markers
An included end marker (or just end marker) includes the specified page in the subdocument.
An excluded end marker does not include the specified page in the subdocument.
Optional: Specifying the OpenAI Model
Optionally, an OpenAI model can be specified for the document extraction.
the openai model is "{model}"
Data
Components
This table lists the names of the data components in this operation and their properties.
the document
Yes
Required
the start page
No
Optional
the end page
No
Optional
the start page marker
No
Optional
the end page marker
No
Optional
the excluded end page marker
No
Optional
the openai model
No
Optional
Values
The table below lists the parameters in the operation. Parameters are placeholders for data values. In the operation's syntax, replace the parameters with your own data values as needed.
start
The starting page number. Defaults to the first page.
2
Optional
end
The ending page number. Defaults to the last page.
5
Optional
start marker
Text indicating where the subdocument should begin.
Section 2
Optional
end marker
Text indicating where the subdocument should end, including the specified marker.
Page is the beginning of an invoice
Optional
excluded end marker
Text indicating where the subdocument should end, excluding the specified marker.
Page containing the text 'Conclusion'
Optional
Examples
1. Extracting by Page Numbers
extract subdocument from the file
the start page is 2
the end page is 5
2. Extracting by Page Markers
Included End Marker
extract subdocument from the report
the start page marker is "Page containing the text 'Introduction'"
the end page marker is "Page containing the text 'Conclusion'"
Excluded End Marker
extract subdocument from the manuscript
the start page marker is "Page is the beginning of an invoice"
the excluded end page marker is "Page is the beginning of a different invoice"
3. Extracting by Page Numbers with OpenAI Model
extract subdocument from the transcript
the start page is 2
the end page is 5
the openai model is "gpt-4o-mini"
4. Extracting by Start Marker Only
extract subdocument from the document
the start page marker is "Invoice 230320-01"
Last updated
Was this helpful?