LogoLogo
About
  • Home
  • Guides
  • BDK
  • REST API
  • Release Notes
  • OpenAI
    • Overview
  • Ask Koncierge
  • Ask Koncierge To Extract Information From A Document
  • Extract Data
  • Extract Pages
  • Extract a Subdocument
  • Extract Subdocuments
  • Extract Tables
  • Identify Elements in a Text
Powered by GitBook
On this page
  • Overview
  • Syntax
  • Ways to Specify Subdocuments
  • OpenAI Model Specification
  • Data
  • Components
  • Values
  • Examples

Was this helpful?

Export as PDF

Extract Subdocuments

This procedure extracts subdocuments from a document or file.

📘 Prerequisites

Ensure you have learned the Document Processing Book before using this operation. After learning the Book, make sure to publish your Agent and create a new Playground for it to take effect.

Overview

This operation extracts one or more subdocuments from a document or file. A subdocument is a subsection of a larger document.

Syntax

The syntax for this operation begins with extract subdocuments from the document. Nested within this statement, you must specify the subdocuments to be extracted.

⚠️ If the subdocuments to be extracted are not specified, a question will be raised.

Ways to Specify Subdocuments

To specify which subdocuments should be extracted, include one or more of the following lines under extract subdocuments from the document:

Defining the Start Page Marker

extract subdocuments from the document
	the start page marker is "{start marker}"

Specifying the Subdocument Size

extract subdocuments from the document
	the subdocument size is {size}

Specifying the Subdocument Overlap Size

extract subdocuments from the document
	the subdocument size is {size}
	the subdocument overlap size is {overlap size}

OpenAI Model Specification

extract subdocuments from the document
	the openai model is "{model}"

Data

Components

This table lists the names of the data components in this operation and their properties.

Label
Renamable
Required

the document

Yes

Required

the start page marker

No

Optional

the subdocument size

No

Optional

the subdocument overlap size

No

Optional

the openai model

No

Optional

Values

The table below lists the parameters in the operation. Parameters are placeholders for data values. In the operation's syntax, replace the parameters with your own data values as needed.

Parameter
Description
Example Value
Required

start marker

Text indicating where a subdocument should begin.

Section 2

Optional

size

The maximum number of pages per subdocument.

10

Optional

overlap size

The number of pages that overlap between consecutive subdocuments.

1

Optional

model

gpt-4o

Optional

Examples

1. Using Start Page Marker and Additional Fields

extract subdocuments from the file
    the start page marker is "Page containing the text 'Chapter'"
    The first field is "invoice number"
    The first field's format is "number"
    The second field is "invoice date"
    The second field's format is "string"

2. Using Subdocument Size & Overlap Size

extract subdocuments from the report
	the subdocument size is 5
	the subdocument overlap size is 1

3. Using Start Marker and OpenAI Model

extract subdocuments from the invoice
	the start page marker is "Page is the beginning of a new invoice"
	the openai model is gpt-"4o-mini"

Last updated 1 day ago

Was this helpful?

Optionally, the can be specified for the document extraction.

The used for the extraction.

OpenAI model
OpenAI model