LogoLogo
About
  • Home
  • Guides
  • BDK
  • REST API
  • templates
    • Fill in a Template File
  • Documents
    • Document Processing: Workflow Patterns and Best Practices
    • Get a File as a Scanned Document
    • Get Fields from a Document
    • Get Tables from a Document
    • Extract Data from a Document
  • PDFs
    • Convert to PDF
  • Extract information from a PDF
  • Getting a PDF Field or Label
  • Setting Information in a PDF
  • JSON
    • Getting Data from JSON
  • Converting JSON to String
  • Modifying JSON
  • Working with Excel Files
  • Excel
  • Working with Excel Files
Powered by GitBook
On this page
  • Overview
  • Single Document Input
  • Multi-Document Input

Was this helpful?

Export as PDF
  1. Documents

Document Processing: Workflow Patterns and Best Practices

Discover how to design efficient document processing workflows in Kognitos.

Last updated 1 day ago

Was this helpful?

Overview

This guide outlines best practices for designing document processing workflows, including structuring automations for single versus multiple document inputs. It covers when to use subprocesses and when to apply run parallelization.

Single Document Input

When the automation handles one document at a time, focus on clean, modular design. Use to separate distinct stages, such as extraction, processing, and output creation, for clarity and reusability. A typical document processing workflow might look like:

# Step 1: Get attachments 
get the attachments 

# Step 2: Extract data 
extract the sales data with
  the attachment
get the answer as the data 

# Step 3: Process data 
process the sales data with
  the data 
get the answer as the sales record 

# Step 4: Create Sales Order 
create a sales order in oracle with
  the sales record 

Multi-Document Input

When the automation is processing multiple documents, you can choose between three approaches:

Choosing the right approach ensures your workflow is scalable, efficient, and easy to troubleshoot.

Typically, the calling a subprocess and invoking a subprocess methods are used when documents are independent and can be processed separately. Run parallelization is used when documents are related and their outputs need to be combined or aggregated. For a detailed comparison, refer to

subprocesses
Calling a Subprocess
Invoking a Subprocess
Start Parallel Runs
Which Mechanism Should I Use?