Learn how to operate on PDF files in Kognitos.
PDF Operations
Getting Information from a PDF
Getting a Label
To retrieve a label from a PDF, use the following syntax:
get the pdf
get the pdf's label where
the label is "Invoice Number"
Getting a Field
To retrieve a field from a PDF, use the following syntax:
get the pdf
get the pdf's field where
the field contains "Date"
Setting or Changing Information in a PDF
This section covers how to set or change field values in a PDF file.
Setting a Field to a String
To set a field in a PDF to a string value, use the following syntax:
get the pdf
the field is "Date"
set the pdf's field to "2023-01-01"
Changing a Field to a String
To change a field in a PDF to a string value, use the following syntax:
get the pdf
the field is "Name"
change the pdf's field to "John Doe"
Setting a Field to a Number
To set a field in a PDF to a numeric value, use the following syntax:
get the pdf
the field name is "Total Amount"
set the pdf's field to 150
Changing a Field to a Number
To change a field in a PDF to a numeric value, use the following syntax:
get the pdf
the field name is "Page Count"
change the pdf's field to 5
Saving a PDF
This section covers how to save a PDF file after making changes.
Saving to a Local Path
To save a PDF to a local path, use the following syntax:
get the pdf
save the pdf to a file with
the target is "/local/path/to/save/the/pdf"
Saving to an S3 URL
To save a PDF to an S3 URL, use the following syntax:
get the pdf
save the pdf to a file with
the target is "s3://bucket-name/path/to/save/the/pdf"
Removing Duplicates from a PDF
This section covers how to remove duplicates from a PDF file.
Removing Duplicates with a Confidence Threshold
To remove duplicates from a PDF with a specified confidence threshold, use the following syntax:
get the pdf
the department's duplicate confidence threshold is 0.95
remove duplicates from the pdf
Removing Duplicates without a Confidence Threshold
To remove duplicates from a PDF without specifying a confidence threshold, use the following syntax:
get the pdf
remove duplicates from the pdf
Convert from Word Document to PDF
This operation allows you to convert from Word (.doc
or .docx
) to portable document format (.pdf
):
the file is the document
read the file as a pdf
Convert Picture to PDF
To convert a picture (.jpg
/ .png
) to portable document format (.pdf
):
convert a file to a pdf file with
the file is the picture
Merge (Document)
This operation combines multiple PDFs into a single PDF file:
get the attachments
get the above as the scanned documents
merge the scanned documents into a single document where
the document name is "statements.pdf"
Working with .tif
and .tiff
Files
.tif
and .tiff
FilesTo work with .tif
or .tiff
files in Kognitos, you can convert them to PDF format and use PDF operations on them. To convert to PDF, use the following syntax:
read a file as a pdf
Updated about 1 month ago