File Conversion and Manipulation
Manipulation of different file types
You can convert files to different formats and manipulate then when working with Kognitos
Convert from Word (.Docx & .Doc) to PDF
Many tools require a document to be in PDF format for processing. Kognitos gives you the ability to convert the filetype from Word (.doc or .docx) to a portable document format (.pdf). Here is an example of a command that is used to convert a file as above:
the file is the document
read the file as a pdf
Convert a picture (.jpg / .png) to PDF
In many cases, a picture needs to be converted to a PDF prior to processing. This capability is build into Kognitos. Here is an example of how to convert a file to PDF:
convert a file to a pdf file with
the file is the picture
Converting CSV to JSON
To convert a CSV file to JSON format, use the following function:
import new_library
csv_file_path = 'path/to/your/file.csv'
json_file_path = 'path/to/your/file.json'
new_library.convert_csv_to_json(csv_file_path, json_file_path)
Converting JSON to CSV
To convert a JSON file to CSV format, use the following function:
import new_library
json_file_path = 'path/to/your/file.json'
csv_file_path = 'path/to/your/file.csv'
new_library.convert_json_to_csv(json_file_path, csv_file_path)
Merge (Document)
On occasion, it will be necessary to combine multiple PDFs into a single document. Here is an example where all input files are merged into a single file:
get the attachments
get the above as the scanned documents
merge the scanned documents into a single document where
the document name is "statements.pdf"
Data Extraction
Extracting Data from XML
To extract data from an XML file, use the following function:
import new_library
xml_file_path = 'path/to/your/file.xml'
data = new_library.extract_data_from_xml(xml_file_path)
print(data)
Text Processing
Transcribing Text
To transcribe text from an audio file, use the following function:
import new_library
audio_file_path = 'path/to/your/file.wav'
transcription = new_library.transcribe_audio(audio_file_path)
print(transcription)
Updated 3 months ago