Data Input File Ingestion Overview
Overview
This section provides information about the File Ingestion functionality enables you to upload files containing data in a CSV format.
The File Ingestion functionality enables you to easily upload files containing data such as Business Events in the CSV format. These data are later stored in Fynapse in proper Entities, e.g., Business Event for data from Business Event.
This mode of Journal input into Fynapse will create the same Journals according to the defined Journal Type as if they were generated from the Accounting Engine, so if your ingested Journals have Reversing Journal Type, both base and reversing Journals will be created.
File Requirements
A file must meet the following requirements to be properly ingested and processed:
- It must be in the CSV format and should have a unique name
Files with the same name can be uploaded into the system. They are differentiated by the upload timestamp. However, it is recommended to use unique names for clarity and tracking within the system.
For a list of signs that can be used in file names, refer to https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html. Additionally, the space character is properly handled in file names.
- The CSV file syntax constraints:
- Values must be separated using a comma
- Dot can be used as a decimal separator and no separator should be used for thousands
- Double quotation marks (”) are treated as varchar, e.g., “OTC,5” is understood as the value of one attribute
- The date format must be: yyyy-mm-dd
- The data structure for a particular entity must align with its definition available in Schema Repository. This means that the order of the columns defined in the CSV file has to correspond to the order of the fields defined for the selected entity.
Example of a correct CSV file with Business Event input data
Example of an incorrect CSV file with Business Event input data
The file below is incorrect because it lacks one of the mandatory fields, the Source System.
Fynapse allows the ingestion of files with UTF-8 file encoding both with and without BOM.
Uploading a file without UTF-8 encoding will result in an Unable to read file error being thrown.
Fynapse allows the ingestion of files in the gzip format, which allows to upload more data in one file. The file format is automatically detected by the system so you do not have to add a file extension i.e., .gz. Such files can be ingested only via the backend, there is no option to ingest them using the Fynapse UI. After uploading to the dedicated cloud storage folder they are decompressed to check if they meet the same requirements as files in the CSV format and to undergo the same validation and deduplication processes to be finally ingested. The details of the ingested files in the gzip format are available in the File Ingestion grid.
Validations
To eliminate problems with file processing, various technical validations are performed for each uploaded file. After the validation process is complete, you can download a file containing all error messages from the Errors column.
File-Level Validation
The following validations are performed:
- The system checks data integrity, that is if a checksum of the sent file matches the checksum of the received one. If this validation fails, then you will see a proper notification and the file will not be uploaded. You need to correct the content of the file and upload it again.
- The system blocks adding files that have extensions different from CSV to the list before uploading them. They will not be added.
File-Header Validation
The first row of a file is always treated as a header and validated, that is, the system compares column names with labels taken from the entity definition. If the header does not match, then the file is not processed and further validations are not performed. Additionally, the following error message: The file header does not match the entity definition. The expected header is: {the proper header from the definition} is added to the error file.
Record-Level Validation
Record-level validation is performed record after record, so even if there are errors in one line, the remaining successfully validated lines are still processed. The first validation checks if the number of columns is correct and if it fails for a particular record, then the other validations are not performed for this record.
The headers are not treated as data input and are only used to align the data structure of the CSV file with the definition of the particular entity. However:
- If the order of the attributes in the header does not correspond to the order of the fields defined for a particular entity
- If the name of the header contains an error, e.g., a typo
Then the header is subjected to the same record-level validation as data input records and returns a validation error.
The following validations are performed:
If you want to edit an uploaded file, ensure you use Notepad and not Excel. Editing the upload file in Excel can cause issues with formatting, which will cause an upload to fail.
Ingestion Process
The upload process consists of the following steps:
- A user uploads a file or files via Fynapse GUI.
- The system checks if the files have the CSV extension. If not, then the system blocks adding them to a list of files that will be uploaded.
- The system checks data integrity that is, if the sent files match the received ones. If this validation fails, then files will not be uploaded.
- The system performs the file-header validation. If this validation fails, then the file will not be processed.
- The system performs the record-level validation and checks if particular records are correct, and if some of the records are incorrect, only the correct records are processed (information how many of them were processed is shown in the Success column). The incorrect records are not processed and a link to a file containing invalid records along with error messages added in the new column (next to each record) is available. The user corrects invalid records and uploads a new file.
Duplicated Data
Ingesting transactional data comes with the risk of duplicating records, which have already been ingested by Fynapse. To avoid duplication of data, we implemented a mechanism that verifies the ID of incoming data upon ingestion. This verification occurs based on the Primary Key setup for the data. If incoming data is recognized as already having been ingested into Fynapse it is rejected right after ingestion and not processed. An error will be thrown by the system and logged into the Error log.
Deduplicated records are not submitted for reprocessing, as they are duplicates of records already in the system. The original record is the one submitted for processing.
Deduplication occurs at the point where data are ingested into Fynapse, either via the File Ingestion screen or direct input to the cloud storage. The window for deduplication to occur is set to 48 hours from the moment of ingestion of the record into Fynapse. After 48 hours have passed, the record will not be deduplicated.
If the Primary Key is not set, the data will not be verified, and duplicated records will appear in the system.
File Ingestion Screen
The File Ingestion screen comprises the following elements:
- The Upload button - click it to upload new files to Fynapse
- The Refresh button - click it to see if any new files were added or if their statuses changed
- The grid - use it to see details of ingested files. The grid has the following columns:
- Name - the name of an original file. You can click it to download this file.
- Ingestion ID - an ID that is used to link Journals with a particular ingestion
- Transfer date - a date and time when a file was transferred to a server
- Processing start date - a date and time when data processing started
- Processing end date - a date and time when data processing ended
- Namespace - a namespace to which the file was uploaded. The namespace is a set of entities used to prevent entities’ name collisions.
- Type - an entity type, for example, a Business Event
- Status - a status of the file processing:
- Failed - a critical error occurred or there were no valid rows in this file (all rows contained errors)
- Success - the file processing ended successfully
- Warning - the file processing ended successfully but some of the rows in this file contain errors. Check the error file to find out more about these errors.
- In Queue - the file was uploaded by the user via GUI but the file processing has not started yet
- Processing - the file is being processed (that is, rows are read, checked, and validated against errors)
- Total - the total number of processed records
- Success - a number of records that were successfully processed
- Failed - a number of records that were not processed due to errors
- Uploaded by - a user or a process that uploaded the file
- Journals - when you upload a BusinessEvent or BusinessEventRollback file and it has been processed and is in either Success or Warning status, i.e., has generated Journals. This column will contain a link that will redirect you to the Journals screen in Subledger, with Ingestion ID set as the search criterion so that you can browse Journals created by the Business Event records uploaded in the file.
- Errors - consists of a Download error file link to a file containing all incorrect records from the processed file with error messages corresponding to each invalid record. The name of the file that you can download after clicking the link uses the following syntax:
{filename}-errors.csv. You can filter the grid using the following options:- All - to show all files
- Is true - to show files that had some errors
- Is false - to show files that did not contain errors
Additionally, at the bottom of the screen, you can:
- See the number of pages that display ingested files and buttons to navigate between pages
- See the information which page you are currently viewing and how many pages are available in total
- Decide how many ingested files you want to display on one page
- See how many ingested files are presented on the page
Saving Exported Files
Fynapse uses your web browser configuration for saving downloaded files. This means that the exported file will be saved in the default download location configured in your web browser. If you configured your browser to always ask for a download location, you will be prompted to provide a location for the exported file.
Tutorials
How to Manually Upload a File?
Prerequisites: The file that you want to upload needs to have the CSV extension and cannot be empty. Empty files and files having different extensions will not be added to the list of files that will be uploaded to Fynapse.
- Go to Integration > File Ingestion.
- From the Type list, select an entity to which the file will be uploaded.
- Click the Upload button.
- Drag a file or click the Select files button.
To upload many files simultaneously, drag or select multiple files.
You cannot drag the same file twice because the duplicate will not be added to the list of files to be uploaded. Keep in mind, that none of the files will be added if one or more of the selected files is empty or is already on the list.
- Click the Upload button. The file will be validated, uploaded, and made visible in the grid. To see the current status of the processing, click the Refresh button. You can later check an error file and download the original file.
How to View an Error File?
- Go to Integration > File Ingestion.
- Find a file you uploaded by filtering the column by the file name.
- Click the Download error file link in the Errors column.
- You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded. It contains all incorrect records from the original file that were not processed with an error description next to each record.
How to Correct Invalid Data?
- Go to Integration > File Ingestion.
- Find a file you uploaded by filtering the column by the file name.
- Click the Download error file link in the Errors column.
- You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded.
- Based on the error file, correct invalid records in your new file and upload it to Fynapse.
If you want to edit an uploaded file, please ensure you use Notepad and not Excel. Editing the upload file in Excel can cause issues with formatting, which will cause an upload to fail.
How to Download an Original File?
- Go to Integration > File Ingestion.
- Find an original file by filtering the column by the file name.
- Click the name of the file you want to download.
- You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded.
How to View Journals Created from a Selected Ingest File?
- Go to Integration > File Ingestion.
- Find a file you uploaded by filtering the column by the file name.
- Click the Open journals link in the Journals column. You will be redirected to the Subledger > Journals where Ingestion ID is set as a search criterion so that you can view all Journals created out of this data ingestion.