Data Input File Ingestion Overview

A guide to file ingestion in Fynapse.

Overview

This section provides information about the File Ingestion functionality enables you to upload files containing data in a CSV format.

The File Ingestion functionality enables you to easily upload files containing data such as Business Events in the CSV format. These data are later stored in Fynapse in proper Entities, e.g., Business Event for data from Business Event.

This mode of Journal input into Fynapse will create the same Journals according to the defined Journal Type as if they were generated from the Accounting Engine, so if your ingested Journals have Reversing Journal Type, both base and reversing Journals will be created.

File Requirements

A file must meet the following requirements to be properly ingested and processed:

  • It must be in the CSV format and should have a unique name

Files with the same name can be uploaded into the system. They are differentiated by the upload timestamp. However, it is recommended to use unique names for clarity and tracking within the system.

For a list of signs that can be used in file names, refer to https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html. Additionally, the space character is properly handled in file names.

  • The CSV file syntax constraints:
    • Values must be separated using a comma
    • Dot can be used as a decimal separator and no separator should be used for thousands
    • Double quotation marks (”) are treated as varchar, e.g., “OTC,5” is understood as the value of one attribute
    • The date format must be: yyyy-mm-dd
  • The data structure for a particular entity must align with its definition available in Schema Repository. This means that the order of the columns defined in the CSV file has to correspond to the order of the fields defined for the selected entity.
Source System,Source System Id,Event Type,Transaction Amount,Transaction Currency,Transaction Date,Business Entity
OTC,1,EV1,100.10,USD,2022-05-13,Aptitude
OTC,2,EV1,200.10,USD,2022-05-13,Aptitude
OTC,3,EV1,300.10,USD,2022-05-13,Aptitude
OTC,4,EV1,400.10,USD,2022-05-13,Aptitude
OTC,5,EV1,500.10,USD,2022-05-13,Aptitude
OTC,6,EV1,600.10,USD,2022-05-13,Aptitude
OTC,7,EV1,700.10,USD,2022-05-13,Aptitude
OTC,8,EV1,800.10,USD,2022-05-13,Aptitude
OTC,9,EV1,900.10,USD,2022-05-13,Aptitude
OTC,10,EV1,110.10,USD,2022-05-13,Aptitude

The file below is incorrect because it lacks one of the mandatory fields, the Source System.

Source System Id,Event Type,Transaction Amount,Transaction Currency,Transaction Date,Business Entity
1,EV1,100.10,USD,2022-05-13,Aptitude
2,EV1,200.10,USD,2022-05-13,Aptitude
3,EV1,300.10,USD,2022-05-13,Aptitude
4,EV1,400.10,USD,2022-05-13,Aptitude
5,EV1,500.10,USD,2022-05-13,Aptitude
6,EV1,600.10,USD,2022-05-13,Aptitude
7,EV1,700.10,USD,2022-05-13,Aptitude
8,EV1,800.10,USD,2022-05-13,Aptitude
9,EV1,900.10,USD,2022-05-13,Aptitude
10,EV1,110.10,USD,2022-05-13,Aptitude

Fynapse allows the ingestion of files with UTF-8 file encoding both with and without BOM. Uploading a file without UTF-8 encoding will result in an Unable to read file error being thrown.

Fynapse allows the ingestion of files in the gzip format, which allows to upload more data in one file. The file format is automatically detected by the system so you do not have to add a file extension i.e., .gz. Such files can be ingested only via the backend, there is no option to ingest them using the Fynapse UI. After uploading to the dedicated cloud storage folder they are decompressed to check if they meet the same requirements as files in the CSV format and to undergo the same validation and deduplication processes to be finally ingested. The details of the ingested files in the gzip format are available in the File Ingestion grid.

Validations

To eliminate problems with file processing, various technical validations are performed for each uploaded file. After the validation process is complete, you can download a file containing all error messages from the Errors column.

File-Level Validation

The following validations are performed:

  • The system checks data integrity, that is if a checksum of the sent file matches the checksum of the received one. If this validation fails, then you will see a proper notification and the file will not be uploaded. You need to correct the content of the file and upload it again.
  • The system blocks adding files that have extensions different from CSV to the list before uploading them. They will not be added.

File-Header Validation

The first row of a file is always treated as a header and validated, that is, the system compares column names with labels taken from the entity definition. If the header does not match, then the file is not processed and further validations are not performed. Additionally, the following error message: The file header does not match the entity definition. The expected header is: {the proper header from the definition} is added to the error file.

Record-Level Validation

Record-level validation is performed record after record, so even if there are errors in one line, the remaining successfully validated lines are still processed. The first validation checks if the number of columns is correct and if it fails for a particular record, then the other validations are not performed for this record.

The headers are not treated as data input and are only used to align the data structure of the CSV file with the definition of the particular entity. However:

  • If the order of the attributes in the header does not correspond to the order of the fields defined for a particular entity
  • If the name of the header contains an error, e.g., a typo

Then the header is subjected to the same record-level validation as data input records and returns a validation error.

The following validations are performed:

Error Message Showed in the FileValidation Details
1The record has {more/fewer} columns than expected.The system checks if a number of columns in the file is correct and if there are too many columns or fewer columns than expected the error message is created.

This validation is done as the first one and if it fails for a particular record, then the other validations are not performed for this record.
2The mandatory {attribute name} attribute cannot be empty.The system checks if a particular record contains all mandatory values.
3The expected data type in {attribute name} attribute is {data type}.The system checks if a data type for a particular attribute is correct.
4The record contains an invalid date format. The expected date format is: {date format}.The system checks if the date format is correct. Only the yyyy-mm-dd format is accepted.
5The expected number scale in the {attribute name} attribute should be less than or equal to {expected value} but it is {actual value}.The system checks if the number scale for a particular attribute is correct.
6Numeric values in the {attribute name} attribute should not be written in scientific notation.The system checks if a numeric value in a particular attribute is not written in scientific notation.
7The expected number precision in the {attribute name} attribute should be less than or equal to {expected value} but it is {actual value}.The system checks if the number precision for a particular attribute is correct.

If you want to edit an uploaded file, ensure you use Notepad and not Excel. Editing the upload file in Excel can cause issues with formatting, which will cause an upload to fail.

Ingestion Process

The upload process consists of the following steps:

  • A user uploads a file or files via Fynapse GUI.
  • The system checks if the files have the CSV extension. If not, then the system blocks adding them to a list of files that will be uploaded.
  • The system checks data integrity that is, if the sent files match the received ones. If this validation fails, then files will not be uploaded.
  • The system performs the file-header validation. If this validation fails, then the file will not be processed.
  • The system performs the record-level validation and checks if particular records are correct, and if some of the records are incorrect, only the correct records are processed (information how many of them were processed is shown in the Success column). The incorrect records are not processed and a link to a file containing invalid records along with error messages added in the new column (next to each record) is available. The user corrects invalid records and uploads a new file.

Duplicated Data

Ingesting transactional data comes with the risk of duplicating records, which have already been ingested by Fynapse. To avoid duplication of data, we implemented a mechanism that verifies the ID of incoming data upon ingestion. This verification occurs based on the Primary Key setup for the data. If incoming data is recognized as already having been ingested into Fynapse it is rejected right after ingestion and not processed. An error will be thrown by the system and logged into the Error log.

Deduplicated records are not submitted for reprocessing, as they are duplicates of records already in the system. The original record is the one submitted for processing.

Deduplication occurs at the point where data are ingested into Fynapse, either via the File Ingestion screen or direct input to the cloud storage. The window for deduplication to occur is set to 48 hours from the moment of ingestion of the record into Fynapse. After 48 hours have passed, the record will not be deduplicated.

If the Primary Key is not set, the data will not be verified, and duplicated records will appear in the system.

File Ingestion Screen

The File Ingestion screen comprises the following elements:

  • The Upload button - click it to upload new files to Fynapse
  • The Refresh button - click it to see if any new files were added or if their statuses changed
  • The grid - use it to see details of ingested files. The grid has the following columns:
    • Name - the name of an original file. You can click it to download this file.
    • Ingestion ID - an ID that is used to link Journals with a particular ingestion
    • Transfer date - a date and time when a file was transferred to a server
    • Processing start date - a date and time when data processing started
    • Processing end date - a date and time when data processing ended
    • Namespace - a namespace to which the file was uploaded. The namespace is a set of entities used to prevent entities’ name collisions.
    • Type - an entity type, for example, a Business Event
    • Status - a status of the file processing:
      • Failed - a critical error occurred or there were no valid rows in this file (all rows contained errors)
      • Success - the file processing ended successfully
      • Warning - the file processing ended successfully but some of the rows in this file contain errors. Check the error file to find out more about these errors.
      • In Queue - the file was uploaded by the user via GUI but the file processing has not started yet
      • Processing - the file is being processed (that is, rows are read, checked, and validated against errors)
    • Total - the total number of processed records
    • Success - a number of records that were successfully processed
    • Failed - a number of records that were not processed due to errors
    • Uploaded by - a user or a process that uploaded the file
    • Journals - when you upload a BusinessEvent or BusinessEventRollback file and it has been processed and is in either Success or Warning status, i.e., has generated Journals. This column will contain a link that will redirect you to the Journals screen in Subledger, with Ingestion ID set as the search criterion so that you can browse Journals created by the Business Event records uploaded in the file.
    • Errors - consists of a Download error file link to a file containing all incorrect records from the processed file with error messages corresponding to each invalid record. The name of the file that you can download after clicking the link uses the following syntax: {filename}-errors.csv. You can filter the grid using the following options:
      • All - to show all files
      • Is true - to show files that had some errors
      • Is false - to show files that did not contain errors

Additionally, at the bottom of the screen, you can:

  • See the number of pages that display ingested files and buttons to navigate between pages
  • See the information which page you are currently viewing and how many pages are available in total
  • Decide how many ingested files you want to display on one page
  • See how many ingested files are presented on the page

Saving Exported Files

Fynapse uses your web browser configuration for saving downloaded files. This means that the exported file will be saved in the default download location configured in your web browser. If you configured your browser to always ask for a download location, you will be prompted to provide a location for the exported file.

Tutorials

Prerequisites: The file that you want to upload needs to have the CSV extension and cannot be empty. Empty files and files having different extensions will not be added to the list of files that will be uploaded to Fynapse.

  1. Go to Integration > File Ingestion.
  2. From the Type list, select an entity to which the file will be uploaded.
  3. Click the Upload button.
  4. Drag a file or click the Select files button.

    To upload many files simultaneously, drag or select multiple files.

    You cannot drag the same file twice because the duplicate will not be added to the list of files to be uploaded. Keep in mind, that none of the files will be added if one or more of the selected files is empty or is already on the list.

  5. Click the Upload button. The file will be validated, uploaded, and made visible in the grid. To see the current status of the processing, click the Refresh button. You can later check an error file and download the original file.
  1. Go to Integration > File Ingestion.
  2. Find a file you uploaded by filtering the column by the file name.
  3. Click the Download error file link in the Errors column.
  4. You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded. It contains all incorrect records from the original file that were not processed with an error description next to each record.
  1. Go to Integration > File Ingestion.
  2. Find a file you uploaded by filtering the column by the file name.
  3. Click the Download error file link in the Errors column.
  4. You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded.
  5. Based on the error file, correct invalid records in your new file and upload it to Fynapse.

If you want to edit an uploaded file, please ensure you use Notepad and not Excel. Editing the upload file in Excel can cause issues with formatting, which will cause an upload to fail.

  1. Go to Integration > File Ingestion.
  2. Find an original file by filtering the column by the file name.
  3. Click the name of the file you want to download.
  4. You may be prompted to provide a location where the file will be saved or the location will be automatically selected (it depends on your web browser configuration). The file will be downloaded.
  1. Go to Integration > File Ingestion.
  2. Find a file you uploaded by filtering the column by the file name.
  3. Click the Open journals link in the Journals column. You will be redirected to the Subledger > Journals where Ingestion ID is set as a search criterion so that you can view all Journals created out of this data ingestion.

Learn More