Can I Add Other Sources Than CSV?Yes. The application can also index JSON and Parquet files. What Happens When You Push a New File?Data Factory Studio detects when new sources files are added to the S3 Bucket (specified in Pipeline > Source > S3 Bucket), and indexes these files automatically. Similarly, it also detects all updates made to existing source files, and reindexes these files to keep the index up to date. What Happens When Sources Files are Deleted?The behavior depends on the pipeline type.
What Happens With Lines Deleted in Source Files?Data Factory Studio detects when lines are deleted or updated in source files. Data Factory Studio deletes all associated items from the index, and recrawls the files. How To Manage URIs (Item Identifier)?It is better to specify URI values in your source files. For CSV files, create a dedicated
column with
To point to other elements in the index, use an attribute of uri, name, neighbor 1, Michael, 5 5, Jane, 1 Note:
If you do not specify URI values in your source files, Data Factory Studio creates them automatically for all items (CSV lines) in the Index Unit, using the
following elements:
[s3bucketname]:[csvFolderPath]/[csvFileName]:[lineNumber]
For example,
|