Expression Input
The record object represents the current source file line, using headers
as variables, and cells as values. It is a Named Tuple containing each value of the source
file line as String . For more information, see Expression Language.
For example, with the following csv:
name, age, status
Alice, 20, single
Bob, 40, married
Camille, 70, widow
Data Factory Studio calls the expression 3 times, and record successively holds the
following values:
{name: "Alice", age: "20", status: "single"}
{name: "Bob", age: "40", status: "married"}
{name: "Camille", age: "70", status: "widow"}
In this example, you can access the age value with record.age .
Expression Output
Your expression must return a Tuple matching the targeted Index Unit data model. For more
information, see Expression Language.
Note:
The attribute order does not matter. Only attribute names are relevant. For example,
{x: 123, y:456} is totally equivalent to {y:456,
x:123} .
Supported Types for the Output Tuple Elements
Type |
Support |
NULL |
Yes |
STRING |
Yes |
TEXT_MATCH |
Yes |
HIERARCHICAL_STRING |
No |
BOOLEAN |
Yes |
DECIMAL |
Yes |
FLOAT |
Yes |
INTEGER |
Yes |
DATE |
Yes |
DATE_TIME |
Yes |
LOCAL_DATE_TIME |
Yes |
PERIOD |
Yes |
DURATION |
Yes |
TYPE |
No |
UNIT |
No |
BINARY |
No |
GEOMETRY |
No |
FUNCTION |
No |
ZONE_OFFSET |
No |
|
|
LIST |
Yes |
TUPLE |
No |
SET |
Yes |
MAP |
No |
|
|
ITEM |
No |
DICTIONARY_CODED_STRING |
No |
|
|
TYPE_PARAMETER |
No |
GRAPH_TYPE_PARAMETER |
No |
Examples
Parsing Dates and Time
Input data:
a_date, a_datetime, a_time
01/01/2020, 01/01/2020 20:15:36, 20:15:36
01/01/1970, 01/01/1970 00:01:01, 00:01:01
29/12/1969, 29/12/1969 23:59:59, 23:59:59
You can parse dates with a date format other than ISO 6801, with an expression like:
{a_date: Date.parse(text: record.a_date, format: "dd/MM/yyyy"),
a_datetime: LocalDateTime.parse(text: record.a_datetime, format: "dd/MM/yyyy HH:mm:ss"),
a_time: LocalDateTime.parse(text: "01/01/1970 "+record.a_time, format: "dd/MM/yyyy HH:mm:ss")}
Parsing Lists
Input data:
name, neighbors, neighbors_distance
alice, bob|camille, 123;456
bob, david|camille|alice, 147;258;369
camille, ,
david, emilie, 741
emilie, bob|david, 159;753
You can process this data with the following expression:
{...record, neighbors: record.neighbors IS NULL ? [] AS List<String>:
record.neighbors.split("|"), neighbors_distance: record.neighbors_distance IS NULL ? [] AS List<Integer> :
record.neighbors_distance.split(";").map(function: x => Integer.parse(value: x))}
Parsing Lists with Trimming and Unquoting
Input data:
name, neighbors
alice, "bob"| camille
bob, david | "camille" | alice
camille,
david, "emilie"
emilie, "bob"| david
You can process this data with the following expression:
{...record, neighbors: record.neighbors IS NULL ? [] AS List<String> : record.neighbors.split("|").
map(s => s.trim()).map(s => s.endsWith("\"") AND s.startsWith("\"") ? s.substring(1, - 1) : s )}
Simple Cleaning
Input data:
name, age, status
Alice, 20, Single
Bob, 40, married
Camille, 70, Widow
David, 42, single
Emilie, 25, not married
Example of cleaning expression for this CSV file:
{...record, status: record.status.toLowerCase().replace("not married", "single")}
|