Row Denormalizer Analytics

Plugin version: 2.11.0

Converts raw data into denormalized data based on a key column. User is able to specify the list of fields that should be used in the denormalized record, with an option to use an alias for the output field name. For example, ‘ADDRESS’ in the input is mapped to ‘addr’ in the output schema.

The transform takes input record that stores a variable set of custom attributes for an entity, denormalizes it on the basis of the key field, and then returns a denormalized table according to the output schema specified by the user. The denormalized data is easier to query.

Configuration

Property	Marco Enabled?	Description

Property	Marco Enabled?	Description
Key Field	Yes	Required. Name of the column in the input record which will be used to group the raw data. For Example, id.
Input Name Field	Yes	Required. Name of the column in the input record which contains the names of output schema columns. For example, input records have columns ‘id’, ‘attribute’, ‘value’ and the ‘attribute’ column contains ‘FirstName’, ‘LastName’, ‘Address’. “So the output record will have column names as ‘FirstName’, ‘LastName’, ‘Address’.
Input Value Field	Yes	Required. Name of the column in the input record which contains the values for output schema columns. For example, input records have columns ‘id’, ‘attribute’, ‘value’ and the ‘value’ column contains ‘John’, ‘Wagh’, ‘NE Lakeside’. So the output record will have values for columns ‘FirstName’, ‘LastName’, ‘Address’ as ‘John’, ‘Wagh’, ‘NE Lakeside’ respectively.
Output fields to include	Yes	Required. List of the output fields to be included in denormalized output.
Output fields to rename	Yes	Optional. List of the output fields to rename. The key specifies the name of the field to rename, with its corresponding value specifying the new name for that field.
numPartitions	Yes	Optional. Number of partitions to use when grouping data. If not specified, the execution framework will decide on the number to use.
Output Schema	No	Required. The output schema for the data.

Conditions

In case a field value is not present, then it will be considered as NULL.

For Example,

If keyfield(‘id’) in the input record is NULL, then that particular record will be filtered out.

If namefield(‘attribute’) or valuefield(‘value’) is not present for a particular keyfield(‘id’) value, then the denormalized output value for that namefield will be NULL.

If user provides output field which is not present in the input record, then it will be considered as NULL.

Example

The transform takes input records that have columns id, attribute, value, denormalizes it on the basis of id, and then returns a denormalized table according to the output schema specified by the user.

Property	Value

Property	Value
Key Field	`id`
Input Name Field	`attribute`
Input Value Field	`value`
Output fields to include	`Firstname,Lastname,Address`
Output fields to rename	`Address:Office Address`

For example, suppose the aggregator receives the input record:

id	attribute	value

id	attribute	value
joltie	Firstname	John
joltie	Lastname	Wagh
joltie	Address	NE Lakeside

Output records will contain all the output fields specified by user:

id	Firstname	Lastname	Office Address

id	Firstname	Lastname	Office Address
joltie	John	Wagh	NE Lakeside

Now, let’s suppose the aggregator receives the input record with NULL values:

id	attribute	value

id	attribute	value
joltie	Firstname	John
joltie	Lastname	Wagh
joltie	Address	NE Lakeside
brett	Firstname	Brett
	Lastname	Lee
brett	Address	SE Lakeside
bob	Firstname	Bob
bob		Smith
bob	Address

Output records will contain all the output fields specified by user:

id	Firstname	Lastname	Office Address

id	Firstname	Lastname	Office Address
joltie	John	Wagh	NE Lakeside
brett	Brett		SE Lakeside
bob	Bob