-
Notifications
You must be signed in to change notification settings - Fork 0
Module 5 : JOLT
JOLT stands for JsOn Language Transform
Jolt is a Java-based JSON to JSON transformation library that is used to transform JSON data from one structure to another. It provides a set of transformation specifications that can be used to map and manipulate JSON data in a declarative way.
JSON (JavaScript Object Notation) is a lightweight, text-based data interchange format. It is used to represent data objects in a structured way that can be easily parsed and manipulated by software programs. JSON is often used as an alternative to XML because it is more concise, easier to read, and can be parsed more quickly.
JSON data is made up of key-value pairs, which are enclosed in curly braces and separated by commas. The keys are strings, and the values can be strings, numbers, Boolean values, arrays, or other JSON objects. The data can be nested, with objects containing other objects, and arrays containing other arrays or objects.
Example of JSON representation
{
"key": "value"
}Jolt is particularly useful in situations where JSON data needs to be transformed into a different format or structure, such as when integrating with external systems or preparing data for consumption by downstream systems. It provides a way to define complex transformation logic in a simple, human-readable format that can be easily modified and maintained over time.
- It is easy to use and understand.
- JOLT provides a wide range of transformation operations allowing you to perform complex data manipulations on JSON data.
- Active community: Jolt has an active and supportive community of users and contributors, who are constantly working to improve the library and provide support to other developers.
- JOLT is not a "stream" based library, so it can't handle large JSON documents efficiently.
- It increase the work of garbage collector.
- Error message are not very helpful.
jq
jq is a command-line tool for processing and transforming JSON data. It provides a simple, powerful, and flexible way to manipulate JSON data from the command line, and can be used in a variety of scenarios where JSON data needs to be transformed in real-time.
JSONPath
JSONPath is a query language for JSON data that allows you to extract data from JSON documents. It is similar to XPath, which is used to query XML documents. JSONPath uses a dot notation syntax to navigate through the JSON data and extract the required data.
JSONPath can be used in a variety of programming languages, including Java, JavaScript, and Python. It is particularly useful when working with JSON data in RESTful web services and APIs.
JsonSurfer
JsonSurfer is a Java-based library that provides a simple and efficient way to extract data from JSON documents. It is built on top of Jackson, which is a popular Java-based JSON processing library.
JsonSurfer uses a streaming approach to parse JSON data, which allows it to handle large JSON documents efficiently without consuming too much memory. It provides a simple API that allows you to extract data from JSON documents using JSONPath expressions.
Here is a list of operations available in JOLT:
The shift operation is one of the most commonly used operations in Jolt. It allows you to restructure JSON data by shifting data from one location to another.
Example of shift :
input.json
"user":{
"name" : "Rishabh",
"gmail" : "rishabh@gmail.com"
}spec.json
[
{
"operation": "shift"
"spec": {
"client": {
"firstName": "user.name",
"email": "user.gmail"
}
}
}
]output.json
{
"user": {
"firstName": "Rishabh",
"email": "rishabh@gmail.com"
}
}input.json contains the JSON data that needs to be transformed. In this case, the JSON data contains a single object with two key-value pairs.
spec.json contains the transformation specification that defines how the JSON data should be transformed. In this case, the transformation specification contains a single shift operation that specifies that the value of the firstName key should be copied from the name key, and the value of the email key should be copied from the gmail key.
output.json contains the transformed JSON data. In this case, the transformed JSON data contains a single object with two key-value pairs.
.dot notation is used to specify the path to the source and destination keys. The source key is specified on the left side of the colon, and the destination key is specified on the right side of the colon.
The default operation is used to set a default value for a key if the key is not present in the JSON data.
Example of default :
input.json
{
"user": {
"name": "Rishabh"
}
}spec.json
[
{
"operation": "default",
"spec": {
"user": {
"email": "rishu@gmail.com"
}
}
}
]output.json
{
"user": {
"name": "Rishabh",
"email": "rishu@gmail.com"
}
}The remove operation is used to remove a key from the JSON data.
Example of remove :
input.json
{
"user": {
"name": "Rishabh",
"email": "rishu@gmail.com"
}
}spec.json
{
"operation": "remove",
"spec": {
"user": {
"email": ""
}
}
}output.json
{
"user": {
"name": "Rishabh"
}
}The cardinality operation is used to transform JSON data into a different format. It is similar to the shift operation, but it allows you to specify a destination key that contains an array of values.
Example of cardinality :
input.json
{
"user": {
"name": "Rishabh",
"email": "rishu@gmail.com"
}
}spec.json
[
{
"operation": "cardinality",
"spec": {
"user": {
"*": {
"name": "user[].name",
"email": "user[].email"
}
}
}
}
]output.json
{
"user": [
{
"name": "Rishabh",
"email": "rishu@gmail.com"
}
]
}The modify-default-beta operation is used to set a default value for a key if the key is not present in the JSON data. It is similar to the default operation, but it allows you to specify a default value using a function.
Example of modify-default-beta :
input.json
{
"user": {
"name": "Rishabh"
}
}spec.json
[
{
"operation": "modify-default-beta",
"spec": {
"user": {
"email": "=concat('rishu', '@gmail.com')"
}
}
}
]output.json
{
"user": {
"name": "Rishabh",
"email": "rishu@gmail.com"
}
}The modify-overwrite-beta operation is used to modify the value of a key in the JSON data. It is similar to the shift operation, but it allows you to specify a function that will be used to modify the value of the key.
Example of modify-overwrite-beta :
input.json
{
"user": {
"name": "Rishabh",
"email": ""
}
}spec.json
[
{
"operation": "modify-overwrite-beta",
"spec": {
"user": {
"email": "=concat('rishu', '@gmail.com')"
}
}
}
]output.json
{
"user": {
"name": "Rishabh",
"email": "rishu@gmail.com"
}
}| modify-default-beta | modify-overwrite-beta | |
|---|---|---|
| Field Exists | ❌(no change) | Field value changed to new one |
| Field Not Exists | New field created | New field created |
The sort operation is used to sort the values of an array.
Example of sort :
input.json
{
"user": [
{
"name": "Rishabh",
"email": "rishu@gmail.com"
},
{
"name": "Adi",
"email": "adi@gmail.com"
}
]
}spec.josn
[
{
"operation": "sort",
"spec": {}
}
]output.json
{
"user": [
{
"name": "Adi",
"email": "adi@gmail.com"
},
{
"name": "Rishabh",
"email": "rishu@gmail.com"
}
]
}The LHS and RHS operations are used to specify the source and destination keys in the transformation specification.
The LHS operation is used to specify the source key in the transformation specification.
RHS
The RHS operation is used to specify the destination key in the transformation specification.
All the JSON content before : will be LHS and after : will be RHS.
One wildcard can have different functions depending on its use (LHS e RHS), in addition, we can combine different wildcards in the same transformation.
The * wildcard is used to match any key in the JSON data.
In a json object which consists n numbers of key-value pairs, the * wildcard will match all the keys. It works something like looping through all the keys.
Usage: LHS
Operations: shift, remove, cardinality, modify-default-beta and modify-overwrite-beta
Example of * :
input.json
{
"user": {
"name": "Rishabh",
"phone": "1234567890",
"email": "rishu@gmail.com",
"birthDate": "10/31/1990",
"address": "Customer Example Street"
}
}spec.json
[
{
"operation": "shift",
"spec": {
"user": {
"*": "user.&",
"name": "user.Firstname"
}
}
}
]output.json
{
"user": {
"Firstname": "Rishabh",
"phone": "1234567890",
"email": "rishu@gmail.com",
"birthDate": "10/31/1990",
"address": "Customer Example Street"
}
}It uses the content of what is declared in the LHS to compose the structure of the output JSON, without the need to make this content explicit in the transformation.
Usage: RHS
Operation: shift
Example Of &
{
{
"name": "Rishabh Malviya",
"email": "rishu@gmail.com",
"desh": "china"
}
}spec.json
[
{
"operation": "shift",
"spec": {
"name":"customer.&",
"emai":"customer.&"
"desh":"customer.country"
}
}
]
output.json
{
"customer": {
"name": "Rishabh Malviya",
"email": "rishu@email.com"
"country" : "china"
}
}If used in LHS, it has the function of entering values manually in the output JSON.
In RHS, on the other hand, it is applicable only to create lists and has the function of grouping certain content of the input JSON within the list to be created.
Usage: LHS and RHS
Operations: shift
input.json
{
"products": [
{
"code": "PROD-A",
"value": 10
},
{
"code": "PROD-B",
"value": 20
}
]
}spec.json
[
{
"operation": "shift",
"spec": {
"products": {
"*": {
"code": "products[#2].code",
"value": "products[#2].value"
}
}
}
}
]output.json
{
"products": [
{
"code": "PROD-A",
"value": 10
},
{
"code": "PROD-B",
"value": 20
}
]
}References the value of a field or object contained in the input JSON, but has different effects depending on its usage.
Usage: LHS and RHS
Operations: shift (LHS and RHS), modify-overwrite-beta (RHS) e modify-overwrite-beta (RHS).
input.json
{
"key": "node",
"value": "2001"
}spec.json
[
{
"operation": "shift",
"spec": {
"value": "product.@(1,key)"
}
}
]output.json
{
"product": {
"code": "123-ABC"
}
}References the name of a field or object contained in the input JSON to be used as the value of a field or object in the output JSON.
**Usage: **LHS
Operations: shift
input.json
{
"user": {
"name": "Rishu",
"height": 173,
"weight": 68,
"gender": "M"
}
}spec.json
[
{
"operation": "shift",
"spec": {
"product": {
"*": {
"$": "user[]"
}
}
}
}
]output.json
{
"user": ["name", "height", "weight", "gender"]
}It allows referencing multiple fields or objects of an input JSON so that, regardless of the name of the field or object, its value is allocated to the same destination in the output JSON.
Usage: LHS
Operations: shift
input.json
{
"user": [
{
"name": "Rishabh",
"email": "rishu@gmail.com"
},
{
"fullName": "Aditya",
"email": "adi@gmail.com"
}
]
}spec.json
[
{
"operation": "shift",
"spec": {
"user": {
"*": {
"name|fullName": "user.name",
"email": "user.email"
}
}
}
}
]output.json
{
"user": [
{
"name": ["Rishabh", "Aditya"],
"email": ["rishu@gmail.com", "Aditya"]
}
]
}JOLT has a set of functions that can be used to manipulate the data in the input JSON. These functions are used in the RHS of the transformation.
modify-overwrite-beta and modify-default-beta allow us to apply functions to our JSON.
We cannot combine two functions in the same operation like
=toLower(toUpper(@(1,product)))
-
String
toLower, toUpper, concat, join, split, substring, trim, leftPad,rightPad, size,
String Function Example input.json
{ "product": { "product": "Product A", "company": "company a", "value": "100", "measureWithSpaces": "10 meters" } }spec.json
[ { "operation": "modify-overwrite-beta", "spec": { "product": { "product": "=toLower(@(1,product))", "company": "=toUpper(@(1,company))", "product_company": "=concat(@(1,product),'_',@(1,company))", "joinProductCompany": "=join(' - ',@(1,product),@(1,company))", "splitProductCompany": "=split('[-]',@(1,joinProductCompany))", "substringProduct": "=substring(@(1,product),0,4)", "value": "=leftPad(@(1,value),6,'A')", "measure": "=trim(@(1,measureWithSpaces))", "length": "=size(@(1,measure))" } } } ]output.json
{ "product": { "product": "product a", "company": "COMPANY A", "value": "AAA100", "measureWithSpaces": "10 meters", "product_company": "product a_COMPANY A", "joinProductCompany": "product a - COMPANY A", "splitProductCompany": ["product a ", " COMPANY A"], "substringProduct": "prod", "measure": "10 meters", "length": 9 } } -
Numeric
min, max, abs, avg, intSum, doubleSum, longSum, intSubtract, doubleSubtract, longSubtract, divide e divideAndRound
Numeric Function Example input.json
{ "product": { "value": 100, "measure": 10 } }spec.json
[ { "operation": "modify-overwrite-beta", "spec": { "product": { "min": "=min(@(1,value),@(1,measure))", "max": "=max(@(1,value),@(1,measure))", "abs": "=abs(@(1,value))", "avg": "=avg(@(1,value),@(1,measure))", "intSum": "=intSum(@(1,value),@(1,measure))", "doubleSum": "=doubleSum(@(1,value),@(1,measure))", "longSum": "=longSum(@(1,value),@(1,measure))", "intSubtract": "=intSubtract(@(1,value),@(1,measure))", "doubleSubtract": "=doubleSubtract(@(1,value),@(1,measure))", "longSubtract": "=longSubtract(@(1,value),@(1,measure))", "divide": "=divide(@(1,value),@(1,measure))", "divideAndRound": "=divideAndRound(@(1,value),@(1,measure))", "multiply": "=divide(1,@(1,value),@(1,measure))" } } } ]output.json
{ "product": { "value": 100, "measure": 10, "min": 10, "max": 100, "abs": 100, "avg": 55, "intSum": 110, "doubleSum": 110, "longSum": 110, "intSubtract": 90, "doubleSubtract": 90, "longSubtract": 90, "divide": 10, "divideAndRound": 10 } } -
Type
toInteger, toDouble, toLong, toBoolean, toString, recursivelySquashNulls, squashNulls, size
Type Function Example input.json
{ "product": { "value": 10.5, "stringBoolean": "true", "objectWithNull": { "fielWithValue": "ABC", "nullField": null } } }spec.json
[ { "operation": "modify-overwrite-beta", "spec": { "product": { "toInteger": "=toInteger(@(1,value))", "toDouble": "=toDouble(@(1,value))", "toLong": "=toLong(@(1,value))", "toBoolean": "=toBoolean(@(1,stringBoolean))", "toString": "=toString(@(1,value))", "recursivelySquashNulls": "=recursivelySquashNulls(@(1,objectWithNull))", "squashNulls": "=squashNulls(@(1,objectWithNull))", "size": "=size(@(1,objectWithNull))" } } } ]output.json
{ "product": { "value": 10.5, "stringBoolean": "true", "objectWithNull": { "fielWithValue": "ABC" }, "toInteger": 10, "toDouble": 10.5, "toLong": 10, "toBoolean": true, "toString": "10.5", "recursivelySquashNulls": { "fielWithValue": "ABC" }, "squashNulls": { "fielWithValue": "ABC" }, "size": 1 } } -
List/Array
firstElement, lastElement, elementAt, toList, sort
List/Array Function Example
input.json
{ "product": { "array": ["c", "t", "m", "a"], "stringField": "123" } }spec.json
[ { "operation": "modify-overwrite-beta", "spec": { "product": { "firstElement": "=firstElement(@(1,array))", "lastElement": "=lastElement(@(1,array))", "elementAt": "=elementAt(@(1,array),2)", "toList": "=toList(@(1,stringField))", "sort": "=sort(@(1,array))" } } } ]output.json
{ "product": { "array": ["c", "t", "m", "a"], "stringField": "123", "firstElement": "c", "lastElement": "a", "toList": ["123"], "sort": ["a", "c", "m", "t"] } }
Mulitpication in JOLT
JOLT has no function for direct multipication of two numbers, but we can use the inverse to multiply two numbers.
input.json
{
"numbers": {
"num1": 2,
"num2": 3
}
}spec.json
[
{
"operation": "modify-overwrite-beta",
"spec": {
"numbers": {
"inverse": "=divide(1,@(1,num1))",
"multiplied": "=divide(@(1,num2),@(1,inverse))"
}
}
}
]output.json
{
"numbers": {
"num1": 2,
"num2": 3,
"inverse": 0.5,
"multiplied": 6.0
}
}Convert Map/Object to List/Array
-
$is used to get all the keys of the object and it can be converted to a value. -
@is used ot all the values and it can be converted reassigned to a differnt value here it isvalue
input.json
{
"ratings": {
"Very Bad": 0,
"Bad": 1,
"Average": 2,
"Good": 3,
"Very Good ": 4,
"Exellent": 5
}
}spec.json
[
{
"operation": "shift",
"spec": {
"ratings": {
"*": {
"$": "Ratings[#2].Title",
"@": "Ratings[#2].Value"
}
}
}
}
]output.json
{
"Ratings": [
{
"Title": "Very Bad",
"Value": 0
},
{
"Title": "Bad",
"Value": 1
},
{
"Title": "Average",
"Value": 2
},
{
"Title": "Good",
"Value": 3
},
{
"Title": "Very Good ",
"Value": 4
},
{
"Title": "Exellent",
"Value": 5
}
]
}Convert List/Array to Map/Object
input.json{
"Photos": [
{
"Id": "327703",
"username": "rishu",
"Url": "http://bob.com/0001/327703/photo.jpg"
},
{
"Id": "327704",
"username": "sinchan",
"Url": "http://bob.com/0001/327704/photo.jpg"
}
]
}spec.json
[
{
"operation": "shift",
"spec": {
"Photos": {
"*": {
"Id": "photo-&1-id",
"username": "photo-&1-caption",
"Url": "photo-&1-url"
}
}
}
}
]output.json
{
"photo-0-id" : "327703",
"photo-0-caption" : "rishu",
"photo-0-url" : "http://bob.com/0001/327703/photo.jpg",
"photo-1-id" : "327704",
"photo-1-caption" : "sinchan",
"photo-1-url" : "http://bob.com/0001/327704/photo.jpg"
}Escaping Characters
\</code> is used to escape the special characters.
input.json
{
"product": {
"@price": 10.5,
"quantity^": 10
}
}spec.json
[
{
"operation": "shift",
"spec": {
"product": {
"\\@price": "product.price",
"quantity\\^": "product.qunatity"
}
}
}
]output.json
{
"product": {
"price": "10.5",
"qunatity": "10"
}
}Converting a list of maps to map of maps
-
@in LHS is used to select each element of array. -
@in RHS is used to get the value of the key.
input.json
{
"cars": [
{
"name": "BMW",
"power": "1700HP"
},
{
"name": "Mercedes",
"power": "1400HP"
},
{
"name": "Audi",
"power": "1000HP"
},
{
"name": "Ferrari",
"power": "1200HP"
}
]
}spec.json
[
{
"operation": "shift",
"spec": {
"cars": {
"*": {
"@": "cars.@(1,name)"
}
}
}
}
]output.json
{
"cars": {
"BMW": {
"name": "BMW",
"power": "1700HP"
},
"Mercedes": {
"name": "Mercedes",
"power": "1400HP"
},
"Audi": {
"name": "Audi",
"power": "1000HP"
},
"Ferrari": {
"name": "Ferrari",
"power": "1200HP"
}
}
}Slice a keys from both left and right
It will require three transformation.- Convert keys to value to apply string function.
- Apply string function to concat string.
- Convert again values to key.
input.json
{
"a-p-x-g": "1",
"a-q-y-h": "2"
}spec.json
[
{
"operation": "shift",
"spec": {
"*": {
"$": "@0"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=substring(@(1,&), 2, 5)"
}
},
{
"operation": "shift",
"spec": {
"*": {
"$": "@0"
}
}
}
]output.json
{
"p-x": "1",
"q-y": "2"
}| WildCard | LHS | RHS | Operations |
|---|---|---|---|
* |
It is used to loop over an array of objects or map of objects | ❌ |
shift, remove, cardinality, modify-default-beta, modify-overwrite-beta,default
|
$ |
It selects keys also used with * to match pattern |
❌ |
shift |
& |
It is used to selecting subparts of a key when used with *
|
It is used to specify key name instead of explicitly writing |
shift,modify-default-beta, modify-overwrite-beta
|
@ |
Select the key of each value when used with *
|
Used to get value |
shift (LHS and RHS), modify-overwrite-beta (RHS) and modify-overwrite-beta (RHS). |
# |
Used to apply default value or constant value | only valid in the context of arrays | shift |
| |
Match multiple input keys | ❌ | shift |