Upper Case
Overview
Upper Case converts specified key values to uppercase while data is in-flight.
Number of Parameters
1
Parameter: Uppercase
Provide comma-separated keys in double quotes.
"first_name","last_name"
Lower Case
Overview
Lower Case converts specified key values to lowercase during data processing.
Number of Parameters
1
Parameter: Lowercase
"first_name","last_name"
Data Type
Overview
Data Type converts string values into Boolean, Float, Integer, or DateTime data types.
Number of Parameters
4
Boolean
"test_passed"
Float
"Amount"
Integer
"Quantity"
Date Time
"startweekdate1","%Y-%m-%d %H:%M:%S.%f%z","%Y-%m-%d %H:%M:%S", "startweekdate2","%Y-%m-%d %H:%M:%S","%Y-%m-%d"
Goldfinch Datalake Date Format:
%Y-%m-%dT%H:%M:%S.%f%z
Append
Overview
Append adds new keys with static or dynamic values during data flow.
Number of Parameters
1
Examples
"export_flag_y":"Y","export_flag_p":"P"
"concatenate_key_name":"{%ORDERNUMBER%}|{%ORDER_TYPE%}"
Title Case
Overview
Title Case converts specified key values into title case format.
"amount","first_name"
Data Extractor
Overview
Extracts specified keys and their values from JSON responses.
"['access_token']","['feedDocumentId']"
Trim
"first_name"
JSON to String / String to JSON
"key1","key2"
JSON to XML / XML to JSON
product_data_response data_response
Base64 Encoding / Decoding
"email"
Generate Array Sequence Number
key_name DATA
Today Timestamp
%Y-%m-%dT%H:%M:%S.%f%z dl_insert_date
Calculator
"Amount1-Amount2","Amount1+Amount2" "key1","key2"
Grok Pattern
Overview
Grok extracts structured data from unstructured text using predefined patterns.
This is endpoint url %{URI:endpoint_url} for mac add %{MAC:mac_address} and v4 %{IPV4:ip_address_v4} and V6 %{IPV6:ip_address_v6}
PDF Extractor
Items @xyz.grapgh.downloadUrl
ARRAY COUNT
['bizdata_dataset_response']['data']
RAW SENTENCE GENERATOR
"Name","Commands" Response
TIME UNITS
timestamp "year","month","day","hour","minute","second","microsecond"
Data Chunking
['text'] chunks 1000 Data_Chunks
Extract to Array Operation
"data" ["id","content"] ["ids","documents"]
Upper Case
Overview
Upper Case converts specified key values to uppercase while data is in-flight.
Number of Parameters
1
Parameter: Uppercase
Provide comma-separated keys in double quotes.
"first_name","last_name"
Lower Case
Overview
Lower Case converts specified key values to lowercase during data processing.
Number of Parameters
1
Parameter: Lowercase
"first_name","last_name"
Data Type
Overview
Data Type converts string values into Boolean, Float, Integer, or DateTime data types.
Number of Parameters
4
Boolean
"test_passed"
Float
"Amount"
Integer
"Quantity"
Date Time
"startweekdate1","%Y-%m-%d %H:%M:%S.%f%z","%Y-%m-%d %H:%M:%S", "startweekdate2","%Y-%m-%d %H:%M:%S","%Y-%m-%d"
Goldfinch Datalake Date Format:
%Y-%m-%dT%H:%M:%S.%f%z
Append
Overview
Append adds new keys with static or dynamic values during data flow.
Number of Parameters
1
Examples
"export_flag_y":"Y","export_flag_p":"P"
"concatenate_key_name":"{%ORDERNUMBER%}|{%ORDER_TYPE%}"
Title Case
Overview
Title Case converts specified key values into title case format.
"amount","first_name"
Data Extractor
Overview
Extracts specified keys and their values from JSON responses.
"['access_token']","['feedDocumentId']"
Trim
"first_name"
JSON to String / String to JSON
"key1","key2"
JSON to XML / XML to JSON
product_data_response data_response
Base64 Encoding / Decoding
"email"
Generate Array Sequence Number
key_name DATA
Today Timestamp
%Y-%m-%dT%H:%M:%S.%f%z dl_insert_date
Calculator
"Amount1-Amount2","Amount1+Amount2" "key1","key2"
Grok Pattern
Overview
Grok extracts structured data from unstructured text using predefined patterns.
This is endpoint url %{URI:endpoint_url} for mac add %{MAC:mac_address} and v4 %{IPV4:ip_address_v4} and V6 %{IPV6:ip_address_v6}
PDF Extractor
Items @xyz.grapgh.downloadUrl
ARRAY COUNT
['bizdata_dataset_response']['data']
RAW SENTENCE GENERATOR
"Name","Commands" Response
TIME UNITS
timestamp "year","month","day","hour","minute","second","microsecond"
Data Chunking
['text'] chunks 1000 Data_Chunks
Extract to Array Operation
"data" ["id","content"] ["ids","documents"]
HTML Extractor
Description:
The HTML Extractor operation extracts textual and structured data from given HTML content.
Number of Parameters : 2
Parameter : Input HTML Key
Provide the key name that contains the raw HTML data we aim to extract information from.
Below is an example where we provide the key containing HTML content.
bizdata_dataset_response
Parameter : Output Data Key
Specify the key name where you want to store the extracted structured data.
Below is an example where we aim to store the extracted data in the html_text_data key.
html_text_data
Various Use Cases for the Parameters:
Case 1:
When it’s necessary to extract plain text content from an HTML string.
Input HTML Key
bizdata_dataset_response
Output Data Key
html_text_data
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": "<html><body><h1>Hello World</h1><p>This is a paragraph.</p></body></html>"
}
Result
{
"html_text_data": "Hello World This is a paragraph."
}
File Extractor
Description:
The File Extractor operation extracts textual data from various file formats such as txt, docs, ppt, pdf, and many others.
Number of Parameters : 1
Parameter : File Data Key
Provide the key name that contains the bytes data of the file to be extracted.
Below is an example where we provide the key containing the file data.
bizdata_dataset_response
Various Use Cases for the Parameters:
Case 1:
When you have a document represented as bytes and want to pull out its text content.
File Data Key
bizdata_dataset_response
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": "b'%PDF-1.1\\n1 0 obj\\n<<>>\\nstream\\nBT (Hello , This is a File Extractor ops.) Tj ET\\nendstream\\nendobj\\n%%EOF'"
}
Result
{
"extracted_text": "Hello , This is a File Extractor ops.",
"extracted_Images": [],
"extracted_tables": []
}
JSON to Avro
Description:
The JSON to AVRO operation converts your structured JSON data into the AVRO format using a specified valid schema.
Number of Parameters : 3
Parameter : JSON Data Key
bizdata_dataset_response
Parameter : AVRO Schema
{"type": "record", "name": "User", "fields": [{"name": "name", "type": "string"}, {"name": "age", "type": "int"}]}
Parameter : AVRO Data Key
avro_data_key
Avro to JSON
Description:
The AVRO to JSON operation converts AVRO formatted byte data back into structured JSON data.
Number of Parameters : 2
Parameter : AVRO Data Key
avro_data_key
Parameter : JSON Data Key
json_data_response
Frequently Asked Questions
What does the Upper Case operation do?
The Upper Case operation converts the values of specified keys into uppercase format while the data is in-flight within the pipeline.
What does the Lower Case operation do?
The Lower Case operation converts the values of selected keys into lowercase format during data processing.
How does the Data Type operation work?
The Data Type operation converts string values into their respective data types such as Boolean, Float, Integer, or DateTime based on the provided configuration.
When should I use the Append operation?
Use the Append operation when you need to add new keys with static values or dynamic values derived from existing pipeline data.
What is the purpose of Title Case?
Title Case converts the values of specified keys so that each word starts with an uppercase letter.
How does Data Extractor help?
Data Extractor retrieves specific keys and their corresponding values from a JSON response for further processing.
What is the difference between JSON to String and String to JSON?
JSON to String converts structured JSON data into a string format, while String to JSON parses a string and converts it into structured JSON.
When should JSON to XML or XML to JSON be used?
These operations are used when converting data between JSON and XML formats to meet integration or system requirements.
Notes
- Validate key names before execution.
- Ensure correct datatype formats during conversion.
- Test transformations using sample datasets.
Upper Case
Overview
Upper Case converts specified key values to uppercase while data is in-flight.
Number of Parameters
1
Parameter: Uppercase
Provide comma-separated keys in double quotes.
"first_name","last_name"
Lower Case
Overview
Lower Case converts specified key values to lowercase during data processing.
Number of Parameters
1
Parameter: Lowercase
"first_name","last_name"
Data Type
Overview
Data Type converts string values into Boolean, Float, Integer, or DateTime data types.
Number of Parameters
4
Boolean
"test_passed"
Float
"Amount"
Integer
"Quantity"
Date Time
"startweekdate1","%Y-%m-%d %H:%M:%S.%f%z","%Y-%m-%d %H:%M:%S", "startweekdate2","%Y-%m-%d %H:%M:%S","%Y-%m-%d"
Goldfinch Datalake Date Format:
%Y-%m-%dT%H:%M:%S.%f%z
Append
Overview
Append adds new keys with static or dynamic values during data flow.
Number of Parameters
1
Examples
"export_flag_y":"Y","export_flag_p":"P"
"concatenate_key_name":"{%ORDERNUMBER%}|{%ORDER_TYPE%}"
Title Case
Overview
Title Case converts specified key values into title case format.
"amount","first_name"
Data Extractor
Overview
Extracts specified keys and their values from JSON responses.
"['access_token']","['feedDocumentId']"
Trim
"first_name"
JSON to String / String to JSON
"key1","key2"
JSON to XML / XML to JSON
product_data_response data_response
Base64 Encoding / Decoding
"email"
Generate Array Sequence Number
key_name DATA
Today Timestamp
%Y-%m-%dT%H:%M:%S.%f%z dl_insert_date
Calculator
"Amount1-Amount2","Amount1+Amount2" "key1","key2"
Grok Pattern
Overview
Grok extracts structured data from unstructured text using predefined patterns.
This is endpoint url %{URI:endpoint_url} for mac add %{MAC:mac_address} and v4 %{IPV4:ip_address_v4} and V6 %{IPV6:ip_address_v6}
PDF Extractor
Items @xyz.grapgh.downloadUrl
ARRAY COUNT
['bizdata_dataset_response']['data']
RAW SENTENCE GENERATOR
"Name","Commands" Response
TIME UNITS
timestamp "year","month","day","hour","minute","second","microsecond"
Data Chunking
['text'] chunks 1000 Data_Chunks
Extract to Array Operation
"data" ["id","content"] ["ids","documents"]
HTML Extractor
Description:
The HTML Extractor operation extracts textual and structured data from given HTML content.
Number of Parameters : 2
Parameter : Input HTML Key
Provide the key name that contains the raw HTML data we aim to extract information from.
Below is an example where we provide the key containing HTML content.
Parameter : Output Data Key
Specify the key name where you want to store the extracted structured data.
Below is an example where we aim to store the extracted data in the html_text_data key.
Various Use Cases for the Parameters:
Case 1:
When it’s necessary to extract plain text content from an HTML string.
Input HTML Key
Output Data Key
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": "<html><body><h1>Hello World</h1><p>This is a paragraph.</p></body></html>"
}
Result
{
"html_text_data": "Hello World This is a paragraph."
}
File Extractor
Description:
The File Extractor operation extracts textual data from various file formats such as txt, docs, ppt, pdf, and many
others.
Number of Parameters : 1
Parameter : File Data Key
Provide the key name that contains the bytes data of the file to be extracted.
Below is an example where we provide the key containing the file data.
bizdata_dataset_response
Various Use Cases for the Parameters:
Case 1:
When you have a document represented as bytes and want to pull out its text content.
File Data Key
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": "b'%PDF-1.1\\n1 0 obj\\n<<>>\\nstream\\nBT (Hello , This is a File Extractor ops.) Tj ET\\nendstream\\nendobj\\n%%EOF'" }
Result
{
"extracted_text": "Hello , This is a File Extractor ops.", "extracted_Images": [], "extracted_tables": [] }
JSON to Avro
Description:
The JSON to AVRO operation converts your structured JSON data into the AVRO format using a specified valid schema.
Number of Parameters : 3
Parameter : JSON Data Key
Provide the key name that contains the JSON structured data which we aim to convert into the AVRO format.
Below is an example where we provide the data key containing the JSON data.
bizdata_dataset_response
Parameter : AVRO Schema
Provide the AVRO schema in JSON format that will be used to validate and parse the JSON data into AVRO bytes.
Below is an example where we specify the schema.
{"type": "record", "name": "User", "fields": [{"name": "name", "type": "string"}, {"name":
Parameter : AVRO Data Key
Specify the key name where you want to store the converted AVRO byte data.
Below is an example where we aim to store the converted data in the avro_data_key.
avro_data_key
Various Use Cases for the Parameters:
Case 1:
When you have a simple JSON record and a matching AVRO schema and want to serialize it.
JSON Data Key
bizdata_dataset_response
AVRO Schema
{
"type": "record",
"name": "Customer",
"fields": [
{"name": "CREATEDDATE", "type": "string"},
{"name": "CUSTOMERCITY", "type": "string"},
{"name": "CUSTOMERCOUNTRY", "type": "string"},
{"name": "CUSTOMEREMAIL", "type": "string"},
{"name": "CUSTOMERNAME", "type": "string"},
{"name": "CUSTOMERPHONE", "type": "string"},
{"name": "CUSTOMERSTATE", "type": "string"},
{"name": "CUSTOMERZIPCODE", "type": "string"},
{"name": "ERPCUSTOMER", "type": "string"},
{"name": "CUSTOMER_ID", "type": "string"},
{"name": "ID", "type": "string"}
]
}
AVRO Data Key
avro_data_key
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": {
"CREATEDDATE": "15-01-2024",
"CUSTOMERCITY": "Bengaluru",
"CUSTOMERCOUNTRY": "India",
"CUSTOMEREMAIL": "john.doe@email.com",
"CUSTOMERNAME": "John Doe",
"CUSTOMERPHONE": "9876543210",
"CUSTOMERSTATE": "Karnataka",
"CUSTOMERZIPCODE": "560001",
"ERPCUSTOMER": "ERP001",
"CUSTOMER_ID": "CUST001",
"ID": "1"
}
}
Result
{
"avro_data_key": "b'Obj\\x01\\x04\\x14avro.codec\\x08null\\x16avro.schema\\xa4\\x08{\"type\": \"record\", \"name\": \"Customer\", \"fields\": [{\"name\": \"CREATEDDATE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERCITY\", \"type\": \"string\"}, {\"name\": \"CUSTOMERCOUNTRY\", \"type\": \"string\"}, {\"name\": \"CUSTOMEREMAIL\", \"type\": \"string\"}, {\"name\": \"CUSTOMERNAME\", \"type\": \"string\"}, {\"name\": \"CUSTOMERPHONE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERSTATE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERZIPCODE\", \"type\": \"string\"}, {\"name\": \"ERPCUSTOMER\", \"type\": \"string\"}, {\"name\": \"CUSTOMER_ID\", \"type\": \"string\"}, {\"name\": \"ID\", \"type\": \"string\"}]}\\x00\\xb7\\x0e\\xecd\\x1b\\x1bZ]\\xa8 \\xddC\\xa9\\xf9j\\x01\\x02\\xc8\\x01\\x1415-01-2024\\x12Bengaluru\\nIndia$john.doe@email.com\\x10John Doe\\x149876543210\\x12Karnataka\\x0c560001\\x0cERP001\\x0eCUST001\\x021\\xb7\\x0e\\xecd\\x1b\\x1bZ]\\xa8 \\xddC\\xa9\\xf9j\\x01'"
}
Avro to JSON
Description:
The AVRO to JSON operation converts AVRO formatted byte data back into structured JSON data.
Number of Parameters : 2
Parameter : AVRO Data Key
Provide the key name that contains the AVRO byte data which we aim to parse and convert into JSON.
Below is an example where we provide the key containing AVRO data.
avro_data_key
Parameter : JSON Data Key
Specify the key name where you want to store the parsed and converted JSON data.
Below is an example where we aim to store the parsed data.
json_data_response
Various Use Cases for the Parameters:
Case 1:
When you have valid AVRO bytes and need to convert them into a readable JSON object.
AVRO Data Key
avro_data_key
JSON Data Key
Example for Case 1:
Input JSON
{
"avro_data_key": "b'Obj\\x01\\x04\\x14avro.codec\\x08null\\x16avro.schema\\xa4\\x08{\"type\": \"record\", \"name\": \"Customer\", \"fields\": [{\"name\": \"CREATEDDATE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERCITY\", \"type\": \"string\"}, {\"name\": \"CUSTOMERCOUNTRY\", \"type\": \"string\"}, {\"name\": \"CUSTOMEREMAIL\", \"type\": \"string\"}, {\"name\": \"CUSTOMERNAME\", \"type\": \"string\"}, {\"name\": \"CUSTOMERPHONE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERSTATE\", \"type\": \"string\"}, {\"name\": \"CUSTOMERZIPCODE\", \"type\": \"string\"}, {\"name\": \"ERPCUSTOMER\", \"type\": \"string\"}, {\"name\": \"CUSTOMER_ID\", \"type\": \"string\"}, {\"name\": \"ID\", \"type\": \"string\"}]}\\x00\\xb7\\x0e\\xecd\\x1b\\x1bZ]\\xa8 \\xddC\\xa9\\xf9j\\x01\\x02\\xc8\\x01\\x1415-01-2024\\x12Bengaluru\\nIndia$john.doe@email.com\\x10John Doe\\x149876543210\\x12Karnataka\\x0c560001\\x0cERP001\\x0eCUST001\\x021\\xb7\\x0e\\xecd\\x1b\\x1bZ]\\xa8 \\xddC\\xa9\\xf9j\\x01'"
}
Result
{
"json_data_response": {
"CREATEDDATE": "15-01-2024",
"CUSTOMERCITY": "Bengaluru",
"CUSTOMERCOUNTRY": "India",
"CUSTOMEREMAIL": "john.doe@email.com",
"CUSTOMERNAME": "John Doe",
"CUSTOMERPHONE": "9876543210",
"CUSTOMERSTATE": "Karnataka",
"CUSTOMERZIPCODE": "560001",
"ERPCUSTOMER": "ERP001",
"CUSTOMER_ID": "CUST001",
"ID": "1"
}
}
Filter Ends
Description:
This operation ends the filter for a given streaming pipeline.
Number of Parameters : 0
When Ending a Filter Condition:
Always use the Filter End operation once data transformation and data transfer are completed.
The Filter End operation finalizes the active filter condition. After Filter End, no further filtering is applied,
and the original data flow is restored.
Important:
If more than one filter condition is used within the same flow, applying Filter End after each filter block is
mandatory to avoid unintended data filtering and to ensure correct data processing.
Frequently Asked Questions
What does the Upper Case operation do?
The Upper Case operation converts the values of specified keys into uppercase format while the data is in-flight within the pipeline.
What does the Lower Case operation do?
The Lower Case operation converts the values of selected keys into lowercase format during data processing.
How does the Data Type operation work?
The Data Type operation converts string values into their respective data types such as Boolean, Float, Integer, or DateTime based on the provided configuration.
When should I use the Append operation?
Use the Append operation when you need to add new keys with static values or dynamic values derived from existing pipeline data.
What is the purpose of Title Case?
Title Case converts the values of specified keys so that each word starts with an uppercase letter.
How does Data Extractor help?
Data Extractor retrieves specific keys and their corresponding values from a JSON response for further processing.
What is the difference between JSON to String and String to JSON?
JSON to String converts structured JSON data into a string format, while String to JSON parses a string and converts it into structured JSON.
When should JSON to XML or XML to JSON be used?
These operations are used when converting data between JSON and XML formats to meet integration or system requirements.
Notes
- Validate key names before execution.
- Ensure correct datatype formats during conversion.
- Test transformations using sample datasets.
Upper Case
Overview
Upper Case converts specified key values to uppercase while data is in-flight.
Number of Parameters
1
Parameter: Uppercase
Provide comma-separated keys in double quotes.
"first_name","last_name"
Lower Case
Overview
Lower Case converts specified key values to lowercase during data processing.
Number of Parameters
1
Parameter: Lowercase
"first_name","last_name"
Data Type
Overview
Data Type converts string values into Boolean, Float, Integer, or DateTime data types.
Number of Parameters
4
Boolean
"test_passed"
Float
"Amount"
Integer
"Quantity"
Date Time
"startweekdate1","%Y-%m-%d %H:%M:%S.%f%z","%Y-%m-%d %H:%M:%S", "startweekdate2","%Y-%m-%d %H:%M:%S","%Y-%m-%d"
Goldfinch Datalake Date Format:
%Y-%m-%dT%H:%M:%S.%f%z
Append
Overview
Append adds new keys with static or dynamic values during data flow.
Number of Parameters
1
Examples
"export_flag_y":"Y","export_flag_p":"P"
"concatenate_key_name":"{%ORDERNUMBER%}|{%ORDER_TYPE%}"
Title Case
Overview
Title Case converts specified key values into title case format.
"amount","first_name"
Data Extractor
Overview
Extracts specified keys and their values from JSON responses.
"['access_token']","['feedDocumentId']"
Trim
"first_name"
JSON to String / String to JSON
"key1","key2"
JSON to XML / XML to JSON
product_data_response data_response
Base64 Encoding / Decoding
"email"
Generate Array Sequence Number
key_name DATA
Today Timestamp
%Y-%m-%dT%H:%M:%S.%f%z dl_insert_date
Calculator
"Amount1-Amount2","Amount1+Amount2" "key1","key2"
Grok Pattern
Overview
Grok extracts structured data from unstructured text using predefined patterns.
This is endpoint url %{URI:endpoint_url} for mac add %{MAC:mac_address} and v4 %{IPV4:ip_address_v4} and V6 %{IPV6:ip_address_v6}
PDF Extractor
Items @xyz.grapgh.downloadUrl
ARRAY COUNT
['bizdata_dataset_response']['data']
RAW SENTENCE GENERATOR
"Name","Commands" Response
TIME UNITS
timestamp "year","month","day","hour","minute","second","microsecond"
Data Chunking
['text'] chunks 1000 Data_Chunks
Extract to Array Operation
"data" ["id","content"] ["ids","documents"]
HTML Extractor
Description:
The HTML Extractor operation extracts textual and structured data from given HTML content.
Number of Parameters : 2
Parameter : Input HTML Key
Provide the key name that contains the raw HTML data we aim to extract information from.
Below is an example where we provide the key containing HTML content.
Parameter : Output Data Key
Specify the key name where you want to store the extracted structured data.
Below is an example where we aim to store the extracted data in the html_text_data key.
Various Use Cases for the Parameters:
Case 1:
When it’s necessary to extract plain text content from an HTML string.
Input HTML Key
Output Data Key
Example for Case 1:
Input JSON
{
"bizdata_dataset_response": "<html><body><h1>Hello World</h1><p>This is a paragraph.</p></body></html>"
}
Result
{
"html_text_data": "Hello World This is a paragraph."
}
Frequently Asked Questions
What does the Upper Case operation do?
The Upper Case operation converts the values of specified keys into uppercase format while the data is in-flight within the pipeline.
What does the Lower Case operation do?
The Lower Case operation converts the values of selected keys into lowercase format during data processing.
How does the Data Type operation work?
The Data Type operation converts string values into their respective data types such as Boolean, Float, Integer, or DateTime based on the provided configuration.
When should I use the Append operation?
Use the Append operation when you need to add new keys with static values or dynamic values derived from existing pipeline data.
What is the purpose of Title Case?
Title Case converts the values of specified keys so that each word starts with an uppercase letter.
How does Data Extractor help?
Data Extractor retrieves specific keys and their corresponding values from a JSON response for further processing.
What is the difference between JSON to String and String to JSON?
JSON to String converts structured JSON data into a string format, while String to JSON parses a string and converts it into structured JSON.
When should JSON to XML or XML to JSON be used?
These operations are used when converting data between JSON and XML formats to meet integration or system requirements.
Notes
- Validate key names before execution.
- Ensure correct datatype formats during conversion.
- Test transformations using sample datasets.