Integration

Lyftrondata Data Mirror is a feature within Lyftrondata platform that allows for the replication and synchronization of data from various sources to a centralized data warehouse or data lake in real-time or near real-time. This helps in creating an up-to-date, unified view of the data across the organization.

Integration is Divided into 5 Simple Steps:

Prep
Select Source
Select Target
Configuration
Confirm

Warehouse:

Warehouse (also known as a virtual warehouse) is a key component that plays a central role in processing Integrations. It is a cluster of computing resources (e.g., CPU, memory) that users can provision to perform data processing tasks.

Components of Warehouse:

Key

Value

Name

Warehouse Name

CPU

No of CPU's

Memory

Amount of memory

IP address

Public IP Address

Whitelist Warehouse IP:

To ensure proper functionality, please whitelist the IP address in your environment. This will allow necessary access and prevent any connectivity issues.

Select Source: Lyftrondata integrates with over 300 data sources. You simply need to select the source from which to load your data into the target.

Key

Description

Connection Name

You need to write a meaningful connection name.

Description

Short description of the connection name.

Tag

Tags for a connection are keywords or labels assigned to a data connection to categorize and organize it.

You need to complete the prerequisites for the API in order to obtain the credentials. Some APIs require payment, while others are free to use. I have used Freshsales API as an example.

Key

Value

Personal Token

Your Freshsales API Personal Token.

Base URL

Your Freshsales API BASE URL.

Hostname

Your Freshsales Hostname.

Select Target: Lyftrondata's target refers to the destination where data is transferred, transformed, or loaded during data integration processes. It could include databases, data warehouses, data lakes, cloud storage services, or other platforms where the processed data is ultimately stored or used for further analysis and reporting.

Key

Description

Connection Name

You need to write a meaningful connection name.

Description

Short description of the connection name.

Tag

Tags for a connection are keywords or labels assigned to a data connection to categorize and organize it.

Basic

Key

Value

Field

URL

Your snowflake account URL.

Required

Username

Enter your snowflake Username.

Required

Password

Enter your snowflake Password.

Required

Schema

Enter your snowflake Schema.

Required

Role

Enter your snowflake Role.

Required

Warehouse

Enter your snowflake Warehouse.

Required

Database

Enter your snowflake Database.

Required

Target Snowflake Connection Video:

After setting up the target, the integration configuration process begins, defining data flow through mappings, transformations, and schedules for efficient, accurate processing. Batches manage data transfer size and frequency to optimize performance, while logging tracks each step for troubleshooting and monitoring. Webhooks trigger actions on event-based notifications, enhancing automation in real-time data workflows.

Config Parameters

Description

Batch Size

Batch size is the number of data records processed together in a single operation, optimizing performance and resource use.

Select Memory Size

Refers to choosing the amount of memory allocated for a specific task.

Regex

A sequence of characters that defines a search pattern for matching, replacing, and extracting text.

Die on Error

Immediately stop a program or process when an error occurs, preventing any further execution.

Process Method

Process method parquet or Avro" refers to the choice between using the Parquet or Avro file formats during data processing.

Pipeline Parallelism

Pipeline parallelism means how many pipelines can run in parallel.

Enable Multithreading

Execute multiple threads concurrently, improving performance by utilizing multiple CPU cores effectively

Pipeline Per Dag Limit

Pipeline per DAG means how many pipelines/tables you can select in a single integration.

You need to select the target schema in the load configuration.

You can schedule the integration based on your specific time.

If you want to receive notifications through email or a Slack channel, you can configure that. You will get notifications for any event, whether it passes or fails.

You have the option to select your preferred logging service for tracking and monitoring your data integration processes. Choose between Lyftrondata or CloudWatch to ensure you receive timely and detailed logs of all activities.

You can also set up Web Hook Calls to receive real-time notifications and updates. This allows you to instantly react to events and integrate with other systems seamlessly.

Data Mirror Integration:

PreviousPrerequisite NextData Vault

Last updated 1 year ago