K3 Documents

Amazon Redshift Source Adaptor

Updated on

Amazon Redshift is an Internet hosting service and data warehouse product which forms part of the larger cloud-computing platform Amazon Web Services.

The K3 Other SQL Source Adaptor supports pulling data from an Amazon Redshift database. The following instructions will walk you through how to set up a Redshift adaptor.

1. Select Other SQL as the Source Adaptor 

2. Establish a Connection to the Redshift Database

Click Database Credentials. Enter the required parameters and click Test Credentials. If the connection information is correct, a checkmark will appear next to the Test Credentials button. Click OK to close the Credentials window.

Configuration Parameter Description
Classname Java class name for the database driver
Subprotocol The official database protocol. This defines the connection string for the database.
Subname

Enter the Database Server and database using the following format:

//<DB_ServerName>:<Port>;databaseName=<DB Name>

User
Enter a user that has permission to read from the Redshift database the user will be connecting to.
Password Enter a password for the user provided in the User field

3. Build the query

From the configuration page, click Build Query. A pop up window will appear with a premade SQL query started for the user. At this point the user may follow one of the options below:

Option 1: The user can  keep the premade query and select a table from the drop down menu. This will pull all data from the chosen table without any filtering.

Option 2:  A custom query can be written in place of the premade query to pull from any available table with whatever filtering is desired.

Once the query is written, click Execute Query. A sample of the data received will be displayed in the right hand window. 

4. Select the delta poll parameter

On the right-hand side of the query pop-up box, K3 will highlight columns it recommends using for delta polling. K3 will only recommend columns that are DateTime columns or incrementing integer columns. K3 will not allow users to select a column that is not highlighted. If the user wants to use a non-highlighted column as the delta poll parameter, the user must manually enter the delta poll parameter.

To use one of the recommended columns for delta polling, add a WHERE clause with the column's name followed by a greater than sign, then click the column name from the right-hand plane.  

Example: 

WHERE snapshot_timestamp > ${rs.col('snapshot_timestamp').max()|$date.toDate('iso_tz','1970-01-01T00:00:00Z')}

In this example, the delta poll parameter will be snapshot_timestamp. Each new poll of the database will only ingest data with a snapshot timestamp that occurred after the latest snapshot timestamp K3 previously ingested.

Once the delta poll parameter has been chosen, click OK.


5. Adjust the poll schedule

K3 currently supports Cron scheduling, which is the default, and periodic scheduling.

For assistance in Cron syntax, please visit https://crontab.guru/. 

6. Name the Route and click Create

Once the database connection is established, the SQL query is approved, and the poll schedule is set, name the route by clicking Route Name located at the top left of the configuration page and typing the name, then click Create to create the route.

Configuration Field Value
Schedule Cron (only available if you select Cron Scheduling) Enter the Cron schedule script to determine when the poll will run.
Poll Delay  (only available if you select Periodic Scheduling) Enter the Poll Delay in milliseconds to determine when the poll will run.
Previous Article Change your password
Next Article Amazon Redshift Target Adaptor