Follow

Connecting to Redshift

You can configure Domino to query an external data warehouse such as Redshift during a run. This guide shows how to create a connection and retrieve the results of a query using Python as well as R.

Note: if your database is behind a firewall, you may need to speak with your ops/IT/networking team in order to allow the connection from Domino. Please contact us for more information, or to set up a meeting with your team to work out the details.

Credentials

Your database is likely secured with a username and password. We recommend storing these credentials as environment variables in your project so you can access them at runtime without needing to include them in your code.

Python

To establish a connection to Redshift with the psycopg2 library:

import psycopg2
import os

HOST = os.environ['REDSHIFT_HOST']
PORT = 5439 # redshift default
USER = os.environ['REDSHIFT_USER']
PASSWORD = os.environ['REDSHIFT_PASSWD']
DATABASE = 'mydatabase'

def db_connection():
    conn = psycopg2.connect(
        host=HOST,
        port=PORT,
        user=USER,
        password=PASSWORD,
        database=DATABASE,
    )
    return conn

example_query = "SELECT * FROM my_table LIMIT 5"

conn = db_connection()
try:
    cursor = conn.cursor()
    cursor.execute(example_query)
    results = cursor.fetchall() # careful, the results could be huge
    conn.commit()
    print results
finally:
    conn.close()

# using pandas
import pandas as pd
conn = db_connection()
try:
    df = pd.read_sql(example_query, conn)
    df.to_csv('results/outfile.csv', index=False)
finally:
    conn.close()

 

R

To establish a connection to Redshift with the RPostgreSQL library:

install.packages("RPostgreSQL")
library(RPostgreSQL)

redshift_host <- Sys.getenv("REDSHIFT_HOST")
redshift_port <- "5439"
redshift_user <- Sys.getenv("REDSHIFT_USER")
redshift_password <- Sys.getenv("REDSHIFT_PASSWORD")
redshift_db <- "mydatabase"

drv <- dbDriver("PostgreSQL")
conn <- dbConnect(
drv,
host=redshift_host,
port=redshift_port,
user=redshift_user,
password=redshift_password,
dbname=redshift_db)

tryCatch({
example_query <- "SELECT * FROM my_table LIMIT 5"
results <- dbGetQuery(conn, example_query)
print(results)
}, finally = {
dbDisconnect(conn)
})
Was this article helpful?
0 out of 0 found this helpful

Comments