First steps in Domino

This article will guide you through your first steps in Domino. You'll be working with some sample data from the Global Power Plant Database. You'll see examples of Jupyter, Dash, pandas, and NumPy used in Domino.

In Part 1, you'll learn how to:

  • create a project
  • launch a workspace
  • retrieve data for use in Domino

In Part 2, you'll learn how to:

  • create new files in your project
  • publish an App
  • share your work with others

 

Part 1

The first thing you'll see after logging into Domino is the Projects page, displaying your projects.

 

starting-out.png

 

Every new user will own an automatically-created quick-start project. This project contains an informative README with tips and instructions for working in Domino. It's a useful reference, but for this tutorial you're going to want a fresh project.

Click New Project to get started.

 

project-creation.png

 

Give your project an informative name, set its visibility to Private, then click Create Project.

Usually, after you create a project you'll want to apply some settings appropriate for the work you plan to do. The software environment your code will run in is controlled by the Domino Environment your project is configured to use. For this tutorial, any of the prepackaged Domino Standard Environments will work, so you can leave the project settings at their defaults.

Click Workspaces from the project menu to continue.

 

Screen_Shot_2018-10-15_at_6.00.07_PM.png

 

Select Jupyter, give the workspace an informative name, then click Launch Jupyter Workspace.

When you launch a workspace in Domino, a new containerized session is created on a machine in the required hardware tier. The workspace tool you requested is launched in that container, and your browser is automatically redirected to the workspace's UI when it's ready.

Once your workspace is up and running, you will see a fresh Jupyter interface.

 

new-workspace.png

 

You can see from the file path that you're in /mnt. By default, this is considered the root of your Domino project. When your project files are loaded onto an executor machine like this one, they'll be placed here. If you add or modify files in /mnt, you can save them back to your project when you stop or sync the workspace.

Your next step is to download some data to the executor.

Use the New menu to open a Jupyter terminal.

 

new-jupyter-terminal.png

 

Once your terminal is open, run the following command to fetch some data exported from the Global Power Plant Database:

wget https://s3-us-west-2.amazonaws.com/dominodatalab-gppd/global_power_plant_database.csv

 

Screen_Shot_2018-09-30_at_10.33.21_PM.png

 

Click the Jupyter logo at the top to return to the files browser. You should see a file named global_power_plant_database.csv in /mnt. Your next step is to do some basic manipulation of the data.

Use the New menu to create a Python notebook.

 

Screen_Shot_2018-09-30_at_10.34.14_PM.png

 

In Jupyter, you enter Python code in cells and hit Shift+Enter to execute the focused cell. Each time you execute a cell, you step forward to the next program state, as though you were typing each cell into the Python interpreter.

In your first cell, enter these lines to import some necessary packages, then hit Shift+Enter to execute:

import pandas as pd
import numpy as np

In the cells after that, you should read the file you downloaded into a pandas dataframe, then display the data:

df = pd.read_csv('global_power_plant_database.csv')
df

 

Screen_Shot_2018-10-01_at_2.38.24_AM.png

 

You can now see a truncated table of the data. Let's answer a simple question:

For power plants commissioned between 1990 and 2000, how many gigawatt hours are produced by each fuel source today?

To answer, you'll want to clean up the data a bit. Filter down to only those plants that have an estimated future production:

df = df[df.estimated_generation_gwh > 0]

After that, you should also filter down to plants with a commissioning_year in the target range, discard any decimal places on those dates, and then display the data again.

df = df[df.commissioning_year > 1990]
df = df[df.commissioning_year < 2000]
df.commissioning_year = df['commissioning_year'].astype(int)
df

 

 Screen_Shot_2018-10-01_at_2.32.06_AM.png

 

Now that you've got the data you're interested in, you can group the plants by the fuel1 column, and then aggregate estimated_generation_gwh and display the results:

fuels = df.groupby('fuel1')
totals = fuels['estimated_generation_gwh'].agg(np.sum)
totals

 

Screen_Shot_2018-10-01_at_2.56.13_AM.png

 

You can see from the data that from power plants built between 1990 and 2000, today the world draws the greatest number of gigawatt hours from coal, followed by natural gas and hydroelectrics. This is an interesting start, so you should save your work.

Click Stop from the top menu.

 

Screen_Shot_2018-10-01_at_3.03.55_AM.png

 

Domino will show you which files have changed during your workspace session, and prompt you to commit them back to your project files. Enter an informative commit message, then click Stop and Commit. Once the workspace shuts down you can close it. If you return to your project in Domino and look at the Files page, you'll see the raw data and the notebook file have been saved in the latest revision. If you start a new workspace, those files will be loaded into /mnt and you can resume where you left off.

 

Part 2

Now that you've found some interesting data, you can use Domino to share it in a way that's easier to browse and consume than a Python notebook. Domino can host Apps built with popular web application frameworks. This lets you power interactive visualizations with your Domino data, and quickly share insights.

Return to your project from Part 1, and click Files from the project menu.

 

Screen_Shot_2018-09-30_at_11.03.40_PM.png

 

Click the Add File button. Name the new file app.py, and paste in the following code for a Dash application. This code builds on the aggregation-by-fuel idea from Part 1, and shows how much capacity was commissioned by fuel type in each year.

 

import dash
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
import numpy as np
import plotly.graph_objs as go

app = dash.Dash()
app.config.requests_pathname_prefix = ''

df = pd.read_csv('global_power_plant_database.csv')
df = df[df.estimated_generation_gwh > 0]
df = df[df.commissioning_year >= 1990]
df = df[df.commissioning_year <= 2000]
df.commissioning_year = df['commissioning_year'].astype(int)

app.layout = html.Div(style={'paddingLeft': '40px', 'paddingRight': '40px'}, children=[
dcc.Graph(id='graph-with-slider'),
dcc.Slider(
id='year-slider',
min=df['commissioning_year'].min(),
max=df['commissioning_year'].max(),
value=df['commissioning_year'].min(),
step=None,
marks={str(year): str(year) for year in df['commissioning_year'].unique()}
)
])

@app.callback(
dash.dependencies.Output('graph-with-slider', 'figure'),
[dash.dependencies.Input('year-slider', 'value')])
def update_figure(selected_year):
filtered_df = df[df.commissioning_year == selected_year]
print(filtered_df)
grouped_df = filtered_df.groupby('fuel1')
averages = grouped_df['estimated_generation_gwh'].agg(np.sum)
fuel_values = []
production_values = []
for fuel, average in averages.items():
fuel_values.append(fuel)
production_values.append(average)
bars = [go.Bar(x=fuel_values, y=production_values)]

return {
'data': bars,
'layout': go.Layout(
xaxis={'type': 'category', 'title': 'Fuel type'},
yaxis={'title': 'GWH commissioned'},
margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
)
}

if __name__ == '__main__':
app.run_server(port=8888, host='0.0.0.0', debug=True)

 

Click Save when finished. The last thing you need to do before publishing this App is set up an app.sh file. When you publish an App from a project, Domino looks in the project files for a file named app.sh with the necessary commands to launch the application.

Return to the Files page and click Add File again. Name this file app.sh and paste in the following single line, then click Save:

python app.py

You're now ready to publish your App.

Click Publish from the project menu.

 

Screen_Shot_2018-10-01_at_3.29.49_AM.png

 

Title and describe your App, then click Publish.

If your app starts successfully, you can click View App to open it.

 

Screen_Shot_2018-10-01_at_3.33.21_AM.png

 

All that's left is to share your data with a colleague. Return to the project and click Publish again from the project menu. Under App Permissions you'll find a field you can use to send email invites to view your App.

 

Screen_Shot_2018-10-01_at_3.35.22_AM.png

Was this article helpful?
4 out of 5 found this helpful