Using Jupyter Notebooks for Model Data Analysis#
Welcome to the DKRZ tutorials and use cases repository!
This repository collects and prepares Jupyter notebooks with coding examples on how to use state-of-the-art processing tools on big data collections. The Jupyter notebooks highlight the optimal usage of High-Performance Computing resources and adress data analysists and researchers which begin to work with resources of German Climate Computing Center DKRZ.
While jupyter notebooks with demonstrations are provided in the
notebooks/demo directory, we also host notebooks for hands-on sessions in the
Getting a DKRZ account:#
for model data users working in EU:
for model data users with partners in the German earth systems research community, see here.
To run the notebooks, you need a browser (like Firefox, Chrome, Safari,…) and internet connection.
Open the DKRZ Jupyterhub in your browser.
Login with your DKRZ account (if you do not have one account yet, see the links above).
Select a preset spawner Option.
Choose job profile which matches your processing requirements. We recommend to use at least 10GB memory. Find info about the partitons here or note the mouse hoover. Specify an account (the luv account which your user belongs to, e.g. bk1088).
Press “start” and your Jupyter server will start (which it is also known as spawning). The server will run for the specified time in which you can always come back to the server (i.e. reopen the web-url) and continue to work.
In the upper bar, click on
Git -> Clone a Repository
In the alert window, type in
https://gitlab.dkrz.de/data-infrastructure-services/tutorials-and-use-cases.git. When it is successfull, a new folder appears in the data browser which is the cloned repo.
In the data browser, change the directory to
tutorials-and-use-cases/notebooksand browse and open a notebook from this folder.
Make sure you use a recent
Python 3kernel (
Kernel -> Change Kernel).
Some notebooks need individual Jupyter kernel:
Open a terminal.
Run the following lines to create a conda environment and a kernel for the notebook:
module load python3 # works at levante. otherwise, install conda or mamba #the following creates the environment: mamba env create -f environment.yml # set -p TARGETPATH for installing not in home #activate the environment: conda activate nbdemo #sometimes you need to use 'source' instead of conda and the full path instead of 'nbdemo' #the following creates the kernel python -m ipykernel install --user --name nbdemokernel
When done then go to you Jupyter and choose the new Kernel we just created
Now you can run also the summer days notebook.
Content and structure#
In this hands-on we will find, analyze, and visualize data from our DKRZ data pool. The goal is to create two maps, one showing the number of tropical nights for 2014 (the most recent year of the historical dataset) and another one showing a chosen year in the past. The hands-on will be split into two exercises:
Search for an appropriate list of data files. The datasets should contain the variables
tasminon a daily basis.
Save your selection as .csv file, so it can be used by another notebook.
Read the saved selection and open the two files, which are needed.
Calculate the number of tropical nights for both years.
Visualize the results on a map. You can use your preferred visualization package or stick to the example in the demo
Reach us at firstname.lastname@example.org