Generate Health Report for IBM Cloud Pak for Data Using Jupyter Notebook

Nikola Samardzic, Mario Galjanic

/ 2022-02-03

Introduction

Cloud Pak for Data is an open, extensible data platform that provides a data fabric to make all data available for AI and analytics for both on prem and cloud (for example: AWS, Azure, IBM Cloud). It helps you prepare data and enables data engineers, data stewards, data scientists, and business analysts to collaborate using integrated tools.

Some of the key features and reasons why we can see increased usage of IBM Cloud Pak for Data are:

Connects all data and eliminates data silos for self-service analytics
Automates and governs a unified data & AI lifecycle
Analyzes your data in smarter ways, with ML and deep learning capabilities
Operationalizes AI with trust and transparency

Like any other system or service, we need to make sure it stays healthy. We achieve this by monitoring resource usage, potential hardware/software issues, clean-up of completed pods and similar maintenance activities. Right now, health monitoring of Cloud Pak for Data needs to be done manually in web console by user with administration privileges. Our goal was to generate static reports, similar to the ones available in Cloud Pak for Data’s user interface. Also, we wanted to add a few different interesting metrics, such as the number of projects owned by user.

In this article, we would like to present our idea of automatizing that process and make sure commands do not need to be executed manually. We are going to present usage using a combination of Openshift and bash commands within a Jupyter Notebook integrated in Cloud Pak for Data. The main purpose of resulting report is to share the metrics with the power users and team leaders that may be able to take action if needed. A good example of such action would be shutting down unused runtimes or resizing unnecessarily large environments.

Generating data

Like any other data science project, first and most importantly we need to generate data. In case our generated data does not follow the same pattern in every iteration, we can easily see how our automated process can fail, or even worse, generate a report with incorrect data. By being extra careful on this step, we are saving a lot of our time in the future.

After careful planning and collection of outputs of interest, we are preparing the commands that generate outputs in such a way that can be easily processed in subsequent Python steps. Using RedHat Openshift’s oc commands on the Cloud Pak for Data layer we can modify outputs to satisfy the data collection needs.

For example, if we want to see pods that are in non-running status, we will use simple oc command and append grep command to it.

oc get pods | grep -iv Running

Part of the output of command above:

center-big

In the image above, we can see that, besides the pod name, there are a few other columns that we are not interested in. Besides, we would prefer to have the same delimiter for every row and we can see that rows can have a different number of spaces between columns. By making a slight modification to our command, generated data will be much more “developer-friendly”. The results end up being easier to read, prepare and present using Python. The new command will look something like:

oc get pods | grep -iv Running | awk '{print $1 " " $3}'

In this case, part of the output will look something like:

center-big

Unlike the previous image, this output is straight to the point. We have only 2 columns we are interested in with a known, unique limiter between 2 columns – single space.

Of course, pods in non-running status are just an example of a command. The image below shows the final, user-friendly output presented to the end-user after the whole process is completed.

center-big

Preparing Jupyter environment

One of the most popular tools integrated into Cloud Pak for Data is Jupyter Notebooks – an open-source, interactive web tool that has become data scientist’s go-to option when it comes to Python development. In Cloud Pak for Data, we can create a dedicated project for the purpose of generating reports. After creating a new project, running oc get pods | grep jupyter can return our jupyter pod. Save that name because it will probably be needed in the future. Since we do not expect the notebook to be heavy on the system, we do not need to assign many resources to it. In our case, 2GB RAM with 1VPC proved to be just enough. Furthermore, it is advised to use as the Python version possible. Our newly created environment comes with pre-installed popular Python libraries such as Numpy, Matplotlib, Pandas, etc. In our coding part, we will need to read the files that we generated in the previous step. It would be advised to mount a directory shared by Cloud Pak for Data’s bare-metal pod with every Jupyter pod. By having such a directory, we can redirect generated data to files located in it making it accessible in the newly created project. In this article, we will not get into Python coding itself because every client is different and the way we presented our data is specific and should be treated separately. Finally, we need to execute the notebook from scratch by using Restart and run all to make sure everything will run without errors.

Scheduling

After we have the scripts/commands used to generate data and the Python notebook ready to present it, we must schedule the process itself so it can be generated in agreed intervals without the need to be run manually. For scheduling, we used crontab on Cloud Pak for Data bare-metal. There is some general pattern we should follow while creating a crontab file:

Setup global variables (if you have them)

e.g.

export JUPYTER_POD=jupyter-py37-cd6a24d1-9bd5-4867-b0b3-1941ce6e0d40-7f4d974dpg7bz
Create log file where we will redirect both output and error of each command

<command> &>> <logfile>
Run all scripts/commands used to generate data

e.g.

<path_to_script_folder>/get_non_running_pods.sh &>> <logfile>

Execute Jupyter notebook and generate a report file

find the location of the notebook we want to execute

  find / -type f -name “*<notebook_name>*”

  output e.g.: 

  /var/lib/osd/pxns/649015473893017782/projects/1c4bf706-feb8-4174-9257-786669d91dbc/assets/notebook/test-notebook_WDNEk6C6z.ipynb

copy the latest notebook version to the folder shared between baremetal and Jupyter pod. We want to have this step always in order to work with the latest version of the developed report. In your case, you might want to manually copy the notebook every time.

  \cp -f <notebook_path_from_previous_step> <path_on_shared_folder>

execute Jupyter’s nbconvert to generate HTML file using resources from the project’s pod

  e.g.  

  oc exec <project_pod_name> -- <jupyter_path_on_pod> nbconvert --execute --to html --no-input <notebook_path_on_shared_folder> --output <html_file_path_on_shared_folder>

Change permission of files in the shared folder

chmod -R 766 < shared_mounted_folder > &>> <logfile>
Change ownership of files in the shared folder

chown nfsnobody: nfsnobody -R <shared_mounted_folder>
Email HTML file

use Python script to send an email with HTML file attached

utilize class EmailMessage – good examples can be found on StackOverflow

tip: utilize Python from the project’s pod even though baremetal has its own Python (probably an older version)

e.g.

oc exec <project_pod_name> -- <python_path_on_pod> <mailing_script_path> &>> $FILENAME

Sample output:

center-big

Conclusion

A combination of Python, Jupyter, Bash and Openshift can be used to automatize the process of monitoring Cloud Pak for Data. By scheduling crontab scripts, with good code practices, we can make sure that every change we make in our notebook will be presented in the next run (after saving the version!). The emailing part makes sure everyone gets the report in a user-friendly format without unnecessary code cells, but only ones with output. This topic is still new in the IT world, but we believe this article is a good start and something that can save you some time, but also something interesting to create and develop and attractive to present.

If you are interested more in the IBM Cloud Pak for Data and its possibilities, you can find out here more info about the 2-day workshop we organize.

We hope this article helps you streamline your usual processes and if you have any questions, feel free to reach out to us.

Generate Health Report for IBM Cloud Pak for Data Using Jupyter Notebook

Nikola Samardzic, Mario Galjanic

Introduction

Generating data

Preparing Jupyter environment

Scheduling

Sample output:

Conclusion

Share This Story, Choose Your Platform!

Share This Story

Drive your business forward!