Generate Health Report for IBM Cloud Pak for Data Using Jupyter Notebook

Nikola Samardzic, Mario Galjanic

/ 2021-12-14

Introduction

Cloud Pak for Data is an open, extensible data platform that provides a data fabric to make all data available for AI and analytics for both on prem and cloud (for example: AWS, Azure, IBM Cloud). It helps you prepare data and enables data engineers, data stewards, data scientists, and business analysts to collaborate using integrated tools.

Some of the key features and reasons why we can see increased usage of IBM Cloud Pak for Data are:

  1. Connects all data and eliminates data silos for self-service analytics
  2. Automates and governs a unified data & AI lifecycle
  3. Analyzes your data in smarter ways, with ML and deep learning capabilities
  4. Operationalizes AI with trust and transparency

Like any other system or service, we need to make sure it stays healthy. We achieve this by monitoring resource usage, potential hardware/software issues, clean-up of completed pods and similar maintenance activities. Right now, health monitoring of Cloud Pak for Data needs to be done manually in web console by user with administration privileges. Our goal was to generate static reports, similar to the ones available in Cloud Pak for Data’s user interface. Also, we wanted to add a few different interesting metrics, such as the number of projects owned by user.

In this article, we would like to present our idea of automatizing that process and make sure commands do not need to be executed manually. We are going to present usage using a combination of Openshift and bash commands within a Jupyter Notebook integrated in Cloud Pak for Data. The main purpose of resulting report is to share the metrics with the power users and team leaders that may be able to take action if needed. A good example of such action would be shutting down unused runtimes or resizing unnecessarily large environments.

Generating data

Like any other data science project, first and most importantly we need to generate data. In case our generated data does not follow the same pattern in every iteration, we can easily see how our automated process can fail, or even worse, generate a report with incorrect data. By being extra careful on this step, we are saving a lot of our time in the future.

After careful planning and collection of outputs of interest, we are preparing the commands that generate outputs in such way that can be easily processed in subsequent Python steps. Using RedHat Openshift’s oc commands on the Cloud Pak for Data layer we can modify outputs to satisfy the data collection needs.

For example, if we want to see pods that are in non-running status, we will use simple oc command and append grep command to it.

oc get pods | grep -iv Running

Part of the output of command above:

center-medium

In image above, we can see that, besides pod name, there are a few other columns that we are not interested in. Besides, we would prefer to have same delimiter for every row and we can see that rows can have different number of spaces between columns. By making slight modification to our command, generated data will be much more “developer-friendly”. The results end up being easier to read, prepare and present using Python. The new command will look something like:

oc get pods | grep -iv Running | awk '{print $1 " " $3}'

In this case, part of the output will look something like:

center-medium

Unlike the previous image, this output is straight to the point. We have only 2 column we are interested in with known, unique limiter between 2 columns – single space.

Of course, pods in non-running status are just an example of a command. The image below shows final, user-friendly output presented to end user after whole process is completed.

center-medium

Preparing Jupyter environment

One of the most popular tools integrated in Cloud Pak for Data is Jupyter Notebooks – open source, interactive web tool that has become data scientist’s go-to option when it comes to Python development. In Cloud Pak for Data, we can create a dedicated project for the purpose of generating reports. After creating new project, running oc get pods | grep jupyter can return our jupyter pod. Save that name because it will probably be needed in the future. Since we do not expect the notebook to be heavy on the system, we do not need to assign many resources to it. In our case, 2GB RAM with 1VPC proved to be just enough. Furthermore, it is advised to use as the Python version possible. Our newly created environment comes with pre-installed popular Python libraries such as Numpy, Matplotlib, Pandas, etc. In our coding part, we will need to read files that we generated in previous step. It would be advised to mount a directory shared by Cloud Pak for Data’s bare-metal pod with every Jupyter pod. By having such a directory, we can redirect generated data to files located in it making it accessible in the newly created project. In this article, we will not get into Python coding itself because every client is a different and the way we presented our data is specific and should be treated separately. Finally, we need to execute notebook from scratch by using Restart and run all to make sure everything will run without errors.

Scheduling

After we have the scripts/commands used to generate data and the Python notebook ready to present it, we must schedule the process itself so it can be generated in agreed intervals without need to be ran manually. For scheduling, we used crontab on Cloud Pak for Data bare-metal. There is some general pattern we should follow while creating crontab file:

  • Setup global variables (if you have them)

    e.g.

    export JUPYTER_POD=jupyter-py37-cd6a24d1-9bd5-4867-b0b3-1941ce6e0d40-7f4d974dpg7bz

  • Create log file where we will redirect both output and error of each command

    <command> &>> <logfile>

  • Run all scripts/commands used to generate data

    e.g.

    <path_to_script_folder>/get_non_running_pods.sh &>> <logfile>

  • Execute Jupyter notebook and generate report file

    find the location of notebook we want to execute

      find / -type f -name “*<notebook_name>*”
    
      output e.g.: 
    
      /var/lib/osd/pxns/649015473893017782/projects/1c4bf706-feb8-4174-9257-786669d91dbc/assets/notebook/test-notebook_WDNEk6C6z.ipynb 

    copy the latest notebook version to folder shared between baremetal and Jupyter pod. We want to have this step always in order to work with the latest version of the developed report. In your case you might want to manually copy the notebook every time.

      \cp -f <notebook_path_from_previous_step> <path_on_shared_folder>

    execute Jupyter’s nbconvert to generate HTML file using resources from project’s pod

      e.g.  
    
      oc exec <project_pod_name> -- <jupyter_path_on_pod> nbconvert --execute --to html --no-input <notebook_path_on_shared_folder> --output <html_file_path_on_shared_folder>
  • Change permission of files in shared folder

    chmod -R 766 < shared_mounted_folder > &>> <logfile>

  • Change ownership of files in shared folder

    chown nfsnobody: nfsnobody -R <shared_mounted_folder>

  • Email HTML file

    use Python script to send email with HTML file attached

    utilize class EmailMessage – good examples can be found on StackOverflow

    tip: utilize Python from project’s pod even tough baremetal has its own Python (probably older version)

    e.g.

    oc exec <project_pod_name> -- <python_path_on_pod> <mailing_script_path> &>> $FILENAME

Sample output:

center-medium

Conclusion

A combination of Python, Jupyter, Bash and Openshift can be used to automatize process of monitoring Cloud Pak for Data. By scheduling crontab scripts, with good code practices, we can make sure that every change we make in our notebook will be presented in next run (after saving the version!). The emailing part makes sure everyone gets the report in user-friendly format without unnecessary code cells, but only ones with output. This topic is still new in IT world, but we believe this article is a good start and something that can save you some time, but also something interesting to create and develop and attractive to present.

We hope this article helps you streamline your usual processes and if you have any questions, feel free to reach out to us.

Share This Story, Choose Your Platform!

Share This Story

Drive your business forward!

iOLAP experts are here to assist you