OBIEE restart of system components using WLST

By |2019-10-29T09:20:18-06:00October 28th, 2019|Technology, Oracle, Business Intelligence, Administration, OBIEE|

Introduction

OBIEE, which stands for Oracle Business Intelligence Enterprise Edition, is the Oracle solution for business intelligence analytics. The platform encompasses a wide range of tools with various uses and purposes. If you want to know more about OBIEE, the Oracle site provides a good high level overview of its capabilities. This blog post will tackle scripting of one of the common OBIEE administration tasks: restart of OBIEE system components.

Customer Issue

The customers’ ETL process extracts most of the data the users are interested in once a day. In addition to the main process, some of the data needs to be refreshed during regular work hours. To have the data consistent until the smaller loads are done, the client uses two separate physical schemas for reporting. While the process is updating one schema, the users are getting the data from the other one, which was last updated during the bigger load run the previous night. The ETL process “switches” schemas at the end of the smaller load so the users get new data, the schema the reports are pointing to is determined by an RPD (Oracle BI Repository, a file that stores the BI server metadata) variable. The users reported that the dynamic session variables weren’t always refreshing properly, which led to reports having old data.

Sometimes the variable wouldn’t refresh until the components were restarted so the reports that use them were still displaying old data. This would usually happen a couple of days after the last restart, so the suggested workaround (that would also be a part of routine maintenance to keep everything running as smoothly as possible) included periodic restarts of OBIEE components.

On a single node instance this wouldn’t be a problem. We could connect to the machine running the WebLogic server, navigate to the bitools/bin folder and execute stop.sh/start.sh with a hardcoded list of components. However, this was a clustered environment, with failover and load balancing set up in a way that required going through the EM console, or going to each node and repeating the process.

Our Solution

The EM approach was used for a while, but it was taking time away from our admins because it had to be done manually. The simpler stop.sh/start.sh method would require a hardcoded list of components or it would restart the administration server, managed server, and all system components, which we didn’t want. What about the option of adding another node? In that case, the list would have to be updated and the person in charge of the system would need to know the list needs updating or the command would only do a partial restart.

There is a simpler, more dynamic process and that is using WLST.

Architecture

First, it’s necessary to address the architecture of the environment. OBIEE is run on two Linux boxes with high availability and load balancing due to the number of concurrent users connecting to it each day. The Admin server is running on a shared drive so it will still be running even if one of the nodes fails. Each node runs a node manager, one bi_server, and a list of system components: obips, obis, obisch, obiccs and obijh.

The diagram of the system and its components looks like this:

Utility and Approach

The approach we decided to take is to utilize WLST. What is WLST? It is an Oracle utility that allows us to manage and monitor servers and WebLogic domains from the command line. The name stands for WebLogic Scripting Tool, it is extendable, customizable and it is based on Jython, which is very convenient because it lets us use Python syntax to control the flow and logic of our scripts. It also has added bonuses like built-in functions specific to WebLogic that can help us in our task. Full list of WLST commands and variables can be found here. By connecting to the Admin server, it is possible to dynamically get the list of system components across all nodes it is administrating.

High level overview of what needs to be done:

  1. connect to the Admin server
  2. list out all system components
  3. stop components
  4. start components
  5. notify certain users whether the process was a success or failure

Sounds easy enough, let’s get started.

Security

There are two ways to connect to the server:

  1. using a connection string
  2. using a configuration file for the user that is authenticated with a key file

The first approach is simple enough: after starting WLST use the connect(<user_name>, <password>, <URL>) command to connect to the Admin server. That way the script will have to have plain text administrator username and password stored within it. We wanted to avoid that route because multiple users had access to those boxes, and they could potentially access the script and the administrator password with it.

The second approach is to log into the Admin server using the connection string with the connect() command and store the user and password information in encrypted config and key files. This way an admin would only have to log in once and generate the files, which can be kept in a secure directory. It is not ideal, because if someone has access to the script, they also have the permissions to access the configuration files, but files are a lot more difficult to smuggle out of a system than a plain text password.

The process is straightforward: an admin runs WLST and connects to the Admin server using the standard connect(<user_name>, <password>, <URL>) command. Then they can use storeUserConfig(userConfigFile=<user_config_file_name>, userKeyFile=<key_file_name>, nm=’false’) command. The last parameter stands for Node Manager, and the flag indicates whether to save the Node Manager user and password. We don’t need that for what we’re doing, since we’re connecting directly to the Admin server. However, for those that might need it, the command used for connecting to the Node Manager would be nmConnect.

Script Body

With connecting and authorization out of the way, we can focus on getting WLST to do what we need it to do. Remember when I said WLST is based on Jython? This means we can use Python syntax for processing, which can refer to reading files, error handling, executing OS commands, using conditional logic, etc.

Since we have the ability to read files, we created a separate file containing the following: location of the generated user file, the key file, address and port of the server we’re connecting to, list of user email addresses that would receive the email with the status of the restart action and the email address of the admins that would be notified in case something went wrong. If there are two separate email addresses, we can set the other one to a pager service which would page the admins to look into the issue.

Getting the information from our configuration file is a simple matter of opening the file and loading the key-value pairs into a dictionary. Note, the paths were changed, and some methods simplified, we suggest you use a degree of obfuscation to hide the files (especially the user configuration and key file).

cf_file = open('/u01/obi_restart/obiconfig.ini', 'r')

config = {}
for line in cf_file:
    (key, val) = line.split('=')
    config[key] = val

Once we have that, we can directly call the value by its key and assign the values to variables we can manipulate if we need to.

For the connection string we have to have the user file location, the key file location, and the URL of the admin server. The URL we can construct from the host and port, while the locations to the files can be pulled directly from the configuration file.

host_url = str(config['hostname'].rstrip('\n') + ':' + config['port'].rstrip('\n'))
connect(userConfigFile=config['userConfig'].rstrip('\n'),
    userKeyFile=config['userKey'].rstrip('\n'),
    url=host_url)

In the actual script we surrounded the connect() command within a try-except block so we can raise an error and trigger an email on exit, regular Python error handling can be used here.

If the connection was successful, we can continue and list out all the OBIEE system components across all nodes that are a part of the environment. To do that we can call the getSystemComponents() method of the current instance we’re connected to, our Admin server. In the code we store the components in a list we can iterate through.

Before doing that, it is necessary to define what we will do with our components. For that we created a function that either starts or shuts down a component, based on the parameter we give it.

def perform_action(action, comp_name):
    try:
        if action == 'stop':
            shutdown(comp_name)
        else:
            start(comp_name)

        return 'success'
    except:
        return 'failed'

This simple function gets the name of the component and the action we’re trying to perform. In this case it is either shutdown(comp_name) or start(comp_name) the specified component. Once we have created the function, we can call it in the script body like this:

components = cmo.getSystemComponents()

# Stop all
for component in components:
    status = perform_action('stop', component.getName())

# Start all
for component in components:
    status = perform_action('start', component.getName())

The list of system components across all nodes is obtained through the getSystemComponents() method of the cmo object, our Current Management Object. The function returns the status of the action so we can add it to a list that we’ll use later to create a physical log file and send an email to either the regular distribution list or to the paging service in case of errors, which is easy with Python.

The script can then be saved as a Python script and called with WLST directly from the command line, passing the script location as a parameter to WLST. The syntax to do that is:

<path¬_to_wlst_directory>/wlst.sh <path_to_script>/restart_comp.py

Conclusion

In this blogpost we took a brief glance at scripting a common OBIEE maintenance task, which can then be scheduled in a variety of ways. In our case, we wrapped the script within a shell script and scheduled it with crontab to run at a time we know no users will be online.

Within the actual script we added conditional logic, logging, email capability, error handling and easy modifications using a configuration file, all through the power of the Jython based framework. We could have done a lot more if it was needed, because WLST is capable of everything that can be done through the console, and then some.

In conclusion, WLST is a powerful and versatile utility through which common OBIEE maintenance tasks can be scripted and automated. The code shown here is the core of our script that shows off a tiny bit of WLST’s capability, but much more is possible! Hopefully this blog showed how easy it is to use WLST and ultimately, free up some time that can be spent on other important tasks.