An Overview of IBM’s Information Management Platform for Data Warehousing and Big Data

By |2018-04-10T16:54:38-06:00December 6th, 2013|Insight Post|

The success of any company is becoming more and more dependent on unlocking the value of data and turning it into trusted information for critical decision making.  The ability to deliver the right information at the right time and in the right context is crucial.  Today, organizations are bursting with data, yet most executives would agree they need to improve how they leverage information to prevent multiple versions of the truth, improve trust and control and respond quickly to change.

If you are an IBM customer, it is very likely you have received some level of education about IBM’s Information Management solutions platform, which includes IBM’s Big Data strategy.  IBM’s goal is to help their clients unlock the business value of data by turning it into information for competitive advantage, business optimization and improved business outcomes.  This article presents a brief summary of the scope and components of IBM’s current data and information strategy which sits primarily under their Information Management brand.

Hopefully your organization has created its own information and data strategy.  As a very experienced consulting firm, we are often surprised by how many large enterprise organizations do not have a formalized plan and strategy.  We spend a lot of time with our customers helping them assess where they are and then build an information strategy and roadmap.

If you have set the right direction, the end result should be one where trusted and accurate information can be delivered anywhere it’s needed in your business.  For many companies, that strategy should now also include a Big Data component so you can achieve analytic discoveries at a level where you can take advantage of the variety, velocity and volume of today’s new structured and unstructured information culture.

To achieve customer goals in data and information requirements, IBM breaks down their Information Management offerings into five primary categories:

  • Database Management
  • Information Management and Governance
  • PureData System
  • Data Warehousing and Analytics
  • Big Data

Let’s take a closer look at each of these five functional areas.  There is not enough space in this article to talk in detail about the solutions IBM offers in each area, so I will just select a few where we see a lot of client interest and current development, particularly in Big Data and Analytics.

Database Management Systems

You are probably most familiar with the systems and tools in this category.  IBM’s DB2 family of products has been a proven and reliable system for large-scale enterprises for a long time and now runs on many platforms.  IBM has been in the leader quadrant in the Data Warehousing DBMS Magic Quadrant (Gartner) since the very first one was issued in June 2001. The Database Management portfolio also includes other database platforms such as Informix, IBM solidDB and IMS—plus tools for all of these systems.

This component of Information Management is very mature, with the most recent innovations revolving around DB2 with BLU Acceleration which I will mention later.

Information Integration

Information Integration is comprised of a set of applications and platforms that help bring together data from diverse sources, manage the data quality and maintain master data for multiple, complex environments. It also includes protecting data and allowing for information-based collaboration across business and technical teams.  There are various products in this group under the InfoSphere banner, including:

  • InfoSphere DataStage
  • InfoSphere Data Replication
  • InfoSphere Information Server
  • InfoSphere QualityStage
  • InfoSphere Master Data Management

InfoSphere DataStage integrates data across multiple systems using a high-performance parallel framework.  It includes functions such as ETL (extract, transform and load), Hadoop support to directly access Big Data, near real-time integration and tools to manage a data integration infrastructure.

Our clients typically need a lot of help in this area.  They are wanting to integrate and transform information so that everyone who needs it can get to it.  They also want to create a single “version of the truth” across all silos, all while they protect databases from intrusion plus meet regulatory compliance requirements.

IBM PureData System

The IBM PureData System includes a group of products that deliver data services to different applications.  Many of the PureData products offer built-in expertise and integration within industry – all focused on making the entire analytics process simpler.  IBM claims that when open integration is supported with a 3rd party application, data can be ready for loading within a few hours.

PureData products include:

  • PureData System for Hadoop
  • PureData System for Analytics, powered by Netezza
  • PureData System for Operational Analytics
  • PureData System for Transactions
  • IBM Smart Analytics System

Data Warehousing and Analytics

A data warehouse is often the foundation for much of the analytics taking place in the enterprise today.  We often see where DB2 has been the database of choice in many enterprise data warehouses.  In addition, IBM offers their IBM InfoSphere Warehouse family which provides a data warehouse and analytics software platform.  It is designed for companies that need to transform large amounts of data.  It is powered by the IBM DB2 data server and includes various administration tools, embedded analytics software and a library of pre-built data models for various uses and industries.  They also recently released DB2 with BLU Acceleration which brings dynamic in-memory, actionable compression and other technologies to enable speed of thought analytics.  All of these enhancement were put in place to make analytical queries run much faster.

In addition, the Information Management category encompasses some of the PureData System solutions for analytics, including:

  • PureData System for Analytics, powered by Netezza
  • PureData System for Operational Analytics
  • PureData System for Transactions
  • IBM Smart Analytics System

We are seeing more and more interest in analytics systems—where your organization can gain insight into subtle trends and patterns so you can anticipate and shape events and improve business outcomes.  In many cases, properly architected analytics systems can drive revenue grown and help control costs, identify risks, and even compare “what if” scenarios to predict potential threats as well as new opportunities. Analytics are also great aids to help plan, budget and forecast resources.

Big Data

Big Data brings a new era in data exploration and utilization.  This is an interesting and very visible area.  Our company is actively working with clients who see an exploding volume of information—from terabytes to petabytes.  The velocity at which their information is increasing is also at the forefront.  Internal and external customers are now expecting responses in milliseconds.  And of course this new enterprise data is way beyond traditional formats, including text, audio, social media, streaming data from RFIDs and other amazing uses people are finding.

IBM has multiple Big Data solutions that are wide ranging and includes the following components.

  • InfoSphere BigInsights
  • InfoSphere Streams
  • InfoSphere Data Explorer
  • IBM PureData for Analytics, powered by Netezza
  • DB2 with BLU Acceleration
  • IBM Smart Analytics System
  • InfoSphere Master Data Management
  • InfoSphere Information Server

There is not enough room in this article to go into detail about each of these, but I will highlight those that are receiving the most attention from our consulting clients.

InfoSphere BigInsights brings the power of Hadoop to your enterprise.  Hadoop is a much talked about open source framework that is used to manage large volumes of structured and unstructured data.  IBM has taken this open source Hadoop kernel and added various things including administrative functions, discovery tools, provisioning, security and analytical research capabilities. They claim you will get a more user-friendly environment for large-scale analytics.

We are seeing environments where Hadoop and traditional architecture are linked together to augment and enhance a traditional data warehouse environment.  For example, it can be used as an archive for queries—allowing you to analyze large volumes of multi-structured data without straining your data warehouse.  Other customers are using it as a pre-processing hub to offload tedious tasks for normal warehouse operations.

PureData for Analytics, powered by Netezza is a fairly straight-forward data appliance for large-scale analytics.  It simplifies and optimizes the performance of data services for analytic applications—sometimes enabling complex algorithms to run in minutes instead of hours.  This is an area where we are seeing substantial interest from our clients.  With all the new sources of information, it seems many organizations just cannot get enough speed, fast enough load times, ease of use and better and better analytic understanding.

Summary

Every day 2.5 quintillion bytes of data are created.  In fact, 90 percent of the data in the world was created in the last two years alone.  Those sources of data are changing in variety, volume and velocity. We can get buried by this information explosion or we can figure out ways to use it as an advantage.  The next generation of solutions and tools from IBM, and others, are likely to change the way we do our jobs and make decisions.