As if you haven’t already noticed, the world is growing. Quickly.
Okay, it’s not literally expanding at the equator, but the population is developing, the connections among us are ever-increasing and the data generated by this growth is rocketing skyward. As the amount of data increases exponentially, so goes the demand for people who know how to digest this data in direct proportion. Whether big or small, companies thrive on data. It takes data to paint accurate financial pictures, judge the effectiveness of product promotions or analyze sales trends. A business may hold onto decades of data thanks to a legacy ERP system that resembles a black box more than a useful tool, but a vast amount of data hardly implies that the data is useful. Amidst the frustrating chaos, two titans have arisen…
First, let’s look at the Data Scientist. The science of data (and therefore those that dedicate themselves to it) hasn’t so much arisen recently as it has simply evolved. According to Steve Miller from Information Management, the typical Data Scientist is trained in one or more of the “hard” sciences. In other words, a data scientist has a Physics, Computer Science, or Mathematics background and then shifted their analytical focus.
The Data Scientist also lives for computer programming. Mike Driscoll from Dataspora purports that it’s not a far stretch for the Scientist to be well-versed in numerous computer languages, specifically those that are honed for data analysis, organization and visualization (e.g. R, Perl and Java). What about all of the imperfect data out there? That’s exactly what the Data Scientist would say: what about it? With a few clicks of a mouse, they can create a predictive model to plug any data holes – and pretty accurately, too! That’s why a Data Scientist is more and more in demand in a corporate setting. The ease at which they can navigate around complex data environments sets them apart. Projects utilizing Data Scientists are quicker, one-off projects that are typically not as governed as their traditional-analytics counterparts. Instead of taking specific directives from a project manager or coordinator, the Data Scientist likes to let the data speak for itself. That’s not to say this resource would not be successful creating a KPI dashboard, but his heart is for trend analysis and data mining.
Now let’s take a look at Business Intelligence (BI) professionals. They are by no means at odds with the Data Scientist; even if their approaches differ. The background of the BI professional is business through and through. They have the experience to know what kind of data is useful to each level of an organization and are not afraid to write a pretty complex SQL statement. Because their commitment is to the business and its users, they aren’t so much concerned with complexity as they are with usability. Their aim is to aid better decision making.
Where the Data Scientist uses XML and Java, they leverage ETL tools (like Informatica or DataStage). While the Data Scientist is enamored by the complexities of R and Perl, the BI professional understands OLAP cubes and data modeling best practices. Data Scientist projects are often quicker engagements with very specific short-term objectives. The BI professional’s role is typically managed quite closely and with a larger budget. BI initiatives can span many months and even years. It takes many teams working at all stages of the process to design, deliver and test large BI implementations.
In years past, the convergence of these two groups in the business realm has resembled more of a clash than cooperation. Data Scientists argue for the purity of the data while BI professionals advocate for the business (because it is the ultimate client). The lines once drawn in the sand, however, are being washed over by the tides of the ever-growing business needs.
It is the hope of writer Steve Miller (again, from Information Management) that the best-practices of Business Intelligence will influence and mold Data Science, and vice versa. They each have strengths that will continue to have profoundly positive effects on one another. For example, Data Scientists can leverage a BI professional’s affinity for dashboards to explore potential relationships and possible correlations. They can also continue to grow in appreciation for BI best practices and structured approach. On the other hand, the Business Intelligence industry has garnered a lot of attention from the Data Science sphere due to new innovations in big data. Furthermore, open-source BI solutions (like Pentaho, Jaspersoft, Palo) are attracting Data Scientists due to their highly customizable nature.
As needs change and approaches evolve, Business Intelligence and Data Science can still remain in harmony. Even though each comes from different backgrounds, the end goal is the same: turn all of that data into something useful and actionable. Leveraging their strengths and helping one another in times of weakness will grow the industry and further prove its usefulness.
Driscoll, M. E. (2011, May 27). The Three Sexy Skills of Data Geeks. Dataspora, from http://www.dataspora.com/2009/05/sexy-data-geeks/
Henschen, D. (2010, January 4). Analytics at Work: Q&A with Tom Davenport. Information Week, from http://www.informationweek.com/news/software/bi/222200096
Miller, S. (2011, May 3). Data Science – Part 2. Information Management, from http://www.information-management.com/blogs/data_science_BI_analytics_big_data_visualizations-10020259-1.html