Top Data Science Trends in 2020

Barbara Pongračić, Kristina Ban - 12 minute(s) read.

Data Science Achievements in 2019

In the previous year, we’ve seen a lot of breakthroughs and improvements in the AI field, and we’re excited to see what is coming next. Huge advancements were made in reinforcement learning and NLP, where AI agents are performing as well or in some cases even better than humans.
For example, AlphaStar, a Google-owned AI software, is now capable of winning 99.8 percent of all matches in the StarCraft II competition1. It became capable of this thanks to reinforcement learning, used to train the AlphaStar. Reinforcement learning can also be used to train self-learning robots, cars or for image and object recognition systems2.

The image above displays a visualization of the AlphaStar agent showing the game from the agent’s point of view, including the considered actions of the agent with the predicted outcome.
There is another great DS project called Dactyl, a robot hand that taught itself to solve the Rubik’s cube in a simulated environment before trying it out for real. leaves a lot of space for improvement especially when compared to the people who can solve it in a matter of seconds3.


2019 was very successful in the NLP area. There have also been several breakthroughs in pre-training language models on large text corpora. The GLUE competition was used to evaluate NLP systems such as spanning logic, common sense understanding, and lexical semantics. The state-of-the-art models’ score has increased to 88 over 13 months while the human baseline level is 874.

Just by these few examples we’ve seen interesting advances in AI, but some questions raised our eyebrows:

  • Is AI able to communicate with others and how to explain its decision-making process?
  • Is AI able to produce instant insights and show possible outcomes of business decisions?
  • Is it secure to use AI?
  • How can AI generate new relationships within data?
  • Why did the AI system make a specific prediction or decision?
  • Does AI possess some level of creativity?

We hope that some of those will get answered in the following year. Have a look at some of the technologies we expect great, new improvements from.


Conversational AI

The main focus is to give decision-makers an easier way to ask questions about data and to receive an explanation of the insights. NLP enables a computer to understand spoken or written human language, while Conversational analytics enables questions to be posed and answered verbally rather than through text. The process of financial risk assessment for a single company is an extensive and labor-intensive effort, by using NLP and Conversational analytics managers can obtain the needed information faster and with more ease.

Gartner’s report sees a significant future for NLP in analytics. They predict that 50% of analytical queries will be generated by search, voice or NLP (or automatically generated) by 2020.

Conversational AI has changed customer service in banking, online shopping, and food delivery. We should soon be able to achieve natural conversation levels with AI as we have with human agents5. Chatbots also have the potential to resolve basic issues without the agent’s involvement.

Currently, chatbots still can’t maintain a decent conversation, but they can create short text. You can try the following state-of-the-art NLP system to generate some articles or ask a question:


As you can see from the example above, NLP is not yet mature enough, but huge improvements are definitely visible in the NLP model performance and its implementation in business cases. Next year, we hope to see even more tools like IBM Watson Analytics, Tableau Insights, and Qlik Sense to boost it.

Towards Automated Business Processes

Automated business processes are accomplished through augmented and prescriptive analytics, and continuous intelligence.

In the report, Gartner describes Augmented Analytics as an approach that automates insights using advanced machine learning and natural-language generation algorithms6. It is designed to conduct analyses and generate insights automatically without supervision and the need for assistance from a business analyst or data scientist. Augmented analytics use statistics and NLP to assist teams and to improve the interpretation of patterns found in data.

While descriptive analytics gives insight into the past and predictive analytics helps us understand the future, prescriptive analytics can automate decision-making process and advise on possible outcomes. It uses the information provided by descriptive and predictive analytics and creates models based on heuristics (Business Rules Management Systems – BRMS) and mathematical optimization. BRMS are decision trees based on a pattern-matching algorithm known as the Rete algorithm. They automate complex calculations when there is a large number of rules to be executed. For example, calculating the price of an insurance policy according to the customer’s profile or the product to be insured7, or supporting the decision to add more promotions to the Airline business8. Models based on mathematical optimization evaluate the solution according to the chosen variable. The model matches all possible results and finds one optimal solution. It is usually applied to the minimization of costs in a process, maximization of profitability, etc.

Continuous intelligence is AI-based real-time analytics integrated to prescribe actions in each step of the data pipeline, discovering complex patterns in data and allowing businesses to respond in near real-time.

Gartner predicted that the prescriptive analytics market would reach $1.88 billion by 2020, together with the development of augmented analytics and continuous intelligence.

Although the trend is to move towards automation of processes, companies will still need a data scientist to develop models that can operate in such processes.

Graph Analytics and Graph Databases

Using graph theory (network analysis) gives us the ability to model and analyze real-life interactions between different entities such as people, molecules, words, places or things that are related to each other. This field of research spans through different fields such as biology, linguistics, fraud detection, traffic route optimization, social networks and disease spreading.
Graphs are made of vertices and edges that can represent customers, groups, companies, institutions or entities such as buildings, cities, houses, stores, ports etc. Vertices can represent customer accounts, devices, products, bank accounts, etc. Edges are relationships that connect nodes and can help to find patterns and data connections.

Knowledge Graphs became the part of our every-day reality through voice assistants, search results, or recommendations created by machine learning. Machine Learning can define classes as topics in documents to help humans create new relations.

Gartner predicts that the application of graph processing and graph databases will grow 100% annually over the next few years with the goal of accelerating data preparation and enabling more complex and adaptive data science.


The Rise of Regulation and DS Ethics

The establishing of the GDPR in Europe and the California Consumer Privacy Act (which will go into effect in 2020) limited the ability to process and profile data, as well as imposing model transparency and brought forth the possibility of organizations being held accountable for adverse consequences. These regulations pressured the businesses to comply with the requirements and understand their current operations. Limiting data access as well as data processing, many enterprises faced difficulties trying to comply.

Data scientists and analysts have a new role in guiding the organization thought these new guidelines, especially if they possess thorough knowledge of privacy and security regulations. The trend for new regulation in the field of data governance is not ebbing, so we can be sure that new regulations are bound to spring up. The Cambridge Analytica Scandal is not something that should be repeated.

Explainable AI

AI is sometimes very hard to explain, especially the “black box” of neural networks. People want to know how the system made a decision, why it didn’t do something else, what will happen if the system fails, and can we trust those systems?

It’s not strange that people are distrustful when it comes to AI systems since they are capable of learning tricks. For example, researchers created a neural network to improve the process of turning satellite images into maps. AI was graded based on how close the generated image was to the satellite image and what happened is that AI learned to encode original features to the generated picture. That was something that the human eye couldn’t catch but it’s easily detected by the computer9.

Explainable AI’s task is to create results and solutions which can be easily explained to human experts. Exactly the opposite of today’s concept of the black box.


Maps that are shown in (c) show differences between the original map (a) and the crafted map (b).

AI in Media

Eventually, recommendation systems will be integrated with different types of media and digital assistants to create new content. AI is great with unstructured data such as text, image and video and that is the reason why it is widely used in media to improve customer experience.

For example, the AR/VR experience for gaming can be developed using AI. The creation of weather reports can be powered by AI too. Recently, AI emotion analytics was applied to music or voice tones to aim specific customers.

There is another way to use AI in media and lately, it’s usually laced with a negative connotation – deepfakes. Deepfakes are so named because they use deep learning generative adversarial networks (GANS) to create fake images and sounds like what was done with Peter Cushing and Star Wars Story. AI can also be used to detect deepfakes and help large companies work on removing deepfake images and videos from media.

We hope to see more AI applications in the media to create immersive and personalized experiences. While there is a great risk that AI can help in creation of fake news, we hope that it will be used for the presentation of accurate information.

To Conclude…

For years, science and open-source platforms dominated the field of machine learning, as well as artificial intelligence, innovating by developing different algorithms and environments. Commercial vendors are now starting to develop tools and environments, as well as enterprise features necessary to scale AI and ML as well as to accelerate deployment. Gartner predicts that by 2022, 75% of solutions will be built on commercial platforms. With the changes in the software design brought by AI, there is a need for platforms that can analyze big data using different tools and devices to detect and react to potential issues. Because of that, we definitely expect to see the increase in importance for AIOps in the following years.

We can already note the emergence of DS analysis tools that follow the trends mentioned above, such as AWS machine learning services (Machine Learning, Sagemaker and DeepLens), Microsoft Machine Learning Server, and Azure Machine Learning. Also, new tools for conversational analytics have emerged recently, such as AWS Amazon Connect and Microsoft Bot.

Companies have the opportunity to implement AI in their business processes even if they are not ready to form an inhouse data science team. With the help of specially tailored education for their needs in the form of data science primers or workshops from consultants they can quickly become an enterprise with very mature AI capabilities. We think that in 2020 even more enterprises will adopt AI.

The idea is that we will become more reliant on AI systems. But, if we start basing all our efforts on these new artificial intelligence technologies and adopting them proves successful, what changes can we expect to our society? Is this even possible? We must question the applications of technology and ask how and why.

What do you think? What breakthroughs and trends can we expect in the following year?

Before you go, check out the references we used: