It is time we start basing sustainable actions on data. Great article written by our technical director Marc de Beurs, shedding some light on the different types of analytics for businesses to become more sustainable. It's your turn to be #GeekyWise!
It is common knowledge that basing decisions on data is more effective than basing them on gut feelings. But how come that this is almost only applied to financial questions, and not when dealing with decisions on becoming more sustainable?!
We at Yabba Data Doo support decision makers with the ability to steer their company’s sustainability challenges with data. Data science for sustainability. In this article we would like to shed some light on the different types of analytics that exist and how it can be useful for your business to become more sustainable.
Unlock your data potential - from data to action
Often people perceive data science as a ‘black box’, something quite complex and abstract. In the past we have made the decision to open this black box for financial challenges and thus financial benefits. However, currently it blinds us from using it outside of these financial challenges because of the fear for huge investments. Although complex and abstract analytical methods exist (which would take many expensive hours for data engineers and scientists to implement), there are plenty of less complex analytics that might bear quick wins Basically, there are 4 types of analytics you can adopt. Below they are given in ascending order of complexity. As you might guess, the more complex the method, the more value you can achieve.
Descriptive - explains what happened
Diagnostic - explains why it happened
Predictive - forecasts what might happen
Prescriptive - recommends actions that should be done
When a business wants to start using data science in their decision making, never would they start with predictive or prescriptive analytics. The first step is always to get all the desired data in one place and make it visible for the end-users, through a report or dashboard for instance. This provides insights on what happened and is called descriptive analytics. As said before, getting the data in place is not straightforward so some initial investment is necessary. But once that is done, making the translation from raw data to clear insights is around the corner.
Deciding on one snapshot (one report) is hard and does not give decision makers the right handles to achieve their sustainable goals. At Yabba Data Doo we support the decision-making process with automated data flows and dashboards to continuously visualise critical data. This way sustainability become concrete and steerable.
Once you know what is happening, often the question of why arises. Why is my CO2 high at certain locations? Why do I get complains about unequal working opportunities? Finding the answer to these questions is crucial and belong to a slightly more complex type of analytics called diagnostic. The tactic here is to drill down to the root-cause and isolate confounding information. This can be done using three different strategies:
Uncover relationships (correlations between different data streams): does energy consumption change with machine temperature, or with the amount of people walking around in the building and causing vibrations?
Isolate patterns (statistical analysis): looking at the mean, spread and shape of energy consumption. Is it symmetric, uniform, are there multiple peaks? All this will help you understand why consumption is as it is.
Identifying outliers (anomaly detection): when CO2 emissions (or energy usage, etc..) exceeds a certain threshold, an email will be sent to the operator (or in a more extreme case, a machine is shutdown).
Based on the understanding of what happened and why, you are (if you want to) ready to predict the future. What will happen when we continue with business as usual (predictive)? What will happen when we turn this nob or invest in that technology (prescriptive)? Here we cannot avoid getting a little more technical, but we will try to make it comprehensible by making use of use-case examples.
There are basically two types of predictive models: qualitative and quantitative. We want to step away from gut-feeling and therefore suggest only looking at quantitative (pure data-driven) techniques. Most popular is linear regression, which is a fancy word for modelling the relationship between the different variables in the data. This can be done purely analytical, fitting a mathematical function (i.e. a polynomial) to the data, or more advanced using machine learning like a neural network.
An example use-case of predictive analytics is using historic data, such as energy consumption, to obtain a forecast of what might happen in the future (when nothing is changed). A simple analytical function (like a polynomial) might be sufficient and can provide insight into the future of (let's say) costs when energy pricing becomes more expensive (or emissions are taxed more heavily). The benefit of a neural network is that deep relations in the data could be uncovered since there is technically no limit to which data is included in the training of the model (downside is the complexity of building and testing the model and understanding the outcome). Finally, independent of the technique used, the forecast can be extended with automated actions, like sending an email or shutting down a machine, before they happen! Which is a significant difference compared to anomaly detection.
The goal of prescriptive analytics is to answer the “what if..?” question. This is done by investigating different scenarios and adding them to the forecast model (which could additionally be extended with external data sources and/or deep learning). In extending the model, simulations can be used which are most often of a stochastic nature (using semi-random numbers). These types of analytics are highly use-case specific. Nonetheless it can be informative to describe two use-cases, one with and one without the use of simulations:
Let's say that during the diagnostic analysis, it has been found that energy consumption rises with the temperature of the machine. In this step one could investigate if the total energy consumption would decrease if the machines were covered with isolating material, or after installing additional cooling equipment etc.
Use case with simulation: Perhaps the vibrations of people walking around will affect energy consumption (and vibration isolation of the machine is already maxed out). Then new walking routes might be beneficial which could be tested in simulation where people walk around randomly on the allowed new routes and produce vibrations, etc.
Implementing analytics remains work for data scientists or analysts. Nonetheless, after reading this article you should have a better idea of what types of analytics are possible and what it entitles. No need to be afraid anymore of the complexity or the black box feeling around data science. Let’s start making the right decisions based on facts: use data science for sustainability. But enough talking, let’s start doing. Yabba Data Doo!