A company's ability to innovate is one of the key factors today to be and stay competitive. In a world that spins faster and faster, business models must be constantly renewed according to trends and developments. The basis for this is the intelligent handling of data made possible by modern data innovation platforms. What do such platforms have to achieve? Which tools are frequently used? We asked Olaf Nimz, responsible for Chief Data Scientist at Trivadis.
Why is the intelligent use of data central to a company's ability to innovate?
Olaf Nimz: To be innovative, you must understand your business, your products, your processes and especially your customers - information for this understanding lies in data. Data can also be used to support decisions. You can challenge the creative gut feeling of managers with data-driven trends and hypotheses. This goes as far as simulating a portfolio of different scenarios, i.e. virtually testing the expectations and risks of certain measures. Provided, of course, that you use the data accordingly. While many companies today are already able to collect large amounts of data, there is certainly enormous potential for operational utilisation.
How do you explain that?
Olaf Nimz: I think it’s mainly "in the head": You just grew up in a world where the use of machine learning was less a topic or not as easy as it is today. It's a slow transformation process. In Switzerland and Germany in particular, companies are also rather cautious - they wait until a technology has established itself. One of the reasons for this is probably that we are not massively challenged by disruptive approaches from start-ups that make life difficult for large companies. In the last two years, companies have increasingly woken up and are on their way now.
How can companies make their handling of data more efficient?
Olaf Nimz: With a modern data innovation platform that enables data collection and analysis. The data collection must be comprehensive, clear and accessible. Quite a few big data initiatives have led to data lakes that have become confusing for the user because they lack a comprehensive means of orientation. When storing the data, one should therefore particularly consider which business context is behind it - this is the only way to derive the origin, processing chain, quality and significance of the data and ultimately store it in a traceable data catalog. In addition, such a platform should serve both the need for spontaneous ad hoc analyses and regular, recurring analyses. For example, one should be able to observe what is happening today, thus evaluate creative ideas, and at the same time derive forecasts for the next days or weeks.
What is such a data innovation platform comprised of?
Olaf Nimz: On the one hand, data storage is important. This should be as inexpensive as possible in comparison to conventional storage, since data is stored in large quantities and a very long history may be kept. On the other hand, the platform must allow event processing, i.e. allow information from all source systems to be collected and processed live. Kafka, an Apache project, is a tool that collects events in any format and makes them available for further processing. Finally, when processing the data, it is important that the platform allows parallelisation, i.e. that the processing can be scaled as required. The best tool for this is Spark, also an open source initiative. You also need a data catalog that maps the metadata - in other words, it shows the meaning and structure of the data, from which sources it comes, and the quality status of the individual steps in the process.
What is the fundamental role of open source tools in data science?
Olaf Nimz: Our corporate partners Microsoft and Oracle use both open source technologies. They are the most advanced because a lot of people are working on them. It is generally a global trend to rely on the knowledge and experience of many people. The best example of this from Europe is Spotify, which is developing its tools together with the open source community.
In the Business Breakfast "Data Innovation as a Service" you set up a data innovation platform in just 2 hours, with Oracle. Why is Oracle ideal for this?
Olaf Nimz: Oracle's integrative performance is excellent. In other words, Oracle knows how to integrate a wide variety of tools into its own cloud platform and turn it into a working product. Basically, the world for infrastructure manufacturers has changed completely in recent years - they can no longer afford to live in their own proprietary world but benefit from being open to the outside world. This prepared service in the cloud offers unprecedented flexibility to quickly try out new ideas.
What do you advise companies that want to build such a platform?
Olaf Nimz: Building a modern platform for data management and analysis is one thing, configuring the zoo of interdependencies and making it fly smoothly is another. The skillset required for this is enormously broad. Companies are often absorbed with other processes or do not have the necessary know-how. The support of an external partner who masters all aspects of setting up, configuring and operating a data innovation platform can therefore be very valuable.