- Data-handling turnaround to gain more insights from self-care app.
- New big-data solution supported by AWS and assortment of software tools.
Deutsche Telekom embarked on a programme to change the way it deals with the enormous amounts of data generated by OneApp, the Group’s customer self-care app (Deutsche Telekomwatch, #85).
DT Software Engineer Parteek Singhal, in a blog, said the “previous system” — implying the revamp, if not entirely done and dusted, is well on its way to completion — was deemed unsuited to providing “advance analytics and reporting capabilities”.
OneApp generates click events that, Singhal explained, were previously streamed using RabbitMQ over MQTT. Click events were then stored in ElasticSearch, which could be queried via Kibana software.
According to Singhal, this way of doings things was fine when it came to providing real-time analytics capability, which enabled DT to gain real-time insight into customer engagement (and launch real-time campaigns). “Though this system is ideal for real-time use cases”, said Singhal, “data-driven decision making is not that easy with the growing data size and hunger for insights”.
Data volumes generated by OneApp — which, as of February 2020, had been downloaded by 55% of customers across DT’s European subsidiaries — are huge. Real-time analytics, monitoring, and campaign management are based on poring over 50 million documents a day. When OneApp was launched in May 2018, the goal was to channel more than 50% of the Europe segment’s digital service interactions through the application by 2021. OneApp is one of the products DT has developed based on the Group’s harmonised application programming interface layer (also known as “HAL”), which is linked in with the TM Forum’s Open APIs scheme.
DT set out various system requirements before embarking on the design of its big data solution. These were as follows, as highlighted by Singhal:
- The capability to store data over a longer period in a cost and performance effective manner.
- Aggregating different types of data from different sources.
- Executing complex analytical queries that need multiple joins over huge data.
- Need for distributed processing engines to scale in accordance with the growing data.
- Enable machine learning use cases.
- General-purpose business intelligence application that should be able to integrate with any data source.
Digging deeper with AWS (and other software tools)
Singhal outlined a new workflow architecture, using various software tools, to support OneApp and meet system requirements. It is implemented and managed on Amazon Web Services (AWS) public-cloud facilities, which acts as a “centralised data lake”.
Among the software-tool contributors are Logtash, Apache Spark, Parquet, and Airflow.
Open-source Logtash allows collection of data from a variety of sources; Apache Spark can process large amounts of data; Parquet is in the business of data compression; and Airflow helps with data pipeline management.
Singhal promised to provide more details of the new system in subsequent blogs.