Data analysis in collaboration with WirelessCar

From ISLAB/CAISR
Title Data analysis in collaboration with WirelessCar
Summary Data analysis in collaboration with WirelessCar
Keywords
TimeFrame
References
Prerequisites
Author
Supervisor Mahmoud Rahat, Peyman Mashhadi, Slawomir Nowaczyk
Level Master
Status Open


WirelessCar is a very prospective company with the main mantra of leading the automotive industry towards a digital society. We are provided with an interesting dataset from the WirelessCar company. The goal of this thesis is twofold. The first goal targets advanced exploratory data analysis (EDA) to gain insight from data and ideally convert it into a story. After gaining insight from the data, the next step is to define an exciting application with both research and business values. From the machine learning point of view, this application could be anything from supervised learning, unsupervised learning, feature representation learning, adversarial learning, and so on. The real advantage of this project is that you get to work with a real dataset and going through the entire pipeline of designing a successful project from both research and business perspective, in close collaboration with WirelessCar.

The dataset contains information about many trips taken by different vehicles. It is represented in a hierarchy with three levels. The first level is the coarse-grained information about each trip, including start and end GPS position, total fuel consumption, and more high-level information. The second level includes a more detailed representation of each trip. The third level breaks down each trip into different segments called waypoints and contains information sampled for each segment. There are many use cases that WirelessCar could be interested in. To name a few, they are interested in finding some driver-related behavior patterns and their relation to cost, time, and traffic. One interesting question will be if the route taken by a driver to a specific location is cost-optimal? This can be addressed using EDA by comparing it to other drivers’ behaviors. This can be taken to the next step and be accompanied by a machine learning algorithm predicting optimal route taking into account time and cost. Another use case would be the possibility of carpooling and consequently reducing traffic and cost. These use cases can be well-understood and analyzed by advanced exploratory data analysis and further be converted into a high-impact machine learning problem.