SmarterRoutes - Data-driven road complexity estimation for level-of-detail adaptation of navigation services

: SmarterRoutes aims to improve navigational services and make them more dynamic and personalised by data-driven and environmentally-aware road scene complexity estimation. SmarterRoutes divides complexity into two subtypes: perceived and descriptive complexity. In the SmarterRoutes architecture, the overall road scene complexity is indicated by combining and merging parameters from both types of complexity. Descriptive complexity is derived from geospatial data sources, trafﬁc data and sensor analysis. The architecture is currently using OpenStreetMap (OSM) tag analysis, Meten-In-Vlaanderen (MIV) derived trafﬁc info and the Alaro weather model of the Royal Meteorological Institute of Belgium (RMI) as descriptive complexity indicators. For the perceived complexity an image based complexity estimation mechanism is presented. This image based Densenet Convolutional Neural Network (CNN) uses Street View images as input and was pretrained on buildings with Bag-of-Words and Structure-from-motion features. The model calculates an image descriptor allowing comparison of images by calculation of the Euclidean distances between descriptors. SmarterRoutes extends this model by additional hand-labelled rankings of road scene images to predict visual road complexity. The reuse of an existing pretrained model with an additional ranking mechanism produces results corresponding with subjective assessments of end-users. Finally, the global complexity mechanism combines the aforementioned sub-mechanisms and produces a service which should facilitate user-centred context-aware navigation by intelligent data selection and/or omission based on SmarterRoutes’ complexity input.


Introduction
As roads get more busy and our living areas more densely populated, driving has become challenging, especially in cognitively demanding circumstances such as complex junctions or traffic congestion.In those highly critical situations extra help would be highly beneficial for a road user.Advanced Driver Assistance Systems (ADAS) is the collective term to describe these supportive technologies ((EC), 2016) (Research et al., 2002).Braking support, lane detection, parking aid and emergency calls are just a handful of the numerous efforts that have been made to improve driving safety.However, these state-of-the-art assistance features are usually designed for the average driver.There is little or no input from the user -meaning it is generally only relying on sensors of the vehicle to make decisions (Hasenjger and Wersing, 2017).Recently, some major contributions towards user-centred data-driven navigation assistance have been made.As an example, Okamoto and Tsiotras (Okamoto and Tsiotras, 2019) are demonstrating that additional data might improve ADAS implementations by a data-driven steering wheel torque prediction model.Combined with driver behaviour profiling (Ferreira et al., 2017) such a model could help driving assistance platforms better predict and prevent possible crashes based on a driver's steering combined with knowledge about his/her behaviour profile.

Complexity driven route suggestion
Along with the aforementioned assisting technologies an optimal route with appropriate, well-timed navigation instructions can also contribute to the user-centred contextaware driving experience.Giannopolous et al. discovered for instance that driver age and spatial abilities have a considerable influence on the desired timings of the navigation instructions (Giannopoulos et al., 2017).Sladewski et al. (Sladewski et al., 2017) implemented a route planning layout based on weights originating from ranking the road turns and their accompanied complexity.This is a great example of less apparent characteristics which make navigation challenging.In complex urban situations, drivers (and especially older or inexperienced ones (Giannopoulos et al., 2017)) might not want the fastest but safest or easiest route.Duchham et al. ((Duckham and Kulik, 2003)) implemented such a routing algorithm and favoured routes with less complex manoeuvres over the absolute shortest path.They also found that an easier route from A to B was on average 16% longer than their corresponding shortest path.Similar approaches are the algorithm of Krisp and Keler (Krisp and Keler, 2015), estimating complexity based on the number of nearby nodes detected in OSM, and the Least Angle Strategy of Hochmair (Hochmair, 2000) which prefers roads with the least deviation from the direct target direction at an intersection.

User driving preferences
Another contributing factor towards user-centred contextadaptive navigation is the inclusion of user preferences and customisation.(Michon and Denis, 2001) performed a user study to get an idea of how the users would formulate navigation instructions after they were shown the way from a starting point to a destination in Paris.They found that the test persons often fell back on landmarks (e.g.objects or buildings standing out of the environment) when manoeuvres became more difficult and complex.Various efforts have been made to elaborate landmark based orientation in navigation instructions.Richter and Klippel (Richter and Klippel, 2005) implemented a solution which started from basic and abstract routing directions that were then projected on the actual environmental situation.The finding that users prefer landmark based navigation when complexity increases can be incorporated in SmarterRoutes' implementation.The utilisation of meta-information from geospatial data sources (e.g.does a given intersection have traffic lights or is there a bus stop nearby) serves as an input for complexity estimation during navigation.
The overall advancement of technology and the availability of extensive (real-time) data introduces the need for data management as navigation and driving can become very challenging with an overload of data at inappropriate moments.Driving and navigating can be sometimes very challenging.A mixture of bad weather conditions, busy traffic and complex driving environments combined with data-overload and very complex driving instructions might cause dangerous situations for the driver (Rolison et al., 2018).To prevent such situations from happening, datafiltering and the provision of appropriate, well-timed instructions should be implemented.This paper tries to contribute to the risk-assessment and data-filtering process by the proposal of a road-scene complexity judgement model based on the combination of geospatial, sensor and image data.

What is complexity?
Before the components of our complexity system are introduced, we should first define what a complex system actually is.The exact definition of complexity has evolved over the years.In the early days, the Latin word "complexus" literally meant "weaving things together" (Schlindwein and Ison, 2004).Ottino (Ottino, 2003) came with a definition for complex systems.They concluded that a system can be considered complex if a lot of individual components and interactions exist.As a lot of things happen simultaneously and are dependent of each other, it is acceptable to consider the road network and the accompanied traffic as a complex system.The different types of roads with all its road furniture and the interactions of drivers with each other and with the road infrastructure are perfect examples of elements which define the overall complexity of a road scene.Schlindwein and Ison concluded that a major definition of complexity can be best described by classifying the possible assessments of them in one of the two subcategories; descriptive or perceived complexity.The former, is the category in which quantitative measurements of complexity can be placed.In our road based context we can sum up a number of examples such as the number of cars on the road, the distance between them or the number of speed bumps.The latter, perceived complexity, is Figure 1.A visualisation of context aware data display whilst driving.Show less information when their is congestion.
the group which tries to quantify the perception of an observer.In our situation, this can be the thoughts of a driver whilst riding a certain road segment (e.g. this looks dangerous or this sector is badly illuminated or I can barely see the road).
A distinct complexity indication might on its turn serve as ultimate input for a smart navigation implementation.In the following subsections we discus the aforemetioned building blocks of the complexity mechanism in further detail.

Level-of-detail driven data management
As already discussed in the introduction, proper route planning should pay attention to the potential factors which are making driving hard.A context-aware path-calculation implementation should provide the best (or least worst) possible route by consideration of both the available environmental circumstances and the user itself.The next and important step towards user-centred navigation is the provision of clear instructions tailored to a specific user and/or use case.In an user-centred navigation implementation we have to make the following consideration: "how much and at which moment do users want to see the relevant (navigation) information?".The hereby proposed complexity model will contribute towards a concise answer to this question by the provision of road-scene complexity gauge.Several studies have been performed around this exact topic.Giannopoulos et al. (Giannopoulos et al., 2017) performed a user study to get an insight in when and how much a user needed navigation instructions to perform an A to B routing task.The participants could ask for the same audio navigation instruction multiple times for the same manoeuvre but this only happened in 14.4% of the situations.They also found that timing of the instructions was highly dependant of a number of environmental properties such as direct visibility of the decision point (DP), the length of the segment (between DPs) and the type of intersection of the DP.This useful insight is elaborated in the proposed complexity model by geospatial and image based analysis.Furthermore, they also investigated the user characteristics which were related to the actual timing of the instructions and concluded that age of the driver and general sense of direction are the most correlated with appropriate instruction timing.Every individual behaves in a different way and has his own preferences.One driver might be obsessed by data and would love to know every single detail whilst driving.Another one might want the least information possible and have a navigational device which pro-vides just enough information to conveniently orient themselves.In a user-centred navigational approach this choice should really be up to the users themselves.A parametric dashboard architecture which offers a predefined collection of data fields and services based on the type of user might be a good design to fulfil this important aspect of user-driven navigation.
Nonetheless, Cellario (Cellario, 2001) pointed out that a huge amount of data might cause information overload during driving.Research performed by Morris et al. (Morris et al., 2015) showed that distractions of as little as 2 seconds considerably increase the risk for an accident.The US department of Transportation also made considerable efforts in this field of study.They concluded that information should be checked against the following criteria: information type; priority and complexity; trip status and driving load; and the driver profile (Hulse et al., 1998).Although the latter two criteria are undeniably very important to the safety of an intelligent navigation system we will mainly focus on the former aspect of data filtering.The added dimension of data and distraction management introduces a supplementary use case for the context-aware complexity indication mechanism.In the following subsections, a mechanism to estimate environmental road scene complexity based on image, geospatial and sensor data will be discussed.

Scoring an environment's complexity
Generally, humans are capable to visually judge the immediate risk of a certain road situation.This is what we previously called the perceived complexity of an individual.In sheer contrast, traffic accident studies show that the major part of incidents are caused by human mistakes (Rolison et al., 2018).A human's judgement is the result of the combination from input of our senses with our gathered background knowledge.The eyes take a snapshot of the environment and the human brain tries to link this snapshot with previously encountered situations.Background knowledge about the type of environment ultimately provides us with an idea of the road scene's complexity.To mimic this natural behaviour the proposed complexity measurement mechanism uses transfer learning on Convolutional Neural Networks (CNNs) with human judgements of visually perceived risk of a road scene as training labels.A Convolutional Neural Network is a deep neural network which is especially suitable for image input.Image input is processed by a variety of matrix operations which are reducing the individual pixels to a feature matrix.The exact feature matrix is the result of training the model with training data for a specific use case (e.g.road type classification).For readers who want a more in-depth introduction into convolutional neural networks we refer to the work of O'Shea and Nash (O'Shea and Nash, 2015).
Visual complexity score: Convolutional Neural Net Image Retrieval (CIR) As mentioned, increasing road complexity clearly imposes the need for context-adaptive navigation and driving assistance.The first component of the complexity estimation mechanism is computer-aided visual road scene complexity estimation.In the next paragraph we will introduce such an implementation which emulates the human's visual road scene perception.The suggested implementation uses the proposed framework of Radenović et al. (Radenović et al., 2018).Their solution excels in finding similarity among images and achieves this by careful selection of descriptor features.Positive matches were detected in large image data sets based on a combination of Bag-of-Words (BoW) and Structure-from-Motion techniques.As negative matches the closest negative image and its k-nearest neighbours are selected (a match is marked as negative by additional 3D reconstruction methods verifying that the match isn't just showing the object from another point of view).When the tuples (image, best positive match, closest negative matches) are calculated for a training dataset a Convolutional Neural Network can be fine-tuned to minimise positive matching distances and maximising the negative matching ones (see Fig. 2 for the schematic overview of this approach).Fig. 3 illustrates that the model is very capable at finding a visually very similar image in its collection of labelled training data.(Naik et al., 2014).In this study head to head comparisons between 2 street scenes were performed.During this experiment volunteers were asked to select the pictures that look most safe, wealthy, lively, beautiful or depressing.For our purpose we weighed and normalised the individual comparisons and favoured the 'safety, wealthy and lively' comparisons of the set of available decision variables.The other images (18) were hand labelled.Rankings accompanied by their image paths are stored in a separate csv file.Likewise, image descriptors are calculated using the pretrained model of Radenović et al. (Radenović et al., 2018) and are stored in .h5files (Hierarchical Data Format).Road scene complexity can now be estimated by calculating the descriptor of a new/unseen street scene image and by comparing them with the descriptors in the indexed h5 file consisting of the training images.Image similarity is defined as the euclidean distance between their descriptors.Eq.1 and Eq. 2 calculate the best and second best match for an image query (I Q ).When the best (I best ) and the second best match (I 2 nd best ) is known a weighted final score (Eq. 3) can be determined.The exact weights are currently set to 3 4 and 1 4 .Further and thorough evaluation is needed to fine-tune and verify the values for the weights. (1) With:  The approach looks promising as the real power lies in the flexibility, relative simplicity and ease of use.
The biggest contributor to its overall ranking potential is the adoption of a well established pretrained model tailored for recognising similarity in 3D objects (buildings or road scenes).When this is combined with an additional ranking (and lookup) mechanism, a basic but functional ranking mechanism is obtained.When we would have opted to retrain an entire model for the image-based complexity judgement task we should need a lot of labelled training data (M.Foody et al., 1995), which unfortunately isn't yet the case, but is an aspect we are actively working on.The re-use of an existing CNN-based similarity model, combined with an initial collection of road scene images which are accompanied by human labelled rankings, already provides us with a basic, but usable model.Initial user tests show that generated complexity scores are corresponding with subjective evaluations of the testers.
Additionally, to verify this first iteration of the CNN based complexity gauging mechanism, 20 hand labelled images were used to validate the model.As previously discussed, the model used to determine perceived complexity is looking for similarity in the images to find the best match in the dataset (images with calculated CNN descriptors and the human-labelled complexity).The bigger and more universal this dataset is, the bigger the chance of finding a road scene with similarly looking characteristics (e.g.bridges, zebra crossings).With this consideration in mind, the initial model's achieved mean square error (MSE) of 2.92 is a promising starting point for the following versions with more and extended geographically covering human-labelled street scene images.

Geospatial and sensor complexity analysis of a road scene
The CNN-based model and ranking system emulates the visually perceived road scene safety.As mentioned in the introduction, human perceived complexity isn't always an accurate gauge for the actual complexity.
The big number of road accidents which are occurring due to human error are a perfect example for this fact.When we want to obtain a complexity model that minimises the potential danger of human (mis)perception, we should also include descriptive complexity indicators into our complexity mechanism.The image based complexity can be considered as a baseline for complexity.This is a discrete indicator of how safe or dangerous the road situation is perceived by a road user.In a next step, the initial value can be checked against or further finetuned by the descriptive complexity indicators of the architecture.Currently we have implemented geospatial analysis (Fig. 5) using OpenStreetMap as its data source.This mechanism is providing us with meta-information about the road which the driver is currently using.Table 1 shows the geospatial information which is obtained for our complexity mechanism by an example.The properties are allowing categorisation of the environment based on landuse (e.g.grassland, woodland and urban) and road type (e.g.highway, primary, secondary) during the fusioning and reasoning stage (Fig. 5) .The other currently implemented descriptive measures of complexity are traffic and weather analysis.
This modular approach facilitates additional inclusion of sensor data to adapt the complexity scoring to a specific use case based on additional and carefully selected environmental and peripheral input (e.g.extra inclusion of turning angle or driver stress level).Furthermore, as more and more people tend to carry a smartphone, active research around additional useful sensor data should definitely be considered.An example of a possibly useful sensor is the accelerometer which is standard equipment for most everyday smartphones.Logging acceleration and monitoring changes over time might provide valuable contributions towards overall complexity estimation.Zang et al. (Zang et al., 2018) proposed a method for road surface roughness estimation using a mobile application.Their work is basically showing the possibility to get road surface insights from smartphone acceleration sensors.For certain transport modes (e.g.motor riders or cyclists) accelerometer data might also provide an approximate leaning angle during turns (Lingesan and Rajesh, 2018).When both surface knowledge and tilt angles are combined a general idea of how "sporty" a bike rider is taking turns and provide them with a warning if a certain leaning angle was too extreme for a given type of road.
Another interesting input source might be real-time imagery coming from action or dash cameras.This bypasses the use of Street View based road scene images, which aren't always optimally representing the current road situation (e.g.different time of the day, road works).Additionally, together with the guaranty of real-time footage which can be used for more accurate image based street scene complexity, more profound insights in the environment might be obtained as well.Some examples of possible insights are weather situation (clouded or clear sky or even temperature (Chu et al., 2018)) or image based road surface categorisation (Slavkovikj et al., 2014).

The influence of user characteristics on complexity
The previously introduced complexity mechanisms were all based on the road environment and its related context.The characteristics of the end-user are another equally important factor which should also be considered in a contextaware LoD-management platform.As mentioned in the introduction of this paper (section 1), several interesting studies around a user's navigational behaviour do exist.Two important user specific characteristics are their age and their general sense of direction.Combination of both characteristics can result in various driver profiles which can be linked to specific rules on how they behave at a certain complexity level.For instance, when a driver is profiled as "a novice driver with mediocre geospatial capabilities, these rules might result in the omission of certain data in favour of more thorough route guidance during complex driving situations.
Finally, a global complexity indication mechanism can be compiled using the various sources of geospatial and sensor data combined with knowledge about the type of driver.
With the help of this indication, a data-driven navigation service should now be able to make a well considered decision about which information to omit/display and how exactly to display the necessary information, based on the complexity ranking and the provided user preferences.

Evaluation
In the previous section the various components of the complexity framework have been discussed in detail.A following step consisted of testing the introduced mechanisms against some real-life scenarios to check if the indicators are corresponding with these scenarios.The data selection and processing of potential complex-ity analysis mechanisms was realised in correspondence with SmarterRoutes' general concepts and principles about modularity.Extra components can be added as needed and the subsequent reasoning and unification of the selected mechanisms ideally happen on a user(group) level.
To demonstrate and verify the perceived complexity CNN model, we implemented a basic set of descriptive complexity parameters to suit a broad group of users' needs.Currently, geospatial analysis is performed by OSM way tag analysis supplying maximum speed, land use, road type and basic intersection information.Sensor based analytics are obtained from the Alaro weather model from the Belgian Meteorological Institute (RMI)1 providing the complexity mechanism with precipitation, temperature and wind condition.Traffic information was obtained by periodically polling the data resulting from the governmental initiative "Meten in Vlaanderen (MIV)"2 which analyses the traffic statuses of Belgium's major roads and is providing the model with real-time vehicle speeds, road occupation and speed difference for a certain location.
As a showcase for the various mechanisms we show the analysis for a road scene of a Flemish village centre.Figure 8 shows the full complexity analysis for the centre of Moerbeke-Waas, Flanders.As indicated the perceived image based complexity mechanism indicates average complexity (5.25/10).As shown under "Closest intersection" (Table 1) and also visible on the map in Figure 8 a 3-way intersection is nearing.This fact should also be considered in the overall complexity estimation.Traffic info is irrelevant for this exact location as the closest measurement point is more than 2km away.Additionally, the geospatial (tertiary road) and sensor based analysis (i.e.good weather and relatively low wind speed) indicate that the road scene circumstances are relatively safe.The final recommendation coming out of the model would be "average complexity, intersection coming up".
As mentioned the image based complexity mechanism was using a minimal amount of training data and the initial model is giving a MSE of 2.92 on a verification set of 20 images.We are currently gathering more training data, to get a bigger and geographically broader dataset.For this additional data gathering a number of A to B routes were calculated and Street View footage was obtained for these routes.A total of 6 routes (see Fig. 9), spread across the Flanders region (Belgium), resulted in a total of 2245 images which will be hand labelled by test users using an in-house web-based labelling tool (see Fig. 10).
Nevertheless, as the model might become more accurate by providing it with more training data, perceived complexity will still have several shortcomings.First, an important consideration is the fact that Street View images are just snapshots of the environment.This footage can be outdated or even unavailable for certain regions.Another important consideration is that complexity can change throughout the day.For instance, a highway will look more complex when a lot of vehicles are on the Street View image then it would do when traffic was entirely quiet.

Conclusion
The introduced SmarterRoutes complexity mechanism was implemented with end-user customisation and simplicity in mind.The human's perceived complexity is replicated by a CNN-model with an additional ranking mechanism.
In addition, geospatial and sensor based analysis provide descriptive complexity measurements which are fine-tuning or correcting the user perception about road complexity.Basic and straightforward fusion and reasoning has already been experimented with (e.g.primary roads are more complex as secondary or traffic congestion means high complexity), but additional, future work should be done to streamline the process.
A big consideration is the fact that the whole mechanism is centred around data.The mechanism's performance is highly dependant on the quality of the supplied data.As shown in Table 1, land use, road surface or maximum speed are often unknown as users haven't yet contributed this information to the OSM database (see https://taginfo.openstreetmap.org for an idea of the coverage for certain OSM tags).Additional geospatial datasets or data augmentation methods could be used to get more coverage (for instance, the work of Slavkovikj et al. (Slavkovikj et al., 2014) and could be considered in future iterations of the geospatial complexity mechanism).
Additionally, as we try to focus the whole complexity levelof-detail aware navigation and data-management around an individual the need for user studies arise.As previously mentioned, the suggested image based complexity indication component would certainly benefit from additional hand-labelled Street View images.When more labelled test and training Street View data is used in our model, the MSE for the visual complexity model could be further reduced.Another direct benefit of user studies would be the impact on the perceived complexity by using different image sources.It would be very interesting to present the user the same road scene both as a Street View capture and a action cam image.This would potentially give further insights in the influence of weather situation, season, time of day or other nuances on the perceived complexity.Combined with knowledge of traffic (and accidents) experts we could possibly gain also more insights under which circumstances drivers are more likely to missestimate complexity.

Figure 4 .
Figure 4. Step by step schematic overview of the image based complexity ranking mechanism

Figure 5 .
Figure 5.A visual representation of the complexity estimation model with all of its sub ranking mechanisms

Figure 6 .
Figure 6.R1, Antwerp, relatively quiet traffic, image based complexity 1.5/10 (i.e.low complexity) Fig.8our decision to split complexity in a perceived and descriptive sub-component.The time-and context-dependency on perceived complexity could be bypassed by the use of action or dash cams as they would provide the model with a real time snapshot of the environment.As an additional bonus, they also avoid the process of downloading and processing Street View footage.Experiments have shown that this exact process takes the longest time in the entire complexity ranking mechanism (5.42 seconds were needed to download and process the Street View image, the entire complexity indication process took 9.75 seconds).

Figure 8 .
Figure 8. Complexity analysis for a rural town centre, showing the various mechanism parameters in table 1.Further reasoning about the parameters indicate average complexity

Figure 9 .
Figure 9. Geographical spread of the routes used to gather additional training data.

Figure 10 .
Figure 10.The user-friendly web interface to conveniently label the route's (see Fig. 9) Street View images.

Table 1 .
This further fortifies Complexity analysis results for rural town centre of