Eye-tracking comparison of two road atlases

The usage of road atlases is experiencing a decline due to the rise in popularity of modern GPS navigational systems. However, road atlases are still utilised by some individuals, both in combination with mobile navigation and separately when navigating during the trip. Therefore, road atlases continue to be published. They are regularly updated, and they do gradually change, such as when the map symbology changes or when the creation of maps takes new technological possibilities into account. The changes in map symbology are the main essence of the presented paper. Based on the assumption presented by an expert (head of the largest cartographic publishing house in the Czech Republic) during an expert discussion that the 20-year-old road atlas is easier to read and that the required information was more quickly accessible in the old atlas than in the new atlas, a scientific experiment was designed and performed. Within the comparison of the "old" and "new" atlases, three hypotheses were established: (1) Accuracy of responses will be higher for the "old" atlas, (2) Time for task completion will be lower for the "old" atlas, and (3) Orientation in the maps from the "old" atlas will be easier. The eye-tracking testing performed did not confirm the first hypothesis, but the statistics confirmed the second and third hypotheses. The reasons for the different results for the "old" and "new" atlases were sometimes obvious (change in the graphic variables of a particular map symbol). Still, other times the causes were not completely clear. All of the experimental results were provided to the publishing house for further use in practice.


Road maps and atlases
Road maps are among the most widely used maps worldwide. In general, road maps show roads and their specifications intended especially for vehicular movement and are usually used for finding automotive transit routes. The vehicle is usually a car or motorcycle, but there are also special road atlases for trucks. In other words, road maps are used for the road navigation of motorised vehicles. Road map content depends on the map scale and the level of detail. As the primary focus, road maps show the whole road network's topology with a road class/type distinction. The other content usually includes topography of the builtup areas, forested areas, points of interest (important landmarks, petrol stations, car repair shops, hotels, restaurants, railway network with stations, etc.) and a relatively large number of text labels in different formats (road numbers, names of settlements, significant slopes of roads, motorway exits, etc.). Although road maps have a long history, their graphic design varies greatly in the world within individual map symbols' graphic variables. Regarding the main road's colour, it is possible to find almost all tones: yellow, green, red, pink, purple, orange, black; however, blue is used very rarely because it is standardly used for watercourses. And this diversity is not just a matter of differences between states or continents; it can also be found between the maps of individual publishers in one state or region. An example is the Czech Republic, where the study presented in this article was carried out.
While old national road maps were published as single large-format maps, they were later replaced by more detailed road atlases. This is a specific type of cartographic atlas because the atlas is not a set of different maps but is one large map cut into individual pages in a predetermined system of pages.

The role of road atlases today
It would seem that road atlases are being replaced by mobile navigation. However, this is only partially true. In an online survey which we conducted (Drahosova, 2015; 179 respondents; 101 males and 78 females; average age 26.35; higher education 58%, secondary school 31%, other education 21%) the first ranked preference for use in practice was the use of GPS navigation together with the road atlas (46% respondents). The use of only GPS navigation was ranked the same as the use of only the road atlas (19%). Concerning the number of respondents and the purpose of the survey as a basic overview of user preferences, these values are only indicative. However, the results do show that there is still user demand for road atlases. The main reasons are the possible unreliability of navigation due to a missing signal or erroneous information processing, habit, and the cautious relationship of (mainly) older users to modern technologies.

Road map evaluation
Navigational map reading is a complex task composed of relatively simpler cognitive subtasks (Lobben, 2007). Lobben (2007) developed the Navigational Map Reading Ability Test (NMRAT), which contains five parts (map rotation, place recognition, self-location, route memory, Advances in Cartography and GIScience of the International Cartographic Association, 3, 2021. 30th International Cartographic Conference (ICC 2021), 14-18 December 2021, Florence, Italy. This contribution underwent double-blind peer review based on the full paper. https://doi.org/10.5194/ica-adv-3-12-2021 | © Author(s) 2021. CC BY 4.0 License and a way-finding exercise) and predicts a person's ability to read maps and navigate with them. The test can be useful for those who need to identify people with map reading/navigation abilities without the need for realworld assessment. Kozlowski & Bryant (1977) evaluated university students' map reading abilities and sense of direction. These objectively measured abilities correlated with their selfassessment based on a user study conducted with a university campus map. French, Ekstrom, & Price (1963) proposed a Building Memory Test, which measures the ability to remember buildings' position on a street map. Phillips & Noyes (1977) performed a study investigating how map design affects the search speed in a street map. Road map-reading abilities of drivers were analysed by Streeter & Vitello (1986) who designed an experiment where three categories of drivers (experts, experienced, novices) were drawing a route from place A to B. Beside this experiment, participants performed several cognitive tests and a self-appraisal of navigational abilities. Kovach Jr, Surrette, & Aamodt (1988) used a different approach. They performed an experiment with drivers, whose task was to drive from place A to place B according to one of six (informal, hand drawn) maps that differed in design and complexity. The results showed that verbatim voice instructions were more appropriate than maps. The authors used these results as an argument for the development of a voice-oriented navigational guidance system. Differences in user behaviour while using electronic navigation systems and paper-based maps were analysed in a study by Aichinger et al. (2014). The study was designed in a real-world environment, and the participants were driving a car in Vienna's surroundings. Visual distractions caused by the navigation device were detected through the use of eye-tracking. The study confirms that navigation systems help to decrease distance and travel time. The results indicated a lower distractive potential for navigation systems when compared with paper maps. The influence of GPS navigation on people's attention was also analysed using eye-tracking in Hejtmánek, Oravcová, Motýl, Horáček, & Fajnerová (2018). Their results showed that the use of navigational systems negatively influences the spatial knowledge of its users. Online map portals like Google Maps or OpenStreetMap share similarities with roadmaps. Alacam & Dalci (2009) conducted an eye-tracking study comparing four different web maps. Eye-movement analysis showed that participants were highly fixated on local traffic signs (highway numbers). Similar stimuli were used in a comparative study by Schnur, Bektaş, & Çöltekin (2018), who analysed the perceived complexity of web maps using an online survey and compared the results with measured complexity. Another partially related type of map is the so-called metro map -a schematic map designed to represent the transportation system by preserving topological aspects. User performance and reading strategies for these maps were assessed by Netzel et al. (2017) and Burch, Kurzhals, & Weiskopf (2014).
No study has been conducted to evaluate the understandability and readability of road atlases or road maps, to the best of our knowledge.

Motivation and hypotheses
The motivation for the experiment's design was a statement by the director and chief cartographer of the largest cartographic publishing house in the Czech Republic (Kartografie PRAHA), who said in an expert discussion that although the publishing house tries to create maps in a modern and experience-based way, she is better acquainted with the atlas that was published 20 years prior to the current edition. She argued that she finds the required information much easier and faster to access in the road atlas from 1994 (OLD) than in the newer atlas from 2013 (NEW). And that she also perceives this to be the experience of other users of the atlas. We defined three hypotheses to compare the atlases from the user perspective: H1: Accuracy of responses will be higher for the OLD atlas H2: Time for task completion will be lower for the OLD atlas H3: Orientation in the maps from the OLD atlas will be easier

Methods
The method of eye-tracking was used for the evaluation of user perception of the two road atlases.

Stimuli and tasks
The eye-tracking experiment contained stimuli from two Czech road atlases with the scale 1:200,000 published by Kartografie PRAHA in 1994 (OLD) and 2013 (NEW). Both atlases are compared in Table 1 The experiment contained seven pairs of stimuli (see Figure 1 with an example of one pair). Each pair depicted the same area. The tasks were to find the optimal route, municipalities, selected symbols (gas stations and road numbers), and determine the distance between municipalities. Tasks were presented in random order.

Apparatus
The eye-tracking experiment was conducted at the eyetracking laboratory of the Department of Geoinformatics, Palacký University Olomouc, Czech Republic, with the use of the SMI RED 250 eye-tracker with a sampling frequency of 250 Hz. Stimuli were presented on a 24" screen with resolution 1920×1200 px. Data were analysed in SMI BeGaze, OGAMA and V-Analytics software.

Participants
There were twenty-two participants in the study (14 males and 8 females). A majority of them were students of geoinformatics.

Procedure
The experimental procedure is depicted in Figure 2. At the start, the participants answered a few demographic questions. Then the nine-point calibration was performed. Calibration was considered successful when the deviation was lower than 1°. The experiment was designed as within-subject, so all participants saw all stimuli. To avoid a learning effect, the tasks were presented in random order. Moreover, the tasks for each pair were not exactly the same. However, emphasis was given to make them as comparable as possible. At the end of the experiment, participants were asked if they were familiar with the depicted area. The design of the whole experiment is depicted in Figure 2.

Accuracy of responses
As the first step of experiment analysis, the accuracy of the responses was evaluated. The percentage of correct answers is displayed in Figure 3. The graph shows that for the majority of tasks, maps from both atlases allow for correct perception. The number of correct responses was higher for the OLD atlas for all tasks except for the last one. These results may indicate better readability of the OLD maps. However, these differences were not significant according to Wilcoxon signed-rank test, and therefore, further analyses were performed. The only task where the situation was opposite was the last one, in which participants were asked to estimate the distance between the two municipalities. In this case, the percentage of correct answers was higher for the NEW atlas, and the difference was statistically significant (p=0.002). The distances in kilometres for particular road sections are displayed on the map. As is evident from Figure 4, the numbers are not salient in the OLD atlas, which leads to lower response accuracy. Only four participants (18.2%) answered correctly with the OLD version of the map. In contrast, with the NEW version of the map, answers from 13 participants (59.1%) were correct. Figure 4. Example of the stimuli from Pair 7 for the OLD atlas (above) and NEW atlas (below). Numbers near to the road segments represent the distance.
Some participants did not notice those numbers and tried to find the scale in the map. As a result, their answers were incorrect. An example of this behaviour is the scanpath of participant P11 (male) displayed in the upper part of Figure 5. In contrast, participant P21 (male) recognised numbers representing distances correctly, and all his fixations were aimed at the connection between municipalities (lower part of Figure 5). Figure 5. Scanpaths of two selected participants above the OLD atlas. The task was to estimate the distance between the two municipalities. Participant P11 (top) did not notice the numbers and tried to find the scale. Participant P21 (bottom) used the correct approach.

Trial Duration and Scanpath Length
In the next step, the eye-tracking metrics Trial Duration and Scanpath Length were statistically evaluated using Wilcoxon Rank Sum Test at the significance level α = 0.05. First, the time needed to solve a task was evaluated. The time limit of 60 seconds was reached eight times during the experiment. As shown in Figure 6, a statistically significant difference was observed between Trial Duration for tasks using the OLD and NEW atlases. Participants needed more time for task completion with the NEW atlas. Detailed analysis of Trial Duration for individual tasks showed that a greater length of time was needed for the NEW atlas for all tasks except the last one ( Figure 7). Statistically significant differences were observed in pair 3 (finding the symbol) and pair 6 (finding the route). A similar situation was observed for the eye-tracking metric Scanpath Length. Scanpath Length describes the length of the eye-movement trajectory in pixels. This metric expresses the complexity of the stimulus's task or understandability (Goldberg & Helfman, 2011). The metrics Trial Duration and Scanpath Length are usually correlated. In the case of this experiment, the correlation was 0.89. However, the analysis of Scanpath Length leads to an interesting result regarding the style of the atlases. A statistically significant difference was observed for pair 3. The task, in this case, was to find particular roads (road numbers). While the median Scanpath Length for the OLD map was 5496px, the NEW map value was almost doubled (10654px). This difference was caused by the very different style of the road number labels in the OLD versus the NEW atlas (see Figure 8). The different style of road labels evidently affects user perception, as is visible from the values of the Scanpath Length metric and the Flow Map visualisation created using V-Analytics software (Andrienko, Andrienko, Burch, & Weiskopf, 2012).  Arrows represent the number of gaze moves between created Voronoi polygons. Only arrows representing more than five moves are displayed. It is evident that participants scanned the map from the NEW atlas (lower part of Figure 9) much more intensively than the map from the OLD atlas (upper part of Figure 9).

Conclusion
This eye-tracking experiment proved that the feedback from cartographic publishers' customers (and from the director of the publishing house) was valid and that the maps from the older road atlas are more easily readable than the maps from the newer atlas.
At the beginning of the study, three hypotheses were formulated. The first one stated that the accuracy of responses will be higher for the OLD atlas. The accuracy of answers was higher for the OLD atlas for all but the last task; however, the differences were not statistically significant. The only significant difference was found in the last task (distance estimation), but in this case the situation was opposite. Higher accuracy was observed for the NEW atlas. This difference was caused by the different visual style of the numbers representing lengths of the road segments. The numbers were more salient in the NEW atlas. Thus the first hypothesis was not confirmed.
The second hypothesis stated that the task completion using the OLD atlas will be faster. This hypothesis was confirmed, because a statistically significant difference for the Trial Duration metric was observed and the values were lower for the OLD atlas. Detailed analysis of particular tasks showed that shorter lengths of time were needed for the OLD atlas in all but the final task. Statistically significant differences were observed for two tasks (P3, P6). The last hypothesis was regarding orientation in the maps. Orientation was investigated using the Scanpath Length metric, which represents the length of the eye-movement trajectory. Although this metric is highly correlated with Trial Duration, its analysis helped us to identify one interesting drawback of the NEW atlas. Road numbers in the NEW atlas are not very salient and locating them was problematic for participants. Scanpath Length for the OLD atlas for task 3 was almost twice as long as it was for the NEW atlas. It can be said that the last hypothesis was also confirmed.
The results showed that the NEW atlas is not ideal and the old one is, in many aspects, more readable. The results were provided to the cartographic publishing house and the findings were taken into account during the creation of the new version of the road atlas (2019).

Acknowledgements
This paper was created within the project "Application of geospatial technologies for spatial analysis, modelling, and visualisation of spatial phenomena" (IGA_PrF_2021_020) with the support of the Internal Grant Agency of Palacký University Olomouc).