Automated Extraction of Driving Lines from Mobile Laser Scanning Point Clouds

This paper presents a novel approach to automated generation of driving lines from mobile laser scanning (MLS) point cloud data. The proposed method consists of three steps: road surface segmentation, road marking extraction and classification, and driving line generation. The voxel-based upward-growing algorithm was firstly used to extract ground points from the raw MLS point clouds followed by segmentation of road surface using a regiongrowing algorithm. Then, the statistical outlier removal filter was applied to separate and refine the road marking points followed by extracting and classifying the lane markings based on the geometric features of different road markings using empirical hierarchical decision trees. Finally, land node structures were constructed followed by generation of driving lines using a curve-fitting algorithm. The proposed method was tested on both circular road sections and irregular intersections. The smoothing spline curve fitting model was tested on the circular road sections, while the Catmull-Rom spline with five control points was used to generate the driving lines at road intersections. The overall performance of the proposed algorithms is promising with 96.0% recall, 100.0% precision, and 98.0% F1-score for the lane marking extraction specifically. Most significantly, the validation results demonstrate that the driving lines can be effectively generated within 20 cm-level localization accuracy at an average of 3.5% miscoding using MLS point clouds, which meets the requirement of localization accuracy of fully autonomous driving functions. The results demonstrate the proposed methods can successfully generate road driving lines in the test datasets to support the development of high-definition maps.


Introduction
The digital maps particularly created for autonomous driving functions are normally called high-definition (HD) maps. Compared to commercial standard maps with metre-level localization accuracy, the sub-lane level accuracy of HD maps can reach 20 cm based on the report of Enhanced Digital Mapping Project released by the U.S. Department of Transportation. HD maps provide highly accurate localization and navigation services to support the development of the emerging market for autonomous vehicles (AV) (Máttyus et al., 2016). These HD maps provide rich road information and lane geometry, such as road markings, road boundaries, traffic signs, lane lines, and reference lines. Moreover, the driving lines lie in between two adjacent lane lines are elaborately depicted in HD maps, as these driving lines can be regarded as the driving routes with highly precise localization information for sub-lane level navigation during autonomous driving (Bétaille and Toledo-Moreo, 2010).
Recently, mobile laser scanning (MLS) systems have demonstrated their superior solutions in HD map generation due to the high data collection rate and high flexibility in large-scale urban environments. However, one of the most difficult tasks for HD map generation from unordered and noisy point clouds is to extract road markings and generate driving lines on complex road networks, including road curves and irregular intersections (Seif and Hu, 2016). Road markings and road driving lines regulate the traffic rules and clear guidance for existing road users and autonomous vehicles. By estimating diriving lines, the use of HD maps can be expanded from human-readable to machine-readable and thereby contribute to the navigation of AVs. Nevertheless, the variations in point resolutions and intensities, the low contrast between road surfaces and road markings, and the lack of consistency in MLS point clouds make the accurate road marking classification and driving line generation very challenging (Cheng et al., 2017).
The first step of driving line generation is to extract road surfaces from MLS point clouds. The proposed methods for road surface extraction are mainly divided into 2D georeferenced image-based methods, 3D point-based methods, and other input data assistance (Kumar et al., 2013 andYu et al., 2015). Accordingly, the existing methods for road marking extraction are mainly categorized into two categories relying on semantic knowledge (e.g., shape) and MLS intensity properties: georeferenced image-driven methods, and 3D pointdriven methods (Ma et al., 2018). However, it is still very challenging to propose automated algorithms that can efficiently segment 3D MLS point clouds acquired in large-scale urban environments into semantic objects. The correctness and completeness of extracted road markings have great impacts on the performance of driving line generation. Various studies were carried out in the past years (Zai et al., 2018, Soilán et al., 2017. Nevertheless, the prior knowledge about LiDAR-derived control point selection makes the accurate driving line generation very challenging (Cabo et al., 2016). Therefore, developing an efficient and robust algorithm to generate driving lines while precisely recording their geometry and coordinate information from large-scale and unordered MLS point clouds, has been essentially needed. In this paper, we mainly concentrate on exploring the underlying rationale and proposing a semi-automated algorithm for driving line generation using MLS point clouds to support HD maps.

Method
This developed method can typically be perceived as a stepwise process for MLS point cloud interpretation: (1) the input MLS point clouds are first preprocessed by the random sampling algorithm and a voxel-based upward growing algorithm to remove off-ground points; (2) road surfaces are then extracted based on region-growing based methods; (3) a multi-threshold road marking classification is afterward adopted to determine the optimal intensity thresholds, followed by the road marking classification by using a hierarchical classification tree; (4) finally, the line nodes are accordingly constructed depending on the extracted road markings and road geometry information, which can support the driving line generation using the Catmull-Rom spline algorithm. Furthermore, the performance of proposed algorithms is evaluated on MLS point clouds collected in urban environments. As shown in Fig. 1, the proposed algorithms in this study mainly include three modules: road surface extraction, road marking extraction and classification, and driving line generation. Moreover, two point cloud datasets containing curved roads and irregular intersections, are used to test the performance of the proposed algorithms. Figure 2. Voxel-based upward growing algorithm: (a) point cloud segmented into blocks; (b) voxelization process; (c) Adjacent voxels of V i ; and (d) upward growing process.

Road surface extraction
The raw point clouds collected by the RIEGL VMX-450 MLS system are in large-scale and thus lead to huge computational burdens. Accordingly, the random sampling algorithm packaged in CloudCompare software with the predefined number of the points was performed. Since the primary objective of this study is to generate driving lines, the non-ground points are irrelevant and need to be removed to improve the computational efficiency.
A voxel-based upward growing algorithm (Yu et al., 2015) was adopted. This algorithm first partitions each dataset into different blocks with a predefined block width (W b ) in the XY plane, and each partition was then processed separately (see Fig. 2 (a)). For each block, the voxelization process divided the point cloud into an array of voxels with a predefined width W v based on their locations in the 3D space. The voxels containing points were flagged with true, while the others were flagged as false. Fig. 2(b) shows the result after the voxelization of the point clouds. Two criteria are considered for voxelization: global elevation difference (H g ) and local elevation difference (H l ). Accordingly, the algorithm identified all the voxels with Hg smaller than the predefined threshold (T g ) as candidate ground voxels. As shown in Fig. 2( algorithm recursively grew upward to its nine closest neighbors (i.e. N 1 , N 2 , …, N 9 ). If the grown voxel contains points inside, this voxel became the new starting point and continued to grow in this pattern. This process terminated with the highest voxel (V h ), which has above neighbors that contain points. The Hl of the origin Vl was determined and compared with the predefined threshold of local height difference (T l ). Therefore, only the voxels with H l smaller than T l were regarded as ground voxels and the points within these voxels were outputted as the ground points, see Fig. 2(d). The rationale of the region-growing surface extraction was to expand the road surface points from the trajectory to the curb points. The algorithm was first initialized the seed voxels with voxels that contain the trajectory points. Starting from the first seed, the algorithm exhaustively searched their eight neighbors of the seed units along with a direction. Since road surface points are a group of points with similar elevation and have little elevation undulation and slope changes. Based on these characteristics, the region-growing algorithm employed in this study utilized four parameters: the local elevation difference within each voxel H ld , global height difference relative to the lowest voxel (H gd ), slope difference (S d ), and the distance of the processing voxel to its origin seed voxel (D s ).
The thresholds of these four parameters were specified as T ld , T gd , T slope , and T search , respectively. If the processing voxel has neighbors that meet the requirements of H ld < T ld , H gd < T gd , S d < T slope , and D s < T search , this voxel will be marked as a road surface voxel and added to the processing list. However, if the candidate voxel has already been marked as road surfaces, this voxel is then ignored, while the search continues to the next seed voxel in the processing list. When the region-growing algorithm ended, the points inside the road surface voxels were exported as the road surface points.

Road marking extraction and classification
In module II, a multi-threshold road marking extraction method was utilized to determine multiple and optimal intensity thresholds. This algorithm first partitioned the road surfaces into different bins with the width of W bin . Then, the intensity thresholds of each individual bin were determined using the Otsu's algorithm. The results after extracting road markings contained many noisy points, which was further removed by using the statistical outlier removal (SOR) filter.
For road marking classification, the conditional Euclidean clustering (CEC) algorithms from the Point Cloud Library (PCL) was first employed to cluster the extracted road markings into different groups based on their spatial distributions and locations. Then, as shown in Fig. 3, five types of road markings were classified using a hierarchical decision tree based on the road design standards. Additionally, a minimum bounding box calculation based on PCA transformation was further conducted to calculate the width and length of road markings, lane markings were extracted using the predefined geometric thresholds.

Driving line generation
First, to improve the computational efficiency, the existing road system was abstracted with a line node structure with the assistance of trajectory data. If a lane centreline has an intersection with a stop line, it is labelled as 'exit' lane, and its vertex is labelled as 'exit' node. Otherwise, the lane is labelled as 'entry' lane, and its vertex is labelled as 'entry' node. Then, the exit point e xi was paired with the entry point e nj . The interpoint m ij was determined by the location of the center point O of the intersection and the offset D. To generate smooth driving lines, the Catmull-Rom spline method was employed (Catmull and Rom, 1974). Based on the characteristics of the Catmull-Rom spline, the e xi , e nj , and m ij was selected as the control point , and , respectively. Two other control points and were selected to ensure parallels to lane L i and parallels to lane L j . The functions that define driving lines at the crossroad were thereby determined with the Catmull-Rom spline functions in Eqs. (1) and (2).

MLS system and point cloud data
The RIEGL VMX-450 system used in this study. This system can provide 8 mm absolute measurement accuracy with 5 mm relative precision, which can generate highdensity point clouds near 7,000 points/m 2 at normal driving speed in urban environments. The primary research target in this study is urban road scenes, as illustrated in Fig. 4

Validation
For the road marking extraction, the confusion matrix is created to analyze the classification accuracy of different road marking types based on visual interpretation. The number of correctly classified and misclassified points were recorded in the matrix accordingly. Three criteria that quantify the overall performance of the proposed method, recall, precision and F1-score were calculated to quantify the overall performance of the road marking extraction, see Eqs.
(3) -(5). (4) where C p is the number of pixels that were correctly extracted as road markings, R cpt is the number of road marking pixels in the reference data, and R crt is the number of extracted road marking pixels by the proposed algorithm.
In addition, to evaluate the accuracy of the generated driving lines, the Buffer-overlay-statistics (BOS) method was adopted. Buffers with a predefined distance from the calculated driving line (X) and the reference driving line (Q) were created, respectively. Then, the accuracy of the driving lines was assessed by calculating the completeness and miscoding as following: 3. Results and discussion

Driving line generation results
The thresholds used in region-growing algorithm are defined as T ld = 0.05m, T gd = 0.8m, T slope = π/6, and T search = 15 m in Dataset 1, and T ld = 0.05m, T gd = 0.95m, T slope = π/6, and T search = 60 m in Dataset II, respectively. As shown in Fig. 5(b), the promising results indicate that road surfaces from two point cloud datasets are effectively extracted. Moreover, such region-growing based algorithm performs better in Dataset I, because the road surface of Dataset I is flatter than the surface of Dataset II, and there are fewer undulations in Dataset I. Based on the length and width attributes, road markings were classified into six types using the predefined hierarchical decision tree: boundary lanes, centerlines between two directions, centerlines of one direction roads, zebra crossing, arrow and non-road marking noises. The overall classification accuracies for two test datasets are 81.98% and 89.26%, respectively. Based on the road marking extraction result (see Fig. 5(c)), an accuracy assessment was thus conducted. Moreover, as shown in   Table 2, the proposed algorithms can achieve an average of 71.1 % in completeness within 5 cm-level reference buffer, 86.6% in completeness within 10 cm-level reference buffer, and 90.4% in completeness within 15 cm-level reference buffer for two generated driving lines. The values of miscoding reduce with the increased width of reference buffers, which reveals that the majority of generated driving lines are located within the precision allowable reference buffers. It can be concluded that the proposed driving line generation method is capable of achieving 20 cm level accuracy with an average of 3.5% miscoding for both test datasets, which meets the requirements of localization accuracy of HD maps. The accuracy reduction occurs due to the incompleteness of extracted road markings. Fig. 5(e) indicates a prototype of HD maps including extracted road markings and generated driving lines in complex urban road environments. It is worth noting that this HD map can be vectorized and overlap with other road information (e.g., driver behaviours) to provide highly precise road information and thus support autonomous driving functions.

Concluding Remarks
In this paper, we have proposed a semi-automated workflow for driving line generation using MLS point clouds to support the development of HD maps. Two MLS point cloud datasets including road curves and intersections were used to test the performance and robustness of the proposed algorithms. The F1-score for road marking extraction can achieve 97.6% and 98.3% for each test dataset. Moreover, the miscoding for driving line generation is 5.5% and 1.5% within 20 cm buffer zones, respectively. Based on the comprehensive analysis of the experiment results, it can demonstrate that the proposed method can successfully and effectively extract and classify different kinds of road markings, and then generate the smooth driving lines with the promising accuracies. However, the proposed decision tree algorithm for road marking classification failed to distinguish between lane marking types and the complex road markings (e.g., diamonds and words). For further research, we intend to propose deep learning based methods to achieve automated road marking classification with sufficient labelled point cloud data.