Free Essay

Stereoscopic Building Reconstruction Using High-Resolution Satellite Image Data

In:

Submitted By tasabk85
Words 2888
Pages 12
Stereoscopic Building Reconstruction Using High-Resolution Satellite Image Data
Anonymous submission

Abstract—This paper presents a novel approach for the generation of 3D building model from satellite image data. The main idea of 3D modeling is based on the grouping of 3D line segments. The divergence-based centroid neural network is employed in the grouping process. Prior to the grouping process, 3D line segments are extracted with the aid of the elevation information obtained by using area-based stereo matching of satellite image data. High-resolution IKONOS stereo images are utilized for the experiments. The experimental result proved the applicability and efficiency of the approach in dealing with 3D building modeling from high-resolution satellite imagery. Index Terms—building model, satellite image, 3D modeling, line segment, stereo

I. I NTRODUCTION Extraction of 3D building model is one of the important problems in the generation of an urban model. The process aims to detect and describe the 3D rooftop model from complex scene of satellite imagery. The automated extraction of the 3D rooftop model can be considered as an essential process in dealing with 3D modeling in the urban area. There has been a significant body of research in 3D reconstruction from high-resolution satellite imagery. Even though a natural terrain can be successfully reconstructed in a precise manner by using correlation-based stereoscopic processing of satellite images [1], 3D building reconstruction remains to a difficult process, due to the discontinuity of elevation in manmade objects. In this context, most studies rely on 3D feature analysis. Perceptual grouping technique [2] has been broadly used for detecting and describing buildings in aerial or satellite image. This traditional method demonstrates the usefulness of the structural relationships called collated features which can be explored by perceptual organization in complex image analysis. All reasonable feature groupings are first detected and the candidates are then selected by a constraint satisfaction network. But this approach involves all extracted line segments in the image. Consequently it costs a big computational effort. It also depends on the accurate extraction of line segments. Fischer et al. [3] suggested a model-based approach. A building is considered as a set of features such as points, lines and regions connected together. This is a better approach compared to the models only using non-stereo images. The result, however, critically depends on the selection of the correct building model and the number as well as quality of features extracted from the background. Huertas [4] suggested using extracted cues from the IFSAR data, while Kim [5] solved the segmentation problem of

interested objects from commercial DEM(Digital Elevation Map) image . The extracted cues do not give us the shape of the buildings. However they can give us the idea where the buildings are located in the image. Unfortunately, it is not easy to have IFSAR data or DEM image in all cases. Some approaches use hypothesis and verification paradigm based on perceptual grouping to solve the second problems. Jaynes [6] proposed the task driven perceptual organization for extraction of rooftop. Features such as corner and line segment are first extracted and assigned a certainty value. Then feature and their grouping are stored in a feature relation graph. Close cycles in the graph represent the grouped polygon hypotheses. The independent set of closed groups that have maximum sum of certainty values of its parts is the final grouping choice. This approach is limited on rectangular buildings and tends to have false hypotheses in complexity images. In spite of the effort made by many researchers, the fully automated building reconstruction system remains not yet achieved or used only in the limited environment. This problem is mainly due to the fact that 3D feature extraction is not complete enough to support the further reconstruction process. [7] In the absence of fully automatically functioning reconstruction system, semi-automatic systems [8][9] have emerged as an alternative solution. However, the human intervention required for the semi-automatic system makes the overall processing extremely slow. In this context, we are concerned with automated building reconstruction system, which can efficiently deal with the uncertainty introduced by the incompletely extracted 3D feature. Our approach is presented in three steps. Firstly, we extract 3D lines from satellite images by associating the DEM data and the extracted 2D line information. Secondly, the divergence-based centroid neural network [10] algorithm is used to group these line segments and divide them into groups of parallel line. Finally, 3D rooftop shapes are reconstructed based on the result of second step. The enhancement of our approach can be found in the 3D line grouping process. Many researchers have studied on grouping 2D line segments [11][12]. One of the most popular approaches is to solve perceptual grouping problem. In this paper we apply grouping method to 3D line segments with employing the divergencebased centroid neural network algorithm. The advantages of using the divergence-based centroid neural network algorithm, which was verified as an excellent unsupervised learning algorithm, are requirements of neither a predetermined schedule for leaning coefficients nor a total number of iterations for

clustering while it converges much faster than conventional algorithms with compatible outcome. The remainder of this paper is organized as follows: Section 2 presents a summary (background theory) of the divergencebased centroid neural network algorithm and core element of our approach - the 3D line grouping process. The clustering process from 3D line segments into rooftop model is stated in section 3. Experimental results and conclusion are presented in section 4 and 5, respectively. II. G ROUPING OF 3D LINE SEGMENTATIONS A. Divergence-Based Centroid Neural Network The divergence-based centroid neural network is an unsupervised competitive learning algorithm based on the classical k-means clustering algorithm. It introduces the definitions of ”winner neuron” and ”loser neuron” that are clusters in the clustering process. A neuron is a winner when it wins the data in the current presentation but not in the previous presentation. On the contrary, a loser neuron wins the data in the previous presentation but loses in the current presentation. Only the centroids of winner and loser clusters are changed, so we just need to recalculate the weights of them at each epoch, instead of all clustered data. An ”epoch” is a presentation of whole data vectors in the data set to the network. Generally, the weights of winner and loser neurons are updated as follows, assumed that a data vector x is presented to the network at time n: 1 wi (n + 1) = wi (n) + [x − wi (n)] (1) Ni + 1 wj (n + 1) = wj (n) − 1 [x − wj (n)] Nj − 1 (2)

1 [line ang − angj (n)] Nj − 1 (4) If the considering line does not exist in any cluster in advance - we just have the winner but no loser, equation (4) will not be executed. Secondly, the parallelizability of 3D lines in a same group is checked by considering z and either x or y values. This is used to eliminate cases parallel cases while projecting on 2D coordinate system but not parallel in space. For example, two 3D lines as in Fig. 1a are not parallel because they are just parallel when projecting on x-y plane (Fig. 1b) but not on x-z plane (Fig. 1c). In summary, a parallel group includes 3D lines that are located in the same group at the first phase and pass the parallelizability check at the second phase. angj (n + 1) = angj (n) −

(a)

(b)

where wi (n), wj (n) are the weights of winner neuron i and loser neuron j at time n; Ni , and Nj are the numbers of data vectors in cluster i and cluster j respectively. B. Grouping of parallel line segments In order to deal with the uncertainty introduced by the incompletely extracted 3D lines, we carry out the grouping of the extracted 3D lines. The divergence-based centroid neural network algorithm is used to gather 3D lines into the parallel groups based on their angles. The angle between two 3D lines, however, cannot be determined except they have an intersection point, a situation that rarely happens in space. In order to identify whether two 3D lines are parallel or not, we examine them twice. Firstly, we ignore the z value (height), only consider x (column) and y (row). We use the divergence-based centroid neural network algorithm to assemble them into groups of parallel lines on x, y axes. The winner group is the one has the neuron angle closest to the angle of considering line. Weights of winner and loser neurons are updated as follows angi (n + 1) = angi (n) + 1 [line ang − angi (n)] Ni + 1 (3)

(c) Fig. 1. Example of 3D lines parallelizability.

Lines in a group, however, maybe spread out everywhere in the scene. The purpose of this step is to divide the current groups into smaller parts - so-called building segment groups - contain the lines that belong to the same suspected building. Lines in same building segment group need to satisfy two following conditions: the overlapping part between two lines need to be equal or greater a certain percentage of length of the shorter line; and the distance between lines in a group should be smaller than a threshold to guarantee that they are not so far away. C. Regrouping 3D lines Because of error in extracting 3D lines, a true line may be constructed more than one time, creating many corresponding 3D line segments lying close and parallel together. In other cases, some line segments are parts of a same linear structure in the image, but fragmented during extraction process. This step employs the divergence-based centroid neural network algorithm again to regroup those segments and replace

them by the corrected lines. The possibility of regrouping two segments depends on their collinearity, nearness, and their own length. The collinearity can be measured as the difference in direction of two line segments whereas the nearness is calculated via the distances between their end points. All these factors have been synthesized into a metric for line segments [4], which is adopted as the distance measure function for the divergence-based centroid neural network. The distance function between a line segment u and a neuron w - is also a line segment - is defined as: 1 d (u, w) = f (u, w) where
−1 f (u, w) = Gσangle (θw − θu )∗Gσlength σwidth Rθu (xw − xu , yw − yu ) (6) with • • • •

Fig. 2.

Example of a parallel lines cluster

(5)

θw − θu : the difference in direction xw − xu , yw − yu : the distance between the midpoints of two segments Gσ (x): Gaussian function for the orientation component Gσx σy x, y: Gaussian function for the displacement component

Fig. 3.

Pair of parallel lines and their folding spaces

IV. E XPERIMENTAL RESULT To evaluate the proposed method, we set up an experiment using IKONOS satellite images of Daejun area. The resolution of each image is 1 meter, and its size is 1536x1536. The images contain several buildings with different shapes and heights among many other objects, as shown in Fig. 4. Notice that there is a complex structure of buildings prominently appeared in the center of each image. In order to detect 3D

The angle and the length of a neuron are updated similarly equations (3) and (4). A neuron wins a line segment if it has the smallest distance to that line segment than the others. Since (5) only applies to 2D line, when using for 3D line it requires another condition that the height values of two considering line segments must not too differ from each other. III. C LUSTERING OF 3D ROOFTOP The idea of forming rooftop from parallel line pairs is not a new issue. Generally, a parallel pair needs to be existed and afterwards looking for another one upright to them. According to that, we firstly determine pairs of parallel lines in each group which are possible to be in a cluster. Not every two lines in a group can be a pair. They must lie nearest in their group and own the lengths that are not too different. For example, with the cluster as in Fig. 2, we can extract two pairs (1, 2) and (2, 3). Lines 1 and 3 cannot be a pair because of the first condition. Meanwhile, line 3 is too longer than line 4 so that they cannot become a pair. For every pair, we search for the lines in the other groups that have perpendicular orientation to them. The chosen lines must lie in folding spaces created by ending points of the two parallel lines as in Fig. 3. They will be the third and the fourth edges, gather with two prior lines into a cluster of candidates for rooftop shape. In some cases, the fourth edge can be omitted. Rooftop is formed by expanding the above lines and defining the rooftop corners as points at which the distances between lines are shortest. If the fourth edge is empty, an orthogonal line will be drawn connecting two remaining end points of two parallel edges.

(a) Fig. 4.

(b) Two IKONOS satellite images

lines, we firstly carry out epipolar resampling (Fig. 5) and generate a DEM (Fig. 6). Also, 2D lines are extracted as shown in Fig. 7 (a). The number of 2D lines detected is too large due to the line fragmentation and noises caused by interfering objects. Not only edges of the buildings, boundaries of trivial structures are also extracted. With the aid of elevation information in DEM, we extracted 3D lines. Even though the stereo matching process of 2D lines decreases the number of lines a little bit, we have still have too many 3D lines. For the efficient rooftop detection, we perform 3D line grouping

based on the divergence-based neural network. In terms of the grouping of parallel line segments and the regrouping of 3D lines, we obtained the significantly reduced the number of 3D line segments, as shown in Fig. 7 (b).

(a) Fig. 8.

(b) Building reconstruction result

(a) Fig. 5.

(b) Epipolar resampled images

In this approach, 3D lines are extracted with the aid of elevation information. The divergence-based centroid neural network algorithm is applied to classify lines into parallel groups and refine them. Rooftops are successfully obtained by parallel pairs and orthogonal lines connecting them. Using the grouping process, we can efficiently deal with the uncertainty arising from the incompletely extracted 3D line segments. The experimental results proved that the proposed method can be utilized efficiently in the rooftop detection task and building reconstruction. R EFERENCES
[1] J. Grodecki and G. Dial, ”IKONOS geometric accuracy,” Joint Workshop of ISPRS Working Groups I/2, I/5 and IV/7 on High Resolution Mapping from Space, pp. 19-21, 2001. [2] A. Huertas and R. Nevatia, ”Detecting buildings in aerial images,” Computer Vision, Graphics, and Image Processing, vol. 41, pp. 131-152, 1988. [3] A. Fischer, T. Kolbe and F. Lang, ”Integration of 2D and 3D reasoning for building Reconstruction using a generic hierarchical model” Workshop Semantic Modeling for the Acquisition of Topographic Information, Forstner, W. (ed.), Bonn, Germany, 1997. [4] A. Huertas, Z. Kim and R. Nevatia, ”Use of cues from range data for building modeling,” DARPA Image Understanding Workshop, pp.577-582, 1998. [5] Z. Kim and R. Nevatia, ”Automatic description of complex buildings from multiple images,” Computer Vision and Image Understanding, vol.96, pp. 60-95, 2004. [6] C. Jaynes, F. Stolle and R. Collin, ”Task driven perceptual organization for extraction of rooftop polygons,” IEEE Workshop on Application of Computer Vision, pp.152-159, 1994. [7] G. Sohn and I. Dowman, ”Extraction of buildings from highresolution satellite data,” Automatic extraction of man-made objects from aerial and space images (III), Baltsavias, E., Gruen, A., Gool, L. V. (eds.), pp. 345-354, A. A. Balkema Publishers, 2001. [8] S. Mayunga, Y. Zhang and D. Coleman, ”A semi-automated approach for extracting buildings from QuickBird imagery applied to informal settlement mapping,” International Journal of Remote Sensing, vol. 28, pp. 2343-2357, 2007. [9] T. Kim, T. Javzandulam and T. Lee, ”Semiautomatic reconstruction of building height and footprints from single satellite images,” IEEE International Geoscience and Remote Sensing Symposium, pp. 4737-4740, 2007. [10] D. Park, ”Centroid neural network for unsupervised competitive

Fig. 6.

DEM data

(a) Fig. 7.

(b) 2D and 3D line images

Rooftop clustering is performed by taking out parallel pairs from each group, and seeking the lines that are in their folding spaces and almost perpendicular to them. Fig. 8 (a) provides the rooftop detection result of the entire area. From the detected rooftop and the known geometric parameters of image acquisition, we reconstructed 3D buildings as presented in Fig. 8 (b). The experimental result shows that all buildings, which are detected completely in DEM generation phase, are clustered perfectly into rooftop model. V. C ONCLUSIONS A new method to detect rooftops from two satellite images using the divergence-base centroid neural network is proposed.

learning,” IEEE Trans. on Neural Network, Vol. 11, pp. 520-528, 2000. [11] P. Perona and W. Freeman, ”A factorization approach to grouping,” Proc. ECCV 1998, pp. 655-670, 1998. [12] J. Shi and J. Malik, ”Normalized cuts and image segmentation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 22, pp. 888-905, 2000.

Similar Documents