Resources and Help Accurate, Dense, and Robust Multiview Stereopsis Abstract: This paper proposes a novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges. We have tested our algorithm on various data sets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in front of a static structure of interest.
|Published (Last):||23 August 2004|
|PDF File Size:||19.55 Mb|
|ePub File Size:||19.94 Mb|
|Price:||Free* [*Free Regsitration Required]|
How- gular patches covering the surfaces visible in the input images. The keys to its performance are effective tech- tion, is capable of detecting and discarding outliers and ob- niques for enforcing local photometric consistency and global stacles, and outputs a quasi dense collection of small ori- visibility constraints. Stereopsis is implemented as a match, ex- ented rectangular patches [6, 13], obtained from pixel-level pand, and filter procedure, starting from a sparse set of matched correspondences and tightly covering the observed surfaces keypoints, and repeatedly expanding these to nearby pixel corre- except in small textureless or occluded regions.
It does not spondences before using visibility constraints to filter away false perform any smoothing across nearby features, yet is cur- matches. A simple but effective method for turning the resulting rently the top performer in terms of both coverage and accu- patch model into a mesh appropriate for image-based modeling is racy for four of the six benchmark datasets provided in . The proposed approach is demonstrated on vari- The keys to its performance are effective techniques for en- ous datasets including objects with fine surface details, deep con- forcing local photometric consistency and global visibility cavities, and thin structures, outdoor scenes observed from a re- constraints.
A simple but effective method for turning the resulting patch model into a mesh suitable for 1. Introduction image-based modeling is also presented. The proposed ap- proach is applied to three classes of datasets: As in the binocular case, although most early work in multi-view stereopsis e. Competing approaches mostly differ in the type of effective bounding volumes typical examples are out- of optimization techniques that they use, ranging from door scenes with buildings or walls ; and local methods such as gradient descent [3, 4, 7], level 1 sets [1, 9, 18], or expectation maximization , to global In addition, variational approaches typically involve massive opti- mization tasks with tens of thousands of coupled variables, potentially ones such as graph cuts [3, 8, 17, 22, 23].
The variational limiting the resolution of the corresponding reconstructions see, however, approach has led to impressive progress, and several of the  for a fast GPU implementation. We will revisit tradeoffs between methods recently surveyed by Seitz et al. Overall approach. From left to right: a sample input image; detected features; reconstructed patches after the initial matching; final patches after expansion and filtering; polygonal surface extracted from reconstructed patches. Key Elements of the Proposed Approach ent places in multiple images of a static structure of interest Before detailing our algorithm in Sect.
Object datasets are the ideal input for these al- fragments have been matched, and determine its visibility. Patch Models to the more challenging scene datasets. Crowded scenes are even more difficult. The method proposed in  uses A patch p is a rectangle with center c p and unit nor- expectation maximization and multiple depth maps to re- mal vector n p oriented toward the cameras observing it construct a crowded scene despite the presence of occlud- Fig.
We associate with p a reference image R p , cho- ers, but it is limited to a small number of images typi- sen so that its retinal plane is close to parallel to p with little cally three. As shown by qualitative and quantitative ex- distortion. Two sets of pic- multi-view stereopsis as a simple match, expand, and fil- tures are also attached to each patch p: the images S p ter procedure Fig.
We enforce the following two constraints on the in all our experiments : 2 Expansion: a technique similar model: First, we enforce local photometric consistency by to [16, 2, 11, 13] is used to spread the initial matches to requiring that the projected textures of every patch p be con- nearby pixels and obtain a dense set of patches.
Second, we enforce global visibility consistency by re- This approach is similar to the method proposed by Lhuil- quiring that no patch p be occluded by any other patch in lier and Quan , but their expansion procedure is greedy, any image in S p. Image Models Furthermore, outliers cannot be handled in their method. In addition, only a 2 A patch p may be occluded in one or several of the images in S p pair of images can be handled at once in , while our by moving obstacles, but these are not reconstructed by our algorithm and method can process arbitrary number of images uniformly.
On the other hand, in the expan- S p sion phase of our algorithm Sect. See text for the details. This amounts to attaching a depth map to I, which gorithm is designed to handle this problem. Iterating its will prove useful in the visibility calculations of Sect. Enforcing Photometric Consistency Given a patch p, we use the normalized cross correlation 3. Matching grid is overlaid on p and projected into the two images, the As the first step of our algorithm, we detect corner and correlated values being obtained through bilinear interpo- blob features in each image using the Harris and Difference- lation.
Given a patch p, its reference image R p , and the of-Gaussian DoG operators. To simplify computations, cells C i, j overlaid on each image Fig. We then consider these ods for computing reasonable initial guesses for c p and points in order of increasing distance from O as potential n p are given in Sects.
More concretely, for each 2. In the matching phase Sect. Expansion At this stage, we iteratively add new neighbors to ex- isting patches until they cover the surfaces visible in the scene. Since some matches and thus the correspond- exit innermost For loop, and add p to P. Feature matching algorithm. See Fig. After initializing T p by using photo- erroneous matches. The first filter focuses on removing metric consistency as in Sect.
Note that removed. The second filter focuses on outliers lying in- since the purpose of this step is only to reconstruct an initial, side the actual surface Fig.
Also note that the patch generation associated with the corresponding images Sect. This does bor of p, or be separated from it by a depth discontinuity, neither case not prevent, however, the reconstruction of the correspond- warranting the addition of a new neighbor.
Output: Expanded set of reconstructed patches. While P is not empty Figure 6. Polygonal surface reconstruction. Patch expansion algorithm. In the first P1 P4 Outlier phase, the photometric consistency term for each vertex v P2 P3 P0 essentially drives the surface towards reconstructed patches Figure 5. U denotes a set of patches occluded by an outlier. In the second phase, the pho- lowered by 0. At each 4. Polygonal Surface Reconstruction vertex v, we create a patch p by initializing c p with v, n p with a surface normal estimated at v on S, and a set of The reconstructed patches form an oriented point, or sur- visible images S p from a depth-map testing on the mesh fel model.
Despite the growing popularity of this type of S at v, then apply the patch optimization routine described models in the computer graphics community , it re- in Sect. The consistency term. In the first phase, we iterate until conver- 5 Table 1. Characteristics of the datasets used in our experiments. Quantitative comparisons kindly pro- roman and skull datasets have been acquired in our lab, while vided by D.
Scharstein on the datasets presented in  other datasets have been kindly provided by S. Seitz, B. Curless, show that the proposed method outperforms all the other J. Diebel, D. Scharstein, and R. Szeliski temple and dino, see also  ; C. Schmitt and the Museum evaluated techniques in terms of accuracy distance d such of Cherbourg polynesian ; S. Sullivan and Industrial Light and that a given percentage of the reconstruction is within d Magic face, face-2, body, steps, and wall ; and C.
Strecha city- from the ground truth model and completeness percent- hall and brussels. The second phase is applied to the mesh only in its measures as before. Reconstruction results for scene datasets are shown in Fig. Additional information such as segmentation 5. Experiments and Discussion masks, bounding boxes, or valid depth ranges is not avail- able in this case.
The datasets used in Nonetheless, our algorithm has successfully reconstructed our experiments are listed in Table 1, together with the num- the whole scene with fine structural details.
The wall dataset ber of input images, their approximate size and a choice of is challenging since a large portion of several of the input parameters for each data set. Finally, Fig. A visual hull model is thus used to initialize on crowded scene datasets. Our algorithm reconstructs the the iterative deformation process for all these datasets, ex- background building from the brussels dataset, despite peo- cept for face and body, where a limited set of viewpoints is ple occluding various parts of the scene.
The steps-2 dataset available, and the convex hull of the reconstructed patches is an artificially generated example, where we have manu- is used instead. The segmentation mask is also used by our ally painted a red cartoonish human in each image of steps stereo algorithm, which simply ignores the background dur- images. To further test the robustness of our algorithm ing feature detection and matching.
The rim consistency against outliers, the steps-3 dataset has been created from term has only been used in the surface deformation pro- steps-2 by copying its images but replacing the fifth one cess for the roman and skull datasets, for which accurate with the third, without changing camera parameters.
This contours are available. The bounding volume information is a particularly challenging example, since the whole fifth has not been used to filter out erroneous matches in our image must be detected as an outlier.
We have successfully experiments. Our algorithm has successfully reconstructed reconstructed the details of both despite these outliers. Sample results on object datasets: From left to right and top to bottom: temple, dino, skull, polynesian, face, face-2, and body datasets.
In each case, one of the input image is shown, along with two views of texture-mapped reconstructed patches and shaded polygonal surfaces. Acknowledgments: This work was supported in part by the Na- The bottleneck of our multi-view stereo matching al- tional Science Foundation under grant IIS We thank S.
Curless, J. Szeliski for the varies from about 20 minutes, for small datasets such as temple and dino datasets and evaluations, C. Schmitt, and the Museum of Cherbourg for polynesian, S. Sul- sisting of high-resolution images, such as polynesian and livan, A. Suter, and Industrial Light and Magic for face, face-2, city-hall. The running times of polygonal surface extrac- body, steps, and wall, C. Strecha for city-hall and brussels, and tion also range from 30 minutes to a few hours depending finally J.
Blumenfeld and S.
Accurate, Dense, and Robust Multiview Stereopsis
This algorithm does not require any initialization in the form of a bounding volume, and it detects and discards automatically outliers and obstacles. It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmark datasets presented in . The keys to its performance are effective techniques for enforcing local photometric consistency and global visibility constraints. Stereopsis is implemented as a match, expand, and? A simple but effective method for turning the resulting patch model into a mesh appropriate for image-based modeling is also presented. The proposed approach is demonstrated on various datasets including objects with? Introduction As in the binocular case, although most early work in multi-view stereopsis e.
Accurate, Dense, and Robust Multi-View Stereopsis论文分析与代码实现（一）
Abstract Abstract: This paper proposes a novel algorithm for calibrated multi-view stereopsis that outputs a quasi dense set of rectangular patches covering the surfaces visible in the input images. This algorithm does not require any initialization in the form of a bounding volume, and it detects and discards automatically outliers and obstacles. It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmarkdatasets presented in . The keys to its performance are effective techniques for enforcing local photometric consistency and global visibility constraints. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these to nearby pixel correspondences before using visibility constraints to filter away false matches.