Morphable 3D-mosaics

The difference between fiction and reality? Fiction has to make sense.
- Tom Clancy

Morphable 3D-mosaics: Photorealistic Reconstruction and Real-Time Exploration of Large Natural Environments

Introduction

Two are the main approaches that have been proposed so far for the automatic visual reconstruction of 3D environments. On one hand (according to the purely geometric approach), a full 3D geometric model of the scene needs to be constructed. On the other hand, image-based modeling and rendering (IBMR) methods skip the geometric modeling part and attempt to create novel views by appropriately resampling a given set of input images. However, a disadvantage of IBMR methods is that they require a large amount of input image data. So, while a lot of research on IBMR has been done regarding small scale scenes, there are only few of these methods that can deal with large scale environments.

Here we present a hybrid (geometry & image based) framework, capable of providing photorealistic walkthroughs of large-scale, complex outdoor environments, using as input only a sparse set of stereoscopic images from the scene. For this purpose, a new data representation of a 3D scene is proposed, which is called "morphable 3D-mosaics" and consists of a set of morphable (both geometrically as well as photometrically) 3D models.

High-level overview

A high-level overview of our framework is illustrated in the following video:


The above avi was encoded with Microsoft MPEG-4 VKI Codec V3 (FOURCC code is 'mp43').
Normally, there won't be a problem in watching this video.

Motion is assumed to be taking place along a predefined path inside the 3D environment and the input to our system is a sparse set of stereoscopic views at certain positions (called key-positions hereafter) along that path (see Figure a).

A series of approximate local 3D models are then constructed, one model for each stereoscopic view, with these local models capturing the photometric and geometric properties of the scene only at a local level (see Figure b).

Then, instead of creating a global 3D model of the scene out of all these local models (a task that can prove to be extremely difficult for large natural scenes), we choose to follow a rather different approach. The key idea is that during the transition between any two successive key-positions along the path, a morphable 3D-model Lmorph is displayed by the rendering process (see Figure c). At the current key-position, say pos1, this model coincides with the local model at that position, say L1, whereas as we are approaching the next key-position, say pos2, it is gradually transformed into the next local model, say L2, coinciding with that model upon reaching key-position pos2. Therefore, during the rendering process, a continuous morphing between successive local 3D models takes place all along the path.

The morphing is both photometric as well as geometric and proceeds in a physically valid way (i.e. it is transparent to the user). For the photometric morphing, a robust algorithm capable of extracting a dense field of correspondences between wide baseline images has been used, whereas, for the geometric morphing, a novel method of computing 3D correspondences between local models has been employed. For more technical details, see our paper in [1] which describes a preliminary version of our system.

The following mpeg video contains an example of a very short walkthrough that uses just one morphable 3D-model and takes place between two successive key-positions:


Click here or on the image above to watch the mpeg video

The walkthrough was generated in real time by our framework's 3D graphics engine. This is possible because the hybrid represenation of a scene (proposed by our framework) allows the rendering pipeline to be performed fully in hardware, e.g. thanks to the use of the OpenGL pixel and vertex shaders for simulating the photometric and geometric morphing respectively. Renderings of photorealistic quality can thus be achieved at very high frame rates.

Our system can also handle the existence of multiple stereoscopic views (related by a camera rotation) per key position of the path. In this case, there will also be multiple local models per key-position and so, before applying the morphing procedure, a 3D-mosaic per key-position needs to be constructed as well. Each 3D-mosaic will simply comprise the multiple local models at the corresponding key-position and will itself be a bigger local model. Morphing can then proceed in the same way as before, with the only difference being that these 3D-mosaics will be the new local 3D models to be used during the stage of morphing (in place of the smaller individual ones). So, during morphing, instead of a simple morphable 3D model we will now have a morphable 3D mosaic.

Our method for constructing the 3D-mosaics is based on solving a standard partial differential equation and can always ensure a geometrically consistent mosaic. To this end, geometric rectifications are applied to each one of the local 3D models during their merging (see [1] for more details). The following example contains a rendered view of a 3D-mosaic that was constructed by our method, using as input three local models:

A virtual tour into the Samaria gorge

Our system has been successfully applied to the visual reconstruction of the Samaria Gorge in Crete, one of the most beuatiful gorges in Europe that was awarded by the Council of Europe with a diploma First Class.

Some rendered views from the gorge using the morphable 3D-mosaics framework

A 3D virtual reality installation at the Natural History Museum of Crete has thus been used to provide a lifelike virtual tour of the Gorge to the visitors of the museum. The hardware equipment that has been used for the virtual reality system was consisting of a PC (with a Pentium 4 2,4GHz CPU on it), a single-channel stereoscopic projection system from Barco with 2 circular polarized LCD projectors (Barco Gemini), an active-to-passive stereo converter as well as a projection screen. The rendering was done on a GeForce 6800 3D graphics card (installed on the PC) and, for the stereoscopic effect to take place, 2 views (corresponding to the left and right eye) were rendered by the graphics card at any time. Museum visitors were then able to participate in the virtual tour simply by wearing stereo glasses that were matched to the circular polarization of the projectors.

The projection screen of the VR system Two views as would be rendered by the VR system (for illustration purposes, they are shown in the form of red-blue images)

An additional benefit of having a virtual 3D reconstruction of the gorge is the ability e.g. to add synthetic visual effects or integrate synthetic objects into the environment. This way the visual experience of the virtual tour inside the gorge can be enhanced even further. For example, in the figure below, we are showing some rendered views of the gorge in which a synthetic volumetric fog has been also added,

   
Rendered views with a synthetic volumetric fog

whereas, in the following two images, we show synthetic views where an "agrimi" (a wild goat which can be found only in the area of the Samaria Gorge) as well as an oleander plant has been integrated into the 3D virtual environment.


A 3D-model of the "agrimi" animal has been integrated into the gorge.


The "agrimi" animal as well as an oleander plant integrated into the virtual gorge.

Also, the following avi video contains a very small clip from a virtual tour into the gorge (using the morphable 3D-mosaics framework) as well as a high-level overview of our system:


To watch the video, click here or on the image above
The avi was encoded with Microsoft MPEG-4 VKI Codec V3 (FOURCC code is 'mp43').

Advantages of the "morphable 3d-mosaics" framework

The main advantages of our approach are that:

  • it is scalable to large scale scenes since only one morphable 3D-mosaic needs to be displayed at any time (no matter how large the actual scene is). A constant high frame rate can thus be maintained throughout the walkthrough.

  • no global 3D model of the environment needs to be assembled, a process which can be extremely cumbersome and error-prone for large scale scenes. For instance, the global registration of multiple local models (which is needed for creating a global 3D model) can accumulate a great amount of error, especially if the number of local models is large.

  • it uses a highly optimized rendering path (since both the photometric as well as the geometric morphing can be performed in graphics hardware using the OpenGL pixel and vertex shaders),

  • it can reproduce the photorealistic richness of a scene

  • it is fully automatic and

  • it offers ease in data acquisition (only a sparse set of stereoscopic images are needed, which can be captured very easily with the use of a tripod and a pair of digital cameras).

References

[1] N. Komodakis, G. Pagonis and G. Tziritas, "Interactive walkthroughs using morphable 3D-mosaics", In 2nd International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), 2004. [pdf]
[2] N. Komodakis and G. Tziritas, "Morphable 3D-Mosaics: a Framework for the Visual Reconstruction of Large Natural Scenes", In video proceedings of CVPR 2006 [video]

Morphable 3D mosaics

Contact information


E-mail:
komod@csd.uoc.gr


Phone:
+30 2810 393547


Address:
Computer Science Department
University of Crete
P.O. Box 2208
Heraklion, GREECE