Dateien
Dokumentation
Specific optical conditions in the deep sea - the problems for correspondence search.
Fig. 1: Pictures of the deep sea floor. The same place photographed from different distances illustrates problems of correspondence search on those images.
There are mainly three problems for correspondence search in deep sea images: scattering, absorption and the absence of natural light or the need for artificial lighting. Scattering and absorption are dependent on distance and type of water and cause blurring, reduced contrast and color variations. The need for lighting causes challenges by introducing dynamically changing artificial pattern on the images, like light cones and moving shadows.
These circumstances impede the recognition of similarities in different images for a computer algorithm and sometimes as well for the human eye.
Following images show the matching result on deep sea images obtained with a D2Net model trained on a dataset based on various photographs of sights, which is the model provided with the D2Net paper and source code. (Paper: Dusmanu et. al., 2019: https://arxiv.org/abs/1905.03561, Source Code: https://github.com/mihaidusmanu/d2-net, MegaDepth Dataset: https://openaccess.thecvf.com/content_cvpr_2018/html/Li_MegaDepth_Learning_Single-View_CVPR_2018_paper.html)
Fig. 2: Matching algorithm recognizes pattern of light source instead of depicted structure (left) or can not find correspondences at all (right).
D2Net architecture - why do I believe this approach can yield good results on challenging deep sea images
Fig. 3: Matching results (D2Net model trained on MegaDepth Dataset as above) on challenging image pairs is promising.
How is this achieved:
(soon in more detail)
describe-and-detect instead of more conventional detect-than-describe
→ extracted 3D feature map is descriptor vector for every image pixel and indicator for keypoint detection
triplet margin ranking loss function extended with a detection part (to jointly detect and describe) minimizes distance with correct matches and maximizes distance to most confounding wrong matches
Dataset - how to obtain realistic training data
Image generation pipeline:
find dense 3D model of a suitable scene
find or add texture to the scene to let the surface appear realistic
with a rendering tool (Mitsuba2 and Blender were used):
define scattering properties of medium between simulated camera sensor and objects in scene
define light configuration
render images with multiple different camera poses, light configurations, scattering properties
First rendering results with small test scene
Fig.4: Rendered images of a sea floor scene with some stones and particles at different distances to the ground.