How it looks like in the deep ocean is nowadays still unknown for huge parts of the earth but visual maps of the sea floor become more and more important for marine research. In particular, optical methods are desirable for underwater vehicles that collect various research data. Like the development of different vehicles on land strives more and more towards autonomous navigation, the same aim is followed for underwater vehicles.

To improve autonomous navigation large image maps of the deep sea need to be generated from single photos by image registration. However, while various feature matching methods have been developed for applications in air, it is much more challenging to solve the matching problem for very different optical conditions in the water, especially in the deep sea.

Existing correspondence search algorithms struggle in deep sea conditions because contrast, blurring, color and intensity of the same object change a lot dependend on the distance to the camera and the charactersitics of the water. Therefore, this work aims to let a neural network learndeep sea feature matcher which is able to tackle this task by training a model with images that contain the special deep sea optical characteristics.

An additional challenge in training a deep sea correspondence search model is the lack of annotated images. To overcome this problem a data base containing simulated deep sea images will be generated using existing raytracing algorithms, which model the under water optical conditions and applying them on three-dimensional textured scene models.



Specific optical conditions in the deep sea - the problems for correspondence search. 

Fig. 1: Pictures of the deep sea floor. The same place photographed from different distances illustrates problems of correspondence search on those images. 

There are mainly three problems for correspondence search in deep sea images: scattering, absorption and the absence of natural light or the need for artificial lighting. Scattering and absorption are dependent on distance and type of water and cause blurring, reduced contrast and color variations. The need for lighting causes challenges by introducing dynamically changing artificial pattern on the images, like light cones and moving shadows.

These circumstances impede the recognition of similarities in different images for a computer algorithm and sometimes as well for the human eye. 

Following images show the matching result on deep sea images obtained with a D2Net model trained on a dataset based on various photographs of sights, which is the model provided with the D2Net paper and source code. (Paper: Dusmanu et. al., 2019: https://arxiv.org/abs/1905.03561, Source Code: https://github.com/mihaidusmanu/d2-net, MegaDepth Dataset: https://openaccess.thecvf.com/content_cvpr_2018/html/Li_MegaDepth_Learning_Single-View_CVPR_2018_paper.html)


Fig. 2: Matching algorithm recognizes pattern of light source instead of depicted structure (left) or can not find correspondences at all (right).

D2Net architecture - why do I believe this approach can yield good results on challenging deep sea images 


Fig. 3: Matching results (D2Net model trained on MegaDepth Dataset as above) on challenging image pairs is promising. 

How is this achieved: 

(soon in more detail)

  • describe-and-detect instead of more conventional detect-than-describe

  • → extracted 3D feature map is descriptor vector for every image pixel and indicator for keypoint detection

  • triplet margin ranking loss function extended with a detection part (to jointly detect and describe) minimizes distance with correct matches and maximizes distance to most confounding wrong matches

Dataset - how to obtain realistic training data

Image generation pipeline:

  • find dense 3D model of a suitable scene

  • find or add texture to the scene to let the surface appear realistic

with a rendering tool (Mitsuba2 and Blender were used):

  • define scattering properties of medium between simulated camera sensor and objects in scene

  • define light configuration

  • render images with multiple different camera poses, light configurations, scattering properties

First rendering results with small test scene

Fig.4: Rendered images of a sea floor scene with some stones and particles at different distances to the ground.

  • No labels