Theoretical and practical limits of CNNs

 

Throughout this wiki a lot of theoretical and practical aspects about convolutional neural networks have been discussed. Although CNNs are a one of the most promising innovations in latest history, engineers and researchers are still facing problems in several fields. This site will focus on some major aspects considering training data retrieval, understanding of trained networks and training times.

Keywords: limits, problems, issues, overfitting

 

Date 15.01.2017

Author Florian Knitz

 

Overfitting

In the research field of Artificial Intelligence (AI), Deep Learning or CNNs “overfitting” is a common problem. It generally describes the use of models or procedures that include more terms than necessary or use overly complex approaches. It can be categorized into two types:

  • Using a network that is more flexible than it needs to be.
  • Using a model that includes irrelevant components

There are several reasons why overfitting should be avoided:

  • Unused predictors waste resources
  • Too complex models increase the probability of prediction mistakes
  • Irrelevant predictors add random variation to the subsequent predictions
  • Too complex models show low portability to other situations, as they are fitted to close to one specific environment

In order to avoid overfitting in CNNs models, this WIKI proposes the Dropout method.(1)

Degradation Problem

When deeper networks are able to start converging, a degradation problem has been exposed. As the network depth increases, accuracy gets saturated and then starts to degrade rapidly. Overfitting does not cause this degradation, and adding more layers to a suitably deep model leads to higher training error. Read more detailed information on this problem here. (2)

Therefore the method of deep residual networks has been proposed.

Spatial invariance

Another problem in the field of image classification is spatial invariance, as objects can be variable in size, rotation and several other transformations. By now, the CNNs proposed in this WIKI show no flexibility to such transformed objects.

Have a look at Spatial Transformer Networks for insights on an approach to resolve this issue and also get more information on spatial invariance.

Long training times / performance issues

Large networks need long training times. Complex models can take days of training depending on the hardware it’s running on. In addition to that, a steady increase of CNN complexity has been observed in the past and can be predicted for future nets. Meaning additional layers, filters and other steps. Expanding training data sets are another factor accelerating the greedy computation consumption of CNNs. As a result researchers have to take care of performance optimization to their nets in order to keep training times reasonable.

Several approaches therefore switch to GPUs for training networks. Read more in here.

Gathering training data

With CNNs being able to differentiate between increasing numbers of features while improving feature detection accuracy as well, it is getting harder to provide enough labeled training data sets for new fields of application. By now there are several acquainted labeled data sets like: MNIST, PASCAL, IMAGENET.

These libraries are well suited to cover topics like handwriting recognition as common image processing tasks. Unusual tasks, on the other hand, are not covered by such big libraries therefore have to be created specifically. Unfortunately constructing and verifying such libraries can be a time- and labor-intensive task.

A proposition on this issue is made by generative adversarial networks, which are able to generate new training data by mixing labeled data sets.

Understanding trained networks

Large Convolutional Network models have recently demonstrated impressive classification performance on the ImageNet benchmark. However, there is no clear understanding of why they perform so well, or how they might be improved. Scientifically and from a security point of view this is deeply unsatisfying. When used in security relevant fields like autonomous driving as an example a high assurance of functionality is desired.

First steps into visualizing and deeper understanding CNNs have been proposed with “Deconvolutional Networks”. Researchers were able to observe individual feature maps in any layer in the model. They also could watch the evolution of features during training and were able to diagnose potential problems with the model. Read more in their elaborations (see literature 3).

Future anticipations

As CNNs are a young growing field of research with much potential, it is by now hard to say where this technology will fiend its limits. Many approaches to solve the preceding limits have already been proposed. CNNs have shown brilliant results in categorizing and processing 2D graphics. The future of deep learning will consist of combinations of different AI technologies, where each of those contributes with its own specialization.

Literature

  1. Douglas M. Hawkins, The problem of overfitting , Journal of Chemical Information and Computer Sciences 44 (2004), no. 1, 1–12, PMID: 14741005. http://pubs.acs.org/doi/abs/10.1021/ci0342472
  2. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep residual learning for image recognition , CoRR abs/1512.03385 (2015). https://arxiv.org/abs/1512.03385
  3. Matthew D. Zeiler and Rob Fergus, Visualizing and understanding convolutional networks, pp. 818–833, Springer International Publishing, Cham, 2014. http://link.springer.com/chapter/10.1007/978-3-319-10590-1_53

Weblinks to Datasets

MNIST: http://yann.lecun.com/exdb/mnist/ (Last visited 29.01.2017)

PASCAL: http://host.robots.ox.ac.uk/pascal/VOC/databases.html (Last visited 29.01.2017)

IMAGE-NET: http://image-net.org (Last visited 29.01.2017)

  • Keine Stichwörter

2 Kommentare

  1. Unbekannter Benutzer (ga25gux) sagt:

    I have realized that you are trying to formulate minimalized sentences. It makes your writing hard to understand, try to explain more and also try to explain the reasoning behind it and give a short interpretation of the content. I know it is very technical but if you put your interpretation and reasoning will make it easier for others to understand. The current formulation you use makes it hard to understand  

    Please add links, dates and google scholar or home page links of the authors to your literature

    Try to add superscript to your numerical links such as Read more in their elaborations (see literature 3) here make (3) with superscript, it would look better

    Give numbers to your web links as well

     

    Here are small mistakes and corrections

    In the research field of AI, Deep Learning or CNNs “overfitting” is a common problem. It generally describes the use of models or procedures that include more terms than are necessary or use more complicated approaches than are necessary. (too much use of than)

     

    In the research field of AI, Deep Learning or CNNs “overfitting” is a common problem. It generally describes the use of models or procedures that include more terms than necessary or usage of overly complicated approaches.

     

    Too complex models increase the possibility of prediction mistakes (I would have used probability)

    Too complex models are impair portability to other situations, as they are fitted to close to one specific environment (the meaning is not clear and you should use impairing portability rather than ‘’are impair’’)

    When deeper networks are able to start converging, a degradation problem has been exposed

    When deeper networks start converging, a degradation problem has been exposed (also the meaning is not so clear)

     

    With the network depth increasing, accuracy gets saturated and then degrades rapidly. 

    As the network depth increases, accuracy gets saturated and then starts to degrade rapidly. 

     

    In addition to that, a steady increase of CNN complexity has been observed in the past and can be predicted for future nets.

    In addition, a steady increase of CNN complexity has been observed in the past and can be predicted for future nets.

     

    Growing training data sets are another factor to the greedy computation consumption of CNNs. (Here the meaning is not so clear)

     

    By now there are several acquainted labeled data sets like: MNIST, PASCAL, IMAGENET. (can you separate them and offer link to each of them)

     

    Unusual tasks on the other hand are not covered by such big libraries therefore have to be created specifically.

    Unusual tasks, on the other hand, are not covered by such big libraries therefore have to be created specifically.

    However there is no clear understanding of why they perform so well, or how they might be improved.

    However, there is no clear understanding of why they perform so well, or how they might be improved.

  2. Unbekannter Benutzer (ga67vim) sagt:

    Concerning the abbreviations, right at the beginning you could put behind "convolutional neural networks" "(CNN)" and I know some abbrevs are kind of obvious but maybe write out "AI". This also attracted my attention in your other chapter (Lächeln)