Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction

Deep neural networks (DNNs), which have been developed with reference to the network structures and the operational algorithms of the brain, have achieved notable success in a broad range of fields, including computer vision, in which they have produced results comparable to, and in some cases superior to, human experts. In recent years, DNNs have also been expected to be useful as a tool for studies of the brain.

Recently a research team led by associate professor Eiji Watanabe of the National Institute for Basic Biology successfully reproduced illusory motion by DNNs trained for prediction.

The DNNs are based on predictive coding theory (Figure 1), which assumes that the internal models of the brain predict the visual world at all times and that errors between the prediction and the actual sensory input further refine the internal models. If the theory substantially reproduces the visual information processing of the brain, then the DNNs can be expected to represent the human visual perception of motion.

In this research, the DNNs were trained with natural scene videos of motion from the point of view of the viewer (Figure 2), and the motion prediction ability of the obtained computer model was verified using a rotating propeller in unlearned videos and the “Rotating Snake Illusion” (Figure 3). The computer model accurately predicted the magnitude and direction of motion of the rotating propeller in the unlearned videos. Surprisingly, it also represented the rotational motion for illusion images that were not moving physically, much like human visual perception (Figure 4). While the trained network accurately reproduced the direction of illusory rotation, it did not detect motion components in negative control pictures wherein people do not perceive illusory motion.

This research supports the exciting idea that the mechanism assumed by the predictive coding theory is a basis of motion illusion generation. Using sensory illusions as indicators of human perception, deep neural networks are expected to contribute significantly to the development of brain research.

These research results were published in Frontiers in Psychology on March 15. This research was conducted as a collaborative research project by the National Institute for Basic Biology, SOKENDAI (the Graduate University for Advanced Studies), Ritsumeikan University, the National Institute for Physiological Sciences, and Sakura Research Office.

Figure 1: A schematic diagram of PredNet (a modification of Figure 1 in Lotter et al. 2016, arXiv:1605.08104). Illustration of information flow within a single layer is presented. Vertical arrows represent connections with other layers. Each layer consists of “Representation” neurons, which output a layer-specific “Prediction” at each time step, which is subtracted from “Target” to produce an error, which is then propagated laterally and vertically in the network. External data or a lower-layer error signal is input to “Target”. In each layer, the input information is not processed directly, and the prediction error signal is processed.

Figure 2: Training videos. Models were trained using videos from the First-Person Social Interactions Dataset (Fathi, et al, 2012, CVPR12@Providence) which contains day-long videos of eight subjects spending their day at Disney World Resort in Orlando, Florida. The cameras were mounted on a cap worn by the subjects.

Figure 3: Akiyoshi Kitaoka’s rotating snake illusions (the left panel). People perceive clockwise or counter-clockwise motion depending on colour alignment. Negative controls (non-illusions) for which people perceive no motion are presented in the right panel. To experience stronger illusory motion perception, please refer to “Akiyoshi’s illusion pages”, http://www.ritsumei.ac.jp/~akitaoka/index-e.html.

Figure 4: The DNNs detected optical flow vectors in the illusion. Optical flow vectors detected between a pair of consecutive predictive images of the illusion. Red bars denote the direction and magnitude of vectors, yellow dots denote the start points of the vectors. The left is a single ring of the rotating snake illusion, and the right is a negative control image. 

##############################
Frontiers in Psychology (2018) Volume 9, Article 345.
“Illusory Motion Reproduced by Deep Neural Networks Trained for Prediction” Eiji Watanabe, Akiyoshi Kitaoka, Kiwako Sakamoto, Masaki Yasugi and Kenta Tanaka
DOI: 10.3389/fpsyg.2018.00345
https://www.frontiersin.org/articles/10.3389/fpsyg.2018.00345/

Frontiers Featured NEWS (2018) April 26
https://blog.frontiersin.org/2018/04/26/artificial-intelligence-tricked-by-optical-illusion-just-like-humans/
##############################


Additional notes:
This paper verifies delta model hypothesis proposed in Watanabe et al. 2010. Delta model is one of expanded theory of Predictive Coding (see below in detail).

Refer to:
Watanabe, E., Matsunaga, W., and Kitaoka, A., Motion signals deflect relative positions of moving objects, Vision Research  50, 2381-2390 (2010) [pubmed]

Related articles:
1) Motion signals deflect relative positions of moving objects
2) Delta model


Comments