Dev 007: Perception

Showing posts with label Perception. Show all posts

Wednesday, January 23, 2019

DDoC #05: Visual memory: What do you know about what you saw?

A random read on "Gist of a Scene" this time, by Dr. Jeremy M. Wolfe.

When it comes to remembering a scene, humans do not go through all the details of the scene. What matters is only the gist of the scene. However, what constitutes the scene gist is not agreed upon yet. Some of the finding on that research direction are as follows:

1. Change in appearance does not cause a scene gist, (e.g., people remember a scene of a two women talking irrespective of the color of the cloths they wear. This is called “change blindness”
2. Scene gist is not just a collection of objects, relationships between the objects in the scene also matter (e.g., milk being poured from a carton into a glass is not the same as a picture of milk being poured from a carton into the space next to a glass)
3. Scene gist involves some information about the spatial layout of the scene
4. Scene gist also involves the presence of unidentified objects (people do not see all the objects, but they know that certain objects should be there even if it is not visible)

You can find more information in his article.

Thursday, January 17, 2019

DDoC #04: Connecting Look and Feel

Today's paper is "Connecting Look and Feel: Associating the Visual and Tactile Properties of Physical Materials". (Figure 1: with a self pat on my back for working on this, even when I'm little sick+in a bad mood. And then I found this cool image!)

Figure 1: self pat on my back source:
http://massagebywil.com/2011/10/25/pat-yourself-on-the-back/

Why?

Humans use visual cues to infer material properties of objects. Further, touch is an important way of perception for both robots and humans to effectively interact with the outside world.

What?

Project the input from different modalities in to a shared embedding space to associate visual and tactile information.

How?

Fabrics are clustered to different groups using K-nearest neighbor algorithm based on their physical properties such as thickness, stiffness, stretchiness and density. For humans, these fabrics in similar cluster will have similar properties.

Clusters of Fabrics with different properties

Input: Different modalities of input image of fabric (depth, color and tactile images from touch sensor)

Output: Determine the whether the different modalities are from same fabric or different fabrics

Process:

First, a low dimension representation (different embedding) of these input data is extracted using CNN. Then the distance between these different embeddings is measured. The idea is to have smaller distance for different modalities of the same fabric and to have a larger distance for different modalities of the different fabric.

So, the goal of optimization function is to minimize the distance between different modalities of same fabric using contrastive loss (In layman terms, neighbors are pulled together and non-neighbors are pulled apart)

More information can be found in their paper.

Sunday, October 28, 2018

On Creativity and Abstractions of Neural Networks

"Are GANs just a tool for human artists? Or are human artists at
the risk of becoming tools for GANs?"

Today we had a guest lecture titled "Creativity and Abstractions of Neural Networks" by David Ha (@HardMaru), Research Scientist at Google Brain, facilitated by Michal Fabinger.

Among all the interesting topics he discussed such as Sketch-RNN, Kanji-RNN and world models, what captivated me most is his ideas about abstraction, machine creativity and evolutional models. What exactly discussed on those topics (as I understood) is,

Generating images based on latent vectors in auto encoders is a useful way to understand how the network understands abstract representations about data. In world models [1], he has used RNN to predict the next latent vector which can think of as an abstract representation of the reality.

Creative machines learn and form new policies to survive or to perform better. This can be somewhat evolutionary (may be not during the life time of one agent). The agents can adopt to different scenarios by modifying them selves too (self-modifying agents).

Some other quotes or facts about human perception that (I think) has inspired his work.

Sketch-RNN [2]:

"The function of vision is to update the internal model of the world inside our head, but what we put on a piece of paper is the internal model" ~ Harold Cohen (1928 -2016), Reflections of design and building AARON

World Models:

"The image of the world around us, which we carry in our head, is just a model. Nobody in their head imagines all the world, government or country. We have only selected concepts, and relationships between them, and we use those to represent the real system." ~ Jay Write Forrester (1918-2016), Father of system dynamics

[1] https://worldmodels.github.io/
[2] https://arxiv.org/abs/1704.03477

Wednesday, July 4, 2018

teamLab: Blurring the Boundaries between Art and Science

My Lizard Painting

Yesterday, we visited MORI building digital art museum: teamLab Borderless (This name is quite long and too hard to remember in the right order :P) which was opened recently in Odaiba... to make Odaiba, or Tokyo for that matter, even greater!

Even though some exhibits look a bit trivial in the beginning, (I felt that the exhibition ticket was somewhat over priced, although it was at discounted price and regardless of the fact that we did not pay for it), a second thought after further reading made me feel so overwhelmed, impressed and fascinated about the extent of innovation, creativity and philosophical thoughts that they have put together in to each piece of art.

This museum gives us a great feel of how digital involvement can nicely complement the traditional forms of art and overcome their inherent limitations. The museum is based on few great concepts. One such concept that highly captivated my curious (... well, about perception, in its all forms) mind is their notion on ultra subjective space. Comparing that concept with the western perspective of paintings and ancient Japanese spatial recognition made the idea even more lucrative.

If you are planning to visit this museum, I highly recommend that you understand those concepts before you visit the museum, in addition to other "things you should read before you visit", to make your museum experience even better.

On the other hand, some activities were quite fun too. Look at how I painted a cute lizard and the way it came alive a little later with all those "natural lizard like moves"!

Sunday, July 1, 2018

Seeing is 'Not Necessarily' Believing?

Recently, I read the Nature article, “Our useful inability to see the reality”that introduced a book called “Deviate: The Science of Seeing Differently”. I haven’t read the book yet, but this seems to shed some light on the idea of perception. The basic idea seems to be that we see what we want to see, based on our past experience, and not necessarily what’s out there in reality. In other words, the information we acquire from our eyes has much less to do with how we derive the actual meaning of it (in relation to ourselves of course). Specifically, it is being discovered that the 90% of the neurons that are responsible to make sense of what we see don’t consist of the visual fields in the brain.

Pages