Dev 007: June 2018

Saturday, June 30, 2018

Favorite quotes from "Lab Girl" by Hope Jahren

"While looking at the graph, I thought about how I now knew something for certain that only an hour ago had been an absolutely unknown, and I slowly began to appreciate how my life had just changed. I was the only person in an infinite exploding universe who knew that the powder was made of opal. In a wide, wide world, full of unimaginable numbers of people, I was - in addition to being small and insufficient - special. I was not only a quirky bundle of genes, but I was also unique existentially, because of the tiny detail that I knew about Creation, because of what I had seen and then understood. Until I phoned someone, the concrete knowledge that opal was the mineral that fortified each seed on each hackberry tree was mine alone. Whether or not this was something worth knowing seemed another problem for another day. I stood and absorbed this revelation as my life turned a page, and my first scientific discovery shone, as even the cheapest plastic toy does when it is new."

"I had worked and waited for this day. In solving this mystery I had also proved something, at least to myself, and I finally knew what real research would feel like."

"Afterward I reward myself by sitting in my office choosing and ordering chemicals and equipment, feeling like a giddy bride picking out her gift registry"

"As research scientists, we will never ever be secure."

"He knows about the many nextstory.doc files in my hard drive; He knows how I like to sift through the thesaurus for hours; he knows that nothing feels better to me than finding the exactly right word that stabs cleanly at the heart of what you are trying to say."

Thursday, June 28, 2018

Look Closer to See Better

image source: wikipedia

Hearing about this recent research made me feel a little dumb, and hopefully you will feel the same too. But, anyways it's quite impressive to see the advanced tasks that machines are getting capable of. What we usually hear is that even though recognizing a cat is a simple task for humans, it is quite challenging task for a machine, or let's say.. for a computer.

Then, try to recognize what's in this image? If I was given this task, I would have just said that it's a 'bird', hopefully you would too, unless you are a bird expert or enthusiast. Of course it's a bird, but what if your computer is smart enough to say that it's a 'Laysan albatross' 😂Not feeling dumb enough yet? Seems like the computer is aware of which features in which areas of its body make it a 'Laysan albatross' too.

Even though, there exists some promising research on region detection and fine grained feature learning (E.g., find which region of this bird contain more discriminative features from other bird species and then learn those features, so that we can recognize the bird species of a new, previously unseen, image), they still have some limitations.

So this research [1] focuses on a method where the two components namely attention based region detection and fine grained feature learning strengthen or reinforce each other by giving them feedback to perform better as a whole. The first component starts by looking at the coarse grained features of a given image to identify which areas to pay more attention to. Then the second component will further analyze the fine grained details of those areas to learn what features make this area unique to this species. If the second component is struggling to make confident decisions on recognizing the bird species, then it will inform that to first model as the selected region might not be very accurate.

More information about this research can be found here.

[1] J. Fu, H. Zheng and T. Mei, "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 4476-4484.

Sunday, June 24, 2018

What makes Paris look like Paris?

windows with railings in Paris

Cities have their own character. May be that’s what makes some cities notable than the others. In her award winning memoire “Eat, Pray, Love”, Elizebeth Gilbert mentions that there’s a ‘word' for each city. She assigns ‘Sex' for Rome, ‘Achieve' for NewYork, ‘Conform' for Stockholm (To add more cities that I have been to, how about ‘ Tranquility' for Kyoto, ‘Elegance' for Ginza and ‘Vibrant' for Shibuya?). When terrorists attacked Paris in 2015, more than 7 million people shared their support for Paris under the #PrayforParis hash tag within 10 hours. Have you ever thought what characteristics make a city feels the way it is? Can we make a machine that can ‘feel' the same way about cities as the humans do?

May be we are not there yet. Nevertheless, researchers from Carnegie Mellon University and Inria have taken an innovative baby step towards this research direction by asking the question “What makes Paris look like Paris?” [1]. Is it the Eiffel tower what makes Paris looks like Paris? How can we find if a given image is taken in Paris if the Eiffel tower is not present in that image?

To start with, they asked people who have been to Paris before, to recognize Paris from some other cities like London or Prague. Humans could achieve this task with significant level of accuracy. In order to make a machine that can perceive a city as the same way as a human does, first we need to figure out, "What characteristics of Paris help humans to perceive Paris as Paris?". So, their research focuses on automatically mining the frequently occurring patterns or characteristics (features) that make Paris geographically discriminative than the other cities. Even though, there can be both local and global features, the researchers have focused only on local, high dimensional features. Hence, image patches at different resolutions, represented as HOG+color descriptors are used for the experiments. Image patches are labeled as two sets namely Paris and non-Paris (London, Prague etc.) Initially, the non discriminative patches, things that can occur in any city such as cars or sidewalks, are eliminated using nearest neighborhood algorithm. If an image patch is similar to other image patches in ‘both' Paris set and non-Paris set, then that image patch is considered as not discriminative and vice versa.

Paris Window painting
by Janis McElmurry

However, the notion of “similarity” can be purely subjective when it comes to similarity between different aspects. So, the standard similarity measurements used in the nearest neighborhood algorithm might not represent the similarity between the elements from different cities well. Accordingly, researchers have come up with a distance or similarity metric that can be learned or adopted to find discriminative features using the available image patches in an iterative manner. This algorithm is executed with images from different cities such as Paris and Barcelona to find distinctive stylist elements of each city.

Interesting fact about this research (well, at least for myself) is artists can use these research findings as useful cues to better capture the style of a given place. More details about this research can be found here.

[1] Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. What Makes Paris Look like Paris? ACM Transactions on Graphics (SIGGRAPH 2012), August 2012, vol. 31, No. 3.

Tuesday, June 19, 2018

Panoptic Segmentation

Panoptic segmentation is a topic that was discussed during our lab seminar recently, because it could potentially improve scene understanding in autonomous vehicles using vision sensors.

Successful approaches based on convolutional nets have been proposed previously for semantic segmentation task. Further, methods based on object or region proposals have become popular to detect individual objects as well.

Image source: [1]

The idea behind panoptic segmentation [1] is unifying the tasks of semantic segmentation (studying about 'stuff' such as sky, grass, regions) and instance segmentation using object detectors (studying about countable 'things', E.g., different instances of cars).

'Panoptic quality (PQ)' metric is proposed as a novel method to evaluate the proposed approach. More details about this can be found here and a simpler version here.

[1] Panoptic Segmentation, Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollar

Monday, June 18, 2018

Theoretical Understanding of Deep Learning using Information Theory

Information theory in communication

Information theory has been proved useful in many applications such as digital communication, computer networking and multimedia compression. For example, the concept of entropy (average information content) can be used to determine the minimum length a given message can be encoded before transmission and the concept of mutual information can be used to determine the channel capacity of a given communication channel.

Information bottleneck
(analogous to optimal representations)

Regardless of the recent breakthroughs of deep learning, 'why they work, the way they work?', still remains a mystery. Recently, I read about an interesting attempt [1] where the researchers tried to understand the inner workings of deep learning algorithms using the notion of information theory. There, they have come up with a concept called "information bottleneck" to describe the most relevant features or signals with sufficient discriminative ability, excluding all the noise, to solve a given problem. In other words, the goal of the deep neural network is to maximize the information bottleneck (compressing the network as much as possible while preserving the prediction accuracy).

More information can be found here (their original paper) and here (a much simplified version!).

[1] Opening the Black Box of Deep Neural Networks via Information, Schwartz-Viz & Tishby, ICRI-CI 2017

Image sources:

[1] http://www.nairaland.com/3943054/information-theory-made-really-simple

[2] https://www.techwell.com/techwell-insights/2017/05/finding-bottlenecks-agile-and-devops-delivery-cycle

Pages