When you accidentally learn some new
Deep learning is everywhere now. Sometimes I felt if a topic does not include that term, it must be out of date (or maybe too novel to some extent ;-).
That fact strongly stops me from learning it because if it is universal to the whole crowd, it is not that distinguished to me, until I read some paper today.
I am always into papers talking about sketch and object detection. To be honest, the approaches of accomplishing these goals are so different from each other, like a boundless forest. The paper’s title and abstract did not mention any learning related keywords. It was hidden well until I went to the section describing their proposal. In order to better recognize both large and small items, especially in indoor scenarios, the author added different features step by step. (I have to say it is really essential to present the idea with a complete logic chain, otherwise it is hard to persuade people all the thoughts are source-able.) One feature is related to context: It is more often we see a lamp on the desk, rather than a lamp on a bed. The relationship between different categories could be learnt after feeding large sets of labeled data. Another feature is about the latent surface, which is one highlight in this approach. Some furniture has hidden surfaces inside their models. For example, a desk with a small bookshelf attached. In that case the cuboid representation will include the whole item but actually the surface of the desk is a really important feature on which might be small cups and vases. The third feature is about the orientation from the object itself to the view, aka camera. Not like other deep learning papers, which might be filled with explanations of the layers, or how to deal with the dimensions of the input. Those sound very important but easily lost to me.
However for this paper, all the implementations relied on the learning of the feature. But the ways to come up with these features are the key points to me. By accident, I tried some CPM code to detect keypoints for fun today. I had no expectation for that since I felt the input was really random and meaningless. After debugging and customizing for one day, more than 95% of them are correct, which really surprised me. I felt like I should recover the online course of deeplearning.ai which I started last summer. It is way more interesting to just see if a cat is a cat.
By the way, the paper is “3D object detection with latent support surfaces” if you are curious.
Leave a Reply