Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/47
Title: | Deep Learning for free-hand sketch object recognition |
Authors: | Byaravalli Arun, Suhag |
Department: | Department of Computer Science |
Issue Date: | 2017 |
Supervisor: | Supervisor: Dr. Lau, W. H. Rynson; First Reader: Dr. Hou, Junhui David; Second Reader: Prof. Zhang, Qingfu |
Abstract: | In this project, we propose a novel deep learning architecture that achieves state-of-the-art results in free-hand sketch object recognition. The highly iconic and abstract nature of sketch objects make it hard task for a computer algorithm to recognition them. As sketch recognition is not a new concept in computer vision, we conducted a detailed study of the previous works related to our project domain. The hand-crafted models failed to capture the iconic nature of sketches. And the existing deep learning architectures are tailored to photo images and do not adopt to the varying levels of abstraction present in sketch objects. This resulted in sketch-A-Net which surpassed human level accuracy. Sketch-A-Net requires stroke order information to accurately recognize sketch objects. The framework only considers real time sketch inputs and cannot handle a large dataset of sketch objects available online. All the above research discoveries resoundingly stressed to adopt a new deep learning architecture which is tailored to solve sketch recognition. Our model is designed on the Hebbian principle which states that neurons that are coupled together, activate together. We address common issues that are overlooked in previous works regarding a new deep learning model design. We solve overfitting problems of wider network by introducing a sparse structure of convolutional blocks in our model. We engineer the model to solve sketch object iconic and abstract nature by using large number of training samples. Our model is trained on the TU-Berlin sketch dataset which consists of 20,000 objects from 250 categories. We apply data-augmentation techniques on the dataset to elastically increase its size. Our model achieves a ground breaking recognition accuracy of 84.7% which is ~10% more than its predecessors. Then, we deployed our model on a cloud platform and set-up a web application to process sketch recognition requests. Even though our model achieves a high accuracy, it still fails to recognize the intra-class deformations. This points out that our model still has room for improvement. By successfully solving sketch recognition, we can now move towards solving multi-object recognition, sketch object segmentation, image retrieval based on sketch query and the most popular current trend in computer vision, the use of Generative Adversarial Networks to synthesis sketch objects or use a sketch object to synthesis a complete photo realistic image. The possibilities in this domain is endless and we plan to visit and continue our research in deep learning for free-hand sketch objects in the future. |
Appears in Collections: | Computer Science - Undergraduate Final Year Projects |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
fulltext.html | 148 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.