Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/9448
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wu, Kwun Yu | en_US |
dc.date.accessioned | 2021-11-16T06:48:29Z | - |
dc.date.available | 2021-11-16T06:48:29Z | - |
dc.date.issued | 2021 | en_US |
dc.identifier.other | 2021eewky500 | en_US |
dc.identifier.uri | http://dspace.cityu.edu.hk/handle/2031/9448 | - |
dc.description.abstract | This project aims to alleviate the existing problem of the deep neural network model that is computational and memory-intensive which is especially significant during the inference time by applying low precision techniques to quantize the model to gain better performance in terms of latency and memory footprint. In this project, different model compression and acceleration techniques have been surveyed. Moreover, the experiments of some of the model compression techniques have been conducted with the application of image classification that the convolutional neural network (CNN) models are successfully trained on the CIFAR 10 dataset and applied low precision techniques to compress the model that the parameters of the model have been successfully quantized into a lower bit width such that performance has been improved in term of memory utilization and also inference time. In addition, the models have been successfully deployed and evaluated on the hardware - Alveo U50, which is an FPGA-based accelerator card for computation-intensive tasks. Two approaches for the implementation of the quantized neural network inference have been explored and evaluated based on different frameworks and hardware. Finally, one of the approaches lead to the success of the deployment of the quantized model onto the hardware for inference and produced promising results on the testing dataset that is the improvement of the inference time and memory utilization while only negligible accuracy loss after the quantization and the deployment. Therefore, it can be considered as one of the feasible solutions to cope with the existing problem. | en_US |
dc.rights | This work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner. | en_US |
dc.rights | Access is restricted to CityU users. | en_US |
dc.title | Quantized Neural Network Inference on FPGA | en_US |
dc.contributor.department | Department of Electrical Engineering | en_US |
dc.description.supervisor | Supervisor: Dr. Cheung, Ray C C; Assessor: Dr. Chan, K L | en_US |
Appears in Collections: | Electrical Engineering - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 148 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.