Outstanding Academic Papers by Students >
OAPS - Dept. of Computer Science >
Please use this identifier to cite or link to this item:
|Title: ||ActiveCrowd: integrating active learning with Amazon Mechanical Turk|
|Authors: ||Chi, Fung Cheung (池鳳翔)|
|Department: ||Department of Computer Science|
|Issue Date: ||2015|
|Course: ||CS4514 Project|
|Programme: ||Bachelor of Science (Honours) in Computer Science|
|Award: ||Won the Second Runner-up in the Final Year Project Competition 2014-2015 organized by the IEEE (Hong Kong) Computational Intelligence Chapter; the Third Prize in the Challenge Cup 2015 - Hong Kong University Students Extra-Curriculum Technology Contest under the category of Information Technology; and the Merit Award in Crossover 2015 Pan-Pearl River Delta Region Universities IT Project Competition (Hong Kong SAR Region).|
|Supervisor: ||Dr. Nutanong, Sarana|
|Citation: ||Chi, F. C. (2015). ActiveCrowd: Integrating active learning with Amazon Mechanical Turk (Outstanding Academic Papers by Students (OAPS)). Retrieved from City University of Hong Kong, CityU Institutional Repository.|
|Type: ||Research Project|
|Abstract: ||Machine learning is a technique that builds classification and prediction models through learning from samples. It is proven to be useful in scientific research such as DNA pattern recognition and climate modeling. It is also adopted in many real life applications, including spam filtering, image searching and optical character recognition (OCR). Theoretically, the more samples being provided to a learning model, the more accurate the model can be. However, supervised learning requires that samples be provided along with their labels, which can be expensive to obtain in terms of the human power required for labeling tasks. It greatly hinders the adoption of machine learning in resource limited environment.
Meanwhile, crowdsourcing allows requestors to obtain scalable workforce and services from a large crowd of people. Amazon Mechanical Turk (MTurk) is one popular online crowdsourcing platform which enables requestors to publish requests to more than 500 thousands registered workers. It has potential to solve the problem of sample labeling, but so far no integration of machine learning and crowdsourcing is implemented in a way that can serve general machine learning purposes.
In this project, a machine learning framework named ActiveCrowd was designed and implemented to allow anyone who has basic programming knowledge to build machine learning model for general purposes. The framework adopted active learning technique and integrated scikit-learn, which is a superior machine learning library written in Python and published under BSD license, with Amazon Mechanical Turk as the label annotator in a low cost and efficient manner. The framework is able to reduce the implementation effort required for building machine learning models and makes the supervised learning process completely automated.|
|Appears in Collections:||Student Works With External Awards|
OAPS - Dept. of Computer Science
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.