Data valuation for machine learning and federated learning

Chen, Jiaqing (陳佳晴)

Please use this identifier to cite or link to this item: http://dspace.cityu.edu.hk/handle/2031/9527

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Jiaqing (陳佳晴)	en_US
dc.date.accessioned	2022-04-27T03:04:43Z	-
dc.date.available	2022-04-27T03:04:43Z	-
dc.date.issued	2021	en_US
dc.identifier.citation	Chen, J. (2021). Data valuation for machine learning and federated learning (Outstanding Academic Papers by Students (OAPS), City University of Hong Kong).	en_US
dc.identifier.other	cs2021-4514-cj577	en_US
dc.identifier.uri	http://dspace.cityu.edu.hk/handle/2031/9527	-
dc.description.abstract	Recently, federated learning (FL) emerges as a promising framework to collect the dispersed data and train a collaborative machine learning (ML) model with privacy protection. An incentive scheme plays a crucial role in the FL system as they encourage long-term client joining. However, due to information asymmetry between the central server and local users, a key challenge is to evaluate participants' contributions in an objective and efficient manner so as to allocate the payoff fairly. Data valuation in ML context is a systematic study on quantifying the usefulness of a specific data point in a prediction model. It provides a potential solution for FL to measure local client's quality. However, exponential computational complexity and additional communication costs are critical challenges of applying data valuation-based incentive schemes. In this project, we propose a new round-based data valuation (RDV) approach to serve as a real-time incentive mechanism. It takes advantage of the FL system's unique model aggregation property to increase the valuation efficiency and provide a fine-grained contribution estimation on a per-round basis. It also offers a guideline for the central server to selectively aggregate the local updates to train a better-performing model. We empirically demonstrate the effectiveness of RDV in identifying high-quality participants, the efficiency in allocating payoff, and its potentials in federation optimization.	en_US
dc.rights	This work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner.	en_US
dc.rights	Access is unrestricted.	en_US
dc.title	Data valuation for machine learning and federated learning	en_US
dc.contributor.department	Department of Computer Science	en_US
dc.description.course	CS4514 Project	en_US
dc.description.programme	Bachelor of Science (Honours) in Computer Science	en_US
dc.description.supervisor	Dr. Wang, Cong	en_US
Appears in Collections:	OAPS - Dept. of Computer Science

Files in This Item:

File	Size	Format
fulltext.html	153 B	HTML	View/Open

Show simple item record