Please use this identifier to cite or link to this item:
http://dspace.cityu.edu.hk/handle/2031/9559
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Misra, Kartikeya | en_US |
dc.date.accessioned | 2023-03-15T10:12:07Z | - |
dc.date.available | 2023-03-15T10:12:07Z | - |
dc.date.issued | 2022 | en_US |
dc.identifier.other | 2022csmk173 | en_US |
dc.identifier.uri | http://dspace.cityu.edu.hk/handle/2031/9559 | - |
dc.description.abstract | There has been recent advancement in the field of NLP which has enabled us to build efficient systems for text summarisation. This research revolves around text summarization for financial texts as it can be very tedious to read all the financial reports issued by the firms and other articles which are useful in this industry. In this research we compare various methods of text summarization- both abstractive and extractive. The model used in this research is LSTM with sequence to sequence encoders and decoders along with multiple attention mechanisms involved. The research aims to generate more human-like summaries i.e. generate abstractive summaries by understanding the context of the text but at the same time extract the important information as it is, for example the numbers in the given texts. For the same we make use of existing research that aims to improve abstractive summarisation techniques and blend those concepts together and also propose our own modifications in the model to get better results. To accomplish the aim of making a good financial text summarizer, the primary focus was on building a model which can generate a good abstractive summaries. To test the model a big dataset is required to ensure that the model does not underperform due to small dataset and since there is no large financial news summary dataset available, this research focuses on Gigaword dataset which has close to 4 million articles. After extensive research, the scope of this research narrowed down to the use of Pointer Generative Networks along with a combination of Maximum Likelihood Estimation and Reinforcement Learning Techniques with multiple attention mechanisms to form the summariser due to a high potential in these techniques. Combining these concepts together along with new proposed methods for the architecture of the model and calculating the encoder and decoder attentions with different approaches than the existing ways, helped to achieve better results than the existing models, thereby contributing to the value of this project. The results achieved by this research has a rouge-l score of 42.47 compared to 35.33 which is used as the baseline model for this research. Once satisfactory results were obtained on this dataset and a comparison with several other models, then the financial dataset was prepared. The financial data used belongs to Reuters which was preprocessed and structed in order to be used in this research. This preprocessed dataset is another contribution of this research as it has close to 100k financial articles which can be used by the community to build financial text summarizer. However as this dataset has not been used earlier in the task of text summarisation, we couldn’t directly compare the model with any work. However we got a decent rouge score of 31.73 and the results were randomly sampled and read to observe the quality of the summaries. The performance of the model was satisfactory and some of the points were identified which could possibly help in improving the accuracy of this model if this research is further carried out. The final dataset used in this research is the CNN/Dailymail dataset with an intention to identify if the model is able to output similar results as Gigaword dataset if the length of each article is more than 500 words as compared to 60 words on gigaword. This experiment led us to a rouge-1 score of 35.99 which hints that the model struggles with longer articles. This discovery is another contribution since it shows the direction in which this research should be further conducted to achieve better overall results as the model is capable of achieving better results, as it demonstrates by beating the state of art models with Gigaword dataset, if this problem is tackled correctly. | en_US |
dc.rights | This work is protected by copyright. Reproduction or distribution of the work in any format is prohibited without written permission of the copyright owner. | en_US |
dc.rights | Access is restricted to CityU users. | en_US |
dc.title | NLP for FinTech | en_US |
dc.contributor.department | Department of Computer Science | en_US |
dc.description.supervisor | Supervisor: Dr. Song, Linqi; First Reader: Dr. Wong, Ka Chun; Second Reader: Prof. Wang, Cong | en_US |
Appears in Collections: | Computer Science - Undergraduate Final Year Projects |
Files in This Item:
File | Size | Format | |
---|---|---|---|
fulltext.html | 147 B | HTML | View/Open |
Items in Digital CityU Collections are protected by copyright, with all rights reserved, unless otherwise indicated.