CityU Institutional Repository >
Student Final Year Projects >
Computer Science - Undergraduate Final Year Projects >
Please use this identifier to cite or link to this item:
|Title: ||Web museum|
|Authors: ||Lee, Hang|
|Department: ||Department of Computer Science|
|Issue Date: ||2006|
|Supervisor: ||Dr. Chan Edward. First Reader: Dr. Poon C K. Second Reader: Dr. Chun Andy H W|
|Abstract: ||As the Internet has been developed readily, electronic information becomes our
essential asset. It is, however, not easy to trace the historic online information from the
World Wide Web. A Web archival system serves as an Internet library to keep track of
all online information, so researchers, historians, scholars and even our next
generation can get the cyber history.
The project aims to design a Web archival system that will visit user-selected Web
sites periodically, determine whether there has been any change, retrieve the
modified version of the Web site and save a copy in the local machine, and also
provide a user-friendly interface for searching and browsing the archived
Most of the existing Web archival systems, however, face the high storage and network
overhead problem. In order to relax these problems, the project develops “change
detection mechanism” and “Web object change interval estimation”.
Change detection mechanism is an algorithm to determine any “meaningful” change
between the last archived version and the latest downloaded version. The algorithm
can determine if the latest downloaded version is whether worth to save or not.
Web object change interval estimation is a scheduler to project the future modification
time of a Web page. Using the scheduler to organize Web archival tasks, the system
can arrange more resources on the frequently-change Web site and reduce the
|Appears in Collections:||Computer Science - Undergraduate Final Year Projects|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.