|
|
CityU Institutional Repository >
CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science >
CS - Master of Philosophy >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2031/6201
|
| Title: | On the labeling and indexing of graph structured data |
| Other Titles: | Tu jie gou hua shu ju de biao ji yu suo yin 圖結構化數據的標記與索引 |
| Authors: | Cai, Jing (蔡晶) |
| Department: | Department of Computer Science |
| Degree: | Master of Philosophy |
| Issue Date: | 2010 |
| Publisher: | City University of Hong Kong |
| Subjects: | Database management. Data structures (Computer science) XML (Document markup language) Indexing. |
| Notes: | CityU Call Number: QA76.9.D3 C326 2010 x, 79 leaves : ill. 30 cm. Thesis (M.Phil.)--City University of Hong Kong, 2010. Includes bibliographical references (leaves 73-78) |
| Type: | thesis |
| Abstract: | Graph structured data are present and widely used in many applications,
due to its expressiveness in representing complex relationships among data
objects. XML, the de facto standard for information representation and
exchange over the Internet, is one typical use of graph structured data. Other
disciplines, such as social networks, geographic navigation, bioinformatics
and web ontologies, etc., involve large amount of graph structured data as
well. Managing, analyzing and querying graph structured data is of great
importance.
This thesis studies the dynamic XML labeling problem and the indexing
of the large graphs for the reachability query purpose.
Firstly, we design a novel XML labeling scheme, referred to as OrdPathX,
which supports both leaf and parent node insertions for dynamic XML. Dynamic
XML labeling has been studied for years. However almost all existing
labeling schemes allow only leaf and sibling node insertions. Inspired by
the caret-in technique of OrdPath, we propose a new labeling scheme which
supports parent node insertions gracefully without relabeling. We introduce
the labeling scheme and the associated algorithms for various inter-node relationship
determination. Experimental results show that OrdPathX can
handle parent node insertions efficiently.
Moreover, with the recent widespread use of ID/REFID tags for representing referenced information among the elements in an XML document, directed graph is gradually replacing tree as the more appropriate model.
More general labeling/indexing schemes are needed to support queries on a
directed graph. Queries on such a directed graph is essentially a reachability
testing.
Therefore, we are motivated to study the reachability problem for general
graph data. Given two vertices, u and v, in a directed graph, a reachability
query asks if there is a directed path from u to v. Over the last two decades,
many indexing schemes have been proposed to support reachability queries
on large graphs. Typically, those schemes based on chain or tree covers work
well when the graph is sparse. For dense graphs, they still have fast query
time but require large storage for their indices. In contrast, the 2-Hop cover
and its variations/extensions produce compact indices even for dense graphs
but have slower query time than those chain/tree covers. We propose a new
indexing scheme, called Path-Hop, which is even more space-efficient than
those schemes based on 2-Hop cover and yet has query processing speed
comparable to those chain/tree covers. We conduct extensive experiments
to demonstrate the effectiveness of our approach relative to other state-of-the-art methods. |
| Online Catalog Link: | http://lib.cityu.edu.hk/record=b3947793 |
| Appears in Collections: | CS - Master of Philosophy
|
Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.
|