City University of Hong Kong
DSpace
 

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Master of Philosophy  >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2031/6201

Title: On the labeling and indexing of graph structured data
Other Titles: Tu jie gou hua shu ju de biao ji yu suo yin
圖結構化數據的標記與索引
Authors: Cai, Jing (蔡晶)
Department: Department of Computer Science
Degree: Master of Philosophy
Issue Date: 2010
Publisher: City University of Hong Kong
Subjects: Database management.
Data structures (Computer science)
XML (Document markup language)
Indexing.
Notes: CityU Call Number: QA76.9.D3 C326 2010
x, 79 leaves : ill. 30 cm.
Thesis (M.Phil.)--City University of Hong Kong, 2010.
Includes bibliographical references (leaves 73-78)
Type: thesis
Abstract: Graph structured data are present and widely used in many applications, due to its expressiveness in representing complex relationships among data objects. XML, the de facto standard for information representation and exchange over the Internet, is one typical use of graph structured data. Other disciplines, such as social networks, geographic navigation, bioinformatics and web ontologies, etc., involve large amount of graph structured data as well. Managing, analyzing and querying graph structured data is of great importance. This thesis studies the dynamic XML labeling problem and the indexing of the large graphs for the reachability query purpose. Firstly, we design a novel XML labeling scheme, referred to as OrdPathX, which supports both leaf and parent node insertions for dynamic XML. Dynamic XML labeling has been studied for years. However almost all existing labeling schemes allow only leaf and sibling node insertions. Inspired by the caret-in technique of OrdPath, we propose a new labeling scheme which supports parent node insertions gracefully without relabeling. We introduce the labeling scheme and the associated algorithms for various inter-node relationship determination. Experimental results show that OrdPathX can handle parent node insertions efficiently. Moreover, with the recent widespread use of ID/REFID tags for representing referenced information among the elements in an XML document, directed graph is gradually replacing tree as the more appropriate model. More general labeling/indexing schemes are needed to support queries on a directed graph. Queries on such a directed graph is essentially a reachability testing. Therefore, we are motivated to study the reachability problem for general graph data. Given two vertices, u and v, in a directed graph, a reachability query asks if there is a directed path from u to v. Over the last two decades, many indexing schemes have been proposed to support reachability queries on large graphs. Typically, those schemes based on chain or tree covers work well when the graph is sparse. For dense graphs, they still have fast query time but require large storage for their indices. In contrast, the 2-Hop cover and its variations/extensions produce compact indices even for dense graphs but have slower query time than those chain/tree covers. We propose a new indexing scheme, called Path-Hop, which is even more space-efficient than those schemes based on 2-Hop cover and yet has query processing speed comparable to those chain/tree covers. We conduct extensive experiments to demonstrate the effectiveness of our approach relative to other state-of-the-art methods.
Online Catalog Link: http://lib.cityu.edu.hk/record=b3947793
Appears in Collections:CS - Master of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html134 BHTMLView/Open
fulltext.html134 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer