City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Master of Philosophy  >

Please use this identifier to cite or link to this item:

Title: 3D protein substructure identification and nearest neighbor search
Other Titles: 3D dan bai zhi zi jie gou de jian ding he zui jin lin cha zhao
3D 蛋白質子結構的鑒定和最近鄰查找
Authors: Yang, Yong (楊勇)
Department: Department of Computer Science
Degree: Master of Philosophy
Issue Date: 2010
Publisher: City University of Hong Kong
Subjects: Proteins -- Analysis -- Data processing.
Notes: CityU Call Number: QP551 .Y36 2010
xi, 56 leaves : ill. 30 cm.
Thesis (M.Phil.)--City University of Hong Kong, 2010.
Includes bibliographical references (leaves 53-56)
Type: thesis
Abstract: Identifying the location of binding sites on proteins is of fundamental importance for a wide range of applications including molecular docking, de novo drug design, structure identification and comparison of functional sites. Structural genomics projects are beginning to produce protein structures with unknown functions. Therefore, efficient methods are required if all these structures are to be properly annotated. Identifying the interface between two interacting proteins provides important clues to the function of a protein and can reduce the search space required by docking algorithms to predict the structures of complexes. Here we develop a software package for identifying similar three-dimensional (3D) protein substructures via direct comparison of 3D structures. We develop an efficient heuristic algorithm for finding protein 3D substructures in a 3D protein structure that are similar to a given 3D protein substructure. This algorithm can also be used for searching a database of 3D protein structures. We also design an algorithm for finding similar 3D substructures from two given 3D protein structures. Our approach is to directly compare the substructures by computing a rigid transformation such that the distance between a pair of corresponding points in the matched substructures is bounded by a given value d. We propose an efficient local search heuristic approach for finding an approximate rigid transformation. We implement the algorithms and produce a software package that works well in practice. Experiments show that our software package can find pairs of similar 3D substructures in reasonable time. We also provide graphical interface to allow users to view the pairs of 3D protein substructures from different angles. Finding the closest objective for a query in a database is a classical problem in computer science. For some modern biological applications, computing the similarity between two objects might be very time consuming. For example, it takes a long time to compute the edit distance between two whole chromosomes and the alignment cost of two 3D protein structures. So we study the nearest neighbor search problem in metric space, where the pair-wise distance between two objects in the database is known and we want to minimize the number of distances computed on-line between the query and objects in the database in order to find the closest object. We have designed two randomized approaches for indexing metric space databases, where objects are purely described by their distances with each other. Analysis and experiments show that our approaches need to compute O(log n) objects in order to find the closest object, where n is the number of objects in the database.
Online Catalog Link:
Appears in Collections:CS - Master of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html134 BHTMLView/Open
fulltext.html134 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer