City University of Hong Kong

CityU Institutional Repository >
3_CityU Electronic Theses and Dissertations >
ETD - Dept. of Computer Science  >
CS - Master of Philosophy  >

Please use this identifier to cite or link to this item:

Title: Identification of linked regions and reconstruction of tandem repeats duplication history
Other Titles: Ji yin lian suo qu yu shi bie ji chong jian chuan lian chong fu xu lie de fu zhi li shi
Authors: Wang, Zhanyong (王占永)
Department: Department of Computer Science
Degree: Master of Philosophy
Issue Date: 2008
Publisher: City University of Hong Kong
Subjects: Computer algorithms.
Genetic recombination.
Notes: CityU Call Number: QA76.9.A43 W36 2008
ix, 86 leaves : ill. 30 cm.
Thesis (M.Phil.)--City University of Hong Kong, 2008.
Includes bibliographical references (leaves 83-85)
Type: thesis
Abstract: In this thesis, we study two important problems in computational biology and bioinformatics. Those two problems are identi¯cation of linked regions and recon- struction of tandem repeats duplication history. With the knowledge of large number of SNPs in human genome and the fast development in high-throughput genotyping technologies, identi¯cation of linked regions in linkage analysis through allele sharing status determination will play an ever important role, while consideration of recombination fractions becomes un- necessary. In Chapter 2, we have developed a rule-based program that identi¯es linked regions for underlined diseases using allele sharing information among fam- ily members. Our program uses high-density SNP genotype data and works in the face of genotyping errors. It works on nuclear family structures with two or more siblings. The program graphically displays allele sharing status for all members in a pedigree and identi¯es regions that are potentially linked to the underlined diseases according to user-speci¯ed inheritance mode and penetrance. Extensive simulations based on the Chi-square model for recombination show that our pro- gram identi¯es linked regions with high sensitivity and accuracy. Graphical display of allele sharing status helps to detect misspeci¯cation of inheritance mode and penetrance, as well as mislabeling or misdiagnosis. Allele sharing determination may represent the future direction of linkage analysis due to its better adaptation to high-density SNP genotyping data. The genomes of many species are dominated by short segments repeated con- secutively. It is estimated that over 10% of the human genome consists of re- peated segments. About 10-25% of all known proteins have some form of repeated structures. Computing the duplication history of a tandem repeated region is an important problem in computational biology [7, 25, 12]. In Chapter 3, we design a polynomial-time approximation scheme (PTAS) for the case where the size of the duplication block is 1. Our PTAS is faster than the existing PTAS [12]. For example, to achieve a ratio of 1:5, our PTAS takes O(n5) time while the previous PTAS in [12] takes O(n11) time. We also design a ratio-6 polynomial-time approx- imation algorithm for the case where the size of each duplication block is at most 2. This is the ¯rst polynomial-time approximation algorithm with a guaranteed ratio for this case.
Online Catalog Link:
Appears in Collections:CS - Master of Philosophy

Files in This Item:

File Description SizeFormat
abstract.html134 BHTMLView/Open
fulltext.html134 BHTMLView/Open

Items in CityU IR are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!
DSpace Software © 2013 CityU Library - Send feedback to Library Systems
Privacy Policy · Copyright · Disclaimer