"Multiple Sequence Alignment Using External Sources Of Information"
by Layal Yasin
Date of Examination:2016-01-28
Date of issue:2016-03-29
Advisor:Prof. Dr. Burkhard Morgenstern
Referee:Prof. Dr. Carsten Damm
Referee:Prof. Dr. Kifah Tout
Files in this item
EnglishMultiple sequence alignment is an alignment of three or more protein or nucleic acid sequences. The alignment area has always been of much interest for researchers, this is due to that fact that many scientifi c researchs depend in their workflow on sequence alignments. Thus, having an alignment of high quality is of high importance. Much work has been done and is still carried in this field to help improving the quality of alignments. Many approaches have been developed so far for performing pairwise and multiple sequence alignments, yet, most of those approaches rely basically on the sequences to be aligned as their only input. Recently, some approaches began to incorporate additional sources of information in the alignment process, the sources of external data can come from user knowledge or online databases. This data, when integrated in the workflow of the alignment programs, may add new constraints to the produced alignment and improve its quality by making it biologically more meaningful. In this thesis, I will introduce new approaches for multiple sequence alignment which use the alignment software DIALIGN along with external information from databases, where useful information is extracted and then integrated in the alignment process. By testing those approaches on benchmark databases, I will show that using additional data during alignment produced better results than using DIALIGN alone without any external input other than the sequences to be aligned.
Keywords: PFAM; Alignment; PROSITE; DIALIGN; Domains; Patterns; Profiles; Multiple sequence alignment