Abstract:
The application of analytical software in live sciences especially in DNA and protein sequences has been on the increase. When aligning sequences, introducing gaps in the sequences allows an alignment algorithm to match more terms than a gapless alignment. Gap penalties are used to adjust alignment scores based on the number and length of gaps. Similarly, ascertaining the effect of increasing sequence length on alignment quality is important. This work focused on the development of a DNA Sequence Alignment Application as well as the analysis of the effect of an increased sequence length and gap in the result of an aligned sequence. Agile software development methodology was adopted. PHP ((hypertext processor) programming language was used. A dataset having 1000 alignments of varying sequence lengths (610-570 residues) were generated. The result of the analysis showed that sequence length had a weaker effect on alignment, but nevertheless effect of sequence length showed that T-coffee achieved the highest average scores. Muscle, Probcons, GLprobs-MSAprobs-MSprobs came second, third and fourth respectively. Among other MSA tools like, MEGA, Mafft, Omega clustal and Kalign achieved the lowest. Although result show that multiple sequence alignment tools freely exist, GeneSuite would assist young and inexperienced bioinformatics to choose the right tools for sequence alignment analysis.
Keywords: Sequence length, Sequence gap, Alignment Algorithm, Gap Penalties