Good article on comparative genomics (aka why common descent is a useful science)

Pete Harcoff

PeteAce - In memory of WinAce
Jun 30, 2002
8,304
71
✟9,874.00
Faith
Other Religion
Comparative Genomics Background

Bit of an overview of genome analysis and why comparative genomics is important:

Unfortunately, although great advances have been made in our ability to annotate gene sequences within the human genome, the annotation of cis-regulatory elements that control gene expression is lagging far behind. This is predominantly because we know so little about their language, mode of action and origin, and traditional ways of identifying them are difficult and laborious. Indeed, protein-coding sequences make up just 1-2% of the genome, with around half of the rest made up of repetitive elements (Lander et al. 2001). This leaves almost half of the genome with little or no ascribed function, with at least a proportion likely to harbour the complex networks of sequences required to control correct gene expression. Furthermore, a lack of knowledge on the location of such sequences has meant that few have been linked to disease susceptibility, despite the likely role in many common human disorders. Identification and characterisation of cis-regulatory regions within the non-coding sections of the human genome therefore remains one of the greatest challenges for the post-genomic era.

Access to the sequences of several vertebrate genomes has presented an unprecedented opportunity for the discovery of functional elements in the human genome through comparative genomics. The discovery of putative functional elements through comparison of sequences from several species, known as phylogenetic footprinting, is based on the assumption that these elements evolve more slowly than surrounding non-functional DNA as they are under negative (purifying) selection. Thus, sequences that are more highly conserved than would be expected under a reasonable model of neutral evolution are likely to be important for function. One of the key decisions inherent in phylogenetic footprinting is the choice of organisms with which the comparison will be made. The type of functional element we hope to identify and the ability to distinguish between sequences conserved by purifying selection and those conserved incidentally by slow neutral divergence depend largely on how species are related on an evolutionary timescale.

Here's some specifics on mouse/human comparative genomics and why an evolutionary approach yields better results:

Indeed, although approximately 40% of the human and mouse genomes are alignable, only ~5% is estimated to be under evolutionary constraint (Waterston et al. 2002). Consequently, specific conservation criteria were proposed to distinguish functional conservation from background. An arbitary criterion of 70% identity over at least 100bp of ungapped alignment (which is above the average rate of neutral conservation) between human and mouse sequences has been used to successfully identify a number of regulatory elements (Gottgens et al. 2000; Loots et al. 2000). Using this criterion across whole-genome human-mouse alignments identified ~327,000 conserved elements, making up around 1% of the human genome, which were located in non-coding regions and had little or no evidence of transcription (Dermitzakis et al. 2003, Dermitzakis et al. 2005). These sequences appear to be distributed uniformly across the genome and are negatively correlated with the distribution of genes, suggesting roles which are distance independent or that are not directly involved in gene expression.

An alternative approach for the identification of non-coding constrained elements was proposed by Margulies et al. (2003) who devised two strategies based on parsimony and binomial-based models applied to multi-species alignments. Unlike simple percent-identity based approaches, these models take into account the derived local neutral mutation rate as well as the divergence times between sequences based on a phylogenetic tree. Combining these approaches, on a 1.7Mb region around the CFTR locus, they were able to successfully distinguish between neutrally evolving sequence such as known ancestral repeats and constrained elements such as exons, and identified a large number of conserved elements, ~70% of which were located in non-coding regions. Many more constrained sequences were identified using this multiple-alignment approach than could be identified using human-mouse pairwise alignments alone, demonstrating the power of multi-species alignments. In addition, a number of other computational approaches have recently been described that identify regions under evolutionary constraint. These include the use of a statistical model of sequence evolution (implemented as a phylogenetic hidden Markov model) (Siepel et al. 2005) and the use of a neutral indel model (Lunter et al. 2006), which instead of using patterns of nucleotide substitution, infers selection through the presence or absence of insertions and deletions.
(emphasis all mine)

So despite all the creationist protests over evolution and common descent, it's a useful science and is here to stay.