Information: A problem for evolution?

gluadys · Jan 30, 2005

Vance suggested I start a thread on information. What you will find are four posts (all lengthy I'm afraid) covering what I know of information theory (not much) and genetic information, followed by a critique of the Carl Weiland article "Variation, Information and the Created Kind" to which Liberty Wing referred us. If you are reading this before I have completed all four posts, please refrain from posting a reply until all four introductory posts are up. Then feel free to critique.

Part One

The latest attempt of YECists to refute evolution is the concept of "no new information" illustrated recently by the brief appearance of Liberty Wing on Vance's "Why I post" thread. Vance has suggested that it is time we had a thread on genetic information.

Typically, creationists never define "information" or provide any means of measuring it. Yet they claim that information decreases but never increases. Clearly the claim cannot be backed up without a measuring tool.

Scientists, on the other hand, have defined information and measured it. There are two recognized theories of information, and they each treat information in a slightly different (though compatible) ways.

One is the Kolmogorov-Chaitin (K-C) theory. It was developed from their work on computer systems. The problem K-C faced was that of a computer receiving information from another computer. How does the receiving computer know it has received the whole message and not just a part of it?

This breaks down into two questions. The first is: What is the total amount of information in the message?" Since computer messages are transmitted in binary code, there is no problem with defining the unit of information. It is the "bit". Each bit is either a 1 or a 0. The total amount of information in a message is the total number of bits. And as you know, every time you create a file, your computer counts how many bits there are in the message.

The tougher question is How does the receiving computer know it is getting the right bits? No point having the right number of bits if the bits are wrong. If one is receiving 1s when one ought to be receiving 0s.

On the other hand, if each bit needs to be described separately, the information about what information is in the message is as long as the message itself. The description of the content of the message needs to be compressed as far as possible. In the K-C system, the amount of information in a message is generally measured by how much computer code it takes to describe the content.

Repetitious information is easily described in a single line. If you have a string of 20 0s it can be described as
"0 x 20"
If you double the number of 0s, the string can be described as
"0 x 40"
Both descriptions take the same amount of code to describe. It is in this sense that duplication can be said to add "no new information".

Repetitious patterns can alse be described with few lines of code. If you have a string that consists of ten repetitions of 110 it can be described as
1. "1 x 2"
2. "0 x 1"
3. Repeat 1 & 2 10 times

Again, duplicating this pattern merely involves changing the actual number of repetitions. It takes no more code to describe 100 repetitions than it takes to describe 10.

One thing creationists seem not to notice: while it is trivially true that if you measure information by how much code it takes to describe it, duplications add no new information, the reverse is also true. If you decrease the number of repetitions, it still takes the same amount of code to describe the reduced number of bits as to describe the larger number. So if you reduce the string from 20 to 10 repetitions---there has been no loss of information either!

Finally, as K-C noted, repetition is compressible. It takes no more code to describe 500 repetitions of a bit or a repeated pattern than to describe 5. The reverse of this is that random strings are not as easily compressible. If a string reads something like:

1001110101100101

a line of code must be created for each switch from 1 to 0.

This string would need 11 lines of code as compared to the one or three lines for the duplicated strings above. By this measure, a random string contains more information than a string of duplications or duplicated patterns. Randomness=MORE information.

Now, let's look at the Shannon theory. Shannon's theory was developed out of his work for Bell Labs studying ways of overcoming "noise" in a system of information transfer.

The problem with transmitting a message is that the system gets between the sender and the receiver and adds "noise" to the message. Sometimes this is quite obvious to the people at either end of transmission as static. But even a very quiet system still has a discernable level of noise. Noise damages or destroys or blocks or garbles the message so that what is received is not precisely what was sent.

In Shannon theory "randomness" describes the level of information-destroying noise in the system. So in the Shannon system Randomness=LESS information.

Still a third way to measure information (though I hesitate to call it a scientific measure) is by what Dembski calls its "specified complexity". For example, take the following letters of the Enlgish alphabet: A, C, E, N, O. In addition to this alphabetical arrangement, once can create 119 other orders if repeats of the same letter are not allowed. (5^5 if repeats are allowed.) But most of these are meaningless strings of letters. Only an arrangement such as CANOE or OCEAN is specified as having a meaning, as conveying useful information.

Note that both K-C and Shannon handle this question of specificity differently. In K-C the question of meaning does not arise. ACENO, NOECA, OCEAN, CANOE all contain the same amount of information. In Shannon theory, if the original message is OCEAN, then all other arrangements, including CANOE, are equally random and represent loss of information.

gluadys · Jan 30, 2005

Part Two

Before applying information theory to genetic information, we need to look at how genetic information is structured and how it is decoded into meaningful information.

The structure is quite similar to a language which uses an alphabetic system of writing. The four base nucleotides of the DNA molecule are analogous to the letters of an alphabet.

But an alphabet does not convey meaningful information in itself. At a minimum the letters need to be arranged into words. The analog to words in the DNA system are triplets of base nucleotides called codons.

Now words can convey some meaning on their own. "Ocean" or "canoe" call up meaningful images. Just so, codons have significance. Each stands for an amino acid used in biological systems, except for the three that mean "stop"

So the dna sequence
ATG TGG GGA means methionine tryptophan glycine

There is, however, a fundamental difference between language words and dna codons.

dna codons differ from words in two ways. The first is that the number of nucleotides in a codon is fixed at three. With four base nucleotides we can then determine mathematically that the dna "dictionary" contains 64 "words".

Words in a language, however, do not have a fixed number of letters. So we cannot specify a fixed number of words.

The second difference is that every codon has a specified meaning. But not every sequence of letters has a specified meaning. For example, even though we cannot specify the total number of words in a language, we can specify the total number of three-letter sequences that are possible. With a 26-letter alphabet, the total number of 3-letter sequences is 26^3=17, 576. But the majority of these will have no specified meaning. They will be random sequences such as "lpq" or "ggj". Only sequences such as "but" or "fly" have specified meaning as Dembski describes it.

If we lengthen the number of letters in a sequence, we will get even higher numbers of possible sequences, but much much lower numbers of sequences which form meaningful words. How many sequences of 15 letters form a word in the dictionary?

Dembski sees the genetic information problem as seeking out the few real words or sentences (sequences with specified complexity) in the forest of meaningless letter sequences. What he does not appear to realize is that at the level of "word formation" this is not a problem with dna sequences. Every single triplet of base nucleotides has a specified meaning. You don't have to seek out a few meaningful codons in a forest of gibberish codons. There is no such thing as a gibberish codon. All 64 possible words in the dna dictionary have specified meanings, and a change in a codon simply changes the specification. No specificity is lost.

But now let's go up another level of complexity. As words are arranged to form sentences, codons are arranged to form genes.

So let's start with a couple of sentences and assess the information in them.

A. He rode to meet her.
B. He rose to greet her.

In K-C theory sentence B represents a gain of one bit of information (since "gr" at the beginning of "greet" replaces "m" at the beginning of "meet."). The other change of "d" to "s" does not represent either an increase or a decrease of information.

In Shannon theory, B represents a loss of information, not in terms of quantity of information, but in terms of content. Two words in the original message have been lost. If message A is sent, but message B received, clearly some noise has randomized the message.

Note that this is true in spite of the fact that B is a perfectly good, meaningful, functional message. Both A and B meet Dembski's criterion of specified complexity.

Just as not every arrangement of letters creates a meaningful word, not every arrangement of words creates a meaningful sentence. If we rearrange the five words of sentence A as

to her meet rose he

we have only list of words, not a sentence.

In the world of language the specific set of word sequences which make meaningful sentences is few and far between.

Is this also true of dna coding?

Not at all.

Suppose we substitute five codons for the five words of sentence A.

ACG, TTA, CGT, CGA, GTC

Now let's make two minor changes to get our sentence B.

ACG, TCA, CGT, CGG, GTC

What has happened?

Well, the first sequence means:

A. threonine, leusine, arginine, arginine, valine

and the second sequence means:

B. threonine, serine, arginine, arginine, valine

We actually end up with only one difference (leusine to serine) because both CGA and CGG code for arginine, as does the unchanged codon CGT. Clearly both sequences are meaningful dna codes for amino acids.

But what if we scramble he codons of sequence A as we did to get the meaningless list of words.

Then we get:

arginine, valine, arginine, leusine, threonine

While this kind of rearrangement left no meaningful sentence in language, it is just as intelligible as an amino acid sequence as the original.

There is no loss of information as measured in K-C theory, nor any loss of specified complexity as described by Dembski.

Now, just in case I seem to be implying that any sequence of codons is meaningful, I am not. Take our sequence B above. The change from leusine to serine was made by changing the codon TTA to TCA. A slightly different change to TAA would have given us a "stop" codon. That would be like changing sentence A (He rode to meet her) to this sentence B "He. To meet her." Obviously this doesn't make sense. In fact in dna coding it makes even less sense because one of the codons (ATG) does double duty as coding for methionine and acting as a "start" codon. That means anything after the "stop" codon is not read at all until an ATG codon appears. So the actual effect of this change in the dna sequence would be like changing sentence A to "He. ___ ___ ___"

Furthermore, just as there are certain language conventions that mean only certain word orders make sense, the very structure of amino acids means that only certain amino acids can follow other amino acids. If a mutation changes a sequence such that incompatible amino acids are placed next to each other, this creates a non-functional sequence.

However, the likelihood that a given sequence of codons will yield a usable sequence of amino acids is rather more probable than that a given sequence of words will yield a meaningful sentence.

gluadys · Jan 30, 2005

Part Three

Now let's analyse what creationist Carl Wieland is saying about genetic information in the article "Variation, Information and the Created Kind" which Liberty Wing was using as a source.

No new sentences

Wieland says:
Genes can be regarded as sentences of hereditary information written in the DNA language.
By looking now at the informational basis for other mechanisms of biological variation, it will be seen why these are not the source of new sentences and therefore why the evolutionist generally relies on mutation of one sort or another in his scheme of things.

Liberty Wing echoed Wieland when he claimed
"No sentences in the DNA appear which did not previous exist."

This is absolutely ridiculous. We saw that with codons, the limited number of base nucleotides and the absolute number of three base nucleotides per codon gave us a fixed "dictionary" of 64 dna "words", while in language, the absence of a fixed number of letters per word means we cannot limit the number of words in a dictionary.

This applies doubly to sentences. In both language and genes there is no fixed number of words or codons. A simple way to see that new gene "sentences" are possible is to ask: how many codons are there in the longest existing gene? Whatever the number, you can make a new gene "sentence" simply by adding another codon to the sequence.

Furthermore, biotech companies are currently creating proteins which have never existed in nature. Each such manufactured protein represents a sequence of amino acids which has never been observed in nature. And each such sequence of amino acids represents a gene "sentence" which nature has not used. New gene "sentences" are no more improbable than new sentences in any language.

Although not raised by either Wieland or Liberty Wing, we may also note that information theory does away with the argument that all we see in dna is a "rearrangement" of information, not new information. Re-arranged information IS new information, since, as Wieland says: "It results from the orderfrom the way in which the letters of the cells genetic alphabet are arranged."

So, for example when we re-arrange the letters of CANOE to spell OCEAN, the re-arrangement IS new information. And when we change the order of codons from a sequence meaning
threonine, serine, arginine, arginine, valine

to one meaning
arginine, valine, arginine, leusine, threonine

that IS new information.

The same, of course, applies to sentences. For any new sentence is nothing more than a unique re-arrangement of words. The notion that re-arrangement does not create new information is as silly as saying that Catcher in the Rye is merely a re-arrangement of words from Hamlet and therefore not a new story.

Thinning down the gene pool

This is the heart and soul of the new creationist paradigm on information. It is worth reading in full and studying Wieland's diagram.

Here is an excerpt (emphasis added)

The created kind
The Scriptures imply that this originally created information was not in the form of one super species from which all of todays populations have split off by this thinning out process, but was created as a number of distinct gene pools. Each group of sexually reproducing organisms had at least two members. Thus,

Each original group began with a built-in amount of genetic information which is the raw material for virtually all subsequent useful variation.

Each original group was presumably genetically and reproductively isolated from other such groups, yet was able to interbreed within its own group. Hence the original kinds would truly have earned the modern biological definition of species.4 We saw in our dog example that such species can split into two or more distinct subgroups which can then diverge (without adding anything new) and can end up with the characteristics of species themselvesthat is, reproductively isolated from each other but freely interbreeding among themselves. The more variability in the original gene pool, the more easily can such new groups arise. However, each splitting reduces the potential for further change and hence even this is limited. All the descendants of such an original kind which was once a species, may then end up being classified together in a much higher taxonomic categorye.g., family.

Take a hypothetical created kind Atruly a biological species with perhaps a tremendous genetic potential. See Figure 1. (For the sake of simplicity, the diagram avoids the issue of what is meant by two of each kind aboard the Arkhowever, the basic point is not affected.) Note that A may even continue as an unchanged group, as may any of the subgroups. Splitting off of daughter populations does not necessarily mean extinction of the parent population. In the case of man, the original group has not diverged sufficiently to produce new species.

Hence, D1, D2, D3, E1, E2, E3, P1, P2, Q1, Q2, Q3 and Q4 are all different species, reproductively isolated. But all the functionally efficient genetic information they contain was present in A. (They presumably carry some mutational defects as well).

Let us assume that the original kind A has become extinct, and also the populations X, B, C, D, E, P and Q. (But not D1, D2, etc.) If X carried some of the original information in A, which is not represented in B or C, then that information is lost forever. Hence, in spite of the fact that there are many new species which were not originally present, we would have witnessed conservation of most of the information, loss of some, and nothing new added apart from mutations (harmful defects or just meaningless noise in the genetic information). All of which is the wrong sort of informational change if one is trying to demonstrate protozoon-to-man evolution.

What then do we say to an evolutionist who understandably presses us for a definition of a created kind or identification of same today? I suggest the following for consideration:

Groups of living organisms belong in the same created kind if they have descended from the same ancestral gene pool.

The "baramin" tree in the diagram is identical to a phylogenic tree used in science. The difference is one of interpretation. As Wieland says, the creationist concept is that the gene pool of the common ancestor contained all the variations now found in the combined gene pools of all its descendants. Hence the gene pool of each descendant species is a portion of the gene pool of the common ancestor. Similarly, the gene pool of a species D1 is a portion of the gene pool of species D, which itself is a portion of the gene pool of species B. Each speciation means some of the parental gene pool no longer exists in the gene pool of its descendants.

In a phylogenic interpretation, the orginal gene pool would not contain all the variation found in the combined gene pools of all its descendant species. Rather, new variations would be added by new mutations.

Weiland is right, therefore, to see the difference as one of perspecitive:
"As the creationist looks back in time along a line of descent, he sees an expansion of the gene pool. As the evolutionist does likewise, he sees a contraction."

So, is the creationist view more reasonable than the scientific view? No.

gluadys · Jan 30, 2005

Part Four

What's wrong with the creationist application of information theory?

1. The creationist view denies the possibility that information can be added to a gene pool. This is manifestly contrary to the observations of molecular biologists. Weiland, for example, dismisses polyploidy (the duplication of a whole genome) as "multiplication of information already present". Technically he is right, but at the very least multiplication of information already present adds to the gross amount of information. The amount of dna in a duplicated genome is double the amount in a single copy of the genome. Furthermore, both segments of the new genome (original and duplicated) continue to be subject to new mutations. Note what this means from a K-C measurement of information.

Suppose we already have a description of the genome. Let's call the sum total of code lines in that description G. Now the genome is duplicated. All we need to do is add one line of code: Repeat G. But then somewhere in the duplicated section of the genome, a mutation occurs. It is no longer possible to describe the duplicated section as "Repeat G", because it is no longer an exact duplicate of G. Now the whole duplicated section has to be itemized separately from the original section. The information has been randomized by the mutation, and, as we saw earlier, in K-C theory randomization=an increase in information.

The other factor Wieland is ignoring is the impact of polyploidy on the organism. In horticulture, polyploidy is known to create double flowers. In short, the organism does not treat the doubling of the information as meaningless repetition. It actually carries out the instructions twice and in doing so creates a new (and from a horticultural view, desirable) form of flower with twice the number of petals. Polyploidy is also instrumental in creating new species. Hybrids seldom create new species on their own. Hybrids are often sterile or produce weak or deformed offspring, largely because the mix of chromosomes from two different species cannot sort themselves out properly during cell reproduction. But if, in addition to hybridization, polyploidy also occurs, every chromosome, no matter which parent it originated in, finds its mate in the duplicate genome. If polyploidy is simply a meaningless multiplication of information already present, how is it that one possible effect of polyploidy is a true new species?

2. Weiland is ignoring the difference between a gene and an allele and this makes his presentation of "information thinning" ridiculous. Weiland has an outdated Mendelian concept of a gene as being a unit which is either present or absent. So, he describes a mutation as "knocking out" a gene for leg length in sheep, such that the sheep now has short legs. He assumes, wrongly, that a Chihuahua is now confined to its small size because it has lost the gene for large size.

He sees the original gene pool as consisting of a fixed number of genes, and each descendant species as lacking some of those genes.

If this were actually true, we should find significantly fewer genes in a highly specialized dog breed than in a wolf. Do we? We should find that a hybrid organism contains more genes than either parent. Do we? We should find a close inverse relation between the number of species recognized as being in the same kind and the number of genes each has. For the more speciations there are, the fewer genes each resulting species should have. Is this what we actually find?

I am no geneticist. I don't know the answers to these questions. But I suggest it is a way to test this theory of a thinned out gene pool.

What Weiland is ignoring is that genes themselves are not fixed units. Genes are subject to modification. For example: cytochrome c is a protein found in all living systems. It consists of just over 100 amino acids. But the amino acid sequences vary in different organisms. The amino acid sequences, let's remember, are derived from the codon sequences of the gene. Cytochrome c is a well-conserved protein, which has maintained the same basic form and function in species as diverse as bacteria, yeasts, plants, fungi and the whole panoply of the animal kingdom. Yet the amino acid sequences can vary by more than 60% from a bacterial species to an animal species.

It is still the same gene!!!! It still produces a cytochrome c protein!!!!

It turns out that genes, and the proteins for which they are templates, are quite flexible. Rather than thinking of a gene as a "sentence" which would be damaged by a mutation, we should think of a gene as a "recipe" capable of being modified in many ways. Much of the variability in living things is not due to having different genes, but due to using various modifications of the same gene.

Take the short-legged sheep. They did not lose a gene for longer legs. There are no separate genes for longer or shorter legs. Rather there is a gene which governs leg growth which can be modified to grow shorter or longer legs. I would not be surprised if occasionally, a lamb of this breed is born which grows long legs. (Just as occasionally a human child is born with a tail.) Because the gene has not been "knocked out" of the genome. It has only been modified to normally grow short legs instead of long legs. What has been modified can also be unmodified.

Most variation is not so much due to new genes or loss of genes (though that happens too); it is due to modification of existing genes. A modified gene is called an allele of the more common "wild-type" gene. (Wild-type simply means the form of the gene that is most common in the species and does not necessarily imply that it is the original "wild" gene.) When a gene has many alleles, it begs the question of what is the minimum size of the original gene pool.

When we understand that most variation is due to modification of genes rather than to creating new genes, we can also see that the "new information" creationists call for is not really necessary to explain even significant differences.

For example Liberty Wing said:
When I say new information, I mean information which did previously not exist. Take a reptile for example. Evolutionists claim that some reptiles evolved into birds. Reptiles previously had no information for making wings and feathers, they therefore had to somehow gain this information. Emphasis added.

But this is not true. Reptiles did have the information for making wings and feathers. A wing is a modified forearm and feathers are modified scales. No new information, no new genes are required to change a forearm to a wing or a scale to a feather. All that is needed is to modify the existing information on the existing genes.

3. Under Wieland's scenario, what is the minimum human population?

In Weiland's scenario, the gene pool of each created kind contains all the variations found in the sum total of gene pools of its descendant species. He says that
Each group of sexually reproducing organisms had at least two members.
and that
Each original group began with a built-in amount of genetic information which is the raw material for virtually all subsequent useful variation.

He presumes that each variation calls for a different gene "sentence" and that speciation isolates certain gene "sentences" into separate gene pools, so that no one species has all the gene "sentences" of the original created kind. As we have seen, this means the original created kind had to have more genes than any of its modern descendants, as its genome would have to include all genes found in all of its descendants. And we have already seen what is wrong with that thinking.

Alleles create a new problem for this scenario, which Weiland has not touched on at all. Alleles, remember, are modifications of one gene. A gene is more like a recipe than a sentence, with varying lists of ingredients, varying proportions of common ingredients. Consider pumpkin pie filling. Some recipes call for as much or more squash as pumpkin. Some call for more ginger, some for more cinnamon. Some call for the eggs to be beaten in directly, others call for the eggs to be separated so the whites can be beaten and added separetly. The proportion of milk to pumpkin can vary from one recipe to another, and so can the baking instructions. Yet every recipe will give you the same product: pumpkiin pie.

Now if the created kind contains all variations in its gene pool, it must not only carry all the genes of all its descendant species, but all variant genetic alleles of those genes. And this is where the minimum population comes in. A species genome consists of all the genes necessary to make an individual member of the species. Every cell of every organism contains at least one copy of the whole genome for the species. And in sexually reproducing species, most contain two copies of the whole genome. But no cell contains more than two copies of the genome. No cell contains more than two copies of a gene.

But what if there are more than two alleles of a gene? In humans, for example, there are three different alleles for blood type A, B and O. One person can have only two of these three. There must be a second person to carry the third. That is the only way all three alleles can exist in the species gene pool. And this gives us a general rule. Since one organism can never carry more than two copies of a gene (same or different), the minimum population of a species can never be less than half the maximum number of alleles in the gene pool. And the population of the original created kind can never be less than half the maximum number of alleles of a gene found in all of its descendant species. Note that this applies not only at the time of creation, but throughout all history since then.

Take Weiland's diagram of speciation. In the lowest row he has 12 species all descended from the original kind. If his theory is right, not only does the original species have to have all the genes found in all twelve descendant species. It also has to have all the various alleles of these genes. If one gene found in one species shows 20 different modifications, the orginal species must have had a minimum number of 10, just to account for that one gene in that one descendant species. If the same gene shows more modifications in other of the descendant species, the original population must be proportionally increased to account for these alleles.

We can take it as a given that a highly varied gene pool implies a larger population than a gene pool with few variations.

Now lets consider human alleles. Cytochrome c was described as a well-preserved protein. It has not varied a lot. By contrast haemoglobin is a highly variable protein. In humans alone over 500 different alleles of haemoglobin are found in the gene pool. Of course, no one person exhibits more than two of these alleles. In order to have over 500 in the gene pool, there must be a minimum human population of over 250.

If Weiland's scenario is correct, God could not begin the human population with only two individuals. The first human population would, at a minimum, have to be a village of at least 250+ people. More actually, as one has to allow for the fact that some will not reproduce. And to guarantee that the human population today would have 500+ haemoglobin alleles in its gene pool, one must begin with a population whose gene pool is capable of carrying all these alleles.

Furthermore, this minimum population must be maintained throughout history. One cannot permit a flood to reduce the human population to only 8 people. Most of those 500+ alleles would be lost forever.

Given this scenario, information poses many more problems to creationism than to evolution.

Vance · Jan 30, 2005

Oh, well done. I thought I knew a little bit about this, but this really clears it up and, to my mind, shows that the "information theory" argument is simply a non-starter for YEC'ism. I would suggest you also post this on the C&E forum as well.

It goes without saying, of course, that I can't give you any more reps! I think it says a lot that you have so many more reputation points than posts.

Crusadar · Jan 31, 2005

Quite a lengthy post gluadys - I hope its not just all you have to say about information theory - as there is so much more than what you been lead into believing. I haven't read all the posts just the first one and already found numerous errors. When I have time this week I will go over them in detail.

And Vance nice to see you bowing down again to another misrepresentation of what creationists actually believe.

Vance · Jan 31, 2005

Crusadar said:
Quite a lengthy post gluadys - I hope its not just all you have to say about information theory - as there is so much more than what you been lead into believing. I haven't read all the posts just the first one and already found numerous errors. When I have time this week I will go over them in detail.

And Vance nice to see you bowing down again to another misrepresentation of what creationists actually believe.

I have read what many creationists say about information, and what Gluadys has presented shows that to be false). If you have something new and different about information theory that contradicts what she says, feel free to present it.

gluadys · Jan 31, 2005

Crusadar said:
Quite a lengthy post gluadys - I hope its not just all you have to say about information theory - as there is so much more than what you been lead into believing. I haven't read all the posts just the first one and already found numerous errors. When I have time this week I will go over them in detail.

And Vance nice to see you bowing down again to another misrepresentation of what creationists actually believe.

Oh, that is the K.I.S.S. version, which is about as far as I can go because the mathematics of the theory are way over my head. Feel free to correct.

seebs · Feb 3, 2005

Crusadar said:
Quite a lengthy post gluadys - I hope its not just all you have to say about information theory - as there is so much more than what you been lead into believing. I haven't read all the posts just the first one and already found numerous errors. When I have time this week I will go over them in detail.

Er. "Lead into believing"? She posted a fairly good summary of the current state of the art, at least as it applies to us end-users.

(Seebs is a mathematician by upbringing, and a full time computer guy, who knows a little bit about information and where it comes from.)

Actually, that reminds me: Would people like to see my program showing how random changes plus natural selection yield "new information" by whatever standard we care to name? I may have to rewrite some of it, the whole thing was written in an afternoon once just to illustrate the point.

Crusadar · Feb 3, 2005

gluadys said:The latest attempt of YECists to refute evolution is the concept of "no new information" illustrated recently by the brief appearance of Liberty Wing on Vance's "Why I post" thread. Vance has suggested that it is time we had a thread on genetic information.

First of all before we dig ourselves into a bigger hole lets examine what creationist by information. In particular as Werner Gitt defines it - unless of course anyone objects as to why his definitions are unscientific.

According to Gitt, in "In the beginning was information" - information exists in five levels:

Statistics: According to Claude Shannon, it is the statistical aspect of information. This theory makes it possible to quantitatively describe the characteristics of languages which are based on simple repetitions. At this level the meaning of a chain of symbols as well as grammar is not taken into consideration.

Syntax: In this second level strings or chains of character or symbols that have meaning are ruled governed and based on a consciously established convention. The syntax level requires that a specified number of symbols represent the information. An example of this is written languages which use letters, others include but not limited to hieroglyphics, musical notes, computer codes, morse code, genetic codes, sign language and even the figures in the dance of foraging bees or the odor symbols in the pheromone languages of insects.

Semantics: At this level strings or chains of symbols governed by syntactical rules are represented in information. Information at this level is not determined by the type, size, quantity, or method of transmission but by its information content that is what it means. At the semantics level, information is determined in the meaning whenever it is transmitted or received. And because no information exists without this third level only that which contains semantics or meaning is information. According Norbert Wiener, the founder of cybernetics and information theory, information is not of a physical nature.

Pragmatics: In this fourth level of information the purpose of the source of information is said to be irrelevant. However the particular response of the receiver of this information is of particular importance. The focus of information at this level is on whether or not the transmitting source meets its set objective in soliciting the desired response from the receiver. In human language pragmatic is represented not by the sentence structure but the anticipated response from the recipient when given a request, complaint, question, threats or commands.

Apobetics: The fifth level of information deals with purpose. At this level - information focuses on the result at the receiving end. Communicated information is taken into consideration as to its objective, plan or design. Since it is the highest information level it is the most important as it takes into consideration the objective that is being pursued by the transmitter.

In short here is what creationists mean by information.

1. Information cannot exist without a code.

2. Codes can not exist without a free and deliberate convention.

3. Information can not exist without the five levels: statistics, syntax, semantics, pragmatics and apobetics.

4. Information is not found in purely statistical processes.

5. Information cannot exist without a transmitter.

6. Information chains cannot exist without a mental origin.

7. Information cannot exist without an initial mental source, that is, information is, by its nature, a mental and not a material quantity.

8. Information cannot exist without a will.

According to evolutionary theory the random process known as natural selection coupled with long ages can account for the numerous complex systems common to all life. However in every step or increase in any organisms complexity, there exists an equal amount of genetic information that is required but unaccounted for by known natural processes. Why must this increase of complexity be accounted for in any increase of complexity? It only seems logical after all every known biological function or structure does have corresponding genetic information that is used to direct or synthesize it.

What we consistently observe is that present day biological systems are assembled using genetic information. Information then is not only an essential part of all living organisms it is required before a new function or structure can be added or assembled and then passed on. And so in order for evolution to be scientifically valid it must account for the increases of information something in which mutations so far have been shown not to do.

From observation we find that the more complex an animal is the more cell types it requires to perform vital functions. For example in a single celled eukaryote, even though internally it is specialized with a nucleus and various organelles, it is a single cell organism. In a trilobite however, there are dozens of specific tissues and organs which require functionally dedicated and specialized cell types. When we get down to the molecular level of an organism we find that cell types require many new and specialized proteins. An example of this is an epithelial cell which lines the intestine and secretes digestive enzymes. At a minimum it requires structural proteins to modify its shape, regulatory enzymes to control the secretion of the digestive enzyme, and the digestive enzymes themselves. Now these would be brand new proteins which require new genetic information encoded in DNA. An increase in the number of cell types implies that in principal there should be represented a considerable increase in the amount of specified genetic information needed to maintain and pass this advantage on to future generations.

It is estimated that at a minimum a self sustaining complex unicellular organism would need anywhere between 300 and 500 genes (about 318 - 562 kilobase pairs of DNA) to produce the necessary proteins to keep itself alive.* Take for example the 1 millimeter long worm Caenorhabditis elegans about, has a genome of approximately 97 million base pairs. However in the the insect drosophila melanogaster , better known as the fruit fly, its genome size is approximately 120 million base pairs.** What we see from simple life forms to complex animals is that it represents a significant and therefore in principle a measurable increase in specified complexity or purposeful information.

*Mitsuhiro Itaya, "An Estimation of the Minimal Genome Size Required for Life," FEBS Letters 362 (1995): 25760;

**John Gerhart and Marc Kirschner, Cells, Embryos, and Evolution (London: Blackwell Science, 1997), 121.

***The C. elegans Sequencing Consortium, "Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology," Science 282 (1998): 201218.

The question then is: Does an increase in genetic information always mean an increase in complexity? The answer is no. But if that is the case then is it reasonable to say that all increases in complexity requires that there also be an increase of information? That is a definite yes, as the acquisition of this increase in genetic information must be retained in someway or it wont be passed on to the next generation will it? But how is this done when the metabolic energy of the living cell is exclusively used to repair and duplicate, and maintain itself?

Why is this all important fact that a growth in biological complexity requires an equal increase in genetic information (not just random arrangements but purposeful information) often ignored by evolutionists I wonder? Perhaps because, again, nothing in nature is known to add to its own genome a new nucleotide sequences that would eventually add up and code for a new specific purpose or structure.

What we actually see is that mutations when they do occur are quickly corrected - to the point where living organisms such as bacteria only makes about .1 to 10 copy errors in about 10 billion transcriptions! (Drake, J.W. "Spontaneuos Mutation" Annual Reviews of Genetics, Vol. 25 p.1132.) In other organisms it is believed to be even smaller. I wonder how such a low level of errors can account for the millions upon millions of needed mutations which will add any beneficial nucleotide sequence to its own genome to finally reach man in the short amount of time that the earth is speculated to be some 4.6 billion years if I remember my evolutionary timeline correctly. What isnt surprising is that the fossil record is pretty silent about this matter.

Typically, creationists never define "information" or provide any means of measuring it. Yet they claim that information decreases but never increases. Clearly the claim cannot be backed up without a measuring tool.

The notion that creationists have not defined information is untrue as pointed above. Within the creationist paradigm information is determined by its purpose i.e. what does it specify? what is its intended purpose? Is it enzymatic, structural or yet to be discovered function? Minus the context and purpose there is no information.

If you deny the purpose then you have no information and no way of measuring anything hence the evolutionist rant that there is no measurable information stems simply from the fact that Darwinian evolution by nature requires that it is a purposeless and blind process. It is not that creationists do not have a way to measure genetic information, it is that evolutionists have already denied that purpose to begin with.

Scientists, on the other hand, have defined information and measured it. There are two recognized theories of information, and they each treat information in a slightly different (though compatible) ways.

It would seem that we are making a rather unwarranted distinction between "creationists" and "scientists" here arent we? In that they cannot be one and the same?

One is the Kolmogorov-Chaitin (K-C) theory. It was developed from their work on computer systems. The problem K-C faced was that of a computer receiving information from another computer. How does the receiving computer know it has received the whole message and not just a part of it?

If the information has served its purpose in that whatever it was intended to do and has been accomplished, then I would say it was a complete message. As in the below example of "I need help!" it does no good if one receives the message but does not understand it even if it is the complete message. Or if one understands the convention but gets only half the message i.e. only "I need"

This breaks down into two questions. The first is: What is the total amount of information in the message?" Since computer messages are transmitted in binary code, there is no problem with defining the unit of information. It is the "bit". Each bit is either a 1 or a 0. The total amount of information in a message is the total number of bits. And as you know, every time you create a file, your computer counts how many bits there are in the message.

It would seem that you are ignoring the fifth level of information. If you are going to claim that there is no distinction between bits and bytes of information and its intended purpose then you are paddling up the wrong creek. An essential property of information is not the medium or physical property that it is carried in. It is the abstract representation of mental concepts and ideas which may be expressed in terms of bits and codes but the bits themselves does not constitute the information.

The tougher question is How does the receiving computer know it is getting the right bits? No point having the right number of bits if the bits are wrong. If one is receiving 1s when one ought to be receiving 0s.

Well now lets look then shall we at this 2 bit language system to answer this question. In brevity binary code uses the digits 0 and 1, where for example the English alphabet can be represented each by its own unique 8 digit number consisting of 0s and 1s (to include spaces, dashes, punctuations and etc.).

Lets say you were in dire trouble and needed help but could only send the message in binary code. In binary code "I need help!" looks like this:

010010010010000001101110011001010110010101100100001000000110100001100101011011000111000000100001

At this stage the information would be statistical since there is no obvious meaning - only a series of repeating 0s and 1s which seem to be random. For there to be meaning one must understand the agreement that every eight digits represents an alphanumeric letter or symbol in the English alphabet.

To continue lets break up the code to see how it actually works:

01001001 00100000 01101110 01100101 01100101 01100100 00100000 01101000 01100101

01101100 01110000 00100001

At the syntax level of information rules are in place, however to a person who does not know binary the message is still nonsense. Now if we assign the corresponding letter alongside the alphanumeric letter it represents - then it becomes more clear.

01001001 = I

00100000 = (space)

01101110 = n

01100101 = e

01100101 = e

01100100 = d

00100000 = (space)

01101000 = h

01100101 = e

01101100 = l

01110000 = p

00100001 = !

In brevity information is meaningful if it meets the following conditions:

1. Understand the language it is to be understood in. Without an already established convention that this symbol or word means something apart from its physical properties, then whatever information received there is no information.

2. Have the complete transmitted message. In order to have a meaningful message we must have a complete message. That is if we receive only a partial transmission of the message what good is it? Even if one understands the language having only a partial message doesnt help in determining its meaning.

3. Know the purpose of the message. That is what is to be accomplished because of it.

Crusadar · Feb 3, 2005

On the other hand, if each bit needs to be described separately, the information about what information is in the message is as long as the message itself. The description of the content of the message needs to be compressed as far as possible. In the K-C system, the amount of information in a message is generally measured by how much computer code it takes to describe the content.

Repetitious information is easily described in a single line. If you have a string of 20 0s it can be described as

"0 x 20"

If you double the number of 0s, the string can be described as

"0 x 40"

Both descriptions take the same amount of code to describe. It is in this sense that duplication can be said to add "no new information".

First off, simple repetition does not constitute information, if anything it merely produces gibberish and redundancy. Assuming that you are not going to convey or build anything with the information you have just generated then of course I would agree that there is no new information in fact there is "zero" information in.

But lets take it a step further, and say that there is a one in front of each of the set of zeros so now it becomes "10x20" and "10x40". The question is: which of these set of numbers would you rather have in your bank account? The point being, the designated numbers in and of themselves constitute no information whether it has a single zero or a hundred it is the context that determines whether it is information.

Repetitious patterns can alse be described with few lines of code. If you have a string that consists of ten repetitions of 110 it can be described as

1. "1 x 2"

2. "0 x 1"

3. Repeat 1 & 2 10 times

Again, duplicating this pattern merely involves changing the actual number of repetitions. It takes no more code to describe 100 repetitions than it takes to describe 10.

Same as above.

One thing creationists seem not to notice:

Then perhaps you can set them straight? So far youre not doing a very good job.

while it is trivially true that if you measure information by how much code it takes to describe it, duplications add no new information, the reverse is also true. If you decrease the number of repetitions, it still takes the same amount of code to describe the reduced number of bits as to describe the larger number. So if you reduce the string from 20 to 10 repetitions---there has been no loss of information either!

That would depend on whether or not there is any information to begin with before the addition or reduction. In the above examples there is no information conveyed, because there was no information to begin with so there was no loss of anything. Now how does this analogy apply biology is my question?

Finally, as K-C noted, repetition is compressible. It takes no more code to describe 500 repetitions of a bit or a repeated pattern than to describe 5. The reverse of this is that random strings are not as easily compressible. If a string reads something like:

1001110101100101

a line of code must be created for each switch from 1 to 0.

This string would need 11 lines of code as compared to the one or three lines for the duplicated strings above. By this measure, a random string contains more information than a string of duplications or duplicated patterns. Randomness=MORE information.

Now, let's look at the Shannon theory. Shannon's theory was developed out of his work for Bell Labs studying ways of overcoming "noise" in a system of information transfer.

The problem with transmitting a message is that the system gets between the sender and the receiver and adds "noise" to the message. Sometimes this is quite obvious to the people at either end of transmission as static. But even a very quiet system still has a discernable level of noise. Noise damages or destroys or blocks or garbles the message so that what is received is not precisely what was sent.

In Shannon theory "randomness" describes the level of information-destroying noise in the system. So in the Shannon system Randomness=LESS information.

Fairly accurate, however Shannon described information in terms of the first level of information - that is the physical aspect of the information in that when distorted information is received and is not what was originally sent then obviously information is lost. This would be true of man made information systems which are prone to losing information. But how would this apply to information systems that have its own built in correcting mechanisms such as DNA?

Still a third way to measure information (though I hesitate to call it a scientific measure) is by what Dembski calls its "specified complexity". For example, take the following letters of the Enlgish alphabet: A, C, E, N, O. In addition to this alphabetical arrangement, once can create 119 other orders if repeats of the same letter are not allowed. (5^5 if repeats are allowed.)

I suppose the only reason you would hesitate to call it anything is that Dembski is an IDst and therefore not a true scientist, correct?

But most of these are meaningless strings of letters. Only an arrangement such as CANOE or OCEAN is specified as having a meaning, as conveying useful information.

Thats because there is already an established agreement on the meaning of "CANOE" or "OCEAN" or any word for that matter - minus the agreement on meaning of the transmitted words it would be like an American tourist in Japan lost (he, he). A perfect example of this is the windtalkers during WWII who created and used a secret code based on the Navajo language in which only those who spoke it knew what was being transmitted. What made it so effective was that Navajo was an oral language that had already been established and at that time perhaps there were only 30 or so non Navajos in the world who spoke the language (none of them Japanese fortunately) which was the reason why it was used.

Note that both K-C and Shannon handle this question of specificity differently. In K-C the question of meaning does not arise. ACENO, NOECA, OCEAN, CANOE all contain the same amount of information. In Shannon theory, if the original message is OCEAN, then all other arrangements, including CANOE, are equally random and represent loss of information.

Again K-C theory handles information at the statistical level, where Shannon in this case defines for us already the purpose as set by specified parameters, where any deviation from such parameters is non information.

From your examples you have at best with K-C and Shannon described the statistical level of information. The level creationist describe is almost always at the apobetics level. For what would the above example of "I need help!" mean to some who only speaks Spanish, Portuguese, or Chinese and happens to intercept the message? Absolutely nothing.

As such the combination of codes that can be generated by the genetic code is never done haphazardly as evolutionists would like to tell us. As an example lets say I have a sequence of nucleotides which in the correct order will code for a normal hemoglobin protein and decided to substitute a single meaningless letter "A" representing adenine in the code "GAG" with the letter "T" representing the base thymine which now becomes "GTG". Now what do you think would be the result of this one meaningless base or letter substitution? What you will end up with is the genetic blood disorder sickle cell anemia.

And it is true that without mechanisms to transcribe the genetic code it is meaningless to the cell. Unlike the human language which is much more flexible in terms of which part can be omitted or substituted and still a sentence would be understood, the sequence of nucleotide bases leaves little or no room for random or even controlled substitutions lest you get undesirable and often detrimental results. The genetic code therefore is not only a language system to say the least, but a very precise one at that and could not have simply evolved without obvious catastrophic results!

Crusadar · Feb 3, 2005

seebs said: (Seebs is a mathematician by upbringing, and a full time computer guy, who knows a little bit about information and where it comes from.)

Nice to know that seebs. From a source of intelligence perhaps is where information comes from? I would hope so being a computer programmer and all. I'll bet it isn't very hard to generate any source of code at all when you put your mind to it!

Actually, that reminds me: Would people like to see my program showing how random changes plus natural selection yield "new information" by whatever standard we care to name? I may have to rewrite some of it, the whole thing was written in an afternoon once just to illustrate the point.

Sure if you can show that the program indeed was the result of random processes as oppose to the use of the grey matter in your skull. Now it would be amazing indeed if nothing went into the programming of the program. Which also reminds me of how evolutionists are still desperately looking for the conditions that would result in the synthesis of life when the creationists have already found it - in that it requires intelligence, an infinite intelligence at that!

seebs · Feb 3, 2005

It is worth noting that Gitt is a creationist, who starts with the very questionable assumption that information cannot exist without a code. Every physicist knows otherwise. Think about the very nature of the Heisenberg uncertainty principle; you cannot simultaneously know the location and velocity of a particle. But... You can know at least one. What is that thing you can know? It's information.

The world itself contains information. You don't need a code to have information; you need a code to transmit it.

The claims about what a code requires are totally unsupported, as well.

seebs · Feb 3, 2005

Crusadar said:
Nice to know that seebs. From a source of intelligence perhaps is where information comes from? I would hope so being a computer programmer and all. I'll bet it isn't very hard to generate any source of code at all when you put your mind to it!

Information is in the world. The mere existence of things contains information; what things they are, what properties they have.

Sure if you can show that the program indeed was the result of random processes as oppose to the use of the grey matter in your skull.

This is a category error, and is disingenuous at best.

It takes work to make a simulation of a thing, but that doesn't mean that the thing itself took the same kind of work to create.

Let's use an analogy. Water dripping off my roof forms patterns of ice in the snow it falls on. I think most people are willing to grant that the patterns formed do not reflect a conscious creation by any creature... But it would take a fair amount of work for me to make a good simulation of them.

In short, the question of whether I had to work to make a simulation, or it "just showed up", is not relevant to the question of whether the simulation correctly depicts a non-volitional process that has interesting results.

I would like to ask you to consider changing your tone. This post from you came across as condescending and patronizing. I leave your post with the clear impression that you have already decided what I am allowed to say, and that anything I say which does not fit the mold will be simply reinterpreted or ignored to fit that mold. If this is not your intent, perhaps you should use fewer rhetorical questions, or not go quite so far down the garden path when you introduce your responses to things I haven't even started to talk about.

gluadys · Feb 4, 2005

Crusadar said:
gluadys said: In particular as Werner Gitt defines it - unless of course anyone objects as to why his definitions are unscientific.

Gitt's definitions fail to deal with the basic question posed by the concept of "no new information". Except for the first level mentioned, none has a quantitative component.

According to Gitt, in "In the beginning was information" - information exists in five levels:

Statistics: According to Claude Shannon, it is the statistical aspect of information. This theory makes it possible to quantitatively describe the characteristics of languages which are based on simple repetitions. At this level the meaning of a chain of symbols as well as grammar is not taken into consideration.

Click to expand...

As noted, this level of information is statistical. You can measure it. Hence you can determine whether there is an increase or a decrease in information.

It is worth noting that Gitt relies on Shannon rather than K-C here. Both versions are statistical, but as you will recall, what appears as a decrease in information in Shannon theory can appear as an increase in information in K-C theory.

Syntax: In this second level strings or chains of character or symbols that have meaning are ruled governed and based on a consciously established convention. The syntax level requires that a specified number of symbols represent the information. An example of this is written languages which use letters, others include but not limited to hieroglyphics, musical notes, computer codes, morse code, genetic codes, sign language and even the figures in the dance of foraging bees or the odor symbols in the pheromone languages of insects.

Click to expand...

This is getting into Dembski's specified information. What is of consequence here is not the total amount of information, but the total amount of meaningful information. Unfortunately neither Dembski or Gitt suggest how to measure meaningful information.

How then do you tell whether the information has increased or decreased? How can you determine whether or not there is new information?

Semantics: At this level strings or chains of symbols governed by syntactical rules are represented in information. Information at this level is not determined by the type, size, quantity, or method of transmission but by its information content that is what it means. At the semantics level, information is determined in the meaning whenever it is transmitted or received. And because no information exists without this third level only that which contains semantics or meaning is information. According Norbert Wiener, the founder of cybernetics and information theory, information is not of a physical nature.

Click to expand...

Now here we get a bold assertion (my emphasis). Only at this 3rd level do we have information. What is the support for this assertion? Am I to believe this solely on Gitt's say so?

And what about information which exists only on the 1st level? Ask a computer programmer or a telephone expert. If your transmission contains meaningless garble, the garble still exists as bits, it still takes time to transmit and it still takes disk space to store. On this basis it is still statistical measureable information.

Interestingly the same situation exists in DNA. Large sections of the genome appear to be meaningless garble that has no function. But it still takes up measurable space on the chromosome. It still takes energy to copy it and transmit it to a new cell.

From a programmer's or geneticist's point of view, from a cell's point of view, this meaningless information cannot be ignored. It is measurable in terms of bits or base pairs and it takes time, energy and storage space to manage.

Pragmatics: In this fourth level of information the purpose of the source of information is said to be irrelevant. However the particular response of the receiver of this information is of particular importance. The focus of information at this level is on whether or not the transmitting source meets its set objective in soliciting the desired response from the receiver. In human language pragmatic is represented not by the sentence structure but the anticipated response from the recipient when given a request, complaint, question, threats or commands.

Click to expand...

And what is the relationship of the response to the amount of information? Can you select out new information from previously existing information? How do you determine whether the information is new and how much new information there is?

Do some receivers require more information than others before the sender gets an appropriate response? How do you measure this? What if you send the same information repeatedly, getting a response from one recipient on the first sending, but not getting a response from another until it was received five times. Since the same message was repeated over and over again, does this mean the slower responder got no new information? Why then did he/she/it need it five times instead of just once?

Apobetics: The fifth level of information deals with purpose. At this level - information focuses on the result at the receiving end. Communicated information is taken into consideration as to its objective, plan or design. Since it is the highest information level it is the most important as it takes into consideration the objective that is being pursued by the transmitter.

Click to expand...

If it is the most important, perhaps it should have the most sophisticated method of measuring how much information exists at this level. What is that method?

In short here is what creationists mean by information.

1. Information cannot exist without a code.

Click to expand...

Is this an assumption or is there a basis for this assertion?

2. Codes can not exist without a free and deliberate convention.

Click to expand...

From this perspective, genetic codes are not information. If so, how can creationists demand new genetic information?

3. Information can not exist without the five levels: statistics, syntax, semantics, pragmatics and apobetics.

Click to expand...

Again, what is the basis for this assertion. We can note that genetic information does include statistics, syntax and semantics, possibly some pragmatics. So the question becomes one of the necessity of apobetics.

4. Information is not found in purely statistical processes.

Click to expand...

Yet scientific information theory (from which the "no new information" mantra was drawn) is entirely a question of statistics. Without statistics how do you measure information at all?

5. Information cannot exist without a transmitter.

Click to expand...

I am not sure what this even means. Nevertheless, I agree that when measuring information we are generally measuring something that passes from a transmitter to a receiver.

6. Information chains cannot exist without a mental origin.

7. Information cannot exist without an initial mental source, that is, information is, by its nature, a mental and not a material quantity.

8. Information cannot exist without a will.

Click to expand...

All this is sheer religious assertion. There is no basis for any of these statements other than an a priori faith in the existence of such a mental source.

....to be continued

gluadys · Feb 4, 2005

continued from previous post...

According to evolutionary theory the random process known as natural selection coupled with long ages can account for the numerous complex systems common to all life. However in every step or increase in any organisms complexity, there exists an equal amount of genetic information that is required but unaccounted for by known natural processes.

Mutations handily account for the genetic information required.

And so in order for evolution to be scientifically valid it must account for the increases of information something in which mutations so far have been shown not to do.

But what do you mean by "information" in this context? And how do you establish that such information cannot be increased by mutations?

From observation we find that the more complex an animal is the more cell types it requires to perform vital functions. For example in a single celled eukaryote, even though internally it is specialized with a nucleus and various organelles, it is a single cell organism. In a trilobite however, there are dozens of specific tissues and organs which require functionally dedicated and specialized cell types. When we get down to the molecular level of an organism we find that cell types require many new and specialized proteins. An example of this is an epithelial cell which lines the intestine and secretes digestive enzymes. At a minimum it requires structural proteins to modify its shape, regulatory enzymes to control the secretion of the digestive enzyme, and the digestive enzymes themselves. Now these would be brand new proteins which require new genetic information encoded in DNA. An increase in the number of cell types implies that in principal there should be represented a considerable increase in the amount of specified genetic information needed to maintain and pass this advantage on to future generations.

Arguing from incredulity again? Your basic thesis that such developments imply new information is accepted. Your inference that it is impossible to develop the necessary information is not.

The question then is: Does an increase in genetic information always mean an increase in complexity? The answer is no. But if that is the case then is it reasonable to say that all increases in complexity requires that there also be an increase of information? That is a definite yes, as the acquisition of this increase in genetic information must be retained in someway or it wont be passed on to the next generation will it? But how is this done when the metabolic energy of the living cell is exclusively used to repair and duplicate, and maintain itself?

Many mutations occur as copying errors so duplication can be the means of increasing information in the cell's offspring.

Why is this all important fact that a growth in biological complexity requires an equal increase in genetic information (not just random arrangements but purposeful information) often ignored by evolutionists I wonder? Perhaps because, again, nothing in nature is known to add to its own genome a new nucleotide sequences that would eventually add up and code for a new specific purpose or structure.

Not true. Duplications are relatively common mutations which add nucleotides to the genome. And unlike a repetition in language, genetic duplications can have functional results.

What we actually see is that mutations when they do occur are quickly corrected - to the point where living organisms such as bacteria only makes about .1 to 10 copy errors in about 10 billion transcriptions! (Drake, J.W. "Spontaneuos Mutation" Annual Reviews of Genetics, Vol. 25 p.1132.)

Can you give me a date on this source? I saw someone else using the identical figures recently, but his source was over 30 years old.

In other organisms it is believed to be even smaller. I wonder how such a low level of errors can account for the millions upon millions of needed mutations which will add any beneficial nucleotide sequence to its own genome to finally reach man in the short amount of time that the earth is speculated to be some 4.6 billion years if I remember my evolutionary timeline correctly. What isnt surprising is that the fossil record is pretty silent about this matter.

Personal incredulity again. Remember, humanity is not a "target" of evolution. So you can't measure whether there was enough time by focusing on that one "target". By the same token, since humanity does exist, the time must have been sufficient.

The notion that creationists have not defined information is untrue as pointed above. Within the creationist paradigm information is determined by its purpose i.e. what does it specify? what is its intended purpose? Is it enzymatic, structural or yet to be discovered function? Minus the context and purpose there is no information.

Please explain the method of measuring purpose.

If you deny the purpose then you have no information and no way of measuring anything hence the evolutionist rant that there is no measurable information stems simply from the fact that Darwinian evolution by nature requires that it is a purposeless and blind process. It is not that creationists do not have a way to measure genetic information, it is that evolutionists have already denied that purpose to begin with.

Evolutionists are not saying that genetic information cannot be measured. In fact the K-C system is quite handy for measuring genetic information. But creationists yell "no new information" without ever specifying how they intend to measure it. If you do not specify what you intend to measure, and how you intend to measure it, any assertion about new or increased information is disingenuous.

This breaks down into two questions. The first is: What is the total amount of information in the message?" Since computer messages are transmitted in binary code, there is no problem with defining the unit of information. It is the "bit". Each bit is either a 1 or a 0. The total amount of information in a message is the total number of bits. And as you know, every time you create a file, your computer counts how many bits there are in the message.

Click to expand...

It would seem that you are ignoring the fifth level of information.

Because I am trying to explain the K-C system in which there is no fifth level. In fact, no 2nd 3rd or 4th level either. You can't fault the K-C system for ignoring an aspect of a different system. Gitt, as we saw, basically ignores K-C too.

If you are going to claim that there is no distinction between bits and bytes of information and its intended purpose then you are paddling up the wrong creek.

Bytes are measured in terms of the number of bits making them up (8 bits=1 byte).

An essential property of information is not the medium or physical property that it is carried in. It is the abstract representation of mental concepts and ideas which may be expressed in terms of bits and codes but the bits themselves does not constitute the information.

Not when you are measuring digital transmissions from one computer to another. The sending and receiving computers are only capable of counting how many bits (and what bits) of information they are sending/receiving. They don't have the mental capacity to comprehend concepts and ideas. The best they can do is act on algorithms. For the computer the bits themselves do, in fact, constitute the information. That is what they measure.

Lets say you were in dire trouble and needed help but could only send the message in binary code. In binary code "I need help!" looks like this:

0100100100100000011011100110010101100
1010110010000100000011010000110010101
1011000111000000100001

At this stage the information would be statistical since there is no obvious meaning - only a series of repeating 0s and 1s which seem to be random. For there to be meaning one must understand the agreement that every eight digits represents an alphanumeric letter or symbol in the English alphabet.

To continue lets break up the code to see how it actually works:

01001001 00100000 01101110 01100101 01100101 01100100 00100000 01101000 01100101

01101100 01110000 00100001

At the syntax level of information rules are in place, however to a person who does not know binary the message is still nonsense. Now if we assign the corresponding letter alongside the alphanumeric letter it represents - then it becomes more clear.

01001001 = I

00100000 = (space)

01101110 = n

01100101 = e

01100101 = e

01100100 = d

00100000 = (space)

01101000 = h

01100101 = e

01101100 = l

01110000 = p

00100001 = !

So you have proved that a human mind responds to a different level of information than a computer. But we can measure the information which the computer receives. How much information did the human mind receive? How do you calculate that?

And is genetic information more like that of computer language or more like that of human mental concepts?

In brevity information is meaningful if it meets the following conditions:

1. Understand the language it is to be understood in. Without an already established convention that this symbol or word means something apart from its physical properties, then whatever information received there is no information.

Clearly, proteins, mRNA, ribosomes, etc. "understand" the genetic code and what a sequence of base nucleotides means since they react to it appropriately. Though, of course, the "understanding" approximates more closely that of a computer than that of a mind.

2. Have the complete transmitted message. In order to have a meaningful message we must have a complete message. That is if we receive only a partial transmission of the message what good is it? Even if one understands the language having only a partial message doesnt help in determining its meaning.

This is where genetic information differs from a computer or telephone transmission. The message received can be different from the message sent and still result in a functional protein. Even a partial message can result in a functional protein.

3. Know the purpose of the message. That is what is to be accomplished because of it.
[/size]

While proteins obviously have no conscious understanding of the purpose of a genetic message, they function as if they do and accomplish the task for which the message was transmitted.

gluadys · Feb 4, 2005

Crusadar said:
First off, simple repetition does not constitute information, if anything it merely produces gibberish and redundancy.

That depends on your definition of information. Genes tend to treat repetition as information. As do the proteins which get a double dose of certain amino acids. This "gibberish and redundancy" can change the structure of a protein. I'd call that information.

Fairly accurate, however Shannon described information in terms of the first level of information - that is the physical aspect of the information in that when distorted information is received and is not what was originally sent then obviously information is lost. This would be true of man made information systems which are prone to losing information. But how would this apply to information systems that have its own built in correcting mechanisms such as DNA?

The correcting systems are not 100% effective. When any correcting system is less than 100% effective, the result is Shannon-defined noise. Genetically, mutations are noise.

I suppose the only reason you would hesitate to call it anything is that Dembski is an IDst and therefore not a true scientist, correct?

Actually its because he doesn't provide a mathematical measure of CSI.

Again K-C theory handles information at the statistical level, where Shannon in this case defines for us already the purpose as set by specified parameters, where any deviation from such parameters is non information.

Yes, that's a good description of the difference between these systems.

From your examples you have at best with K-C and Shannon described the statistical level of information. The level creationist describe is almost always at the apobetics level.

And the apobetics level has no method of determinng the amount of information. That is why it is useless as a basis for the "no new information" canard. When creationists can show that they can measure changes in the amount of information at the apobetics level as precisely as K-C & Shannon do, then there is a basis on which their argument can be assessed.

And it is true that without mechanisms to transcribe the genetic code it is meaningless to the cell. Unlike the human language which is much more flexible in terms of which part can be omitted or substituted and still a sentence would be understood, the sequence of nucleotide bases leaves little or no room for random or even controlled substitutions lest you get undesirable and often detrimental results.

Actually it is very flexible. Each species has its own version of the cytochrome c sequence. The fact that the cytochrome c protein in one species can differ up to 60% from that in another species has no effect on its function.

Even more remarkable are the cases when different members of the same species can have many variants of the same gene and consequently of the protein derived from it, yet all are equally functional. Such variants can number in the hundreds per gene.

billwald · Feb 5, 2005

"Information" is a loaded word. The root is "inform." It inferrs an intelligible communication. On the other hand, could it not be said that some sort of communication takes place when a base comes near an acid?

seebs · Feb 5, 2005

billwald said:
"Information" is a loaded word. The root is "inform." It inferrs an intelligible communication.

I don't think it does. Etymology is not meaning. (There's a word I can't use on this site, sometimes abbreviated MF, which particularly exemplifies the distinction.)

On the other hand, could it not be said that some sort of communication takes place when a base comes near an acid?

Perhaps.

Crusadar · Apr 29, 2005

Part Two

Before applying information theory to genetic information, we need to look at how genetic information is structured and how it is decoded into meaningful information.

I would hope that you arent going to leave all your analogies at the statistical level.

The structure is quite similar to a language which uses an alphabetic system of writing. The four base nucleotides of the DNA molecule are analogous to the letters of an alphabet.

In terms of convention I would agree, where one symbol represents a concept (which by the way is a product of the mind). However the analogy only goes so far, as writing systems are flexible and can change quickly with time. The genetic language however is much more precise, and is less forgiving when an error is left uncorrected.

But an alphabet does not convey meaningful information in itself.

True, it is merely a convention representing information.

At a minimum the letters need to be arranged into words. The analog to words in the DNA system are triplets of base nucleotides called codons.

Actually at the least a symbol needs to be assigned a corresponding concept. where in words there is continual change and refinement, in DNA little has changed since its creation - examples of are is that species thought to have been extinct are found still alive today which differ little from their fossils.

Now words can convey some meaning on their own. "Ocean" or "canoe" call up meaningful images. Just so, codons have significance. Each stands for an amino acid used in biological systems, except for the three that mean "stop"

Words alone do not convey any meaning, it is the agreed on representation of a concept which it has been assigned that carries the meaning. After all what would ocean or canoe mean to a Chinese or for that matter zongua or xie xie mean to English speakers as ourselves?

So the dna sequence

ATG TGG GGA means methionine tryptophan glycine

There is, however, a fundamental difference between language words and dna codons.

dna codons differ from words in two ways. The first is that the number of nucleotides in a codon is fixed at three. With four base nucleotides we can then determine mathematically that the dna "dictionary" contains 64 "words".

Words in a language, however, do not have a fixed number of letters. So we cannot specify a fixed number of words.

The second difference is that every codon has a specified meaning. But not every sequence of letters has a specified meaning. For example, even though we cannot specify the total number of words in a language, we can specify the total number of three-letter sequences that are possible. With a 26-letter alphabet, the total number of 3-letter sequences is 26^3=17, 576. But the majority of these will have no specified meaning. They will be random sequences such as "lpq" or "ggj". Only sequences such as "but" or "fly" have specified meaning as Dembski describes it.

Again what makes this analogy work is that there exists already an established convention where certain combinations are preassigned a specific meaning. Without this agreement no matter what combination of letters, whether it be three or a thousand, there is no meaning. In the case of dna each codon represents a small part of complex system in which has already be assigned a function so a change in its designation wound have drastic consequences.

If we lengthen the number of letters in a sequence, we will get even higher numbers of possible sequences, but much much lower numbers of sequences which form meaningful words. How many sequences of 15 letters form a word in the dictionary?

Two that I know of - uncopyrightable and dermatoglyphics.

Dembski sees the genetic information problem as seeking out the few real words or sentences (sequences with specified complexity) in the forest of meaningless letter sequences. What he does not appear to realize is that at the level of "word formation" this is not a problem with dna sequences. Every single triplet of base nucleotides has a specified meaning. You don't have to seek out a few meaningful codons in a forest of gibberish codons. There is no such thing as a gibberish codon. All 64 possible words in the dna dictionary have specified meanings, and a change in a codon simply changes the specification. No specificity is lost.

Meaning in that it specifies an amino acid? That does not give what it is to be transcribed a function the question is what function will the new sequence code? Can you demonstrate this experimentally or is this more wishful thinking? That is that you can alter the code of any genetic sequence randomly and give it new functions?

But now let's go up another level of complexity. As words are arranged to form sentences, codons are arranged to form genes.

You do know that this type of analogy only goes so far.

So let's start with a couple of sentences and assess the information in them.

A. He rode to meet her.
B. He rose to greet her.

In K-C theory sentence B represents a gain of one bit of information (since "gr" at the beginning of "greet" replaces "m" at the beginning of "meet."). The other change of "d" to "s" does not represent either an increase or a decrease of information.

The problem here is that you are making no clear distinction between the bits themselves i.e. the physical aspect (phonetic symbols, pictographs etc.) and what it represents that is its content. Within an already established convention (English) both sentences have meaning. Whether each sentence has gained an increase in information is determined by its context. For example by adding the sentence He saw her in the distance or He saw her at the window before each of the sentences which sentence do you suppose would make more sense with which? The point is a change in a single sentence does not make it meaningless, it is the context that determines whether it is meaningful.

In Shannon theory, B represents a loss of information, not in terms of quantity of information, but in terms of content.

Correct, statistically it has gained one bit of information. However meaning is not dependent on simple repetition or omission but as in terms of content as stated.

Two words in the original message have been lost. If message A is sent, but message B received, clearly some noise has randomized the message. Note that this is true in spite of the fact that B is a perfectly good, meaningful, functional message. Both A and B meet Dembski's criterion of specified complexity.

As far as sentences are concerned since the likelihood of generating sentences in such a manner depends on the language and context.

Just as not every arrangement of letters creates a meaningful word, not every arrangement of words creates a meaningful sentence. If we rearrange the five words of sentence A as

to her meet rose he

we have only list of words, not a sentence.

In the world of language the specific set of word sequences which make meaningful sentences is few and far between.

Here is a much better analogy:

Lets say I have this long sentence consisting of 50 characters:

ThaumubVajtswvtsimlubntujthiabntiajteb.Chivkeeb1:1

The uppercase letter tells you that there is a beginning of a sentence and the period tells you that there is an end to a sentence but what do you think it says between these obvious clues? To you it is simply a meaningless sequence of alpha numeric letters - right?

How about if I break the sentence into its syntactic structure:

Thaum ub Vajtswv tsim lub ntuj thiab ntiajteb.Chivkeeb1:1

Now do you understand what it says? No?

It is from Genesis 1:1.

Did you see the verse in that? Why not? Was it perhaps that it may be in a different language? A language known as Hmong?

A word for word translation however would still be rather nonsensical as far as English is concern as in the following:

Thaum ub---Vajtswv----tsim------lub------ntuj-------thiab------ntiajteb.
Long ago-----God------created-----a-----heaven-----and--------earth.

In the beginning God created the heaven and the earth.

Now how did I know that well maybe because I speak the language fluently. The point being? Simple, it was to show that any sequence of letters even when in the correct sequence and context and under known rules still would be meaningless - unless it is understood by the person using that language, or in the case of DNA, where the mechanisms transcribing it can make use of it.

Is this also true of dna coding? Not at all. Suppose we substitute five codons for the five words of sentence A.

ACG, TTA, CGT, CGA, GTC

Now let's make two minor changes to get our sentence B.

ACG, TCA, CGT, CGG, GTC

What has happened?

Well, the first sequence means:

A. threonine, leusine, arginine, arginine, valine

and the second sequence means:

B. threonine, serine, arginine, arginine, valine

We actually end up with only one difference (leusine to serine) because both CGA and CGG code for arginine, as does the unchanged codon CGT. Clearly both sequences are meaningful dna codes for amino acids.

Sounds good in theory, but how does changing the coding for a few amino acids relate to how it would function in the organism you wish to hypothetically introduce this substitution? To understand how such a change would affect the delicate biological niche that it has been established to function within - we will need to go much deeper into how enzymes actually work and how any change at all in a system will affect how it functions or not function.

Remember that changing only one nucleotide sequence is known to have detrimental effects on its host so such a drastic change would have an even more catastrophic repercussion if it were to occur.

Still lets suppose that the new sequence served a function. What would the new sequence after switching now code for? What new properties will it serve that was not there before? What functions if any will it now possess that it did not have before - after all such change was never planned? From what we know about enzymes (which require exact assembly instruction in that of dna), if it is assembled incorrectly it wont fold correctly and will serve no function. As such it will not select for the correct molecule it is to catalyze. If the shape of an enzyme isnt near enough to select for the right molecule, it does not matter how close the sequence may be to the correct function. Lets take an enzyme that is half of the average length - say two hundred amino acids. A single deviation at a single point along this amino acid chain can render it useless.

The reason for this is that in enzymes there is a discontinuity between shape and amino acid sequence. In mathematics there is a concept known as continuity - where in brevity it says that when one has a specific target in mind which is controlled by another entity, the nearer the controlling entity gets to the correct value, the nearer the target value also becomes. In a discontinuous function such as that of enzymes minute changes in the controlling entity can and will result in a wild, uncontrolled and unrelated change in the target value.

The obvious conclusion of course is that since the relationship between enzyme structure and amino acid sequence is discontinuous, the correct shape for an enzyme required for a specific function could not have been reached simply by simple modification of coding for the amino acid sequence in a series of gradual steps. Because relationship between enzyme shape and amino acid sequence is a discontinuous function and by nature a single step process meaning that it all had to be there at once or it wont work. Hence the idea of new enzymatic protein functions being acquired through the process of gradual single steps as evolution teaches is very much ludicrous.

continued -

Information: A problem for evolution?

Legend

Legend

Legend

Legend

Contributor

Criado de Cristo

Contributor

Legend

God Made Me A Skeptic

Criado de Cristo

Criado de Cristo

Criado de Cristo

God Made Me A Skeptic

God Made Me A Skeptic

Legend

Legend

Legend

Contributor

God Made Me A Skeptic

Criado de Cristo

Similar threads

Privacy & Transparency

Privacy & Transparency