I noticed you added a base and not replaced a base.
Yes, it's a mutation known as an insertion. Read about the various types of mutations
here, or if you want to read studies detailing this phenonemon there are
countless. Insertions and deletions are common types of mutations and because of them, when you want to compare sequences you can't simply put one over the other and then count the differences. This is one of the reasons why we have to insert gaps, or "chop up" the sequences as you initially phrased it.
Interestingly enough in your example there is an important correction mechanism in the cell’s self repair mechanism (it is length sensitive); this inserted base is easily removed.
Yes, there are repair mechanisms, but the thing is that they aren't flawless, and once in a while a mutation gets through; insertions and deletions included. You have about 120 mutations unique to you that weren't present in your parents' genomes. Most of these are neutral of course, but they are still present in the genome.
Just for the sake of argument you added enough sequences to bump the coding insertion point. Do you discard the inserted sequence because it is does align? Or does it become part of the difference data?
As you can see in the numbers in my former post, I've counted the insertion as the sequences are obviously different after the mutation (97%). But of course the counting method will depend on what a study is set out to measure. These numbers are often oversimplified in the popular press, so it's often best to read the study in question to find out what exactly was measured.
You might take note that long stretches of meaningful bps have never been inserted by chance (that is fact).
Are you omniscient? If not, try not to make statements that require omniscience. In any case, your statement is wrong. The field of genetic engineering is build around the fact that DNA can be more or less randomly absorbed from the environment, naked DNA strands, or via viruses or organisms and be more or less randomly inserted in the genomes of various life forms. There are other ways for long stretches of DNA to be inserted into genomes; retroviruses and transposons have been mentioned, and unequal crossovers during recombination can result in duplications of long sequences.
I would say 70% is an optimistic appraisal.
But you don't know why the researchers chose to "only" align 75.8% of the genomes. As pointed out earlier there can be reasons why the other 24.2% weren't aligned. Likely it's partly because of the huge work involved (6 billion bases are alot to sit and align manually), so they focused on the areas that were most easily aligned via software, for instance by leaving out highly repetitive sequences. Also, as mentioned earlier, the study had to omit alot of sequences because they hadn't been adequately sequenced to be of use. That you're assuming that the 24.2% not included in the analysis has nothing in common between the two species is a huge leap of faith on your part. You have to actually read the study and try to understand why they did as they did, what they did and what they found, instead of simply quote-mining a snippet you found on wikipedia.
This is the paper linked by wiki. Read it.
Lesson for today: Just because you read something on the internet that you like to read, doesn't make it a fact.
Peter