• Starting today August 7th, 2024, in order to post in the Married Couples, Courting Couples, or Singles forums, you will not be allowed to post if you have your Marital status designated as private. Announcements will be made in the respective forums as well but please note that if yours is currently listed as Private, you will need to submit a ticket in the Support Area to have yours changed.

Some Fun Science Trivia

Chesterton

Whats So Funny bout Peace Love and Understanding
Site Supporter
May 24, 2008
25,970
21,362
Flatland
✟1,039,116.00
Faith
Eastern Orthodox
Marital Status
Single
This is insane. At first I thought it was a hoax, but it's not. There's this protein called "titin". Nice enough word. Two syllables. Well within my pay grade. But the official name of titin is a single word comprised of 189,819 letters. A couple of websites say it can take 3 hours to say this single word. Here's a link to a PDF containing the word. It's 42 pages long.

https://cw39.com/wp-content/uploads/sites/10/2020/09/longest-word.pdf

I learned there's this thing called the International Union of Pure and Applied Chemistry, which has a recommended method for naming, and according to Wiki, "Ideally, every possible organic compound should have a name from which an unambiguous structural formula can be created." They say titin is the largest known protein, and thus contains LOTS of parts, and so the official name should contain all the parts. However this might not qualify as the longest word in the world because Wiki also says "...lexicographers regard generic names of chemical compounds as verbal formulae rather than English words."

So I get it. It makes sense, if there're rules or conventions for naming things they should be followed. But for practical purposes, do you think any geneticist or chemist has ever had to use or even look at this word, like to remind themselves what titin is composed of? Seems like a long road to nowhere, lol.
 

The IbanezerScrooge

I can't believe what I'm hearing...
Sep 1, 2015
3,420
5,804
51
Florida
✟306,691.00
Country
United States
Gender
Male
Faith
Atheist
Marital Status
Private
Politics
US-Democrat
This is insane. At first I thought it was a hoax, but it's not. There's this protein called "titin". Nice enough word. Two syllables. Well within my pay grade. But the official name of titin is a single word comprised of 189,819 letters. A couple of websites say it can take 3 hours to say this single word. Here's a link to a PDF containing the word. It's 42 pages long.

https://cw39.com/wp-content/uploads/sites/10/2020/09/longest-word.pdf

I learned there's this thing called the International Union of Pure and Applied Chemistry, which has a recommended method for naming, and according to Wiki, "Ideally, every possible organic compound should have a name from which an unambiguous structural formula can be created." They say titin is the largest known protein, and thus contains LOTS of parts, and so the official name should contain all the parts. However this might not qualify as the longest word in the world because Wiki also says "...lexicographers regard generic names of chemical compounds as verbal formulae rather than English words."

So I get it. It makes sense, if there're rules or conventions for naming things they should be followed. But for practical purposes, do you think any geneticist or chemist has ever had to use or even look at this word, like to remind themselves what titin is composed of? Seems like a long road to nowhere, lol.
Someone feed that into Dragon speech and record it. It's glorious!

To answer your question, probably not. They have other chemical notations that can be used to represent the structure and contents that are more useful. And they have the word titin. ;)
 
  • Like
Reactions: Chesterton
Upvote 0

Chesterton

Whats So Funny bout Peace Love and Understanding
Site Supporter
May 24, 2008
25,970
21,362
Flatland
✟1,039,116.00
Faith
Eastern Orthodox
Marital Status
Single
Upvote 0

jayem

Naturalist
Jun 24, 2003
15,423
7,157
73
St. Louis, MO.
✟414,591.00
Country
United States
Gender
Male
Faith
Atheist
Marital Status
Married
In medicine especially, there are shortened names for some illnesses and longer names for specific physiological phenomena.
Example: We all know anorexia nervosa is an eating disorder that can be fatal without treatment. But we don't need to use the term neurotrophasthenia when referring to nervous system damage caused by poor nutrition. In any case, I'll leave vocabulary matters to Shakespeare:

"What's in a name? That which we call a rose, by any other name, would smell as sweet."
Romeo and Juliet, Act.2
 
Upvote 0

Hans Blaster

One nation indivisible
Mar 11, 2017
20,825
15,749
55
USA
✟397,046.00
Country
United States
Gender
Male
Faith
Atheist
Marital Status
Private
This is insane. At first I thought it was a hoax, but it's not. There's this protein called "titin". Nice enough word. Two syllables. Well within my pay grade. But the official name of titin is a single word comprised of 189,819 letters. A couple of websites say it can take 3 hours to say this single word. Here's a link to a PDF containing the word. It's 42 pages long.

https://cw39.com/wp-content/uploads/sites/10/2020/09/longest-word.pdf

I learned there's this thing called the International Union of Pure and Applied Chemistry, which has a recommended method for naming, and according to Wiki, "Ideally, every possible organic compound should have a name from which an unambiguous structural formula can be created." They say titin is the largest known protein, and thus contains LOTS of parts, and so the official name should contain all the parts. However this might not qualify as the longest word in the world because Wiki also says "...lexicographers regard generic names of chemical compounds as verbal formulae rather than English words."

So I get it. It makes sense, if there're rules or conventions for naming things they should be followed. But for practical purposes, do you think any geneticist or chemist has ever had to use or even look at this word, like to remind themselves what titin is composed of? Seems like a long road to nowhere, lol.
IUPAC has been doing this for years. For example the organic pesticide DDT

DDT - Wikipedia

abbreviates Dichlorodiphenyltrichloroethane, (literally just 2-2-3, couting the components) but the proper IUPAC names is:

1,1′-(2,2,2-Trichloroethane-1,1-diyl)bis(4-chlorobenzene)

which allows the cognoscenti to draw the structure properly and distinguish it from other compounds with the stoichiometry C14H9Cl5

Even a simple base pair has a long name. (This is a significant reason why I was not tempted to try organic chemistry.)
 
Last edited:
  • Like
Reactions: Bradskii
Upvote 0

Chesterton

Whats So Funny bout Peace Love and Understanding
Site Supporter
May 24, 2008
25,970
21,362
Flatland
✟1,039,116.00
Faith
Eastern Orthodox
Marital Status
Single
For those of you who woke up this morning asking yourself "How can I waste three and a half hours of my life today?" No worries, I got you covered. Almost as entertaining as Lou Reed's Metal Machine Music.

 
  • Haha
Reactions: AV1611VET
Upvote 0

AV1611VET

SCIENCE CAN TAKE A HIKE
Site Supporter
Jun 18, 2006
3,854,978
52,378
Guam
✟5,105,161.00
Country
United States
Gender
Male
Faith
Baptist
Marital Status
Married
Politics
US-Republican
For those of you who woke up this morning asking yourself "How can I waste three and a half hours of my life today?" No worries, I got you covered. Almost as entertaining as Lou Reed's Metal Machine Music.

^_^

The comments are funny too!

On death row:
"Any last words?"
"Yeah, just one..."

When your password is good but it still says "Weak"
 
Upvote 0

sjastro

Newbie
May 14, 2014
5,691
4,628
✟333,661.00
Faith
Christian
Marital Status
Single
This looks like an AI challenge to derive the (approximate) empirical formula for Tintin, it appears we dumb humans haven't even accomplished this as AI could not find anything in the literature.
I gave the task to GPT-4o but what ultimately defeated it was a lack of bandwidth, it might take a few days for it to get the answer but then there is no human around yet to verify the answer.

In an unrelated topic GPT-4o asked me if I was familiar with Python programming to which I replied "sort of", now its responses to me are a combination of English and Python code.

This is what GPT-4o has come up with so far...............

Tintin.png
 
  • Informative
Reactions: SelfSim
Upvote 0

sesquiterpene

Well-Known Member
Sep 14, 2018
745
618
USA
✟189,619.00
Country
United States
Faith
Agnostic
Marital Status
Private
So I get it. It makes sense, if there're rules or conventions for naming things they should be followed. But for practical purposes, do you think any geneticist or chemist has ever had to use or even look at this word, like to remind themselves what titin is composed of? Seems like a long road to nowhere, lol.
Not really, this is just a bit of nerd humor. As others have pointed out there are abbreviations in common use; one set where each amino acid is denoted by a three letter code, and another one with a one letter code. There are chemistry programs to convert one set of abbreviations into another, and into IUPAC names, which was probably how the long "word" was generated.

It's also by no means the longest "word" used in biochemistry. The DNA that codes for the titin protein has three times the number of bases compared to the number of amino acids in the protein, as each amino acid is programmed by three bases. (counting only exons)

And that's just the tip of the iceberg - each of your chromosomes consists of two very long molecules bound together. The largest human chromosome (#1) is 247,199,719 base pairs long, and with a sugar and a phosphate attached to each base. It might well take ten syllables for each unit, leading to ~2.5 billion syllables for the chromosome 1 "word". You aren't going to get through repeating that in an afternoon.

In practice, each base pair is abbreviated to a single letter. But you can find some very long abbreviations in genomic databases, and you need every single letter.
 
  • Informative
Reactions: SelfSim
Upvote 0

sesquiterpene

Well-Known Member
Sep 14, 2018
745
618
USA
✟189,619.00
Country
United States
Faith
Agnostic
Marital Status
Private
This looks like an AI challenge to derive the (approximate) empirical formula for Tintin, it appears we dumb humans haven't even accomplished this as AI could not find anything in the literature.
Doesn't the AI consult Wikipedia? It gives the formula as C169,719H270,466N45,688O52,238S911. (not sure how to get the correct subscripts here).
Titin - Wikipedia
 
Upvote 0

sjastro

Newbie
May 14, 2014
5,691
4,628
✟333,661.00
Faith
Christian
Marital Status
Single
Doesn't the AI consult Wikipedia? It gives the formula as C169,719H270,466N45,688O52,238S911. (not sure how to get the correct subscripts here).
Titin - Wikipedia
Here is the excuse offered by GPT-4o.

excuse.png


Given there is no universal agreement to Titin's empirical formula perhaps GPT-4o response was based on this.
Still it will make an interesting exercise what empirical formula GPT-4o comes up with.
 
Last edited:
  • Informative
Reactions: SelfSim
Upvote 0

Chesterton

Whats So Funny bout Peace Love and Understanding
Site Supporter
May 24, 2008
25,970
21,362
Flatland
✟1,039,116.00
Faith
Eastern Orthodox
Marital Status
Single
Not really, this is just a bit of nerd humor. As others have pointed out there are abbreviations in common use; one set where each amino acid is denoted by a three letter code, and another one with a one letter code. There are chemistry programs to convert one set of abbreviations into another, and into IUPAC names, which was probably how the long "word" was generated.

It's also by no means the longest "word" used in biochemistry. The DNA that codes for the titin protein has three times the number of bases compared to the number of amino acids in the protein, as each amino acid is programmed by three bases. (counting only exons)

And that's just the tip of the iceberg - each of your chromosomes consists of two very long molecules bound together. The largest human chromosome (#1) is 247,199,719 base pairs long, and with a sugar and a phosphate attached to each base. It might well take ten syllables for each unit, leading to ~2.5 billion syllables for the chromosome 1 "word". You aren't going to get through repeating that in an afternoon.

In practice, each base pair is abbreviated to a single letter. But you can find some very long abbreviations in genomic databases, and you need every single letter.
But are you talking about a word, or just a string of characters? Have a link to it?
 
Upvote 0

sjastro

Newbie
May 14, 2014
5,691
4,628
✟333,661.00
Faith
Christian
Marital Status
Single
Here is the excuse offered by GPT-4o.

View attachment 359797

Given there is no universal agreement to Titin's empirical formula perhaps GPT-4o response was based on this.
Still it will make an interesting exercise what empirical formula GPT-4o comes up with.
After overcoming bandwidth issues GPT-4o came up with a code which gave the empirical formula for Titin.

Titin_code.png


The result being C₁₅₅₂₂₀H₂₄₇₇₉₂N₄₄₆₅₀O₄₆₈₇₆S₁₀₉₀ compared to the Wiki result of C₁₆₉₇₁₉H₂₇₀₄₆₆N₄₅₆₈₈O₅₂₂₃₈S₉₁₁.
Whether this is right or wrong is anyone's guess given the different values given by various sources.

I decided to give GPT-4o a far simpler challenge of identifying a ten amino acid peptide given by the IUPAC identifier aspartyarginylvalyltyrosylisoleucylhistidylprolylphenylalanylhistidylleucine.

formula.png

Peptide_code.png


It correctly identified the empirical formula as C₆₂H₈₉N₁₇O₁₄.

 
Upvote 0

sesquiterpene

Well-Known Member
Sep 14, 2018
745
618
USA
✟189,619.00
Country
United States
Faith
Agnostic
Marital Status
Private
But are you talking about a word, or just a string of characters?
The chemical (IUPAC) names are words in the same sense - and with the same caveats - as the name of titin in your original post. The abbreviations are just strings of characters.
Have a link to it?
Wikipedia has a link to the full sequence of Chromosome 1 here, under the full DNA sequence (GenBank). It's 225 MBytes, and is a string of characters. I don't know if any of the chemistry programs such as ChemDraw will translate it into an IUPAC name. I suspect it would be too long.
 
Last edited:
  • Informative
Reactions: SelfSim
Upvote 0

sesquiterpene

Well-Known Member
Sep 14, 2018
745
618
USA
✟189,619.00
Country
United States
Faith
Agnostic
Marital Status
Private
After overcoming bandwidth issues GPT-4o came up with a code which gave the empirical formula for Titin.
The " word" in the OP doesn't account for the possible disulfide linkages within the titin molecule(s). Each disulfide link would result in the loss of two H atoms, with an almost negligible effect on the total formula. Does GPT-4o have any comment on this?
 
  • Informative
Reactions: SelfSim
Upvote 0

sjastro

Newbie
May 14, 2014
5,691
4,628
✟333,661.00
Faith
Christian
Marital Status
Single
The " word" in the OP doesn't account for the possible disulfide linkages within the titin molecule(s). Each disulfide link would result in the loss of two H atoms, with an almost negligible effect on the total formula. Does GPT-4o have any comment on this?
sjastro wrote,

The link https://cw39.com/wp-content/uploads/sites/10/2020/09/longest-word.pdf doesn't account for the possible disulfide linkages within the titin molecule(s). Each disulfide link would result in the loss of two H atoms, with an almost negligible effect on the total formula.
What are your comments?
Response.png


I'm no biochemist but shouldn't GPT-4o also take into account methionine residues?
 
Last edited:
  • Informative
Reactions: SelfSim
Upvote 0

sesquiterpene

Well-Known Member
Sep 14, 2018
745
618
USA
✟189,619.00
Country
United States
Faith
Agnostic
Marital Status
Private
I'm no biochemist but shouldn't GPT-4o also take into account methionine residues?
I think that is covered by simply converting the amino acid sequence of methionine into a molecular formula. The methionine residues don't form disulfide links because the methyl group in methionine blocks the bond. There will be cases where that methyl group is catalytically removed. It will be rare. What does GPT-4o say?
 
  • Like
Reactions: SelfSim
Upvote 0

sjastro

Newbie
May 14, 2014
5,691
4,628
✟333,661.00
Faith
Christian
Marital Status
Single
I think that is covered by simply converting the amino acid sequence of methionine into a molecular formula. The methionine residues don't form disulfide links because the methyl group in methionine blocks the bond. There will be cases where that methyl group is catalytically removed. It will be rare. What does GPT-4o say?
Response1.png


GPT-4o is freely available but is limited to a number of messages within a three-hour window.
 
  • Like
Reactions: SelfSim
Upvote 0

Chesterton

Whats So Funny bout Peace Love and Understanding
Site Supporter
May 24, 2008
25,970
21,362
Flatland
✟1,039,116.00
Faith
Eastern Orthodox
Marital Status
Single
The chemical (IUPAC) names are words in the same sense - and with the same caveats - as the name of titin in your original post. The abbreviations are just strings of characters.

Wikipedia has a link to the full sequence of Chromosome 1 here, under the full DNA sequence (GenBank). It's 225 MBytes, and is a string of characters. I don't know if any of the chemistry programs such as ChemDraw will translate it into an IUPAC name. I suspect it would be too long.
I'm afraid I'll have to quibble about that. All words are strings of characters, but not all strings of characters are words.
 
Upvote 0

SelfSim

A non "-ist"
Jun 23, 2014
7,040
2,230
✟208,007.00
Faith
Humanist
Marital Status
Private
I'm afraid I'll have to quibble about that. All words are strings of characters, but not all strings of characters are words.
Hmm .. information technology has quite a specific meaning for what 'a word' is, and is based on the size of a given string measured in bits (and bytes). Word size there, refers to the number of bits processed, stored, or transmitted simultaneously by a computer's processor or memory. It determines the amount of data a processor can handle in a single operation, which impacts the system's overall performance, addressable memory, and data types it can manage. (Eg: Modern architectures typically use 32 or 64-bit words, built of four or eight bytes, respectively.

I'm not sure this is relevant to the points raised by @sesquiterpene, in as far as biochemistry, but the common language meaning of what a word is, falls a little short when dealing with the large amounts of coding information being discussed here(?)
 
Last edited:
Upvote 0