• Starting today August 7th, 2024, in order to post in the Married Couples, Courting Couples, or Singles forums, you will not be allowed to post if you have your Marital status designated as private. Announcements will be made in the respective forums as well but please note that if yours is currently listed as Private, you will need to submit a ticket in the Support Area to have yours changed.

  • CF has always been a site that welcomes people from different backgrounds and beliefs to participate in discussion and even debate. That is the nature of its ministry. In view of recent events emotions are running very high. We need to remind people of some basic principles in debating on this site. We need to be civil when we express differences in opinion. No personal attacks. Avoid you, your statements. Don't characterize an entire political party with comparisons to Fascism or Communism or other extreme movements that committed atrocities. CF is not the place for broad brush or blanket statements about groups and political parties. Put the broad brushes and blankets away when you come to CF, better yet, put them in the incinerator. Debate had no place for them. We need to remember that people that commit acts of violence represent themselves or a small extreme faction.
  • We hope the site problems here are now solved, however, if you still have any issues, please start a ticket in Contact Us

  • The rule regarding AI content has been updated. The rule now rules as follows:

    Be sure to credit AI when copying and pasting AI sources. Link to the site of the AI search, just like linking to an article.

The vBulletin Unicode How-To

Heathen Dawn

Gesta Dei per Francos
Aug 13, 2003
1,475
52
47
Israel
Visit site
✟1,922.00
Faith
Pagan
Marital Status
Single
Here begins the vBulletin Unicode How-To

What is Unicode?

Your US PC keyboard contains the following characters:

  ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~

Those are the graphical characters of the ASCII character set, a character set capable of encoding 128 characters (the 33 that don&#8217;t appear are control characters). ASCII is adequate only for English and for a few other Latin-script languages that don&#8217;t use any diacritics (accents, umlauts etc) on the letters. Once you step outside of English, ASCII isn&#8217;t enough. For that purpose, a new character set called Unicode was devised. Unicode can encode 1,112,064 characters, enough for all living languages and for many historical scripts. Each character has its own unique, unambiguous number in Unicode.

With Unicode, one can write, for example, mathematical symbols, polytonic Greek as in the New Testament, Biblical Hebrew with cantillation marks and the Arabic of the Qur&#8217;an. The ability of Unicode is therefore an asset in debates on science and religion, which are so common on the boards.

Using Unicode on vBulletin boards

The way to use Unicode on boards based on vBulletin software (and also UltimateBB, but not EZBoard) is to use numeric character references, or NCRs for short. NCRs are escape sequences for representing Unicode characters in web pages. A valid NCR consists of the following sequence: ampersand (&#38; ), hash mark (#), Unicode number in decimal, and semicolon. For example, the Unicode number of Hebrew Letter Alef is 1488 in decimal, so it is written as &#38;#1488; which gives the following: &#1488;

Looking up the Unicode number

For an individual character it is efficient to look up its number and type it manually. This can be done either with a character map utility, such as is available on Windows XP (charmap.exe), or by looking up the database on http://www.unicode.org/charts/. Both sources give the Unicode number in hexadecimal (base 16); to convert it to decimal, use a calculator (such as Windows&#8217; calc.exe). For example, 05D0 is hexadecimal for 1488. Another good resource is Alan Wood&#8217;s pages on Unicode, at http://www.alanwood.net/unicode/, which gives the decimal value as well.

Batch converting

For a long run of characters, manually typing the numeric character references is too tedious and time-consuming. If you have an operating system that supports Unicode, such as Windows 2000 or XP or Linux (eg Red Hat 7.0 and onwards), it is best to input the characters the natural way, or to copy from them an Web source. Then it is possible to convert them all to NCRs by software.

On Linux there are utilities such iconv or recode enabling one to convert from Unicode to HTML notation, which means NCRs. All that is necessary is to run the Unicode text file through such a utility and produce a new file in which there are NCRs to be copied and pasted on the boards.

On Windows XP (or 2000), such utilities are usually lacking. The best way to convert Unicode to NCRs is to use Internet Explorer. Follow these stages:

1. Write your text file and save in Unicode.
2. Drag the text file icon into an Internet Explorer window.
3. Choose &#8220;Save As&#8221; from the file menu.
4. Save as a text file, but in a different encoding, such as &#8220;Baltic (ISO)&#8221;

The new text file will contain your Unicode text in NCRs. It is important to save the text file in a different encoding than the characters in the file. For example, if your Unicode text file contains Hebrew characters, don&#8217;t save as &#8220;Hebrew (Windows)&#8221;, but as a different encoding, such as &#8220;Greek (ISO)&#8221; or &#8220;Baltic (ISO)&#8221;.

For example, the following Hebrew text, Genesis 1:1:

&#1489;&#1512;&#1488;&#1513;&#1497;&#1514; &#1489;&#1512;&#1488; &#1488;&#1500;&#1492;&#1497;&#1501; &#1488;&#1514; &#1492;&#1513;&#1502;&#1497;&#1501; &#1493;&#1488;&#1514; &#1492;&#1488;&#1512;&#1509;

can be converted into NCRs:

&#38;#1489;&#38;#1512;&#38;#1488;&#38;#1513;&#38;#1497;&#38;#1514; &#38;#1489;&#38;#1512;&#38;#1488; &#38;#1488;&#38;#1500;&#38;#1492;&#38;#1497;&#38;#1501; &#38;#1488;&#38;#1514; &#38;#1492;&#38;#1513;&#38;#1502;&#38;#1497;&#38;#1501; &#38;#1493;&#38;#1488;&#38;#1514; &#38;#1492;&#38;#1488;&#38;#1512;&#38;#1509;

Pasting the above NCRs into the input box of the board will result in the Hebrew text.

Font issues

For a character to be displayed correctly, there needs to be not only a Unicode number&#8212;that is the easy part&#8212;but also a matching font. Here is where things get tricky. For modern monotonic Greek, modern Hebrew without cantillation marks and Arabic without Qur&#8217;anic marks, most fonts suffice, so setting a font isn&#8217;t necessary. However, for polytonic NT Greek or Hebrew with cantillation marks, a special font must be specified. The problem is that the special font is not always available on the viewer&#8217;s computer.

For polytonic NT Greek, three suitable fonts are Palatino Linotype, Athena and Arial Unicode MS. In Linux there is no problem, because Linux comes complete with polytonic Greek in its system fonts. To specify a font, surround the NCR text with a font markup:

&#91;font="Palatino Linotype, Athena, Arial Unicode MS"&#93;&#38;#7977; &#38;#7936;&#38;#947;&#38;#8049;&#38;#960;&#38;#951; &#38;#956;&#38;#945;&#38;#954;&#38;#961;&#38;#959;&#38;#952;&#38;#965;&#38;#956;&#38;#949;&#38;#8150;, &#38;#967;&#38;#961;&#38;#951;&#38;#963;&#38;#964;&#38;#949;&#38;#8059;&#38;#949;&#38;#964;&#38;#945;&#38;#953; &#38;#7969; &#38;#7936;&#38;#947;&#38;#8049;&#38;#960;&#38;#951;&#91;/font&#93;

giving

[font="Palatino Linotype, Athena, Arial Unicode MS"]&#7977; &#7936;&#947;&#8049;&#960;&#951; &#956;&#945;&#954;&#961;&#959;&#952;&#965;&#956;&#949;&#8150;, &#967;&#961;&#951;&#963;&#964;&#949;&#8059;&#949;&#964;&#945;&#953; &#7969; &#7936;&#947;&#8049;&#960;&#951;[/font]

which should be viewable on computers with Windows 2000 or XP or Linux, but not in Windows 98.

For Hebrew with cantillation marks the situation is harder. Fonts containing it are mostly special downloads; the only free one is Arial Unicode MS, which is present on the system only if Office 2000 or later has been installed. It is therefore best to avoid Hebrew with cantillation marks or Ethopian or Runic, which have no fonts available even in Windows XP. Indic scripts, Georgian, Armenian, Thai, Chinese and Japanese can be used, but they will be viewed properly only on those Windows XP installations where the user has installed support for them. Linux users can view all those except Indic scripts, for which support on Linux is very difficult.

For special symbols, such as signs of the zodiac, a font such as Lucida Sans Unicode or Arial Unicode MS must be specified. The former font is available on Windows 2000 or XP. Linux users already have those symbols in their system fonts.

Browser support

Proper display of Unicode characters is dependent on the browser. Internet Explorer beginning with version 5.0 can display Unicode, as can Mozilla from version 1.3 upwards. Version 4.0 browsers of Netscape and Microsoft don&#8217;t display Unicode properly. On Linux, Konqueror and Mozilla 1.3+ can display Unicode, including Arabic. See Alan Wood&#8217;s pages on Unicode to learn about setting up browsers for Unicode display.

Happy Unicoding! Please contribute to this How-To with questions for me to answer.

Here ends the vBulletin Unicode How-To.
 

Heathen Dawn

Gesta Dei per Francos
Aug 13, 2003
1,475
52
47
Israel
Visit site
✟1,922.00
Faith
Pagan
Marital Status
Single
The source text file for the vBulletin Unicode How-To is on my website, here. Anyone who wishes to post the How-To on another vBulletin board can copy from the text file and paste on the board. Caution: when posting on vBulletin 3.0, use the Standard Toolbar instead of the Enhanced (WYSIWYG) Toolbar.
 
Upvote 0