Wolfram ResearchPRODUCTSPURCHASEFOR USERSCOMPANYOUR SITES
THIS IS DOCUMENTATION FOR AN OBSOLETE PRODUCT.
SEE THE DOCUMENTATION CENTER FOR THE LATEST INFORMATION.
Previous section-----Next section

2.8.8 Advanced Topic: Character Codes

ToCharacterCode["string"] give a list of the character codes for the characters in a string
FromCharacterCode[n] construct a character from its character code
FromCharacterCode[{ ,  , ... }] construct a string of characters from a list of character codes

Converting to and from character codes.

Mathematica assigns every character that can appear in a string a unique character code. This code is used internally as a way to represent the character.

This gives the character codes for the characters in the string.

In[1]:=  ToCharacterCode["ABCD abcd"]

Out[1]=

FromCharacterCode reconstructs the original string.

In[2]:=  FromCharacterCode[%]

Out[2]=

Special characters also have character codes.

In[3]:=  ToCharacterCode["AlphaCirclePlusCapitalGammaCircleMinusEmptySet"]

Out[3]=

CharacterRange[" ", " "] generate a list of characters with successive character codes

Generating sequences of characters.
This gives part of the English alphabet.

In[4]:=  CharacterRange["a", "k"]

Out[4]=

Here is the Greek alphabet.

In[5]:=  CharacterRange["Alpha", "Omega"]

Out[5]=

Mathematica assigns names such as \[Alpha] to a large number of special characters. This means that you can always refer to such characters just by giving their names, without ever having to know their character codes.

This generates a string of special characters from their character codes.

In[6]:=  FromCharacterCode[{8706, 8709, 8711, 8712}]

Out[6]=

You can always refer to these characters by their names, without knowing their character codes.

In[7]:=  InputForm[%]

Out[7]//InputForm=

Mathematica has names for all the common characters that are used in mathematical notation and in standard European languages. But for a language such as Japanese, there are more than 3,000 additional characters, and Mathematica does not assign an explicit name to each of them. Instead, it refers to such characters by standardized character codes.

Here is a string containing Japanese characters.

In[8]:=  "数学 "

Out[8]=

In InputForm, these characters are referred to by standardized character codes. The character codes are given in hexadecimal.

In[9]:=  InputForm[%]

Out[9]//InputForm=

The notebook front end for Mathematica is typically set up so that when you enter a character in a particular font, Mathematica will automatically work out the character code for that character.

Sometimes, however, you may find it convenient to be able to enter characters directly using character codes.

\0 null byte (code 0)
\nnn a character with octal code nnn
\.nn a character with hexadecimal code nn
\:nnnn a character with hexadecimal code nnnn

Ways to enter characters directly in terms of character codes.

For characters with character codes below 256, you can use \nnn or \.nn. For characters with character codes above 256, you must use \:nnnn. Note that in all cases you must give a fixed number of octal or hexadecimal digits, padding with leading 0s if necessary.

This gives character codes in hexadecimal for a few characters.

In[10]:=  BaseForm[ToCharacterCode["AàAlphaAleph"], 16]

Out[10]//BaseForm=

This enters the characters using their character codes. Note the leading 0 inserted in the character code for  .

In[11]:=  "\.41\.e0\:03b1\:2135"

Out[11]=

In assigning codes to characters, Mathematica follows three compatible standards: ASCII, ISO Latin-1, and Unicode. ASCII covers the characters on a normal American English keyboard. ISO Latin-1 covers characters in many European languages. Unicode is a more general standard which defines character codes for several tens of thousands of characters used in languages and notations around the world.

0 - 127 (\000 - \177) ASCII characters
1 - 31 (\001 - \037) ASCII control characters
32 - 126 (\040 - \176) printable ASCII characters
97 - 122 (\141 - \172) lower-case English letters
129 - 255 (\201 - \377) ISO Latin-1 characters
192 - 255 (\240 - \377) letters in European languages
0 - 59391 (\:0000 - \:e7ff) Unicode standard public characters
913 - 1009 (\:0391 - \:03f1) Greek letters
12288 - 35839 (\:3000 - \:8bff)
Chinese, Japanese and Korean characters
8450 - 8504 (\:2102 - \:2138) modified letters used in mathematical notation
8592 - 8677 (\:2190 - \:21e5) arrows
8704 - 8945 (\:2200 - \:22f1) mathematical symbols and operators
64256 - 64300 (\:fb00 - \:fb2c)
Unicode private characters defined specially by Mathematica

A few ranges of character codes used by Mathematica.
Here are all the printable ASCII characters.

In[12]:=  FromCharacterCode[Range[32, 126]]

Out[12]=

Here are some ISO Latin-1 letters.

In[13]:=  FromCharacterCode[Range[192, 255]]

Out[13]=

Here are some special characters used in mathematical notation. The black blobs correspond to characters not available in the current font.

In[14]:=  FromCharacterCode[Range[8704, 8750]]

Out[14]=

Here are a few Japanese characters.

In[15]:=  FromCharacterCode[Range[30000, 30030]]

Out[15]=


Any questions about topics on this page? Click here to get an individual response.Buy NowFree TrialMore Information



 © 2009 Wolfram Research, Inc.  Terms of Use  Privacy Policy |
Sign up for our newsletter: