Unicode Blocks (3.1)


The Unicode charset consists of a number of blocks of characters. What the blocks are is defined by the Unicode Consortium, and you can find this definition in the file Blocks.txt.

Among those blocks, there are some which I personally don't think are safe. I on this page have marked those with "No" in first column. Note that it is for deployment easy to add blocks later on, but not possible to delete blocks after they have been introduced.

Safe?

Start

End

Block


0000 007F Basic Latin

0080 00FF Latin-1 Supplement

0100 017F Latin Extended-A

0180 024F Latin Extended-B

0250 02AF IPA Extensions
No
02B0 02FF Spacing Modifier Letters

0300 036F Combining Diacritical Marks

0370 03FF Greek

0400 04FF Cyrillic

0530 058F Armenian

0590 05FF Hebrew

0600 06FF Arabic

0700 074F Syriac

0780 07BF Thaana

0900 097F Devanagari

0980 09FF Bengali

0A00 0A7F Gurmukhi

0A80 0AFF Gujarati

0B00 0B7F Oriya

0B80 0BFF Tamil

0C00 0C7F Telugu

0C80 0CFF Kannada

0D00 0D7F Malayalam

0D80 0DFF Sinhala

0E00 0E7F Thai

0E80 0EFF Lao

0F00 0FFF Tibetan

1000 109F Myanmar

10A0 10FF Georgian

1100 11FF Hangul Jamo

1200 137F Ethiopic

13A0 13FF Cherokee

1400 167F Unified Canadian Aboriginal Syllabics
No
1680 169F Ogham
No
16A0 16FF Runic

1780 17FF Khmer

1800 18AF Mongolian

1E00 1EFF Latin Extended Additional

1F00 1FFF Greek Extended
No
2000 206F General Punctuation
No
2070 209F Superscripts and Subscripts

20A0 20CF Currency Symbols
No
20D0 20FF Combining Marks for Symbols
No
2100 214F Letterlike Symbols
No
2150 218F Number Forms
No
2190 21FF Arrows
No
2200 22FF Mathematical Operators
No
2300 23FF Miscellaneous Technical
No
2400 243F Control Pictures
No
2440 245F Optical Character Recognition
No
2460 24FF Enclosed Alphanumerics
No
2500 257F Box Drawing
No
2580 259F Block Elements
No
25A0 25FF Geometric Shapes
No
2600 26FF Miscellaneous Symbols
No
2700 27BF Dingbats
No
2800 28FF Braille Patterns

2E80 2EFF CJK Radicals Supplement

2F00 2FDF Kangxi Radicals
No
2FF0 2FFF Ideographic Description Characters
No
3000 303F CJK Symbols and Punctuation

3040 309F Hiragana

30A0 30FF Katakana

3100 312F Bopomofo

3130 318F Hangul Compatibility Jamo

3190 319F Kanbun

31A0 31BF Bopomofo Extended
No 3200 32FF Enclosed CJK Letters and Months
No 3300 33FF CJK Compatibility

3400 4DB5 CJK Unified Ideographs Extension A

4E00 9FFF CJK Unified Ideographs

A000 A48F Yi Syllables

A490 A4CF Yi Radicals

AC00 D7A3 Hangul Syllables
No D800 DB7F High Surrogates
No DB80 DBFF High Private Use Surrogates
No DC00 DFFF Low Surrogates
No E000 F8FF Private Use
No F900 FAFF CJK Compatibility Ideographs
No FB00 FB4F Alphabetic Presentation Forms
No FB50 FDFF Arabic Presentation Forms-A
No FE20 FE2F Combining Half Marks
No FE30 FE4F CJK Compatibility Forms
No FE50 FE6F Small Form Variants
No FE70 FEFE Arabic Presentation Forms-B
No FEFF FEFF Specials
No FF00 FFEF Halfwidth and Fullwidth Forms
No FFF0 FFFD Specials
No 10300 1032F Old Italic
No 10330 1034F Gothic
No 10400 1044F Deseret
No 1D000 1D0FF Byzantine Musical Symbols
No 1D100 1D1FF Musical Symbols
No 1D400 1D7FF Mathematical Alphanumeric Symbols

20000 2A6D6 CJK Unified Ideographs Extension B
No 2F800 2FA1F CJK Compatibility Ideographs Supplement
No E0000 E007F Tags
No F0000 FFFFD Private Use
No 100000 10FFFD Private Use