欢迎访问 生活随笔!

凯发k8官方网

当前位置: 凯发k8官方网 > 编程资源 > 编程问答 >内容正文

编程问答

unicode编码对照表及过滤方案 -凯发k8官方网

发布时间:2023/12/14 编程问答 2051 豆豆
凯发k8官方网 收集整理的这篇文章主要介绍了 unicode编码对照表及过滤方案 小编觉得挺不错的,现在分享给大家,帮大家做个参考.
图例:
unicode 3.1
unicode 1.0unicode 3.2
unicode 1.1unicode 4.0
unicode 2.0unicode 4.1
unicode 2.1未使用
unicode 3.0不作编码
unicode 编码表
0000-0fff8000-8fff10000-10fff20000-20fff28000-28fff
1000-1fff9000-9fff 21000-21fff29000-29fff
2000-2fffa000-afff 22000-22fff2a000-2afff
3000-3fffb000-bfff 23000-23fff 
4000-4fffc000-cfff1d000-1dfff24000-24fff2f000-2ffff
5000-5fffd000-dfff 25000-25fff 
6000-6fffe000-efff 26000-26fff 
7000-7ffff000-ffff 27000-27fffe0000-e0fff

 

【unicode 码表】

0000-007f:c0控制符及基本拉丁文 (c0 control and basic latin)
0080-00ff:c1控制符及拉丁文补充-1 (c1 control and latin 1 supplement) 
0100-017f:拉丁文扩展-a (latin extended-a) 
0180-024f:拉丁文扩展-b (latin extended-b) 
0250-02af:国际音标扩展 (ipa extensions) 
02b0-02ff:空白修饰字母 (spacing modifiers) 
0300-036f:结合用读音符号 (combining diacritics marks) 
0370-03ff:希腊文及科普特文 (greek and coptic) 
0400-04ff:西里尔字母 (cyrillic) 
0500-052f:西里尔字母补充 (cyrillic supplement) 
0530-058f:亚美尼亚语 (armenian) 
0590-05ff:希伯来文 (hebrew) 
0600-06ff:阿拉伯文 (arabic) 
0700-074f:叙利亚文 (syriac) 
0750-077f:阿拉伯文补充 (arabic supplement) 
0780-07bf:马尔代夫语 (thaana) 
07c0-077f:西非書面語言 (n'ko) 
0800-085f:阿维斯塔语及巴列维语 (avestan and pahlavi) 
0860-087f:mandaic 
0880-08af:撒马利亚语 (samaritan) 
0900-097f:天城文书 (devanagari) 
0980-09ff:孟加拉语 (bengali) 
0a00-0a7f:锡克教文 (gurmukhi) 
0a80-0aff:古吉拉特文 (gujarati) 
0b00-0b7f:奥里亚文 (oriya) 
0b80-0bff:泰米尔文 (tamil) 
0c00-0c7f:泰卢固文 (telugu) 
0c80-0cff:卡纳达文 (kannada) 
0d00-0d7f:德拉维族语 (malayalam) 
0d80-0dff:僧伽罗语 (sinhala) 
0e00-0e7f:泰文 (thai) 
0e80-0eff:老挝文 (lao) 
0f00-0fff:藏文 (tibetan) 
1000-109f:缅甸语 (myanmar) 
10a0-10ff:格鲁吉亚语 (georgian) 
1100-11ff:朝鲜文 (hangul jamo) 
1200-137f:埃塞俄比亚语 (ethiopic) 
1380-139f:埃塞俄比亚语补充 (ethiopic supplement) 
13a0-13ff:切罗基语 (cherokee) 
1400-167f:统一加拿大土著语音节 (unified canadian aboriginal syllabics) 
1680-169f:欧甘字母 (ogham) 
16a0-16ff:如尼文 (runic) 
1700-171f:塔加拉语 (tagalog) 
1720-173f:hanunóo 
1740-175f:buhid 
1760-177f:tagbanwa 
1780-17ff:高棉语 (khmer) 
1800-18af:蒙古文 (mongolian) 
18b0-18ff:cham 
1900-194f:limbu 
1950-197f:德宏泰语 (tai le) 
1980-19df:新傣仂语 (new tai lue) 
19e0-19ff:高棉语记号 (kmer symbols) 
1a00-1a1f:buginese 
1a20-1a5f:batak 
1a80-1aef:lanna 
1b00-1b7f:巴厘语 (balinese) 
1b80-1bb0:巽他语 (sundanese) 
1bc0-1bff:pahawh hmong 
1c00-1c4f:雷布查语(lepcha) 
1c50-1c7f:ol chiki 
1c80-1cdf:曼尼普尔语 (meithei/manipuri) 
1d00-1d7f:语音学扩展 (phonetic extensions) 
1d80-1dbf:语音学扩展补充 (phonetic extensions supplement) 
1dc0-1dff:结合用读音符号补充 (combining diacritics marks supplement) 
1e00-1eff:拉丁文扩充附加 (latin extended additional) 
1f00-1fff:希腊语扩充 (greek extended) 
2000-206f:常用标点 (general punctuation) 
2070-209f:上标及下标 (superscripts and subscripts) 
20a0-20cf:货币符号 (currency symbols) 
20d0-20ff:组合用记号 (combining diacritics marks for symbols) 
2100-214f:字母式符号 (letterlike symbols) 
2150-218f:数字形式 (number form) 
2190-21ff:箭头 (arrows) 
2200-22ff:数学运算符 (mathematical operator) 
2300-23ff:杂项工业符号 (miscellaneous technical) 
2400-243f:控制图片 (control pictures) 
2440-245f:光学识别符 (optical character recognition) 
2460-24ff:封闭式字母数字 (enclosed alphanumerics) 
2500-257f:制表符 (box drawing) 
2580-259f:方块元素 (block element) 
25a0-25ff:几何图形 (geometric shapes) 
2600-26ff:杂项符号 (miscellaneous symbols) 
2700-27bf:印刷符号 (dingbats) 
27c0-27ef:杂项数学符号-a (miscellaneous mathematical symbols-a) 
27f0-27ff:追加箭头-a (supplemental arrows-a) 
2800-28ff:盲文点字模型 (braille patterns) 
2900-297f:追加箭头-b (supplemental arrows-b) 
2980-29ff:杂项数学符号-b (miscellaneous mathematical symbols-b) 
2a00-2aff:追加数学运算符 (supplemental mathematical operator) 
2b00-2bff:杂项符号和箭头 (miscellaneous symbols and arrows) 
2c00-2c5f:格拉哥里字母 (glagolitic) 
2c60-2c7f:拉丁文扩展-c (latin extended-c) 
2c80-2cff:古埃及语 (coptic) 
2d00-2d2f:格鲁吉亚语补充 (georgian supplement) 
2d30-2d7f:提非纳文 (tifinagh) 
2d80-2ddf:埃塞俄比亚语扩展 (ethiopic extended) 
2e00-2e7f:追加标点 (supplemental punctuation) 
2e80-2eff:cjk 部首补充 (cjk radicals supplement) 
2f00-2fdf:康熙字典部首 (kangxi radicals) 
2ff0-2fff:表意文字描述符 (ideographic description characters) 
3000-303f:cjk 符号和标点 (cjk symbols and punctuation) 
3040-309f:日文平假名 (hiragana) 
30a0-30ff:日文片假名 (katakana) 
3100-312f:注音字母 (bopomofo) 
3130-318f:朝鲜文兼容字母 (hangul compatibility jamo) 
3190-319f:象形字注释标志 (kanbun) 
31a0-31bf:注音字母扩展 (bopomofo extended) 
31c0-31ef:cjk 笔画 (cjk strokes) 
31f0-31ff:日文片假名语音扩展 (katakana phonetic extensions) 
3200-32ff:封闭式 cjk 文字和月份 (enclosed cjk letters and months) 
3300-33ff:cjk 兼容 (cjk compatibility) 
3400-4dbf:cjk 统一表意符号扩展 a (cjk unified ideographs extension a) 
4dc0-4dff:易经六十四卦符号 (yijing hexagrams symbols) 
4e00-9fbf:cjk 统一表意符号 (cjk unified ideographs) 
a000-a48f:彝文音节 (yi syllables) 
a490-a4cf:彝文字根 (yi radicals) 
a500-a61f:vai 
a660-a6ff:统一加拿大土著语音节补充 (unified canadian aboriginal syllabics supplement) 
a700-a71f:声调修饰字母 (modifier tone letters) 
a720-a7ff:拉丁文扩展-d (latin extended-d) 
a800-a82f:syloti nagri 
a840-a87f:八思巴字 (phags-pa) 
a880-a8df:saurashtra 
a900-a97f:爪哇语 (javanese) 
a980-a9df:chakma 
aa00-aa3f:varang kshiti 
aa40-aa6f:sorang sompeng 
aa80-aadf:newari 
ab00-ab5f:越南傣语 (vi?t thái) 
ab80-aba0:kayah li 
ac00-d7af:朝鲜文音节 (hangul syllables) 
d800-dbff:high-half zone of utf-16 
dc00-dfff:low-half zone of utf-16 
e000-f8ff:自行使用區域 (private use zone) 
f900-faff:cjk 兼容象形文字 (cjk compatibility ideographs) 
fb00-fb4f:字母表達形式 (alphabetic presentation form) 
fb50-fdff:阿拉伯表達形式a (arabic presentation form-a) 
fe00-fe0f:变量选择符 (variation selector) 
fe10-fe1f:竖排形式 (vertical forms) 
fe20-fe2f:组合用半符号 (combining half marks) 
fe30-fe4f:cjk 兼容形式 (cjk compatibility forms) 
fe50-fe6f:小型变体形式 (small form variants) 
fe70-feff:阿拉伯表達形式b (arabic presentation form-b) 
ff00-ffef:半型及全型形式 (halfwidth and fullwidth form) 
fff0-ffff:特殊 (specials)

 

 

 

 

u 0123456789abcdef000000100020003000400050006000700080009000a000b000c000d000e000f0u 0123456789abcdef010001100120013001400150016001700180019001a001b001c001d001e001f0u 0123456789abcdef020002100220023002400250026002700280029002a002b002c002d002e002f0u 0123456789abcdef030003100320033003400350036003700380039003a003b003c003d003e003f0u 0123456789abcdef040004100420043004400450046004700480049004a004b004c004d004e004f0u 0123456789abcdef050005100520053005400550056005700580059005a005b005c005d005e005f0u 0123456789abcdef060006100620063006400650066006700680069006a006b006c006d006e006f0u 0123456789abcdef070007100720073007400750076007700780079007a007b007c007d007e007f0u 0123456789abcdef080008100820083008400850086008700880089008a008b008c008d008e008f0u 0123456789abcdef090009100920093009400950096009700980099009a009b009c009d009e009f0u 0123456789abcdef0a000a100a200a300a400a500a600a700a800a900aa00ab00ac00ad00ae00af0u 0123456789abcdef0b000b100b200b300b400b500b600b700b800b900ba00bb00bc00bd00be00bf0u 0123456789abcdef0c000c100c200c300c400c500c600c700c800c900ca00cb00cc00cd00ce00cf0u 0123456789abcdef0d000d100d200d300d400d500d600d700d800d900da00db00dc00dd00de00df0u 0123456789abcdef0e000e100e200e300e400e500e600e700e800e900ea00eb00ec00ed00ee00ef0u 0123456789abcdef0f000f100f200f300f400f500f600f700f800f900fa00fb00fc00fd00fe00ff0
nulsohstxetxeotenqackbelbshtlfvtffcrsosi
dledc1dc2dc3dc4naksynetbcanemsubescfsgsrsus
 !"#$%&'()* ,-./
0123456789:;<=>?
@abcdefghijklmno
pqrstuvwxyz[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}~del
padhopbphnbhindnelssaesahtshtjvtspldpluriss2ss3
dcspu1pu2stscchmwspaepasossgciscicsistoscpmapc
nbsp¡¢£¤¥¦§¨©ª«¬shy®¯
°±²³´µ·¸¹º»¼½¾¿
àáâãäåæçèéêëìíîï
ðñòóôõö×øùúûüýþß
àáâãäåæçèéêëìíîï
ðñòóôõö÷øùúûüýþÿ
āāăăąąććĉĉċċččďď
đđēēĕĕėėęęěěĝĝğğ
ġġģģĥĥħħĩĩīīĭĭįį
iıijijĵĵķķĸĺĺļļľľŀ
ŀłłńńņņňňʼnŋŋōōŏŏ
őőœœŕŕŗŗřřśśŝŝşş
ššţţťťŧŧũũūūŭŭůů
űűųųŵŵŷŷÿźźżżžžſ
ƀɓƃƃƅƅɔƈƈɖɗƌƌƍǝə
ɛƒƒɠɣƕɩɨƙƙƚƛɯɲƞɵ
ơơƣƣƥƥʀƨƨʃƪƫƭƭʈư
ưʊʋƴƴƶƶʒƹƹƺƻƽƽƾƿ
ǀǁǂǃdždždžljljljnjnjnjǎǎǐ
ǐǒǒǔǔǖǖǘǘǚǚǜǜǝǟǟ
ǡǡǣǣǥǥǧǧǩǩǫǫǭǭǯǯ
ǰdzdzdzǵǵƕƿǹǹǻǻǽǽǿǿ
ȁȁȃȃȅȅȇȇȉȉȋȋȍȍȏȏ
ȑȑȓȓȕȕȗȗșșțțȝȝȟȟ
ƞȡȣȣȥȥȧȧȩȩȫȫȭȭȯȯ
ȱȱȳȳȴȵȶȷȸȹȼȼƚȿ
ɀɂ              
ɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟ
ɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯ
ɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿ
ʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏ
ʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟ
ʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯ
ʰʱʲʳʴʵʶʷʸʹʺʻʼʽʾʿ
ˀˁ˂˃˄˅ˆˇˈˉˊˋˌˍˎˏ
ːˑ˒˓˔˕˖˗˘˙˚˛˜˝˞˟
ˠˡˢˣˤ˥˦˧˨˩˪˫ˬ˭ˮ˯
˰˱˲˳˴˵˶˷˸˹˺˻˼˽˾˿
 ̀ ́ ̂ ̃ ̄ ̅ ̆ ̇ ̈ ̉ ̊ ̋ ̌ ̍ ̎ ̏
 ̐ ̑ ̒ ̓ ̔ ̕ ̖ ̗ ̘ ̙ ̚ ̛ ̜ ̝ ̞ ̟
 ̠ ̡ ̢ ̣ ̤ ̥ ̦ ̧ ̨ ̩ ̪ ̫ ̬ ̭ ̮ ̯
 ̰ ̱ ̲ ̳ ̴ ̵ ̶ ̷ ̸ ̹ ̺ ̻ ̼ ̽ ̾ ̿
 ̀ ́ ͂ ̓ ̈́ ͅ ͆ ͇ ͈ ͉ ͊ ͋ ͌ ͍ ͎cgj
 ͐ ͑ ͒ ͓ ͔ ͕ ͖ ͗ ͘ ͙ ͚ ͛ ͜ ͝ ͞ ͟
 ͠ ͡ ͢ ͣ ͤ ͥ ͦ ͧ ͨ ͩ ͪ ͫ ͬ ͭ ͮ ͯ
    ʹ͵    ͺ   ; 
    ΄΅ά·έήί ό ύώ
ΐαβγδεζηθικλμνξο
πρ στυφχψωϊϋάέήί
ΰαβγδεζηθικλμνξο
πρςστυφχψωϊϋόύώ 
ϐϑυύϋϕϖϗϙϙϛϛϝϝϟϟ
ϡϡϣϣϥϥϧϧϩϩϫϫϭϭϯϯ
ϰϱϲϳθϵ϶ϸϸϲϻϻϼͻͼͽ
ѐёђѓєѕіїјљњћќѝўџ
абвгдежзийклмноп
рстуфхцчшщъыьэюя
абвгдежзийклмноп
рстуфхцчшщъыьэюя
ѐёђѓєѕіїјљњћќѝўџ
ѡѡѣѣѥѥѧѧѩѩѫѫѭѭѯѯ
ѱѱѳѳѵѵѷѷѹѹѻѻѽѽѿѿ
ҁҁ҂ ҃ ҄ ҅ ҆  ҈ ҉ҋҋҍҍҏҏ
ґґғғҕҕҗҗҙҙққҝҝҟҟ
ҡҡңңҥҥҧҧҩҩҫҫҭҭүү
ұұҳҳҵҵҷҷҹҹһһҽҽҿҿ
ӏӂӂӄӄӆӆӈӈӊӊӌӌӎӎ 
ӑӑӓӓӕӕӗӗәәӛӛӝӝӟӟ
ӡӡӣӣӥӥӧӧөөӫӫӭӭӯӯ
ӱӱӳӳӵӵӷӷӹӹ      
ԁԁԃԃԅԅԇԇԉԉԋԋԍԍԏԏ
                
                
 աբգդեզէըթժիլխծկ
հձղճմյնշոչպջռսվտ
րցւփքօֆ  ՙ՚՛՜՝՞՟
 աբգդեզէըթժիլխծկ
հձղճմյնշոչպջռսվտ
րցւփքօֆև ։֊     
 ֑֖֛֚֒֓֔֕֗֘֙֜֝֞֟
֢֣֤֥֦֧֪֭֮֠֡֨֩֫֬֯
ְֱֲֳִֵֶַָֹ ֻּֽ־ֿ
׀ׁׂ׃ׅׄ׆ׇ        
אבגדהוזחטיךכלםמן
נסעףפץצקרשת     
װױײ׳״           
           ؋،؍؎؏
ؐؑؒؓؔؕ     ؛  ؞؟
 ءآأؤإئابةتثجحخد
ذرزسشصضطظعغ     
ـفقكلمنهوىيًٌٍَُ
ِّْٕٖٜٓٔٗ٘ٙٚٛٝٞ 
٠١٢٣٤٥٦٧٨٩٪٫٬٭ٮٯ
ٰٱٲٳٴٵٶٷٸٹٺٻټٽپٿ
ڀځڂڃڄڅچڇڈډڊڋڌڍڎڏ
ڐڑڒړڔڕږڗژڙښڛڜڝڞڟ
ڠڡڢڣڤڥڦڧڨکڪګڬڭڮگ
ڰڱڲڳڴڵڶڷڸڹںڻڼڽھڿ
ۀہۂۃۄۅۆۇۈۉۊۋیۍێۏ
ېۑےۓ۔ەۖۗۘۙۚۛۜ۝۞۟
ۣ۠ۡۢۤۥۦۧۨ۩۪ۭ۫۬ۮۯ
۰۱۲۳۴۵۶۷۸۹ۺۻۼ۽۾ۿ
܀܁܂܃܄܅܆܇܈܉܊܋܌܍  
ܐܑܒܓܔܕܖܗܘܙܚܛܜܝܞܟ
ܠܡܢܣܤܥܦܧܨܩܪܫܬܭܮܯ
ܱܴܷܸܹܻܼܾܰܲܳܵܶܺܽܿ
݂݄݆݈݀݁݃݅݇݉݊  ݍݎݏ
ݐݑݒݓݔݕݖݗݘݙݚݛݜݝݞݟ
ݠݡݢݣݤݥݦݧݨݩݪݫݬݭ  
                
ހށނރބޅކއވމފދތލގޏ
ސޑޒޓޔޕޖޗޘޙޚޛޜޝޞޟ
ޠޡޢޣޤޥަާިީުޫެޭޮޯ
ްޱ              
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
                
 
  ि
  
   
              
    
  
 
      ি
     
            
  
     
      
  
 
      ਿ
        
           
      
           
   
 
 
    િ
    
               
  
               
    
  
 
    ି
       
           
    
              
      
      
         
    ி
      
               
      
     
   
 
 
     ి
    
              
    
                
    
 
 
   ಿ
    
             
    
                
    
 
 
    ി
     
               
    
                
   
   
    
       
  
                
             
 
    ฿
    
                
                
         
     
      
   
    
    
                
                
༿
 
     
 ཿ
    
 
 ྿
  
              
                
                

 

分类: java,路上,额外话题

--------------------------------------------------

不可见字符过滤方案

 

public static string replaceunicode(string sourcestr)
{
string regex= "["
"\u0000-\u001f" //:c0控制符及基本拉丁文 (c0 control and basic latin) 
"\u007f-\u00a0" //:特殊 (specials);
"]";
pattern pattern=pattern.compile(regex);
matcher matcher=pattern.matcher(sourcestr);
return matcher.replaceall("");
}

 

如果都喜欢替换 则修改正则表达式如下:

  • string regex= "["   
  •                 "\u4e00-\u9fbf" //:cjk 统一表意符号 (cjk unified ideographs)  
  •                 "\u4dc0-\u4dff" //:易经六十四卦符号 (yijing hexagrams symbols)  
  •                 "\u0000-\u007f" //:c0控制符及基本拉丁文 (c0 control and basic latin)  
  •                 "\u0080-\u00ff" //:c1控制符及拉丁:补充-1 (c1 control and latin 1 supplement)  
  •                 "\u0100-\u017f" //:拉丁文扩展-a (latin extended-a)  
  •                 "\u0180-\u024f" //:拉丁文扩展-b (latin extended-b)  
  •                 "\u0250-\u02af" //:国际音标扩展 (ipa extensions)  
  •                 "\u02b0-\u02ff" //:空白修饰字母 (spacing modifiers)  
  •                 "\u0300-\u036f" //:结合用读音符号 (combining diacritics marks)  
  •                 "\u0370-\u03ff" //:希腊文及科普特文 (greek and coptic)  
  •                 "\u0400-\u04ff" //:西里尔字母 (cyrillic)  
  •                 "\u0500-\u052f" //:西里尔字母补充 (cyrillic supplement)  
  •                 "\u0530-\u058f" //:亚美尼亚语 (armenian)  
  •                 "\u0590-\u05ff" //:希伯来文 (hebrew)  
  •                 "\u0600-\u06ff" //:阿拉伯文 (arabic)  
  •                 "\u0700-\u074f" //:叙利亚文 (syriac)  
  •                 "\u0750-\u077f" //:阿拉伯文补充 (arabic supplement)  
  •                 "\u0780-\u07bf" //:马尔代夫语 (thaana)  
  •                 //"\u07c0-\u077f" //:西非书面语言 (n'ko)  
  •                 "\u0800-\u085f" //:阿维斯塔语及巴列维语 (avestan and pahlavi)  
  •                 "\u0860-\u087f" //:mandaic  
  •                 "\u0880-\u08af" //:撒马利亚语 (samaritan)  
  •                 "\u0900-\u097f" //:天城文书 (devanagari)  
  •                 "\u0980-\u09ff" //:孟加拉语 (bengali)  
  •                 "\u0a00-\u0a7f" //:锡克教文 (gurmukhi)  
  •                 "\u0a80-\u0aff" //:古吉拉特文 (gujarati)  
  •                 "\u0b00-\u0b7f" //:奥里亚文 (oriya)  
  •                 "\u0b80-\u0bff" //:泰米尔文 (tamil)  
  •                 "\u0c00-\u0c7f" //:泰卢固文 (telugu)  
  •                 "\u0c80-\u0cff" //:卡纳达文 (kannada)  
  •                 "\u0d00-\u0d7f" //:德拉维族语 (malayalam)  
  •                 "\u0d80-\u0dff" //:僧伽罗语 (sinhala)  
  •                 "\u0e00-\u0e7f" //:泰文 (thai)  
  •                 "\u0e80-\u0eff" //:老挝文 (lao)  
  •                 "\u0f00-\u0fff" //:藏文 (tibetan)  
  •                 "\u1000-\u109f" //:缅甸语 (myanmar)  
  •                 "\u10a0-\u10ff" //:格鲁吉亚语 (georgian)  
  •                 "\u1100-\u11ff" //:朝鲜文 (hangul jamo)  
  •                 "\u1200-\u137f" //:埃塞俄比亚语 (ethiopic)  
  •                 "\u1380-\u139f" //:埃塞俄比亚语补充 (ethiopic supplement)  
  •                 "\u13a0-\u13ff" //:切罗基语 (cherokee)  
  •                 "\u1400-\u167f" //:统一加拿大土著语音节 (unified canadian aboriginal syllabics)  
  •                 "\u1680-\u169f" //:欧甘字母 (ogham)  
  •                 "\u16a0-\u16ff" //:如尼文 (runic)  
  •                 "\u1700-\u171f" //:塔加拉语 (tagalog)  
  •                 "\u1720-\u173f" //:hanunóo  
  •                 "\u1740-\u175f" //:buhid  
  •                 "\u1760-\u177f" //:tagbanwa  
  •                 "\u1780-\u17ff" //:高棉语 (khmer)  
  •                 "\u1800-\u18af" //:蒙古文 (mongolian)  
  •                 "\u18b0-\u18ff" //:cham  
  •                 "\u1900-\u194f" //:limbu  
  •                 "\u1950-\u197f" //:德宏泰语 (tai le)  
  •                 "\u1980-\u19df" //:新傣仂语 (new tai lue)  
  •                 "\u19e0-\u19ff" //:高棉语记号 (kmer symbols)  
  •                 "\u1a00-\u1a1f" //:buginese  
  •                 "\u1a20-\u1a5f" //:batak  
  •                 "\u1a80-\u1aef" //:lanna  
  •                 "\u1b00-\u1b7f" //:巴厘语 (balinese)  
  •                 "\u1b80-\u1bb0" //:巽他语 (sundanese)  
  •                 "\u1bc0-\u1bff" //:pahawh hmong  
  •                 "\u1c00-\u1c4f" //:雷布查语(lepcha)  
  •                 "\u1c50-\u1c7f" //:ol chiki  
  •                 "\u1c80-\u1cdf" //:曼尼普尔语 (meithei/manipuri)  
  •                 "\u1d00-\u1d7f" //:语音学扩展 (phone tic extensions)  
  •                 "\u1d80-\u1dbf" //:语音学扩展补充 (phonetic extensions supplement)  
  •                 "\u1dc0-\u1dff" //结合用读音符号补充 (combining diacritics marks supplement)  
  •                 "\u1e00-\u1eff" //:拉丁文扩充附加 (latin extended additional)  
  •                 "\u1f00-\u1fff" //:希腊语扩充 (greek extended)  
  •                 "\u2000-\u206f" //:常用标点 (general punctuation)  
  •                 "\u2070-\u209f" //:上标及下标 (superscripts and subscripts)  
  •                 "\u20a0-\u20cf" //:货币符号 (currency symbols)  
  •                 "\u20d0-\u20ff" //:组合用记号 (combining diacritics marks for symbols)  
  •                 "\u2100-\u214f" //:字母式符号 (letterlike symbols)  
  •                 "\u2150-\u218f" //:数字形式 (number form)  
  •                 "\u2190-\u21ff" //:箭头 (arrows)  
  •                 "\u2200-\u22ff" //:数学运算符 (mathematical operator)  
  •                 "\u2300-\u23ff" //:杂项工业符号 (miscellaneous technical)  
  •                 "\u2400-\u243f" //:控制图片 (control pictures)  
  •                 "\u2440-\u245f" //:光学识别符 (optical character recognition)  
  •                 "\u2460-\u24ff" //:封闭式字母数字 (enclosed alphanumerics)  
  •                 "\u2500-\u257f" //:制表符 (box drawing)  
  •                 "\u2580-\u259f" //:方块元素 (block element)  
  •                 "\u25a0-\u25ff" //:几何图形 (geometric shapes)  
  •                 "\u2600-\u26ff" //:杂项符号 (miscellaneous symbols)  
  •                 "\u2700-\u27bf" //:印刷符号 (dingbats)  
  •                 "\u27c0-\u27ef" //:杂项数学符号-a (miscellaneous mathematical symbols-a)  
  •                 "\u27f0-\u27ff" //:追加箭头-a (supplemental arrows-a)  
  •                 "\u2800-\u28ff" //:盲文点字模型 (braille patterns)  
  •                 "\u2900-\u297f" //:追加箭头-b (supplemental arrows-b)  
  •                 "\u2980-\u29ff" //:杂项数学符号-b (miscellaneous mathematical symbols-b)  
  •                 "\u2a00-\u2aff" //:追加数学运算符 (supplemental mathematical operator)  
  •                 "\u2b00-\u2bff" //:杂项符号和箭头 (miscellaneous symbols and arrows)  
  •                 "\u2c00-\u2c5f" //:格拉哥里字母 (glagolitic)  
  •                 "\u2c60-\u2c7f" //:拉丁文扩展-c (latin extended-c)  
  •                 "\u2c80-\u2cff" //:古埃及语 (coptic)  
  •                 "\u2d00-\u2d2f" //:格鲁吉亚语补充 (georgian supplement)  
  •                 "\u2d30-\u2d7f" //:提非纳文 (tifinagh)  
  •                 "\u2d80-\u2ddf" //:埃塞俄比亚语扩展 (ethiopic extended)  
  •                 "\u2e00-\u2e7f" //:追加标点 (supplemental punctuation)  
  •                 "\u2e80-\u2eff" //:cjk 部首补充 (cjk radicals supplement)  
  •                 "\u2f00-\u2fdf" //:康熙字典部首 (kangxi radicals)  
  •                 "\u2ff0-\u2fff" //:表意文字描述符 (ideographic description characters)  
  •                 "\u3000-\u303f" //:cjk 符号和标点 (cjk symbols and punctuation)  
  •                 "\u3040-\u309f" //:日文平假名 (hiragana)  
  •                 "\u30a0-\u30ff" //:日文片假名 (katakana)  
  •                 "\u3100-\u312f" //:注音字母 (bopomofo)  
  •                 "\u3130-\u318f" //:朝鲜文兼容字母 (hangul compatibility jamo)  
  •                 "\u3190-\u319f" //:象形字注释标志 (kanbun)  
  •                 "\u31a0-\u31bf" //:注音字母扩展 (bopomofo extended)  
  •                 "\u31c0-\u31ef" //:cjk 笔画 (cjk strokes)  
  •                 "\u31f0-\u31ff" //:日文片假名语音扩展 (katakana phonetic extensions)  
  •                 "\u3200-\u32ff" //:封闭式 cjk 文字和月份 (enclosed cjk letters and months)  
  •                 "\u3300-\u33ff" //:cjk 兼容 (cjk compatibility)  
  •                 "\u3400-\u4dbf" //:cjk 统一表意符号扩展 a (cjk unified ideographs extension a)  
  •                 "\u4dc0-\u4dff" //:易经六十四卦符号 (yijing hexagrams symbols)  
  •                 "\u4e00-\u9fbf" //:cjk 统一表意符号 (cjk unified ideographs)  
  •                 "\ua000-\ua48f" //:彝文音节 (yi syllables)  
  •                 "\ua490-\ua4cf" //:彝文字根 (yi radicals)  
  •                 "\ua500-\ua61f" //:vai  
  •                 "\ua660-\ua6ff" //:统一加拿大土著语音节补充 (unified canadian aboriginal syllabics supplement)  
  •                 "\ua700-\ua71f" //:声调修饰字母 (modifier tone letters)  
  •                 "\ua720-\ua7ff" //:拉丁文扩展-d (latin extended-d)  
  •                 "\ua800-\ua82f" //:syloti nagri  
  •                 "\ua840-\ua87f" //:八思巴字 (phags-pa)  
  •                 "\ua880-\ua8df" //:saurashtra  
  •                 "\ua900-\ua97f" //:爪哇语 (javanese)  
  •                 "\ua980-\ua9df" //:chakma  
  •                 "\uaa00-\uaa3f" //:varang kshiti  
  •                 "\uaa40-\uaa6f" //:sorang sompeng  
  •                 "\uaa80-\uaadf" //:newari  
  •                 "\uab00-\uab5f" //:越南傣语 (vi?t thái)  
  •                 "\uab80-\uaba0" //:kayah li  
  •                 "\uac00-\ud7af" //:朝鲜文音节 (hangul syllables)  
  •                 //"\ud800-\udbff" //:high-half zone of utf-16  
  •                 //"\udc00-\udfff" //:low-half zone of utf-16  
  •                 "\ue000-\uf8ff" //:自行使用区域 (private use zone)  
  •                 "\uf900-\ufaff" //:cjk 兼容象形文字 (cjk compatibility ideographs)  
  •                 "\ufb00-\ufb4f" //:字母表达形式 (alphabetic presentation form)  
  •                 "\ufb50-\ufdff" //:阿拉伯表达形式a (arabic presentation form-a)  
  •                 "\ufe00-\ufe0f" //:变量选择符 (variation selector)  
  •                 "\ufe10-\ufe1f" //:竖排形式 (vertical forms)  
  •                 "\ufe20-\ufe2f" //:组合用半符号 (combining half marks)  
  •                 "\ufe30-\ufe4f" //:cjk 兼容形式 (cjk compatibility forms)  
  •                 "\ufe50-\ufe6f" //:小型变体形式 (small form variants)  
  •                 "\ufe70-\ufeff" //:阿拉伯表达形式b (arabic presentation form-b)  
  •                 "\uff00-\uffef" //:半型及全型形式 (halfwidth and fullwidth form)  
  •                 "\ufff0-\uffff]";//:特殊 (specials);  
  •  

    转载来自:
    http://www.cnblogs.com/fan-yuan/p/8176886.html

    以上链接有失效的,可以访问
    https://zh.wikibooks.org/wiki/unicode/字符参考/2000-2fff

    总结

    以上是凯发k8官方网为你收集整理的unicode编码对照表及过滤方案的全部内容,希望文章能够帮你解决所遇到的问题。

    如果觉得凯发k8官方网网站内容还不错,欢迎将凯发k8官方网推荐给好友。

    • 上一篇:
    • 下一篇:
    网站地图