DBCS字符集
来源:互联网 发布:知乎创始人周源 编辑:程序博客网 时间:2024/06/05 12:44
A "character set" is a mapping of characters to their identifying code values. The character set most commonly used in computers today isUnicode, a global standard for character encoding.
Most applications written today handle character data primarily as Unicode, using the UTF-16 encoding. However, many legacy applications continue to use character sets based on code pages. Even new applications sometimes have to work with code pages, often for one of the following reasons:
- To communicate with legacy applications.
- To communicate with older mail and news servers, which might not always support Unicode.
- To communicate with the Windows Console, which does not support Unicode.
Code page is another term for character encoding. It consists of a table of values that describes the character set for a particular language. Each code page is represented by a code page identifier, for example, 1252, and is handled by the Unicode and character set API functions. For a list of supported code page identifiers, see Code Page Identifiers. For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page. Windows code pages, commonly called "ANSI code pages", are code pages for which non-ASCII values (values greater than 127) represent international characters. These code pages are used natively in Windows Me, and are also available on Windows NT and later.
A single-byte character set (SBCS) is a mapping of 256 individual characters to their identifying code values, implemented as a code page.
A double-byte character set (DBCS), also known as an "expanded 8-bit character set", is an extendedsingle-byte character set (SBCS), implemented as a code page. DBCSs were originally developed to extend the SBCS design to handle languages such as Japanese and Chinese. Each DBCS code page supports different characters, but no page supports the full breadth of characters provided by Unicode. Each DBCS code page supports a different subset, differently encoded. Data converted from one DBCS code page to another is subject to corruption because the same data value on different code pages can encode a different character. Data converted from Unicode to DBCS is subject to data loss, because a given code page might not be able to represent every character used in that particular Unicode data.
http://msdn.microsoft.com/en-us/library/windows/desktop/dd317794(v=vs.85).aspx
- DBCS字符集
- DBCS
- ANSI、unicode、utf-8、DBCS等字符集及相关数据类型、函数的区别
- ANSI、unicode、utf-8、DBCS等字符集及相关数据类型、函数的区别
- ANSI、unicode、utf-8、DBCS等字符集及相关数据类型、函数的区别
- 字符集编码cp936、ANSI、UNICODE、UTF-8、GB2312、GBK、GB18030、DBCS、UCS
- ANSI、unicode、utf-8、DBCS等字符集及相关数据类型、函数的区别
- ANSI、Unicode、UTF-8、DBCS等字符集及相关数据类型、函数的区别
- ANSI、Unicode、UTF-8、DBCS等字符集及相关数据类型、函数的区别
- ASCII, DBCS, Unicode【上】
- ASCII, DBCS, Unicode【下】
- ASCII, DBCS,Unicode小结
- Unicode与DBCS
- 转 ASCII, DBCS,UNICODE小结
- DBCS ,宽字符与unicode
- DBCS和UCS编码相关
- 字符集
- 字符集
- 计算单词的个数
- CentOS 环境下C/C++程序的开发
- Activity编辑框光标和键盘同时消失
- 【2】OMAP335X-内核BSP之资源注册那些事.
- cocos2d-x暂停和恢复游戏
- DBCS字符集
- Matlab与C/C++混合编程调用OpenCV
- C++标准库中的数学函数
- 细说PHP:人人都能玩转PHP和MySQL Web开发
- 一种可分级防丢包的视频压缩想法
- 教你用U盘安装原版Win7系统详细步骤
- 西南大学网络与继续教育学院课程考试试题卷
- Matlab与C/C++混合编程调用OpenCV
- Recursion 爬楼梯问题 @CareerCup