UTF-8与GB2312之间的互换

来源:互联网 发布:jquery.min.js引用 编辑:程序博客网 时间:2024/06/04 20:01

如果您对UTF-8、Unicode、GB2312等还是很陌生的话,请查看http://www.linuxforum.net/books/UTF-8-Unicode.html

下面介绍一下WinAPI的两个函数:WideCharToMultiByte、MultiByteToWideChar。

函数原型:

int WideCharToMultiByte(
UINT CodePage, // code page
DWORD dwFlags, // performance and mapping flags
LPCWSTR lpWideCharStr, // wide-character string
int cchWideChar, // number of chars in string
LPSTR lpMultiByteStr, // buffer for new string
int cbMultiByte, // size of buffer
LPCSTR lpDefaultChar, // default for unmappable chars
LPBOOL lpUsedDefaultChar // set when default char used
); //将宽字符转换成多个窄字符

int MultiByteToWideChar(
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // string to map
int cbMultiByte, // number of bytes in string
LPWSTR lpWideCharStr, // wide-character buffer
int cchWideChar // size of buffer
);//将多个窄字符转换成宽字符
要实现 GB2312 (其实是GBK)转换为 UTF-8 其实很简单,先用 MultiByteToWideChar 把 GB2312 转换为 Unicode,再用 WideCharToMultiByte 把 Unicode 转换为 UTF-8 就可以了。

UTF-8 转换为 GB2312 是个相反的过程,先用 MultiByteToWideChar 把 UTF-8 转换为 Unicode,再用 WideCharToMultiByte 把 Unicode 转换为 GB2312 就可以了。
给出2个小函数供参考:
// UNICODE to GB2312,即把UNICODE变为普通单字节字符
char *cstring2char(CString str)
{
int len = str.GetLength();
int nByte = WideCharToMultiByte(CP_ACP,0,str,len,NULL,0,NULL, NULL );
char *buf = new char[nByte+1];
nByte = WideCharToMultiByte(CP_ACP,0,str,len,buf,nByte,NULL, NULL );
buf[nByte] = '/0';
char *pchar = _strdup( buf);
delete buf;
   return pchar;
}
// UNICODE to UTF8,即把UNICODE变为UTF8字符
char *UniCode2UTF8(CString str)
{
int len = str.GetLength();
int nByte = WideCharToMultiByte(CP_UTF8,0,str,len,NULL,0,NULL, NULL );
char *buf = new char[nByte+1];
nByte = WideCharToMultiByte(CP_UTF8,0,str,len,buf,nByte,NULL, NULL );
buf[nByte] = '/0';
char *pchar = _strdup( buf);
delete buf;
   return pchar;
}
原创粉丝点击