android编码的理解1

来源:互联网 发布:农资进销存软件zzhqkj 编辑:程序博客网 时间:2024/06/04 08:26

最近在探究你们的code,发现你们在ffmpegid3v1.c文件中,添加了下面的函数将MP3文件的歌唱者的名字从GBK编码转到其他编码格式上。我的理解是最终要将GBK转到UTF8上,不然中文会乱码。但下面的装换很简单,并不是转换成UTF8.我想问的是这个函数将GBK转成什么格式?希望能得到你的帮助。


最近在用媒体中心播放音频时,发现ffmpeg获取metadata后,中文歌手名显示乱码。主要原因是ffmpeg取出的歌手名是GBK编码,直接通过android的newStringUTF给了java层显示,导致显示错误。下面的patch会将gbk转成UTF,再传给上层JAVA。


+static void convert_iso8859_to_string(const uint8_t *data, int size, char *s) {
+    int utf8len = 0;
+    int i;
+       
+    for (i = 0; i < size; ++i) {
+        if (data[i] == '\0') {
+            size = i;
+            break;
+        } else if (data[i] < 0x80) {
+            ++utf8len;
+        } else {
+            utf8len += 2;
+        }
+    }
+
+    if (utf8len == size) {
+        // Only ASCII characters present.
+
+        memcpy(s, data, size);
+        s[size] = '\0';
+        return;
+    }
+
+    char *ptr = s;
+    for (i = 0; i < size; ++i) {
+        if (data[i] == '\0') {
+            break;
+        } else if (data[i] < 0x80) {
+            *ptr++ = data[i];
+        } else if (data[i] < 0xc0) {
+            *ptr++ = 0xc2;
+            *ptr++ = data[i];
+        } else {
+            *ptr++ = 0xc3;
+            *ptr++ = data[i] - 64;
+        }
+    }
+    *ptr = '\0';
+
+}
+
 static void get_string(AVFormatContext *s, const char *key,
                        const uint8_t *buf, int buf_size)
 {
     int i, c;
     char *q, str[512];
 
+    convert_iso8859_to_string(buf, buf_size, str);

+#if 0
     q = str;
     for(i = 0; i < buf_size; i++) {
         c = buf[i];
@@ -191,6 +234,7 @@ static void get_string(AVFormatContext *s, const char *key,
         *q++ = c;
     }
     *q = '\0';
+#endif
 
     if (*str)
         av_dict_set(&s->metadata, key, str, 0);



UTF8编码表

http://blog.csdn.net/qiaqia609/article/details/8069678


GBK编码表

http://blog.csdn.net/qiaqia609/article/details/8069655


utf8汉字编码对照表_信息与通信_工程

http://cache.baiducontent.com/c?m=9d78d513d98407fb4fece4741a16a671695797143ec0a11568a3e35cd424054e1d20a5f930236319ce802b3b58e85e5c9da06529614437b7ec99d515c0ffc97f6a957332211c864613d51bffcd17259621c45decaf1ce3bba66184aea589990b0d&p=9b3fc64ad4d015b708e29778065594&newp=8e6acc1487d512a05abd9b7d0b1da5231611d73f6590cf512496fe4b98&user=baidu&fm=sc&query=utf8+%B1%E0%C2%EB&qid=dcbcd21c000092cc&p1=5

UTF8编码表

http://blog.csdn.net/qiaqia609/article/details/8069678


全角字符unicode码对应表

http://blog.csdn.net/lvwx369/article/details/39294265


Unicode码对应表_IT/计算机_专业资料

http://cache.baiducontent.com/c?m=9f65cb4a8c8507ed4fece76310478a215915d7743ca080462482d45f93130a1c187ba7e070670d0fd4cf7b6c51ad4f0be0f53570345724bcccc98b41daea963f2fff7d722f42914066934eb8ca30619a77d54eacf259b1b5e743e2b9a5a2c854228d0f5e2bdda6dc4d00659b3ea745&p=8b2a975686cc40ad07f1cf351564&newp=8a769a47999611a059ef8a24565692695c16ed623e9885&user=baidu&fm=sc&query=%CC%EC+unicode+ccec&qid=b21a779500071717&p1=1


GBK、GB2312、iso-8859-1之间的区别

http://blog.csdn.net/jerry_bj/article/details/5714745


0 0
原创粉丝点击