jni 中使用NewStringUTF时报错：input is not valid Modified UTF-8: illegal start byte 0xa0（十六进制未定义字符）

来源：互联网发布：java面试线程池回答编辑：程序博客网时间：2024/05/16 07:10

报错 :

05-20 10:35:30.702: A/art(32149): art/runtime/check_jni.cc:65] JNI DETECTED ERROR IN APPLICATION: input is not valid Modified UTF-8: illegal start byte 0xa0

原因是：定义的是char *

解决办法：

将char * 定义更换为const char * ，即可解决问题。

分析：

这是因为在调用NewStringUTF()时, dalvik虚拟机调用checkUtfString() 中的checkUtfBytes()对字符串的格式进行了校验.

checkUtfBytes的源码在<Android-src>/dalvik/vm/CheckJni.cpp 中,如下：

/*
* Verify that "bytes" points to valid "modified UTF-8" data.
*/
void checkUtfString(const char* bytes, bool nullable) {
if (bytes == NULL) {
if (!nullable) {
ALOGW("JNI WARNING: non-nullable const char* was NULL (%s)", mFunctionName);
showLocation();
abortMaybe();
}
return;
}

const char* errorKind = NULL;
u1 utf8 = checkUtfBytes(bytes, &errorKind);
if (errorKind != NULL) {
ALOGW("JNI WARNING: %s input is not valid Modified UTF-8: illegal %s byte %#x",
mFunctionName, errorKind, utf8);
ALOGW(" string: '%s'", bytes);
showLocation();
abortMaybe();
}
}

static u1 checkUtfBytes(const char* bytes, const char** errorKind) {
while (*bytes != '\0') {
u1 utf8 = *(bytes++);
// Switch on the high four bits.
switch (utf8 >> 4) {
case 0x00:
case 0x01:
case 0x02:
case 0x03:
case 0x04:
case 0x05:
case 0x06:
case 0x07:
// Bit pattern 0xxx. No need for any extra bytes.
break;
case 0x08:
case 0x09:
case 0x0a:
case 0x0b:
case 0x0f:
/*
* Bit pattern 10xx or 1111, which are illegal start bytes.
* Note: 1111 is valid for normal UTF-8, but not the
* modified UTF-8 used here.
*/
*errorKind = "start";
return utf8;
case 0x0e:
// Bit pattern 1110, so there are two additional bytes.
utf8 = *(bytes++);
if ((utf8 & 0xc0) != 0x80) {
*errorKind = "continuation";
return utf8;
}
// Fall through to take care of the final byte.
case 0x0c:
case 0x0d:
// Bit pattern 110x, so there is one additional byte.
utf8 = *(bytes++);
if ((utf8 & 0xc0) != 0x80) {
*errorKind = "continuation";
return utf8;
}
break;
}
}
return 0;
}

C语言中，char * 、const char* 和unsigned char* 的对字符串的处理及区别和联系有待研究。

阅读全文

0 0