参考URL:http://www.w3.org/International/questions/qa-forms-utf-8.en.php

$result = preg_match(’%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs’, $string);

如果$result为真,则是UTF-8编码的字符串,否为ANSI

以上面为条件,匹配出字符串中的中文

if ($result) {
preg_match_all(“/[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}/”, $str, $arr);
print_r($arr[0]);
} else {
preg_match_all(“/[\x80-\xFF]./”, $str, $arr);
print_r($arr[0]);
}