PHP(UTF)截取字符串乱码问题

来源:互联网 发布:北航大数据研究生 编辑:程序博客网 时间:2024/05/07 22:57


      PHP(UTF)截取字符串乱码问题


            使用PHP[Substr()]函数截取字符串末位会出现乱码,因为中文UTF-8编码,每个汉字占3字节,而GB2312占2字节,英文占1字节,截取位不准确,造成断开的字符会把其后的..拉过来一起做一个字,解决方法:

            function cut_str($string, $sublen, $start = 0, $code = 'UTF-8')
            {
            if($code == 'UTF-8')
            {
            $pa =
            "/[/x01-/x7f]|[/xc2-/xdf][/x80-/xbf]|/xe0[/xa0-/xbf][/x80-/xbf]|[/xe1-/xef][/x80-/xbf][/x80-/xbf]|/xf0[/x90-/xbf][/x80-/xbf][/x80-/xbf]|[/xf1-/xf7][/x80-/xbf][/x80-/xbf][/x80-/xbf]/";
            preg_match_all($pa, $string, $t_string);

            if(count($t_string[0]) - $start > $sublen) return join('',
            array_slice($t_string[0], $start, $sublen))."..";
            return join('', array_slice($t_string[0], $start, $sublen));
            }
            else
            {
            $start = $start*2;
            $sublen = $sublen*2;
            $strlen = strlen($string);
            $tmpstr = '';
            for($i=0; $i<$strlen; $i++)
            {
            if($i>=$start && $i<($start+$sublen))
            {
            if(ord(substr($string, $i, 1))>129) $tmpstr.= substr($string, $i,
2);
            else $tmpstr.= substr($string, $i, 1);
            }
            if(ord(substr($string, $i, 1))>129) $i++;
            }
            if(strlen($tmpstr)<$strlen ) $tmpstr.= "..";
            return $tmpstr;
            }
            }


            function msubstr($str, $start, $len) {
                $tmpstr = "";
                $strlen = $start + $len;
                for($i = 0; $i < $strlen; $i++) {
                    if(ord(substr($str, $i, 1)) > 0xa0) {
                        $tmpstr .= substr($str, $i, 2);
                        $i++;
                    } else
                        $tmpstr .= substr($str, $i, 1);
                }
                return $tmpstr;
            }

 

            例子:
            $sql = "夸父a到此bc一游!";
            echo cut_str($sql, 4);

            默认为:UTF-8编码,起始为 0 ,Apache 2 + PHP5环境 测试通过 ..

 

 

原创粉丝点击