解决httpurlconnection获取网页数据部分中文乱码问题

来源:互联网 发布:js如何给div的 id赋值 编辑:程序博客网 时间:2024/04/29 10:53

“`
public void doGet(final String urlStr) throws CommonException {
final StringBuffer sb = new StringBuffer();
new Thread(new Runnable() {

        @Override        public void run() {            // TODO Auto-generated method stub            try {                URL url = new URL(urlStr);                HttpURLConnection conn = (HttpURLConnection) url                        .openConnection();                conn.setRequestProperty("Charset", "UTF-8");                conn.setRequestMethod("GET");                conn.setConnectTimeout(5000);                conn.setDoInput(true);                conn.setDoOutput(true);                if (conn.getResponseCode() == 200) {                    InputStream is = conn.getInputStream();                    int len = 0;                    //原因就出在这里,直接我开的字节1024这回造成如果是一个中文字符正好在这个1024的临界点,这样就会出现中文乱码,所以我就直接将大小开到60000,哈哈哈。                    byte[] buf = new byte[60000];                    while ((len = is.read(buf)) != -1) {                        sb.append(new String(buf, 0, len, "UTF-8"));                    }                    jsoup_jiexi(sb.toString());                    is.close();                } else {                    throw new CommonException("访问网络失败00");                }            } catch (Exception e) {                // TODO Auto-generated catch block                e.printStackTrace();                try {                    throw new CommonException("访问网络失败11");                } catch (CommonException e1) {                    // TODO Auto-generated catch block                    e1.printStackTrace();                }            }        }    }).start();}```
0 0
原创粉丝点击