Java使用SPARQL访问DBPedia Endpoint错误

来源:互联网 发布:linux 网络接口配置 编辑:程序博客网 时间:2024/06/05 16:04

1.问题描述

    最近因研究需要,尝试使用SPARQL协议访问Endpoint,在使用Jena的ARQ组件访问DBPedia的Endpoint时出现了Connection Reset异常。在放弃Jena ARQ组件,直接使用Http请求时依然异常,异常的查询语句为 "Select ?p ?o where { <http://dbpedia.org/resource/Alabama> ?p ?o . }"  然而比较奇怪的是,如果将查询换成Endpoint的范例查询"select * where { ?s ?p ?o .} LIMIT 100" 时 程序没有报错,能将结果正常打印出来。有意思的是,如果在网页的endpoint使用"Select ?p ?o where { <http://dbpedia.org/resource/Alabama> ?p ?o . }"  这个查询,结果也能执行。DBPedia的Endpoint链接为http://dbpedia.org/sparql.

以下是使用HTTP发生请求的代码

public class HttpTest {    public static String sendGet(String url, String param) {        String result = "";        BufferedReader in = null;        try {            String urlNameString = url + "?" + param;            URL realUrl = new URL(urlNameString);            // 打开和URL之间的连接            URLConnection connection = realUrl.openConnection();            // 设置通用的请求属性            connection.setRequestProperty("accept", "*/*");            connection.setRequestProperty("connection", "Keep-Alive");            connection.setRequestProperty("user-agent",                    "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36");            // 建立实际的连接            connection.connect();            // 获取所有响应头字段            Map<String, List<String>> map = connection.getHeaderFields();            // 遍历所有的响应头字段            for (String key : map.keySet()) {                System.out.println(key + "--->" + map.get(key));            }            // 定义 BufferedReader输入流来读取URL的响应            in = new BufferedReader(new InputStreamReader(                    connection.getInputStream()));            String line;            while ((line = in.readLine()) != null) {                result += line;            }        } catch (Exception e) {            System.out.println("发送GET请求出现异常!" + e);            e.printStackTrace();        }        // 使用finally块来关闭输入流        finally {            try {                if (in != null) {                    in.close();                }            } catch (Exception e2) {                e2.printStackTrace();            }        }        return result;    }    public static void main(String[] args) throws Exception {        HttpTest httpTest = new HttpTest();//      String query="query="+URLEncoder.encode("select * where { ?s ?p ?o .} LIMIT 100","UTF-8");       String query="query="+URLEncoder.encode("select ?p ?o where { <http://dbpedia.org/resource/Alabama> ?p ?o .}","UTF-8");        System.out.println(httpTest.sendGet("http://dbpedia.org/sparql",query));        //System.out.println(URLEncoder.encode("query= select distinct ?Concept where {[] a ?Concept} LIMIT 100","UTF-8"));    }}


程序的异常
发送GET请求出现异常!java.net.SocketException: Software caused connection abort: recv failedjava.net.SocketException: Software caused connection abort: recv failedat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:422)at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1890)at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1885)at java.security.AccessController.doPrivileged(Native Method)at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1884)at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1457)at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)at name.dxliu.test.HttpTest.sendGet(HttpTest.java:39)at name.dxliu.test.HttpTest.main(HttpTest.java:66)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:497)at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)Caused by: java.net.SocketException: Software caused connection abort: recv failedat java.net.SocketInputStream.socketRead0(Native Method)at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)at java.net.SocketInputStream.read(SocketInputStream.java:188)at java.net.SocketInputStream.read(SocketInputStream.java:141)at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)at java.io.BufferedInputStream.read(BufferedInputStream.java:345)at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:675)at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)at sun.net.www.protocol.http.HttpURLConnection.getHeaderFields(HttpURLConnection.java:2966)at name.dxliu.test.HttpTest.sendGet(HttpTest.java:32)... 6 moreProcess finished with exit code 0

2.问题解决。
  在咨询了实验室同学无果之后,尝试向DBPedia的Support咨询。问题依然没有解决。后来实验室一同学提醒有可能是GW防火墙
的问题,于是我将程序打包到Google Cloud Shell(Google cloud shell的使用,这里不作说明,大概需要在GAE上申请账
号,创建GAE应用,然后方可使用。一个可能有用的连接为https://console.cloud.google.com/)上执行,发现异常查询
能够正常执行。于是下结论,是防火墙锅。
 
  后来将结论发布在了DBPedia的Support社区,社区人员提示我可以测试下,是查询中的某个实体触发了防火墙屏蔽还是
DBPedia的名空间被屏蔽了。后来我将查询改成如下两句:
"select ?p ?o where { <http://dbpedia.org/resource/China> ?p ?o .}"
和"select ?s where { ?s a <http://www.w3.org/2002/07/owl#Thing> } LIMIT 100" 第一条语句里面将
'Alabama'换成了'China'第二条语句里面没有了DBPedia的名空间`http://dbpedia.org/resource/`
结果显示,第一条仍然跑出Connection Reset异常,而第二条语句正常执行。可见GW是把DBPedia的名空间都给屏蔽了。
参考链接贴出我在DBPedia Support中提出的问题链接:https://dbpedia.atlassian.net/wiki/questions/7441078/problem-occurred-during-making-http-request-to-dbpedia-endpoint

                                             
1 0
原创粉丝点击