新浪微博python API的使用

来源:互联网 发布:给大一新生的建议知乎 编辑:程序博客网 时间:2024/05/16 05:31

一.注册应用

网上有很多关于使用weibo API的帖子,但是很遗憾的是,大多没有很详细的介绍,还有的就是版本太老了。

关于注册的部分,我就不写了,按照新浪给的步骤走就OK了,需要注意的是,现在申请应用,需要给一个回调页面(callback url),因为新浪新版的API使用oauth2.0进行验证,传递的参数是需要使用callback url的。

但是,我想大多数人都是callback url的,我也没有,如果你有callback url,能正常申请APP的话,那么这篇文章就可以不看了(新浪定义的回调页面,是要求可以再公网访问的,localhost和局域网页面都是不行的)。

如果你没有url,或者像我一样,只是想用API来爬数据的话,那么你随便填写一个url,然后注册就可以了,我这篇文章里面使用的验证方式是oauth1.0的,不需要callback url。


二.SDK下载

我是使用python调用API爬数据的,所以下文的所有内容都是针对python的,其他语言的话,我不是很清楚了。

1.下载官网提供的SDK

http://michaelliao.github.com/sinaweibopy/

这是官方给的SDK下载地址,进入github,下载包michaelliao-sinaweibopy-XXXXX,其中有用的就是那个weibo.py文件了,其他的文件应该是用来生成sample的吧,但是很遗憾的是,里面的代码提供的接口需要使用oauth2.0进行验证,我们这种屌丝,没有callback页面,所以用不了,这条路pass掉。

2.使用老版本的SDK

老版本的新浪weibo SDK是在google code上面维护的,地址在http://code.google.com/p/sinaweibopy/wiki/Download,下载OAuth 1源码,这个包里提供的API是使用oauth 1进行验证的,不需要使用callback页面,然后在下载页面的左边那排标签里面,有个使用文旦,地址是http://code.google.com/p/sinaweibopy/wiki/OAuth1,进去之后,有很详细的使用教程。


三.使用API

虽然文档很详细,但是。。。我当时还是没有看明白,在网上找了个sample,然后改了改,大致明白了怎么使用oauth 1的SDK


#!/usr/bin/env python# -*- coding: utf-8 -*-__version__ = '1.0'__author__ = 'Liao Xuefeng (askxuefeng@gmail.com)''''Python client SDK for sina weibo API using OAuth 1.0'''try:    import jsonexcept ImportError:    import simplejson as jsonimport timeimport hmacimport uuidimport base64import urllibimport urllib2import hashlibimport logging_OAUTH_SIGN_METHOD = 'HMAC-SHA1'_OAUTH_VERSION = '1.0'class OAuthToken(object):    def __init__(self, oauth_token, oauth_token_secret, oauth_verifier=None, **kw):        self.oauth_token = oauth_token        self.oauth_token_secret = oauth_token_secret        self.oauth_verifier = oauth_verifier        for k, v in kw.iteritems():            setattr(self, k, v)    def __str__(self):        attrs = [s for s in dir(self) if not s.startswith('__')]        kvs = ['%s = %s' % (k, getattr(self, k)) for k in attrs]        return ', '.join(kvs)    __repr__ = __str__class APIClient(object):    def __init__(self, app_key, app_secret, token=None, callback=None, domain='api.t.sina.com.cn'):        self.app_key = str(app_key)        self.app_secret = str(app_secret)        if token:            if isinstance(token, OAuthToken):                if token.oauth_token:                    self.oauth_token = token.oauth_token                if token.oauth_token_secret:                    self.oauth_token_secret = token.oauth_token_secret                if token.oauth_verifier:                    self.oauth_verifier = token.oauth_verifier            else:                raise TypeError('token parameter must be instance of OAuthToken.')        self.callback = callback        self.api_url = 'http://%s' % domain        self.get = HttpObject(self, _HTTP_GET)        self.post = HttpObject(self, _HTTP_POST)    def _oauth_request(self, method, url, **kw):        params = dict( \                oauth_consumer_key=self.app_key, \                oauth_nonce=_generate_nonce(), \                oauth_signature_method=_OAUTH_SIGN_METHOD, \                oauth_timestamp=str(int(time.time())), \                oauth_version=_OAUTH_VERSION, \                oauth_token=self.oauth_token)        params.update(kw)        m = 'GET' if method==_HTTP_GET else 'POST'        bs = _generate_base_string(m, url, **params)        key = '%s&%s' % (self.app_secret, self.oauth_token_secret)        oauth_signature = _generate_signature(key, bs)        print 'params:', params        print 'base string:', bs        print 'key:', key, 'sign:', oauth_signature        print 'url:', url        r = _http_call(url, method, self.__build_oauth_header(params, oauth_signature=oauth_signature), **kw)        return r    def get_request_token(self):        '''        Step 1: request oauth token.        Returns:          OAuthToken object contains oauth_token and oauth_token_secret        '''        params = dict(oauth_callback=self.callback, \                oauth_consumer_key=self.app_key, \                oauth_nonce=_generate_nonce(), \                oauth_signature_method=_OAUTH_SIGN_METHOD, \                oauth_timestamp=str(int(time.time())), \                oauth_version=_OAUTH_VERSION)        url = '%s/oauth/request_token' % self.api_url        bs = _generate_base_string('GET', url, **params)        params['oauth_signature'] = base64.b64encode(hmac.new('%s&' % self.app_secret, bs, hashlib.sha1).digest())        r = _http_call(url, _HTTP_GET, return_json=False, **params)        kw = _parse_params(r, False)        return OAuthToken(**kw)    def get_authorize_url(self, oauth_token):        '''        Step 2: get authorize url and redirect to it.        Args:          oauth_token: oauth_token str that returned from request_token:                       oauth_token = client.request_token().oauth_token        Returns:          redirect url, e.g. "http://api.t.sina.com.cn/oauth/authorize?oauth_token=ABCD1234XYZ"        '''        return '%s/oauth/authorize?oauth_token=%s' % (self.api_url, oauth_token)    def get_access_token(self):        '''        get access token from request token:        request_token = OAuthToken(oauth_token, oauth_secret, oauth_verifier)        client = APIClient(appkey, appsecret, request_token)        access_token = client.get_access_token()        '''        params = {            'oauth_consumer_key': self.app_key,            'oauth_timestamp': str(int(time.time())),            'oauth_nonce': _generate_nonce(),            'oauth_version': _OAUTH_VERSION,            'oauth_signature_method': _OAUTH_SIGN_METHOD,            'oauth_token': self.oauth_token,            'oauth_verifier': self.oauth_verifier,        }        url = '%s/oauth/access_token' % self.api_url        bs = _generate_base_string('GET', url, **params)        key = '%s&%s' % (self.app_secret, self.oauth_token_secret)        oauth_signature = _generate_signature(key, bs)        authorization = self.__build_oauth_header(params, oauth_signature=oauth_signature)        r = _http_call(url, _HTTP_GET, authorization, return_json=False)        kw = _parse_params(r, False)        return OAuthToken(**kw)    def __build_oauth_header(self, params, **kw):        '''        build oauth header like: Authorization: OAuth oauth_token="xxx", oauth_nonce="123"        Args:          params: parameter dict.          **kw: any additional key-value parameters.        '''        d = dict(**kw)        d.update(params)        L = [r'%s="%s"' % (k, v) for k, v in d.iteritems() if k.startswith('oauth_')]        return 'OAuth %s' % ', '.join(L)    def __getattr__(self, attr):        ' a shortcut for client.get.funcall() to client.funcall() '        return getattr(self.get, attr)def _obj_hook(pairs):    '''    convert json object to python object.    '''    o = JsonObject()    for k, v in pairs.iteritems():        o[str(k)] = v    return oclass APIError(StandardError):    '''    raise APIError if got failed json message.    '''    def __init__(self, error_code, error, request):        self.error_code = error_code        self.error = error        self.request = request        StandardError.__init__(self, error)    def __str__(self):        return 'APIError: %s: %s, request: %s' % (self.error_code, self.error, self.request)class JsonObject(dict):    '''    general json object that can bind any fields but also act as a dict.    '''    def __getattr__(self, attr):        return self[attr]    def __setattr__(self, attr, value):        self[attr] = valuedef _encode_multipart(**kw):    '''    Build a multipart/form-data body with generated random boundary.    '''    boundary = '----------%s' % hex(int(time.time() * 1000))    data = []    for k, v in kw.iteritems():        data.append('--%s' % boundary)        if hasattr(v, 'read'):            # file-like object:            ext = ''            filename = getattr(v, 'name', '')            n = filename.rfind('.')            if n != (-1):                ext = filename[n:].lower()            content = v.read()            data.append('Content-Disposition: form-data; name="%s"; filename="hidden"' % k)            data.append('Content-Length: %d' % len(content))            data.append('Content-Type: %s\r\n' % _guess_content_type(ext))            data.append(content)        else:            data.append('Content-Disposition: form-data; name="%s"\r\n' % k)            data.append(v.encode('utf-8') if isinstance(v, unicode) else v)    data.append('--%s--\r\n' % boundary)    return '\r\n'.join(data), boundary_CONTENT_TYPES = { '.png': 'image/png', '.gif': 'image/gif', '.jpg': 'image/jpeg', '.jpeg': 'image/jpeg', '.jpe': 'image/jpeg' }def _guess_content_type(ext):    return _CONTENT_TYPES.get(ext, 'application/octet-stream')_HTTP_GET = 0_HTTP_POST = 1_HTTP_UPLOAD = 2def _http_call(url, method, authorization=None, return_json=True, **kw):    '''    send an http request and return headers and body if no error.    '''    params = None    boundary = None    if method==_HTTP_UPLOAD:        params, boundary = _encode_multipart(**kw)    else:        params = _encode_params(**kw)    http_url = '%s?%s' % (url, params) if method==_HTTP_GET and params else url    http_body = None if method==_HTTP_GET else params    req = urllib2.Request(http_url, data=http_body)    if authorization:        print 'Authorization:', authorization        req.add_header('Authorization', authorization)    if boundary:        req.add_header('Content-Type', 'multipart/form-data; boundary=%s' % boundary)    print method, http_url, 'BODY:', http_body    resp = urllib2.urlopen(req)    body = resp.read()    if return_json:        r = json.loads(body, object_hook=_obj_hook)        if hasattr(r, 'error_code'):            raise APIError(r.error_code, getattr(r, 'error', ''), getattr(r, 'request', ''))        return r    return bodyclass HttpObject(object):    def __init__(self, client, method):        self.client = client        self.method = method    def __getattr__(self, attr):        def wrap(**kw):            return self.client._oauth_request(self.method, '%s/%s.json' % (self.client.api_url, attr.replace('__', '/')), **kw)        return wrap################################################################################# utility functions################################################################################def _parse_params(params_str, unicode_value=True):    '''    parse a query string as JsonObject (also a dict)    Args:        params_str: query string as str.        unicode_value: return unicode value if True, otherwise str value. default true.    Returns:        JsonObject (inherited from dict)        >>> s = _parse_params('a=123&b=X%26Y&c=%E4%B8%AD%E6%96%87')    >>> s.a    u'123'    >>> s.b    u'X&Y'    >>> s.c==u'\u4e2d\u6587'    True    >>> s = _parse_params('a=123&b=X%26Y&c=%E4%B8%AD%E6%96%87', False)    >>> s.a    '123'    >>> s.b    'X&Y'    >>> s.c=='\xe4\xb8\xad\xe6\x96\x87'    True    >>> s.d #doctest: +IGNORE_EXCEPTION_DETAIL    Traceback (most recent call last):      ...    KeyError:    '''    d = dict()    for s in params_str.split('&'):        n = s.find('=')        if n>0:            key = s[:n]            value = urllib.unquote(s[n+1:])            d[key] = value.decode('utf-8') if unicode_value else value    return JsonObject(**d)def _encode_params(**kw):    '''    Encode parameters.    '''    if kw:        args = []        for k, v in kw.iteritems():            qv = v.encode('utf-8') if isinstance(v, unicode) else str(v)            args.append('%s=%s' % (k, _quote(qv)))        return '&'.join(args)    return ''def _quote(s):    '''    quote everything including /        >>> _quote(123)    '123'    >>> _quote(u'\u4e2d\u6587')    '%E4%B8%AD%E6%96%87'    >>> _quote('/?abc=def& _+%')    '%2F%3Fabc%3Ddef%26%20_%2B%25'    '''    if isinstance(s, unicode):        s = s.encode('utf-8')    return urllib.quote(str(s), safe='')def _generate_nonce():    ' generate random uuid as oauth_nonce '    return uuid.uuid4().hexdef _generate_signature(key, base_string):    '''    generate url-encoded oauth_signature with HMAC-SHA1    '''    return _quote(base64.b64encode(hmac.new(key, base_string, hashlib.sha1).digest()))def _generate_base_string(method, url, **params):    '''    generate base string for signature        >>> method = 'GET'    >>> url = 'http://www.sina.com.cn/news'    >>> params = dict(a=1, b='A&B')    >>> _generate_base_string(method, url, **params)    'GET&http%3A%2F%2Fwww.sina.com.cn%2Fnews&a%3D1%26b%3DA%2526B'    '''    plist = [(_quote(k), _quote(v)) for k, v in params.iteritems()]    plist.sort()    return '%s&%s&%s' % (method, _quote(url), _quote('&'.join(['%s=%s' % (k, v) for k, v in plist])))if __name__=='__main__':    import doctest    doctest.testmod()

上面这段代码,就是你从http://code.google.com/p/sinaweibopy/wiki/Download下载的SDK,实际上就是个文件weibo1.py,如果你没法下载,可以直接从这里复制过去。


下面是我写的一个sample

#!/usr/bin/env python# -*- coding: utf8 -*-from weibo1 import APIClient, OAuthTokenimport MySQLdbimport time#通过提供的账号和密码,返回APIClient对象实例def GetBlogClient(uname, passw):    APP_KEY = u'XXXXXXXXXX' # app key    APP_SECRET = u'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' # app secret    #实例化APIClient    client = APIClient(app_key=APP_KEY, app_secret=APP_SECRET)    #获取OAuth request token    reqToken = client.get_request_token()    #用户授权url    auth_url =  client.get_authorize_url(reqToken)    post_data = urllib.urlencode({            "action": "submit",            "forcelogin": "",            "from": "",            "oauth_callback" : "http://api.weibo.com/oauth2/default.html",            "oauth_token" : reqToken.oauth_token,            "passwd" : passw,            "regCallback": "",            "ssoDoor": "",            "userId" : uname,            "vdCheckflag" : 1,            "vsnval":""        })    mat = re.search(                r'&oauth_verifier=(.+)',                urllib2.urlopen(urllib2.Request(                    "http://api.t.sina.com.cn/oauth/authorize",                    post_data,                    headers={                        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1)',                        'Referer': auth_url                    }                )).url            )    if mat:        client = APIClient(                APP_KEY,                APP_SECRET,                OAuthToken(                    reqToken.oauth_token,                    reqToken.oauth_token_secret,                    mat.group(1)                ))        #返回APIClient        return APIClient(APP_KEY, APP_SECRET, client.get_access_token())    else:        raise Exception()# end of class MyFrameif __name__ == "__main__":    client = GetBlogClient("XXXXXXXXXX","XXXXX")    r = client.get.statuses__user_timeline()    #使用APIstatuses/user/timeline


APP_KEY和 APP_SECRET填入你申请到的APP的key和secret,从新浪的网站上可以查询到。

 client = GetBlogClient("XXXXXXXXXX","XXXXX")
则填入你的微博用户名和密码就可以了。


四.各种问题

1.为什么用我的APP_KEY和 APP_SECRET配合微博账号运行失败?

其实,我申请了APP,然后运行也是失败的,我去找了一下原因,有人说是APP没有通过审核,需要将自己的微博账号添加到测试账号里面(添加工作可以再新浪的网站上完成),我修改了还不行,然后就去网上找了个KEY,然后运行,就没有问题了,我个人觉得应该是APP审核的问题,如果你设置了测试账号还不行,就和我一样去网上找个key吧。


2.为什么有的微博API能用有的用不了

需要注意的是,这个oauth1 SDK代码是配合API V1http://open.weibo.com/wiki/API%E6%96%87%E6%A1%A3使用的,现在最新的是API V2http://open.weibo.com/wiki/API%E6%96%87%E6%A1%A3_V2不支持这套SDK。

如果想使用V2的话,需要将weibo1.py中第44行的

def __init__(self, app_key, app_secret, token=None, callback=None, domain='api.t.sina.com.cn'):

改成

def __init__(self, app_key, app_secret, token=None, callback=None, domain='api.weibo.com'):
改完之后,V2的部分API也可以用了(部分的API可能还是用不了),但是API V1就不支持了。


3.就是不能用

如果这段代码怎么都不行的话,那么我觉得,可能是你使用这段代码的时间和我写这篇文章的时间相距太远了, 新浪又更新了API,所以,一切以新浪的官方文档为准http://open.weibo.com/wiki/%E9%A6%96%E9%A1%B5,当然,如果有什么问题的话,也可以给我留言。

原创粉丝点击