MongoDb学习笔记三 MongoDB and PyMongo

来源:互联网 发布:shell脚本调用java类 编辑:程序博客网 时间:2024/05/16 23:39

英文原文:http://api.mongodb.org/python/current/tutorial.html

翻译的不好还请大家见谅,呵呵,翻译也是一种学习,当你翻译过后,基本上也就了解了,好了不废话了,见译文,如下:

本文主要是做为MongoDb的python客户端使用的简单介绍

1.所需条件

  开始之前一定要保证正确安装了python和PyMongo,如果正确安装了,运行下面脚本不会抛出异常

  >>> import pymongo

  MongoDB 实例是运行在默认host和port,如果已经正确安装了MongoDb,可以启动服务进程,如下:

 $ mongod 

2.创建一个连接

  使用PyMongo 的第一步创建一个连接来执行mongod实例操作,如下:

  >>> from pymongo import Connection

  >>> connection = Connection()

  以上代码将创建一个默认主机和默认端口的连接,我们也可以指定主机和端口,如下:

  >>> connection = Connection('localhost',27017)

3.指定数据库

  一个MongoDB 实例能同时支持多个数据库,用PyMongo 连接时可以使用属性风格来取得一个数据库的连接,如下:

  >>> db =connection.test_database

  如果以上属性风格的不能工作,可以通过字典风格来替换,如下:

   >>> db =connection['test-database'] 

4.指定集合

   一个集合是一组存储在mongodb中的文档,集合相当于传统关系数据库的表结构,指定一个集合

  >>> collection = db.test_collection
  或者使用字典风格
  >>> collection = db['test-collection']
  一个很重要的一点对于集合和数据库,在mongodb里都是延迟创建,只有等到第一个文档插入的时候才创建数据库对象和集合。

5.文档

 在mongodb中数据是以json格式表示存储,在PyMongo 我们使用字典来表示文档,例如一个博客的文档,如下:

>>> import datetime>>> post = {"author": "Mike",...         "text": "My first blog post!",...         "tags": ["mongodb", "python", "pymongo"],...         "date": datetime.datetime.utcnow()}

文档能包含原生的python类型,像datetime.datetime实例,将被自动转换到BSON类型。

6.插入一个文档

  插入一个文档使用 insert() 方法,如下:

>>> posts = db.posts>>> posts.insert(post)ObjectId('...')
当文档被插入的时候,一个特殊的键 _id 是自动被添加到文档中,_id 是唯一的不可重复的,可以自己指定,如果没有指定,mongodb会自动生成一个唯一的值。

插入文档后,集合和数据库才被创建。


下面几点都比较简单,可以直接看英文文档,如下:


Getting a Single Document With find_one()

The most basic type of query that can be performed in MongoDB isfind_one(). This method returns asingle document matching a query (or None if there are nomatches). It is useful when you know there is only one matchingdocument, or are only interested in the first match. Here we usefind_one() to get the firstdocument from the posts collection:

>>> posts.find_one(){u'date': datetime.datetime(...), u'text': u'My first blog post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'mongodb', u'python', u'pymongo']}

The result is a dictionary matching the one that we inserted previously.

Note

The returned document contains an "_id", which wasautomatically added on insert.

find_one() also supports queryingon specific elements that the resulting document must match. To limitour results to a document with author “Mike” we do:

>>> posts.find_one({"author": "Mike"}){u'date': datetime.datetime(...), u'text': u'My first blog post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'mongodb', u'python', u'pymongo']}

If we try with a different author, like “Eliot”, we’ll get no result:

>>> posts.find_one({"author": "Eliot"})

A Note On Unicode Strings

You probably noticed that the regular Python strings we stored earlier lookdifferent when retrieved from the server (e.g. u’Mike’ instead of ‘Mike’).A short explanation is in order.

MongoDB stores data in BSON format. BSON strings areUTF-8 encoded so PyMongo must ensure that any strings it stores contain onlyvalid UTF-8 data. Regular strings (<type ‘str’>) are validated and storedunaltered. Unicode strings (<type ‘unicode’>) are encoded UTF-8 first. Thereason our example string is represented in the Python shell as u’Mike’ insteadof ‘Mike’ is that PyMongo decodes each BSON string to a Python unicode string,not a regular str.

You can read more about Python unicode strings here.

Bulk Inserts

In order to make querying a little more interesting, let’s insert afew more documents. In addition to inserting a single document, we canalso performbulk insert operations, by passing an iterable as thefirst argument to insert(). Thiswill insert each document in the iterable, sending only a singlecommand to the server:

>>> new_posts = [{"author": "Mike",...               "text": "Another post!",...               "tags": ["bulk", "insert"],...               "date": datetime.datetime(2009, 11, 12, 11, 14)},...              {"author": "Eliot",...               "title": "MongoDB is fun",...               "text": "and pretty easy too!",...               "date": datetime.datetime(2009, 11, 10, 10, 45)}]>>> posts.insert(new_posts)[ObjectId('...'), ObjectId('...')]

There are a couple of interesting things to note about this example:

  • The call to insert() nowreturns twoObjectId instances, one foreach inserted document.
  • new_posts[1] has a different “shape” than the other posts -there is no"tags" field and we’ve added a new field,"title". This is what we mean when we say that MongoDB isschema-free.

Querying for More Than One Document

To get more than a single document as the result of a query we use thefind()method.find() returns aCursor instance, which allows us to iterateover all matching documents. For example, we can iterate over everydocument in theposts collection:

>>> for post in posts.find():...   post...{u'date': datetime.datetime(...), u'text': u'My first blog post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'mongodb', u'python', u'pymongo']}{u'date': datetime.datetime(2009, 11, 12, 11, 14), u'text': u'Another post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'bulk', u'insert']}{u'date': datetime.datetime(2009, 11, 10, 10, 45), u'text': u'and pretty easy too!', u'_id': ObjectId('...'), u'author': u'Eliot', u'title': u'MongoDB is fun'}

Just like we did with find_one(),we can pass a document tofind()to limit the returned results. Here, we get only those documents whoseauthor is “Mike”:

>>> for post in posts.find({"author": "Mike"}):...   post...{u'date': datetime.datetime(...), u'text': u'My first blog post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'mongodb', u'python', u'pymongo']}{u'date': datetime.datetime(2009, 11, 12, 11, 14), u'text': u'Another post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'bulk', u'insert']}

Counting

If we just want to know how many documents match a query we canperform a count() operation instead of afull query. We can get a count of all of the documents in acollection:

>>> posts.count()3

or just of those documents that match a specific query:

>>> posts.find({"author": "Mike"}).count()2

Range Queries

MongoDB supports many different types of advanced queries. As anexample, lets perform a query where we limit results to posts olderthan a certain date, but also sort the results by author:

>>> d = datetime.datetime(2009, 11, 12, 12)>>> for post in posts.find({"date": {"$lt": d}}).sort("author"):...   post...{u'date': datetime.datetime(2009, 11, 10, 10, 45), u'text': u'and pretty easy too!', u'_id': ObjectId('...'), u'author': u'Eliot', u'title': u'MongoDB is fun'}{u'date': datetime.datetime(2009, 11, 12, 11, 14), u'text': u'Another post!', u'_id': ObjectId('...'), u'author': u'Mike', u'tags': [u'bulk', u'insert']}

Here we use the special "$lt" operator to do a range query, andalso callsort() to sort the resultsby author.

Indexing

To make the above query fast we can add a compound index on"date" and"author". To start, lets use theexplain() method to get some informationabout how the query is being performed without the index:

>>> posts.find({"date": {"$lt": d}}).sort("author").explain()["cursor"]u'BasicCursor'>>> posts.find({"date": {"$lt": d}}).sort("author").explain()["nscanned"]3

We can see that the query is using the BasicCursor and scanning overall 3 documents in the collection. Now let’s add a compound index andlook at the same information:

>>> from pymongo import ASCENDING, DESCENDING>>> posts.create_index([("date", DESCENDING), ("author", ASCENDING)])u'date_-1_author_1'>>> posts.find({"date": {"$lt": d}}).sort("author").explain()["cursor"]u'BtreeCursor date_-1_author_1'>>> posts.find({"date": {"$lt": d}}).sort("author").explain()["nscanned"]2

Now the query is using a BtreeCursor (the index) and only scanningover the 2 matching documents.

翻译真的需要耐心....................................呵呵





原创粉丝点击