python 取出 Mongdb 中的数据 转化成DataFrame 然后用pandas处理数据

来源:互联网 发布:线切割机床的编程格式 编辑:程序博客网 时间:2024/06/08 15:12

这段时间再玩python ,数据源来源于mongdb ,数据处理方式用的是pandas

刚开始是用的一个比较麻烦的转化,直接上代码:

方法一:

import pandas as pdfrom pymongo import MongoClientclient = MongoClient('192.168.1.5',10070)db = client.dbtestcollection=db.data_tableitems = collection.find()dateId = []ai_type = []ai_name = []quorum = []priceUSD = []ai_disageform = []country = []continent  = []company = []ai_cap_tr = []n = 0for i in items:     n= n+1     print("正在输出 %s 条"%n)     keys = i.keys()     if 'ai_disageform' in keys:         ai_disageform.append(i['ai_disageform'])     else:         ai_disageform.append('')     if 'date' in keys:         t = str(i['date'])         dateId.append(t[:10])     else:         dateId.append('')     if 'ai_type' in keys:         ai_type.append(i['ai_type'])     else:         ai_type.append('')     if 'continent' in keys:         continent.append(i['continent'])     else:         continent.append('')     if 'quorum' in keys:         quorum.append(i['quorum'])     else:         quorum.append('')     if 'priceUSD' in keys:         priceUSD.append(i['priceUSD'])     else:         priceUSD.append('')     if 'country' in keys:         country.append(i['country'])     else:         country.append('')     if 'ai_name' in keys:         ai_name.append(i['ai_name'])     else:         ai_name.append('')     if 'company' in keys:         company.append(i['company'])     else:         company.append('')     if 'ai_cap_tr' in keys:         ai_cap_tr.append(i['ai_cap_tr'])     else:         ai_cap_tr.append('')df = pd.DataFrame({'dateId':dateId,                   'ai_type':ai_type,                   'ai_name':ai_name,                   'quorum':quorum,                   'priceUSD':priceUSD,                   'ai_disageform':ai_disageform,                   'country':country,                   'continent':continent,                   'ai_cap_tr':ai_cap_tr,                   'company':company})df.to_csv('../ncbdata/b.csv', encoding = "utf-8",index=None)

具体思路:经测验,每条记录是dict类型的,将每个键里的值放到不同的数组中,然后创建dataframe对象。

方法二:

import pandas as pdimport numpy as npimport  pymongofrom pymongo import MongoClientimport json#连接mongdbdef connectMongdb():    client = MongoClient('192.168.1.5',10070)    db = client.dbtest    collection = db.data_table    items = collection.find()    return items#转化为dfdef tran_df():    items = connectMongdb()    temp = []    for dict in items:        del dict['_id']        dict['date'] = dict['date'].strftime("%Y-%m-%d")        temp.append(dict)    data_employee = pd.read_json(json.dumps(temp))    data_employee_ri = data_employee.reindex(columns=['date', 'ai_type', 'ai_name'])    data_employee_ri.to_csv('data/a.csv')def main():    tran_df()if __name__ == "__main__":    main()

具体思路:将每一个字典放到一个数组里,然后通过read_json() 方法转化为df对象。

阅读全文
0 0