python数据分析1

来源:互联网 发布:js td赋值 编辑:程序博客网 时间:2024/06/12 19:43

collections之DataFrame和Series

DataFrame:用于把json字符串转化成表格形式

frame如果是DataFrame类型,那么可以把他看成一个表

其中frame['列名']得到的就是一列数据,也称之为Series

使用series.value_counts()可以得到数据出现的频度

 
frameOut[64]:                                                    a              al   c  \0  Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi...  en-US,en;q=0.8  US   1                             GoogleMaps/RochesterNY             NaN  US   2  Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ...           en-US  US   3  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8)...           pt-br  BR   4  Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi...  en-US,en;q=0.8  US              cy       g  gr       h          hc         hh         l  \0     Danvers  A6qOVH  MA  wfLQtf  1331822918  1.usa.gov   orofrog   1       Provo  mwszkS  UT  mwszkS  1308262393       j.mp     bitly   2  Washington  xxr3Qb  DC  xxr3Qb  1331919941  1.usa.gov     bitly   3        Braz  zCaLwp  27  zUtuOu  1331923068  1.usa.gov  alelex88   4  Shrewsbury  9b6kNl  MA  9b6kNl  1273672411     bit.ly     bitly                            ll  nk  \0   [42.576698, -70.954903]   1   1  [40.218102, -111.613297]   0   2     [38.9007, -77.043098]   1   3  [-23.549999, -46.616699]   0   4   [42.286499, -71.714699]   0                                                      r           t  \0  http://www.facebook.com/l/7AQEFzjSi/1.usa.gov/...  1331923247   1                           http://www.AwareMap.com/  1331923249   2                               http://t.co/03elZC4Q  1331923250   3                                             direct  1331923249   4                http://www.shrewsbury-ma.gov/selco/  1331923251                     tz                                                  u  0   America/New_York        http://www.ncbi.nlm.nih.gov/pubmed/22415991  1     America/Denver        http://www.monroecounty.gov/etc/911/rss.php  2   America/New_York  http://boxer.senate.gov/en/press/releases/0316...  3  America/Sao_Paulo            http://apod.nasa.gov/apod/ap120312.html  4   America/New_York  http://www.shrewsbury-ma.gov/egov/gallery/1341...  In [65]: frame['tz']Out[65]: 0     America/New_York1       America/Denver2     America/New_York3    America/Sao_Paulo4     America/New_YorkName: tz, dtype: objectIn [66]: frame['tz'].value_counts()Out[66]: America/New_York     3America/Sao_Paulo    1America/Denver       1Name: tz, dtype: int64


补上未知值的两个方法

clean_tz = frame['tz'].fillna("Missing")

clean_tz[clean_tz == ''] = "unknown"

0 0
原创粉丝点击