【跟着stackoverflow学Pandas】-How do I get the row count of a Pandas dataframe-获取DataFrame行数
来源:互联网 发布:八爪鱼采集器淘宝匿名 编辑:程序博客网 时间:2024/05/20 05:46
最近做一个系列博客,跟着stackoverflow学Pandas。
专栏地址:http://blog.csdn.net/column/details/16726.html
以 pandas作为关键词,在stackoverflow中进行搜索,随后安照 votes 数目进行排序:
https://stackoverflow.com/questions/tagged/pandas?sort=votes&pageSize=15
How do I get the row count of a Pandas dataframe-获取DataFrame行数
数据准备
import pandas as pdimport numpy as npdf = pd.DataFrame(np.random.randn(1000,3), columns=['col1', 'col2', 'col3'])df.iloc[::2,0] = np.nan
获取行数
df.shape # 得到df的行和列数#(1000, 3)df['col1'].count() #去除了NaN的数据# 500len(df.index)# 1000len(df)# 1000
时间测评
因为CPU采用了缓存优化,所以计算的时间并不是很准确,但是也有一定的代表性。
%timeit df.shape#The slowest run took 169.99 times longer than the fastest. This could mean that an intermediate result is being cached.#1000000 loops, best of 3: 947 ns per loop%timeit df['col1'].count()#The slowest run took 50.63 times longer than the fastest. This could mean that an intermediate result is being cached.#10000 loops, best of 3: 22.6 µs per loop%timeit len(df.index)#The slowest run took 14.11 times longer than the fastest. This could mean that an intermediate result is being cached.#1000000 loops, best of 3: 490 ns per loop%timeit len(df)#The slowest run took 18.61 times longer than the fastest. This could mean that an intermediate result is being cached.#1000000 loops, best of 3: 653 ns per loop
我们发现速度最快的是len(df.index)
方法, 其次是len(df)
最慢的是df['col1'].count()
,因为该函数需要去除NaN,当然结果也与其他结果不同,使用时需要格外注意。
阅读全文
0 0
- 【跟着stackoverflow学Pandas】-How do I get the row count of a Pandas dataframe-获取DataFrame行数
- 【跟着stackoverflow学Pandas】add one row in a pandas.DataFrame -DataFrame添加行
- 【跟着stackoverflow学Pandas】How to iterate over rows in a DataFrame in Pandas-DataFrame按行迭代
- 【跟着stackoverflow学Pandas】--Converting a Pandas GroupBy object to DataFrame-Groupby对象转换为DataFrame
- 【跟着stackoverflow学Pandas】 -Get list from pandas DataFrame column headers
- 【跟着stackoverflow学Pandas】Select rows from a DataFrame based on values in a column -pandas 筛选
- 【跟着stackoverflow学Pandas】Delete column from pandas DataFrame-删除列
- 【跟着stackoverflow学Pandas】
- 【跟着stackoverflow学Pandas】
- Pandas DataFrame
- Pandas(DataFrame)
- pandas-dataframe
- How do I get the reference count of a CLR object?
- pandas.DataFrame.any与pandas.DataFrame.all
- Python+Pandas 获取数据库并加入DataFrame
- Pandas之DataFrame操作
- Pandas.Dataframe使用小结
- Pandas之Dataframe操作
- Python绘制动画示例
- SpringMVC
- java丶设置时间停留,5s后执行下一步
- 慢特征分析(SFA)
- 关于内存对齐的计算方式
- 【跟着stackoverflow学Pandas】-How do I get the row count of a Pandas dataframe-获取DataFrame行数
- SpringMVC
- A+B Problem III
- OpenSSL中文手册之ASN1库详解(未完待续)
- composer 的安装和创建项目
- Matlab 图像处理之距离函数
- Akka简单性能分析
- effectIveC++的笔记之operator=
- Python数据挖掘与机器学习_通信信用风险评估实战(2)——数据预处理