pandas做数据分析(四):常用函数
来源:互联网 发布:朗动轮毂数据 编辑:程序博客网 时间:2024/05/22 02:25
一.统计信息
1.pandas.DataFrame.describe
DataFrame.describe(percentiles=None, include=None, exclude=None)
作用:
生成简要的统计信息,排除NaN值参数:
percentiles : array-like, 可选,optional
The percentiles to include in the output. Should all be in the interval [0, 1]. By default percentiles is [.25, .5, .75], returning the 25th, 50th, and 75th percentiles.
include, exclude : list-like, ‘all’, or None (default)
Specify the form of the returned result. Either:
None to both (default). The result will include only numeric-typed columns or, if none are, only categorical columns.
A list of dtypes or strings to be included/excluded. To select all numeric types use numpy numpy.number. To select categorical objects use type object. See also the select_dtypes documentation. eg. df.describe(include=[‘O’])
If include is the string ‘all’, the output column-set will match the input one.
Returns:
summary: NDFrame of summary statistics
See also DataFrame.select_dtypes
Notes
The output DataFrame index depends on the requested dtypes:
For numeric dtypes, it will include: count, mean, std, min, max, and lower, 50, and upper percentiles.
For object dtypes (e.g. timestamps or strings), the index will include the count, unique, most common, and frequency of the most common. Timestamps also include the first and last items.
For mixed dtypes, the index will be the union of the corresponding output types. Non-applicable entries will be filled with NaN. Note that mixed-dtype outputs can only be returned from mixed-dtype inputs and appropriate use of the include/exclude arguments.
If multiple values have the highest count, then the count and most common pair will be arbitrarily chosen from among those with the highest count.
The include, exclude arguments are ignored for Series.
三.绘图相关
1.pandas.DataFrame.hist
使用matplotlib来画出DataFrame的直方图.有多少个列,就会画出多少个子图.
DataFrame.hist(data, column=None, by=None, grid=True, xlabelsize=None, xrot=None, ylabelsize=None, yrot=None, ax=None, sharex=False, sharey=False, figsize=None, layout=None, bins=10, **kwds)
参数:
data : DataFrame
column : 字符串或者序列,如果传进去了,就只会画指定的这些列的直方图.
by : object, optional
If passed, then used to form histograms for separate groups
grid : 布尔值,默认是True,表示是否显示网格线.
xlabelsize : int, default None
If specified changes the x-axis label size
xrot : float, default None
rotation of x axis labels
ylabelsize : int, default None
If specified changes the y-axis label size
yrot : float, default None
rotation of y axis labels
ax : matplotlib axes object, default None
sharex : boolean, default True if ax is None else False
In case subplots=True, share x axis and set some x axis labels to invisible; defaults to True if ax is None otherwise False if an ax is passed in; Be aware, that passing in both an ax and sharex=True will alter all x axis labels for all subplots in a figure!
sharey : boolean, default False
In case subplots=True, share y axis and set some y axis labels to invisible
figsize : tuple
The size of the figure to create in inches by default
layout: (optional) a tuple (rows, columns) for the layout of the histograms
bins: 整形,默认是10.表示在直方图中箱线条的数量.
kwds : other plotting keyword arguments
To be passed to hist function
- pandas做数据分析(四):常用函数
- 数据分析处理库Pandas-常用函数
- 用Python做数据分析:Pandas常用数据查询语法
- pandas做数据分析(三):常用预处理操作
- pandas做数据分析(五):统计相关函数
- pandas常用的数据分析函数(一)
- Python数据分析模块 | pandas做数据分析(二):常用预处理操作
- Python数据分析模块 | pandas做数据分析(三):统计相关函数
- 利用Python数据分析:pandas入门(四)
- Python pandas数据分析中常用方法
- 数据分析之Pandas-03绘图函数
- pandas做数据分析(一):基本数据对象
- 用python做数据分析|pandas库:DataFrame基本操作
- 《Python数据分析常用手册》一、NumPy和Pandas篇
- Python数据分析常用手册——Numpy和Pandas
- 《Python数据分析常用手册》一、NumPy和Pandas篇
- 《Python数据分析常用手册》一、NumPy和Pandas篇
- 《Python数据分析常用手册》一、NumPy和Pandas篇
- 洛谷 2936_[USACO09JAN]全流Total Flow_网络流
- JQuery之学“queue()”+"dequeue"
- 海思3536:kernel编译和mpp_single编译过程报错及解决方法
- leetcode(1)Two Sum
- 机器学习->推荐系统->基于图的推荐算法(PersonalRank)
- pandas做数据分析(四):常用函数
- 类似PHP中的var_dump,Java中的输出调试函数
- Android布局生成图片并保存
- win10 uwp 验证输入 自定义用户控件
- SpringMVC中的注入参数问题
- 计算机网络笔记-黄能富-第2章-Ethernet
- Codeforces-786B-Legacy (线段树+最短路)
- ieTEST停止工作bug修复
- MAC 升级node.js的快捷方法