python-pandas功能大全

来源:互联网 发布:网址转换软件 编辑:程序博客网 时间:2024/05/17 08:36

查全手册

http://pan.baidu.com/s/1nvNmzkH

随机按照一定比例采样

将df分拆为df_sample和df_reset部分

df_sample = df.sample(frac = 0.7)df_reset = df.loc[~df.index.isin(df_sample.index)]

计算数目

dia_num = len(df[df['DiagGDM'] == 1])total_num = len(df)

改变类型

a = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]df = pd.DataFrame(a, columns=['one', 'two', 'three'])df[['two', 'three']] = df[['two', 'three']].astype(float)

将numpy顺序按行打乱

np.random.shuffle(train_data)np.random.shuffle(test_data)

官方文档10 Minutes to pandas

10 Minutes to pandas

打乱训练和测试样本

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)
0 0
原创粉丝点击