03_10Pandas_数据合并concat

来源:互联网 发布:网络女胖子图片搞笑 编辑:程序博客网 时间:2024/05/17 03:28

注意concat与merge的区别,concat是沿轴方向将多个对象合并到一起。

numpy 和 pandas里都有实现concat的函数与功能。

import numpy as npimport pandas as pd

NumPy的concat

# 创建两个DataFramearr1 = np.random.randint(0, 10, (3, 4))arr2 = np.random.randint(0, 10, (3, 4))print arr1print arr2
[[6 5 2 1] [9 4 2 0] [1 6 0 2]][[3 8 1 5] [3 2 1 9] [2 8 4 8]]
# 调用.concatenate, 并将两个ndarray组成list传入,默认是纵向合并print np.concatenate([arr1, arr2])
[[6 5 2 1] [9 4 2 0] [1 6 0 2] [3 8 1 5] [3 2 1 9] [2 8 4 8]]
# 指定轴方向,axis=1时是横向合并print np.concatenate([arr1, arr2], axis=1)
[[6 5 2 1 3 8 1 5] [9 4 2 0 3 2 1 9] [1 6 0 2 2 8 4 8]]

Series上的concat

index 没有重复的情况

# index 没有重复的情况ser_obj1 = pd.Series(np.random.randint(0, 10, 5), index=range(0,5))ser_obj2 = pd.Series(np.random.randint(0, 10, 4), index=range(5,9))ser_obj3 = pd.Series(np.random.randint(0, 10, 3), index=range(9,12))print ser_obj1print ser_obj2print ser_obj3
0    91    82    83    04    1dtype: int645    76    57    88    0dtype: int649     110    711    7dtype: int64
# 调用.concat,并将两个DataFrame组成list传入,默认纵向合并pd.concat([ser_obj1, ser_obj2, ser_obj3])
0     91     82     83     04     15     76     57     88     09     110    711    7dtype: int64
# 设置为横向合并print pd.concat([ser_obj1, ser_obj2, ser_obj3], axis=1)
      0    1    20   9.0  NaN  NaN1   8.0  NaN  NaN2   8.0  NaN  NaN3   0.0  NaN  NaN4   1.0  NaN  NaN5   NaN  7.0  NaN6   NaN  5.0  NaN7   NaN  8.0  NaN8   NaN  0.0  NaN9   NaN  NaN  1.010  NaN  NaN  7.011  NaN  NaN  7.0

index 有重复的情况

# index 有重复的情况ser_obj1 = pd.Series(np.random.randint(0, 10, 5), index=range(5))ser_obj2 = pd.Series(np.random.randint(0, 10, 4), index=range(4))ser_obj3 = pd.Series(np.random.randint(0, 10, 3), index=range(3))print ser_obj1print ser_obj2print ser_obj3
0    71    42    53    14    8dtype: int640    31    62    83    2dtype: int640    91    12    8dtype: int64
# 合并后索引保持不变print pd.concat([ser_obj1, ser_obj2, ser_obj3])
0    71    42    53    14    80    31    62    83    20    91    12    8dtype: int64
# 相当于多个Series的内链接print pd.concat([ser_obj1, ser_obj2, ser_obj3], axis=1, join='inner')
   0  1  20  7  3  91  4  6  12  5  8  8

DataFrame上的concat

df_obj1 = pd.DataFrame(np.random.randint(0, 10, (3, 2)), index=['a', 'b', 'c'],                       columns=['A', 'B'])df_obj2 = pd.DataFrame(np.random.randint(0, 10, (2, 2)), index=['a', 'b'],                       columns=['C', 'D'])print df_obj1print df_obj2
   A  Ba  2  0b  5  0c  4  9   C  Da  4  7b  9  9
print pd.concat([df_obj1, df_obj2])
     A    B    C    Da  2.0  0.0  NaN  NaNb  5.0  0.0  NaN  NaNc  4.0  9.0  NaN  NaNa  NaN  NaN  4.0  7.0b  NaN  NaN  9.0  9.0
print pd.concat([df_obj1, df_obj2], axis=1)
   A  B    C    Da  2  0  4.0  7.0b  5  0  9.0  9.0c  4  9  NaN  NaN

注:部分例子来自于小象学院Robin课程

0 0
原创粉丝点击