删除重复元素 drop_duplicates()

来源:互联网 发布:tf卡数据恢复软件 编辑:程序博客网 时间:2024/06/01 16:59
import pandas as pddf = pd.read_excel("合并fitment.xlsx")print(len(df))skus = df.SKU.drop_duplicates()result = []for sku in skus:    df_sub = df[df.SKU == str(sku)]    makes = df_sub.Make.drop_duplicates()    for make in makes :        df_sub_sub = df_sub[df_sub.Make == make]        models = df_sub_sub.Model.drop_duplicates()        for model in models:            df_sub_sub_sub = df_sub_sub[df_sub_sub.Model ==model]            year = df_sub_sub_sub.Year            year_min = year.min()            year_max = year.max()            arr = [year_min, "-",year_max , make , model]            s = ""+str(year_min)+" - "+str(year_max)+" "+ str(make)+" "+str(model)            result.append([sku , s])df = pd.DataFrame(result , columns=["SKU","fitment"])df.to_csv("fitment_combine.csv" , index=False)

原创粉丝点击