df.apply

来源：互联网发布：php网页爬虫源码编辑：程序博客网时间：2024/06/08 12:28
Signature: df5.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)Docstring:Applies function along input axis of DataFrame.Objects passed to functions are Series objects having indexeither the DataFrame's index (axis=0) or the columns (axis=1).Return type depends on whether passed function aggregates, or thereduce argument if the DataFrame is empty.Parameters----------func : function    Function to apply to each column/rowaxis : {0 or 'index', 1 or 'columns'}, default 0    * 0 or 'index': apply function to each column    * 1 or 'columns': apply function to each rowbroadcast : boolean, default False    For aggregation functions, return object of same size with values    propagatedraw : boolean, default False    If False, convert each row or column into a Series. If raw=True the    passed function will receive ndarray objects instead. If you are    just applying a NumPy reduction function this will achieve much    better performancereduce : boolean or None, default None    Try to apply reduction procedures. If the DataFrame is empty,    apply will use reduce to determine whether the result should be a    Series or a DataFrame. If reduce is None (the default), apply's    return value will be guessed by calling func an empty Series (note:    while guessing, exceptions raised by func will be ignored). If    reduce is True a Series will always be returned, and if False a    DataFrame will always be returned.args : tuple    Positional arguments to pass to function in addition to the    array/seriesAdditional keyword arguments will be passed as keywords to the functionNotes-----In the current implementation apply calls func twice on thefirst column/row to decide whether it can take a fast or slowcode path. This can lead to unexpected behavior if func hasside-effects, as they will take effect twice for the firstcolumn/row.Examples-------->>> df.apply(numpy.sqrt) # returns DataFrame>>> df.apply(numpy.sum, axis=0) # equiv to df.sum(0)>>> df.apply(numpy.sum, axis=1) # equiv to df.sum(1)
#用函数添加列
df = DataFrame ({'a' : np.random.randn(6),             'b' : ['foo', 'bar'] * 3,             'c' : np.random.randn(6)})
df['Value'] = df.apply(lambda row: my_test(row['a'], row['c']), axis=1)
添加定制的峰谷平电价
df5['DFNY']=df5['DFNY'].astype('str').str[:-2]
def df_play(df):
    if df['DFNY']=='201601':
        if df['用电类别']=='大工业用电':
            df['峰电价']=0.9253
            df['平电价']=0.5608
            df['谷电价']=0.2804
        elif df['用电类别']=='普通工业':
            df['峰电价']=1.2524
            df['平电价']=0.759
            df['谷电价']=0.3795
    if df['DFNY'] in [str(i) for i in range(201602,201606)]:
        if df['用电类别']=='大工业用电':
            df['峰电价']=0.9253
            df['平电价']=0.5608
            df['谷电价']=0.2804
        elif df['用电类别']=='普通工业':
            df['峰电价']=1.2524
            df['平电价']=0.759
            df['谷电价']=0.3795 
    if df['DFNY']in [str(i) for i in range(201606,201613)]:
        if df['用电类别']=='大工业用电':
            df['峰电价']=0.8976
            df['平电价']=0.544
            df['谷电价']=0.272
        elif df['用电类别']=='普通工业':
            df['峰电价']=1.2246
            df['平电价']=0.7422
            df['谷电价']=0.3711        
    return df
df6=df5.apply(df_play,axis=1)
阅读全文
0 0