python - Improve performance of Datetime operation in pandas -
i have big data set need date operation, , taking long, wondering if there other way boost speed. data frame looks following:
date, month 2017-01-01, 0 2017-01-01, 1 2017-01-01, 2
i need create column adds month column date column, following:
date, month, newdate 2017-01-01, 0, 2017-01-01 2017-01-01, 1, 2017-02-01 2017-01-01, 2, 2017-03-01
my current method using apply function , relativedelta method like:
def newdatecalc(self, row): return row[0] + relativedelta(months = row[1])
df['newdate'] = df[['date', 'month']].apply(lambda row: newdatecalc(row), axis = 1)
thank in advance,
here vectorized attempt:
df['newdate'] = (df.date.values.astype('m8[m]') + df.month.values * np.timedelta64(1, 'm')).astype('m8[d]')
result:
in [106]: df out[106]: date month newdate 0 2017-01-01 0 2017-01-01 1 2017-01-01 1 2017-02-01 2 2017-01-01 2 2017-03-01
Comments
Post a Comment