How to Calculate Timedelta in Months in Pandas
Last Updated :
05 Apr, 2022
The difference between two dates or times is represented as a timedelta object. The duration describes the difference between two dates, datetime, or time occurrences, while the delta means an average of the difference. One may estimate the time in the future and past by using timedelta. This difference between two dates when calculated in terms of months, it’s called time delta in months. Let’s demonstrate a few ways to calculate the time delta in months in pandas.
pandas.Series.dt.to_period() function:
Syntax:
Series.dt.to_period(*args, **kwargs)
converts datetime array to period array.
parameters:
freq= optional value, offset string or offset object
In this example, we read the time.csv and convert values in each column to DateTime. after converting columns to DateTime we use pandas.Series.dt.to_period() to calculate time delta in months. ‘M’ string in to_period() function symbolizes months. Month-end objects are returned.
CSV Used:
Python3
import pandas as pd
data = pd.read_csv( 'time.csv' )
data[ 'start_date' ] = pd.to_datetime(data[ 'start_date' ])
data[ 'end_date' ] = pd.to_datetime(data[ 'end_date' ])
data[ 'time_delta_months' ] = data[ 'end_date' ].dt.to_period( 'M' ) - \
data[ 'start_date' ].dt.to_period( 'M' )
print (data)
|
Output:
Method 2: Calculate Timedelta using months in integer
In the previous method, Monthends object is returned. If we want it to be in integer we have to convert it using the astype() function or by using view(dtype=’int64′).
Python3
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.read_csv( 'time.csv' )
data[ 'start_date' ] = pd.to_datetime(data[ 'start_date' ])
data[ 'end_date' ] = pd.to_datetime(data[ 'end_date' ])
data[ 'time_delta_months' ] = data[ 'end_date' ].dt.to_period( 'M' ).astype( int ) - \
data[ 'start_date' ].dt.to_period( 'M' ).astype( int )
print (data)
|
Output:
Example 2: Using .view(dtype=’int64′) to convert into integers
Python3
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.read_csv( 'time.csv' )
data[ 'start_date' ] = pd.to_datetime(data[ 'start_date' ])
data[ 'end_date' ] = pd.to_datetime(data[ 'end_date' ])
data[ 'time_delta_months' ] = data[ 'end_date' ].dt.to_period( 'M' ).view(dtype = 'int64' ) - \
data[ 'start_date' ].dt.to_period( 'M' ).view(dtype = 'int64' )
print (data)
|
Output:
Method 3: Calculate Timedelta using a user-defined function
Instead of using built-in functions, we can use our own user-defined function pd.Timestamp() function converts DateTime-like, str, int, or float time object to timestamp. Then we extract year and month values from the timestamps. as each year has 12 months we multiply 12 with the year difference and add the month difference.
Python3
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.DataFrame({ 'startdate' : [pd.Timestamp( '20181211' ),
pd.Timestamp( '20180701' )],
'enddate' : [pd.Timestamp( '20190612' ),
pd.Timestamp( '20190712' )]})
def time_delta_month(end, start):
return 12 * (end.dt.year - start.dt.year) \
+ (end.dt.month - start.dt.month)
print (time_delta_month(data[ 'enddate' ], data[ 'startdate' ]))
|
Output:
0 6
1 12
dtype: int64
Share your thoughts in the comments
Please Login to comment...