Pandas to_datetime() Usage Explained [Practical Examples]
This function converts a scalar, array-like, Series or DataFrame/dict-like to a pandas datetime object. The function accepts an iterable object (such as a Python list, tuple, Series, or index), converts its values to datetimes, and returns the new values in a DatetimeIndex.
pandas.to_datetime( dayfirst=False, yearfirst=False, utc=None, format=None)
- dayfirst — It is a boolean value, that represents true or false, will get the day first when it is true.
- yearfirst — It is a boolean value, that represents true or false, will get the year first when it is true.
- utc — It is used to get the UTC based on the time provided
- format — It is used to format the string in the given format —
%d represents date. %m represents month and % y represents the year.
Example-1. Convert String to DateTime
We can take a simple date output string and convert it to datetime. Consider this example where I have defined a date and then converted it to datetime output:
import pandas as pd # Define string date = '04/03/2021 11:23' # Convert string to datetime format date1 = pd.to_datetime(date) # print to_datetime output print(date1) # print day, month and year separately from the to_datetime output print("Day: ", date1.day) print("Month", date1.month) print("Year", date1.year)
2021-03-04 11:23:00 Day: 4 Month 3 Year 2021
Example-2. Convert Series to DateTime
Here we have a Panda Series which we will convert to datetime format:
import pandas as pd # Define Panda Series times = pd.Series(["2021-01-25", "2021/01/08", "2021", "Jan 4th, 2022"]) # Print Series print("Series: \n", times, "\n") # Convert Series to datetime print("datetime: \n", pd.to_datetime(times))
As you can see, our Series contains date in different format which are all converted into datetime format:
Series: 0 2021-01-25 1 2021/01/08 2 2021 3 Jan 4th, 2022 dtype: object datetime: 0 2021-01-25 1 2021-01-08 2 2021-01-01 3 2022-01-04 dtype: datetime64[ns]
Example-3. Handling exceptions during datetime conversion
But what would happen if the Series contains normal text instead of datetime, in such case the to-datetime will raise exception. For example, I have updated my Series to pd.Series([«2021-01-25», «2021/01/08», «2021», «Hello World», «Jan 4th, 2022»])
When we try to convert this to_datetime, we get following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/pandas/core/arrays/datetimes.py", line 2192, in objects_to_datetime64ns values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K")) File "pandas/_libs/tslibs/conversion.pyx", line 359, in pandas._libs.tslibs.conversion.datetime_to_datetime64 TypeError: Unrecognized value type: During handling of the above exception, another exception occurred:
So, to handle this we must use errors = ‘coerce’ which will convert all section which to_datetime fails to convert to NaT i.e. Not A Time
import pandas as pd # Define Panda Series times = pd.Series(["2021-01-25", "2021/01/08", "2021", "Hello World", "Jan 4th, 2022"]) # Print Series print("Series: \n", times, "\n") # Convert Series to datetime print("datetime: \n", pd.to_datetime(times, errors = 'coerce'))
As you can see, now the Hello World is replaced with NaT as to_datetime was unable to convert that field.
Series: 0 2021-01-25 1 2021/01/08 2 2021 3 Hello World 4 Jan 4th, 2022 dtype: object datetime: 0 2021-01-25 1 2021-01-08 2 2021-01-01 3 NaT 4 2022-01-04 dtype: datetime64[ns]
Example-4. Convert Unix times to DateTime
A Unix represents is a way to store time in seconds, and I believe it represents the number of seconds since January 1st 1970, I think, at midnight. And so by storing the datetime as a number of seconds, it’s very easy to convert that number of seconds into a specific date and time without running into any kind of formatting issues with dashes and slashes and all kinds of funky symbols.
import pandas as pd # Define Panda Series times = pd.Series([1349720105, 1349806505, 1349979305, 1350065705]) # Convert Series to datetime print("datetime:\n", pd.to_datetime(times, unit = "s"))
datetime: 0 2012-10-08 18:15:05 1 2012-10-09 18:15:05 2 2012-10-11 18:15:05 3 2012-10-12 18:15:05 dtype: datetime64[ns]
Example-5. Using format with to_datetime
Now to_datetime will automatically identify the day, month and year but there may be situations where the provided nut may not be in standard format.
For example, I will define my date string in «%M-%D-%Y» format i.e. month-day-year. In such case, if we only want to access the month, then to_datetime() may not be able to give proper data. So in such case we use .format to define the format in which input has been provided to_datetime() .
import pandas as pd # Define string date = '05/03/2021 11:23' # Convert string to datetime and define the format date1 = pd.to_datetime(date, format='%m/%d/%Y %H:%M') # print to_datetime output print(date1) # print individual field print("Day: ", date1.day) print("Month", date1.month) print("Year", date1.year)