Project 2
Week 13- Wednesday
In class, we learnt following topics:
1. Convert Dates to Datetime Objects
– ‘pd.to_datetime()’ is used to convert the ‘issued_date’ and ‘expiration_date’ columns in the DataFrame `data` to datetime objects.
– ‘errors=’coerce” is employed to handle any parsing errors, converting problematic entries to NaT (Not a Time).
– ‘pd.to_datetime()’ is used to convert the ‘issued_date’ and ‘expiration_date’ columns in the DataFrame `data` to datetime objects.
– ‘errors=’coerce” is employed to handle any parsing errors, converting problematic entries to NaT (Not a Time).
Code:
data[‘issued_date’] = pd.to_datetime(data[‘issued_date’], errors=’coerce’)
data[‘expiration_date’] = pd.to_datetime(data[‘expiration_date’], errors=’coerce’)
2. Calculate Duration
– The ‘duration’ column is created, representing the number of days between ‘expiration_date’ and ‘issued_date’.
– The ‘duration’ column is created, representing the number of days between ‘expiration_date’ and ‘issued_date’.
Code:
data[‘duration’] = (data[‘expiration_date’] – data[‘issued_date’]).dt.days
3. Visualization – Number of Permits Issued Over Time
– A new column ‘issued_year’ is created, extracting the year from the ‘issued_date’.
– A bar plot (‘sns.countplot()’) is generated using Seaborn to visualize the count of permits issued each year.
– Matplotlib functions are then utilized to add labels, title, and rotate the x-axis ticks for better readability.
– Finally, ‘plt.show()’ displays the resulting bar chart.
– A new column ‘issued_year’ is created, extracting the year from the ‘issued_date’.
– A bar plot (‘sns.countplot()’) is generated using Seaborn to visualize the count of permits issued each year.
– Matplotlib functions are then utilized to add labels, title, and rotate the x-axis ticks for better readability.
– Finally, ‘plt.show()’ displays the resulting bar chart.
Code:
data[‘issued_year’] = data[‘issued_date’].dt.year
plt.figure(figsize=(10, 6))
sns.countplot(data=data, x=’issued_year’)
plt.title(‘Number of Permits Issued Over Time (by Year)’)
plt.xlabel(‘Year’)
plt.ylabel(‘Number of Permits’)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
In summary, the whole code transforms date columns, calculates the duration between them, and creates a visualization to show the distribution of permits issued over different years. The Seaborn and Matplotlib libraries are used for data manipulation and visualization.