Project: Python mini-datamart for analytics
Python Datamart
I completed this project in December 2022 as part of the Data Warehousing For Analytics class. For this project I had to build Python web scrapers to source in data, create and implement a star schema data model. The resultant data was initially added to a Pandas dataframe and then exported to a .CSV file. Tableau was used to generate visualizations.
Project objective:
Determine if there is any correlation between K12 school ratings and house prices in Suffolk and Nassau counties on Long Island, NY.
Data sourcing:
Build two Python web scrapers: one for scraping school ranking data and another one for scraping real estate inventory.
Data modeling: Apply star schema to the data. Use MySQL Workbench ER modeling feature to create the UML diagram.
Data processing: Process and merge sourced data and produce two .CSV files ready for data visualization.
Data visualization: Use Tableau to generate visualizations to see if any trends can be spotted.