Project: Python mini-datamart for analytics

Python   Analytics

Python Datamart

I completed this project in December 2022 as part of the Data Warehousing For Analytics class. For this project I had to build Python web scrapers to source in data, create and implement a star schema data model. The resultant data was initially added to a Pandas dataframe and then exported to a .CSV file. Tableau was used to generate visualizations.

Project objective:

Determine if there is any correlation between K12 school ratings and house prices in Suffolk and Nassau counties on Long Island, NY.

Data sourcing:

Build two Python web scrapers: one for scraping school ranking data and another one for scraping real estate inventory.

Data modeling: Apply star schema to the data. Use MySQL Workbench ER modeling feature to create the UML diagram.

Data processing: Process and merge sourced data and produce two .CSV files ready for data visualization.

Data visualization: Use Tableau to generate visualizations to see if any trends can be spotted.