Currently working on some Data Analytics projects to explore ongoing wealth inequality (BigQuery/SQL and Python).
Goal: Perform data analysis on csv of doctors matched to json practices (office locations).
Take data in form of health plan roster (csv), manipulate and clean data as needed in order to match pre-ETL data of doctors with json source.
Standard Data Engineering steps:
Read in data from variety of formats, explore what you're working with.
Manipulate and clean data.
Make new dataframes from existing data given.
Create print statements to summarize data analysis and write results to new files (with a dash of fun!).
Results of data source matching after running the Jupyter notebook written in Python.
Use results to suggest next action steps.