Challenge¶
Just as end of year exams finish, you join Woodcreek College as a data analyst :nerd:. Your new colleagues are eager to receive the results of their students 🧑🎓 as well as any other insights you can share.
Your supervisor provides you with three CSV files 1
These CSV files contain all the data you need to get started with your analysis, but you'll need to prepare your data before you can get stuck in.
Your supervisor gives you a heads up that the data within student_exam_survey.csv has numerically encoded values, and you'll need to use the keys below to decode them.
- 1 = Some High School
- 2 = High School
- 3 = Some College
- 4 = Associate's Degree
- 5 = Bachelor's Degree
- 6 = Master's Degree
- 1 = Free/Reduced
- 2 = Standard
- 0 = None
- 1 = Completed
Data Preparation¶
Ingest the three CSV files into BigQuery, them combine them to produce a unified table named Student_Results_Combined
. Reformat the coded values into their proper names using the provided keys above. For example, convert 1s in the Lunch Plan column of student_exam_survey.csv to "Free/Reduced". For the sake of readability, also transform the gender vales (M/F) to (Male/Female).
Expected Output
Average Student Performance¶
Using your recently created table, produce a high level report which provides the average math, reading and writing scores by gender and year_group. Ensure these metrics remain as integers, meaning zero decimal places in your results.
For readability, order your results by gender and then year_group.
Expected Output
Hints
Solution¶
Footnotes
-
Dataset Source: http://roycekimmons.com/ ↩