Work Hours
Everyday: 北京时间8:00 - 23:59
Lab 3: Advanced SQL
Winter 2023 INFO330
Objectives: To be able to write more complex queries in SQL, focusing on nested
queries, common table expressions, aggregations, arrays, null values, and more
Assignment tools: pgAdmin or another database management interface
What to turn in: A single ‘lab3.sql’ file containing your answers. Please use SQL
comments to denote problem number and any answers that are not SQL commands.
For each problem, submit your SQL query, a SQL comment specifying the number of
rows your query returns, and a SQL comment with 10 rows of your result (if the result
has fewer than 10 rows, copy all the rows). Unless otherwise specified, please include
both your query and result as part of your submission.
Due date: Check Canvas, turn in via Canvas
Electric Vehicles in Washington State
Using the vehicles database, please answer the following:
- (10pt) Write a nested SQL query that returns the top 10 electric vehicles (make,
model, year) by average electric range. The average electric range should be
computed from the electric range numbers reported in the ‘ev_wa’ table. The
reason this is a nested query is because each VIN (vehicle identification number)
may be included multiple times in the table, so you will first need to compute the
average range per car (per VIN) before aggregating over make-model-year. In
your final output table, please include the model_year, make, model, number of
cars of that make-model-year, and the average electric range. Hint: you might
want to start by writing a query computing the average electric range for each
VIN. - (15pt) We want to know which electric vehicles have made the most year-onyear improvement in electric range. Write a query using common table expressions to return the vehicles (make, model) that made the most single year
improvement to their electric range. Output just the top 10 most dramatic yearon-year changes. Hint: This query should build upon your previous query.
1
We want to know whether there is a relationship between median household income
and prevalence of electric cars. We will use census tract as a minimum geographic
unit. Median household income can be found in the ‘income’ table associated with
the census geo ID. This geo ID maps to the census tracts in the ‘ev_wa’ table. - (15pt) Write a query to return the number of unique electric vehicles (by VIN)
per census tract in Washington state along with the median household income
of that census tract, for only tracts where the median income is available. The
query should return only the top 10 tracts by highest median income.
You may notice that the census tract identifier from the ‘ev_wa’ table takes a
different form than the census_geo_id from the ‘income’ table. You will need
a new function SUBSTRING to join the two tables in this database on census
tract. The following example query joins these two columns together by taking
a substring of census_geo_id and matching it against census_tract. A few rows
of the resulting join are provided for reference.
SELECT e.census_tract, i.census_geo_id
FROM ev_wa e
JOIN income i ON SUBSTRING(i.census_geo_id, 8) = e.census_tract
LIMIT 5;
census_tract | census_geo_id
————–+——————–
08005006861 | 14000US08005006861
53029971302 | 14000US53029971302
06031980000 | 14000US06031980000
06073018509 | 14000US06073018509
06001423200 | 14000US06001423200 - (10pt) Retrieve the top 10 and bottom 10 tracts by median household income.
Is there a difference in the number of electric vehicles in these sets of tracts?
What are the cities associated with these tracts? You can use SQL to do this. Do
the results match your expectations?
2