Our project explores Formula1 car improvements from 2000-2020. We specifically look at the improvement of cars over the years through power-to-weight and lap times — this is done through interactive charts, somewhat styled using the Formula1 branding guide, which visualizes the change (both improvements and worsening). Our innovative view is an animated head to head lap based on sector times for 2014 and 2019.
Our dataset is threefold; we will discuss them in sections.
This is a massive split, tabular, dataset that contains “all information on the Formula 1 races, drivers, constructors, qualifying, circuits, lap times, pit stops, championships from 1950 [to the] 2020 season”, according to its curator. As such, we present it in multiple tables to keep it manageable. The numbers and attributes you see below are not reflective of the final data in the project—we did a series of pre-processing steps with these attributes.
Circuits: Circuits are where Formula 1 races are held. There are 9 attributes, but we have chosen 3.
Attribute |
Type |
Cardinality/Range |
Note |
circuitId |
Categorical |
76 |
id key of circuit, used to cross ref. other tables |
CircuitRef |
Categorical |
76 |
Reference name of circuit (internal code use) |
name |
Categorical |
76 |
Actual name of circuit (for display purposes) |
Constructors: These are the manufacturers for the cars. There are 5 attributes, we have chosen 3.
Attribute |
Type |
Cardinality/Range |
Note |
constructorId |
Categorical |
213 |
id key of constructor, used for cross ref. |
constructorRef |
Categorical |
211 |
Reference name of constructor |
name |
Categorical |
211 |
Actual name of constructor (for display purposes) |
Qualifying: This is the race period in which racers try to get their best times, so they can start further up the field during the actual race. There are 9 attributes, we have chosen 3.
Attribute |
Type |
Cardinality/Range |
Note |
raceId |
Categorical |
1047 |
id key of race, used for cross ref. |
constructorId |
Categorical |
213 |
id key of constructor, used for cross ref |
driverId |
Categorical |
851 |
id key of driver, used for cross ref. |
q1 |
Quantitative |
(0:53.904, 2:33.885) |
Time in Qualifying Session 1 |
q2 |
Quantitative |
(0:53.647, 2:12.470) |
Time in Qualifying Sess. 2 |
q3 |
Quantitative |
(0:53.377, 2:09.776) |
Time in Qualifying Sess. 3 |
Lap Times: These are the lap times as recorded. Of the 6 attributes, we have chosen the following 3:
Attribute |
Type |
Cardinality/Range |
Note |
raceId |
Categorical |
1047 |
id key of race, used for cross ref |
time |
Quantitative |
(0:50.404, 9:45.712) |
Lap time in minutes |
driverId |
Categorical |
851 |
d key of driver, used in cross ref. |
Races: These are general race information, from the year to what circuit, to the race name. There are 8 attributes, but we have chosen the following 4:
Attribute |
Type |
Cardinality/Range |
Note |
raceId |
Categorical |
32 |
id key of race, used to cross ref. other tables |
year |
Ordinal |
71 |
Year of race filtered to 2000-2020 (inclusive) |
circuitId |
Categorical |
76 |
id key of circuit, used to cross ref. other tables |
name |
Categorical |
76 |
Actual name of race (for display purposes) |
Seasons: These are the years in which F1 takes place (F1 seasons are yearly). Of the 2 attributes, we have chosen 1:
Attribute |
Type |
Cardinality/Range |
Note |
year |
Ordinal |
71 |
Year of the F1 season |
Drivers: These are the drivers who race in F1. Of the 9 attributes, we have chosen 4 of them.
Attribute |
Type |
Cardinality/Range |
Note |
driverId |
Categorical |
851 |
id key of driver, used to cross ref. other tables |
driverRef |
Categorical |
850 |
Reference name of driver (internal code use) |
forename |
Categorical |
470 |
A driver’s first name |
surname |
Categorical |
792 |
A driver’s last name |
These tables from Dataset 1 (F1 1950-2020) are completely unused:
This dataset is a collection of every car that has raced in Formula 1 from 2000 to 2020. It is a dataset which we collected (the process is described below in pre-processing) for the purpose of this project. Each car has the following attributes:
This dataset is a very small collection of qualifying time sector times—Formula 1 splits up their lap timing into sectors. Each sector is a portion of the race circuit.
It is surprisingly hard to get a hold of this data (due to a series of culture issues within the FIA/Formula 1). It was not readily given out for most of the series’ lifespan, the Formula 1 website itself only has data from 2014 to the current season—over 60 years of not-so-easily accessible data.
The data is as follows:
This is something we derived through a great deal of preprocessing on the main F1 1950-2020 dataset. We will discuss it in the pre-processing section 2.3.1, as it is not necessarily a dataset in and of itself, but rather a JSON file created and pre-processed purely for convenience. Nonetheless, its attributes are as follows:
We have 3 main datasets. All processed data can be found on our repo: https://github.students.cs.ubc.ca/cpsc436v-2020w-t2/436v-project_c7s1b_v4x0b_w9d1b/tree/data-scraping
Datasets:
https://www.kaggle.com/rohanrao/formula-1-world-championship-1950-2020
For dataset 1, we put it into MongoDB and did an immense amount of filtering on the database. We performed multiple joins on the appropriate keys to create a view with the fastest lap times for each race, per season. We filtered the seasons down to 2000 to 2020, inclusive. We exported that and then more pre-processing work in python to compute averages based on this dataset (which will be described below). We got the total number of rows down to 387. The object structure for the lap time data is as follows (only the new attributes are commented on in the table):
Object = {circuitRef, location, driverId, driverRef, forename, surname, raceId, year, constructorId}
New Attribute |
Description |
bestLapTime |
The best lap time for a given track, for a given year, that we filtered and queried for |
circuitName |
Actual name of circuit, same as “name” in circuit table, but renamed |
laptimeMillis |
The best Lap Time converted into milliseconds |
We noticed that the dataset was missing quite a bit of race data for the 2000-2003 seasons while doing the initial filtering. We initially left those entries as-is, so we could have data to work with. However, as we want complete data, more pre-processing was done in order to fill out missing entries, cross-referencing with Wikipedia. This was done in a spreadsheet, which was re-exported to a python notebook, where a calculation was set up to see how much cars have improved per year and this was based on a Reddit post that had a convenient enough methodology that we could utilize for our data.
Percent decrease/increase was calculated based on the best lap time of each circuit between
two subsequent years. Based on the table of multipliers (1.xx for increase (i.e. bad), 0.xxx for decrease
(i.e. good)), we were able to calculate a theoretical lap time for each year based off all the multipliers,
starting at a theoretical lap time of 90 seconds multiplied by the first year’s (2000) multiplier.
Each multiplier is then applied to the theoretical lap time result of the previous year. As such, the
theoretical lap time serves as a more accurate overview for our visual that displays the averaged fastest
lap time by year (LT0), in the form of lap time adjusted based on yearly average multipliers.
This “sub dataset” we produced is an array of the following structure:
[ [Year, Calculated Theoretical Lap Time in Seconds ], [ … ], [ ... ] ]
Where index 0 is the year, and index 1 is the calculated value.
Further preprocessing went on where we re-named certain very long circuit names to their alternate/more colloquial names, for example Autodromo Internazionale Enzo e Dino Ferrari was renamed to Imola — this is what most people call it and the long name is its formal/official one. For the purposes of this viz, and in general, the colloquial/actually used names are the correct choice as they are known by every fan and are even used by the F1 live commentators.
The pre-processing was done in Python. The scripts are provided in the data-preprocessing branch linked above. First we get the cars by season, and then from there we get the URIs inside of the scraped data. The tables are luckily standardized, so we can then scrape each car’s specific page for the constructor, the model name, the amount of power, the weight, the drivers who drove it. When there was missing data, we filled it with the FIA regulations for that year, and then once filled, we went and searched the internet for the actual number (as Wikipedia did not have complete data, unfortunately). We add our own field called “group” into the data, which was explained in the previous section. We also apply an unique color per group—this is used in the project’s scatterplot.
We initially collected data from Kaggle for this, and then we filtered the data so only the tracks we supported are left. Furthermore, the data is combined/merged so for a json object has all 3 sectors we need for a given track/year. We manually updated the JSON output to include 2014 data from the FIA website; the reason for this is that we could better illustrate the difference (mostly the improvement) of the cars when comparing 2014 and 2019.
The reasons users may want to look at this visualization are multifold:
Because power (and related) figures are always of interest to car enthusiasts, the mechanical changes charts should be of interest, if only to see how cars end up clustering in terms of power, or how power levels go up/down. This would allow the user to further search as to why certain cars have gone down in power, or why all cars have gone down (e.g. it could be an engine supplier, or different regulations, etc.).
The overview to “sub-overview” to detail view unidirectional interactions are explained in the section below.
Note: In our write up we will mostly refer to this as Mechanical Changes, as per PM1 and PM2, to avoid confusion. We believe that Power and Weight is an apt title and description—however we realize that throwing around too much similarly-worded jargon may be potentially confusing to non-car enthusiasts.
There are many reasons why the user would be interested in this visualization, namely:
Overview: The overview allows users to identify trends in the averaged fastest lap times in F1 over years. This lets us see how overall car performance is changing (better performing cars have faster lap times). For example, there may be a period of time where the lap times are contrary to the trend prior, which in-turn may develop into users researching what the potential cause for this increase in time would be.
Overview: LT0, the overview allows users to lookup the fastest overall seasons through the theoretical lap times that we calculated for this chart.
Small Multiples: For users who are unfamiliar with the tracks and F1, the small multiples provide a leeway for them to browse the lap time data and learn about which seasons which tracks were run for F1.
Small Multiples: Compare the lap time trends per circuit by viewing the trend lines in a given circuit’s chart. We have implemented a “disable small multiple points” button for them to disable the points and view the trend line only, if they so choose—hopefully this allows even easier understanding of general trends.
Small Multiples: A user can find the best/worst times, or they can select whichever year(s) they want and visually compare those.
Small Multiples: Through the use of the interpolated/dashed line, a user can see when F1 does not use a certain track, and then comes back to it.
Small Multiples: With the tool tip, the user can see which driver set the fastest lap and they can potentially see the dominant drivers. See section 6.3 for a note regarding this goal/task.
Beyond it being pretty cool, some reasons why the user would be interested in this visualization are:
Stacked Bar chart: The stacked bar chart has developed into a view for presenting the part-to-whole relationship of sector times to lap time (i.e. the sum of sector times becomes the lap time). Due to limited data availability and time, we only display 2014 and 2019.
Animation: This is meant to provide an innovative way to visualize sector time differences, which users can compare and contrast. Again, due to time and data limitations we were only able to display a total of 2 years (2014 and 2019), and 3 circuits: Suzuka Circuit, Circuit de Monaco, and Marina Bay Street Circuit. Users can watch a “race” between 2014 and 2019 for their selected track, and do a [comparison] grounded in an animated race between the two lines, visualizing how much faster/slower the sector time and the cars are.
Animation: Users would be able to consume and enjoy the line animation as a simulation of what a head-to-head race between the two fastest cars of the respective year’s grand prix would resemble, if it were to occur in real life. Of course, this is abstracted to the sector times taken between the two years worth of data.
The Mechanical Changes section includes three major parts: The overview, sub overview, and detail view.
F1 cars are some of the highest horsepower/lowest weight cars in the world, especially in the single-seater, open-cockpit category. As such, this visualization is meant to give an understanding of how the constructors have progressed their cars (over the 2000-2020 seasons).
We split this into multiple linked views because there is a plethora of data to visualize and so it can get convoluted/difficult to view in one or two views. We will explain this further below.
Marks: Point mark for each year’s averaged power-to-weight ratio. Line mark for connection mark between points.
Channels:
Marks: Point mark for each car’s power-to-weight ratio.
Channels:
Marks: Point for each car’s power over the years. Point for each car’s power-to-weight ratio over years. Line mark for connection mark between points.
Channels:
Initially we had decided that a static chart would be sufficient, but after some discussion we had expanded this section in Milestone 2 to include 2 line charts on top of the existing scatterplot, which portrays trends of power-to-weight and power of an F1 car over the years (dual y axis). For Milestone 3, we expanded it further to include the Average Power-to-Weight ratio over the Years to have a filter over the years—we did this because there was a great deal of occlusion. With a new addition of the new overview that filters for the scatterplot(formerly overview).
We purport that this is a filtered, multiform, overview/detail type visualization, where the overview has data that is aggregated per year, and is selectable to show items in its detail view(s). Furthermore, upon no selection in the overview, there is nothing to view in its detail view, the “sub-overview”/filtered scatterplot.
The scatterplot and two line charts (power/power-to-weight) relation is still the same as Milestone 2. The scatterplot has all the appropriate cars and the detail view that shows specific attributes for a subset of data—the subset aspect applies to the overview-sub overview/filtered scatterplot relation as well.
In the overview, we have a line chart that displays the trend of power-to-weight ratio, averaged by year. The Y-Axis is averaged power-to-weight ratio, while the X-Axis is years, from 2000-2020. This view helps the user in a few ways: (1) It details a trend of the ratio that gives an executive summary of the power-to-weight changes over the years, (2) acts as a filter for the scatterplot to only include a subset of years that the user is interested in exploring, which subsequently also propagates the filtering to the detailed view. By default, 2010 and 2020’s averaged power-to-weight ratios are selected (which also populates the scatterplot).
We sometimes call this a sub overview; it is a result of the filtered selection from the (actual) overview and yet the filtered scatter plot itself acts as an overview for what we call the detail view—thus, sub overview. In this section, we have a scatterplot that simply displays all the cars, and they’re positioned by their power-weight. The Y-Axis is weight, and the X-Axis is power. This view allows the user to see a few things: (1) when combined with the tooltip, where a car lies in terms of number figures in comparison to all the others, (2) groups/clusters that might pop up, F1 has regulations on power/weight, but they’re limits, and not every constructor can always reach that limit—many often do not. By default, and by way of the overview’s default, the scatterplot shows cars from 2010 and 2020 as they have respectively, the minimum and maximum averaged power-to-weight. None of the points are actually selected however, which leaves the detail view blank.
In the detail view, we have 2 charts that show a subset of data -- the subset is based on the selected car’s constructor. Horsepower over the Years’ (Chart 1/the upper chart) Y-Axis is that of a horsepower figure for a selected constructor’s cars, and Power-to-Weight over the Years’ (Chart 2/the lower chart) Y-Axis is that of the power-to-weight ratio for that constructor’s cars. Both charts share the same X-Axis, which is years (2000 to 2020), and the data displayed is whatever is available, as not every team has competed in every year (e.g. HAAS F1 is a very new team, starting their run in 2016). By default, the detail view is blank as nothing is selected in the filtered scatterplot.
The linechart acts as an overview/filter for the scatterplot and has an unidirectional interaction. Upon selecting a year (or years) from the overview linechart, the associated data for the individual cars from a user’s selection will be displayed in the scatterplot, color coded by the constructor group through hue.
Fig 1. Nothing selected at all (this is not the default, we use this to illustrate our viz).
Fig. 2. A single year is selected in the overview line chart, populating the scatterplot. Note the tooltip.
Fig. 3. More years are selected, and more points are plotted.
The scatterplot acts as a “sub-overview” for the initially empty line graph and has a unidirectional interaction. Upon selecting a car from the scatterplot, the associated line charts display changes over the years for that specific car’s constructor.
Fig. 4. We select the 2015 Mercedes. Notice that its point is now opaque, and the line charts are now populated.
If the user clicks the “Ferrari F2001” point in the scatterplot from 2002 data, then the detail view will show the horsepower and power-to-weight for all the Ferraris that have competed in F1 from 2000 to 2020. If the user then clicks the “Jordan EJ11” point in the scatterplot, the data shown in the detail view will transition to that of all that constructors’ cars. As mentioned previously (see ‘Group’ attribute explanation), the data shown is that of the constructor itself, and names change. This means that when we click a Racing Point car, we will also see Jordan, and Midland F1 cars -- because the team was bought out and the name was changed.
See the following series of images for a walk through of our example.
Fig. 5. (above) Ferrari F2001 is selected from the scatter plot, as 2002 was selected on the overview and thus its data is displayed.
Fig. 6. (below) Jordan EJ11 is clicked, and the line charts change.
Once a point is clicked in the scatterplot, the appropriate cars within that grouping are highlighted by their opaqueness, contrasting to unselected translucent points. The appropriate cars are what were explained above -- a given constructors’ cars. That is to say, if we click a Ferrari, all the Ferraris will be highlighted. Due to the filtering by year functionality in the overview, sometimes multiple years’ worth of data must be selected to find additional data points that belong to the same constructor group, since there may be few data points per year.
Figs. 7 and 8 show the Ferrari constructor group selected; first with only 2015 data (thus, 1 car) and then with 2001 and 2015 data showing, and so both Ferrari points are highlighted automatically.
When switching from data point to another, the connection line in detail view is animated for ease of comparison and viewing pleasure.
There are tooltips on the overview, sub overview, and detail view that shows itself upon hovering over a data point. This will display the averaged power-to-weight ratio and year for the overview, the car’s basic data (its constructor, name, year/season, etc.) in the sub overview, and the same format for car’s basic data (constructor, name, year, weight, power, power-to-weight) in the detail view.
Fig. 9. (Left Top) shows the tool tip on the overview line chart. Fig. 10. (Right Top) shows the tooltip on the “sub overview”/filtered scatterplot. Figs 10 and 11 (bottom left and right) show tooltips for the detail view line charts.
Lap times are one of the most important metrics in any form of racing, and as F1 cars are some of the fastest circuit race cars on the planet, and they can produce some ridiculously quick lap times around the circuits they race on. Beyond fast laps being very exciting to watch, lap times in general tell us the overall performance of the car. It’s quite possibly the most important thing to have in this project.
We left LT0 the same from PM2, and we left it as a line chart. We chose this because it was a reasonable way to show the trend of the theoretical lap, which in turn allows users to see progression of improvement quickly. LT1 has been changed to small multiples, and this allows users to see the general lap time trends (and their highlighted years as well). Small multiples seemed to be a reasonable way to show multiple trends, for multiple race circuits.
A potentially interesting feature we included was to disable the points in the small multiples, as to only show the lines. This should allow users to see the trends even easier.
As the axes for this section are a little different, we will explain them here as a design rationale. For LT0, we have a traditional Y axis label, but no X-axis label — this was left out for a cleaner aesthetic, but also the fact that we believed that given the contextual clues of “Yearly”, the project being based on the years 2000-2020, its x-axis is self-evident.
For LT1 (the small multiples), we have traditional Y-axis only on the left side (i.e. the row with Monza, Interlagos, etc.) , see the image on the left. The reason for this was that having every y-axis label would have (1) caused poor readability, (2) caused higher (passive) cognitive load due to the amount of stuff on screen, and (3) looked visually unappealing. For similar reasons, we display the x-axis only along the bottom most row. We do keep the ticks on every small multiple, as a visual cue/indicator which hopefully acts as a sufficient reminder of the y and x axis and their values.
Marks:
Channels:
Marks:
Channels:
The original idea was to have two parts to this view: the overview, which we will refer to as LT0, and the detail view, which we will call LT1. Together, they form a multiform, overview type of visualization. While the general idea was kept consistent, the type of visualization for the detail view (LT1) has been changed from a stacked scatter plot to small multiples of line charts to accommodate for excessive occlusion.
LT0 is our overview. It’s a line chart that displays averaged fastest lap times over the years. The Y-Axis is averaged best theoretical lap time in minutes, and the X-Axis is years from 2000 to 2020. LT0 serves multiple purposes: (1) Users will be able to view how the averaged fastest lap times have changed over the years using theoretical time from derived multipliers, (2) this chart acts as an overview for the small multiples line charts seen below in LT1. For PM3, Y-Axis has been changed to theoretical laps for a more accurate multiplier-based calculation on yearly lap times increase/decrease(See section on Pre-Processing Pipeline for how the calculations were performed).
In the detail view, we have LT1, a small multiples line chart and a detail-view/disaggregation of LT0. The Y-Axis is the fastest qualifying lap time (in minutes), and the X-Axis are the years inspected (2000-2020). Each point on each line chart represents the fastest qualifying lap time for a year, and the line charts are separated into one per F1 circuit. LT1 allows users: (1) to swiftly compare and contrast fastest lap times over the years by track, (2) allows users to see the specific active timelines of different circuits, as well as trends across tracks, (3) see gaps between usage of tracks through the use of the dashed lines for interpolated data.
There is a bidirectional interaction between LT0 and LT1. Upon clicking year points in LT0, highlights will occur in LT1. The points per track, for the selected years will be highlighted in LT1, where each selected year in LT0 will display in LT1 as a red highlighted point for that year across all tracks.
Fig. 12. (above) The view for when nothing is selected.
Fig. 13. (below) shows the year 2012 selected in the LT0 overview. This highlighted change is reflected across all 2012 points in the LT1 detail view.
A user can also click a point that is not currently selected on the small multiples line charts (LT1) and see the corresponding year be highlighted in LT0 and other tracks in LT1.
Fig. 14. (above) When an unselected year (2000) is clicked on in the A1-Ring circuit in LT1, the year’s point is highlighted across all other tracks as well as the overview (LT0).
Users may also toggle already highlighted points on LT0 by deselecting them on the small multiples, or deselect points in LT1 to see the corresponding year deselected from the LT0 overview.
All highlights are made distinguishable by the color cue red, so that the years selected stay distinct. Upon re-clicking either the highlighted point in LT0, or any of the highlighted points in LT1, the highlighted color hue will be reset to a black, which signifies it is unclicked.
Fig. 15. Clicking on year 2000 when it is already highlighted in LT0 deselects the year from LT0 and all of LT1. Note the change of hue back to black upon deselection.
Fig. 16. Clicking on the 2012 lap time data in Silverstone Circuit results in deselection of 2012 in all other circuits, as well as deselection in the overview.
For example, selecting 2003 in LT0 will result in a highlight on LT1 for all the year 2003 points across all tracks, and deselecting a 2003 point from the A1-ring circuit would deselect all 2003 in all the small multiples line charts.
Fig. 17. (above) Selecting 2003 in the LT0 overview highlights all 2003 points in the LT1 small multiples detail view.
Fig. 18. (below) Deselecting from A1-Ring results in deselecting all highlighted points, in both the overview and detail view.
If the user wanted to compare the best time at Suzuka Circuit in 2003 to the best time at Suzuka in 2019. They are able to click on both years and highlight the points of interest in the Suzuka Circuit line chart. This also highlights the years 2003 and 2019 across all circuits. This way, if users wished to compare the Suzuka times to, say, Circuit de Monaco, no additional highlighting is necessary.
Fig. 19. (above) A comparison of Suzuka Circuit times with emphasis on 2003 and 2019. Without further selection needed, can expand this comparison to other highlighted circuits, such as Circuit de Monaco.
An alternative approach without clicking would be to hunt the points down in LT1 by hovering
over the point and looking at the tooltip.
Figs. 20 and 21 (above) The tooltip information for 2003 and 2019 lap times respectively on Suzuka Circuit.
Upon hover to a data point, LT0 currently displays the average theoretical lap time for the
specified year in seconds through the tooltip. LT1’s tooltip currently shows the year, the lap time at
a specific track during that year, in (MM:SS.s) time format, and the driver for that
lap time on the specified track.
Fig. 22. (above) Tooltip for the overview, LT0. Includes year and theoretical lap time.
Fig. 23. (below) Tooltip for the detail view, LT1. Includes year, best lap time and driver information.
In addition, you can disable points, or clear selection, or reset with the corresponding buttons. Reset displays the default chart, which has 2014 and 2020 arbitrarily highlighted.
Fig. 24. (above) When the disable small multiple points button is pressed, all points in the small multiples are disable to allow a toggled view for trend searching. Note the button label has changed to allow for enabling the points.
Fig. 25. (above) Clear selection button removes all selected/highlighted points from both views.
Fig .26. When the reset button is pressed, the default view for LT0 and LT1 is
displayed, with 2014 and 2020 highlighted.
F1 cars are constantly evolving. In order to show the lap time difference over the years, we look at Suzuka Circuit, Circuit de Monaco, and Marina Bay Street Circuit and put the best qualifying lap sector times of 2014 and 2019 into a bar chart. We also (and more interestingly) animate said sector times in a head-to-head race. This provides a
Marks: A glyph that is a composite of line marks that encodes lap times for each year/sector
Channels:
Marks:
Channels:
Our original plan was to support selection on the year(from 2000 to 2020) and animate the whole lap time in one constant animation. However, the line moves in a linear manner and does not show the change over different portions of the lap. However, we are not able to grab as much data over 20 years and decided to only show the sector time of 2014 and 2019.
The view is that of a multiform one—both the stacked barchart and the animated view use the same sector time data, but they visualize them in wildly different manners.
Since we have limited data on sector time, In the bar chart, we display the best sector time for each track in 2014 and 2019. The Y-Axis is the best sector time, while the X-Axis is years 2014 and 2020. This view gives the user a better understanding on: (1) how does the best lap time change over the year on the same track, (2) what’s the change in each sector
In the animation, we are showing all three sectors side by side on a track for both 2014 and 2019. The user is able to have a real time feeling on how much faster a 2019 F1 car is compared to a 2014 sector by sector.
The radio buttons serve as the overall filter by track, that determines the rest of the visualizations displayed in this section.
Fig .27. (above) Default state for the visualization. Radio button for Suzuka is selected.
While the default state for the animation view automatically selects for Suzuka circuit, upon switching over to a different button, both the stacked bar chart and the track animation svg will be altered accordingly to match the selection. The stacked bar chart provides an overall comparison of the lap times between 2014 and 2019,
Fig. 28. (above) Switching selection from Suzuka Circuit to Circuit de Monaco using the radio buttons, stacked bar chart and circuit shape are updated accordingly for the new circuit.
further separated into sector times. Upon clicking on the start race button, two lines will be drawn side by side for the two years’ lap timed data, at speeds that correspond to the raw sector time data.
Fig. 29. (above) Start race button is pressed. Animation starts with the comparison of two years sector data, displayed side by side in a racing format. Refer to the legend for distinguishing between the years and sector times.
As line animations are drawn on the geometric circuit shape, with the sectors differentiated by both their position on the circuit map as well as by the differing color hue. After all 3 sectors (the overall lap time is made up of a sum of all 3 sector times) are drawn, the circuit race between the two years is completed, and the lines at full progress remain on the track.
Fig . 30. (above) Circuit animation has reached completion. Geometric shape fully colored.
Selecting a new track will reset the track with no animation playing, while pressing the start race button again will restart the animation.
We have added tooltips to the stacked barcharts, which includes time information such as the specific sector time, as well as the overall elapsed lap time for the specified track during that year.
Fig. 31. (above) Tooltip for stacked bar chart. Includes sector time and lap time information.
Credits and sources will be listed by visualization, and then general.
Our project evolved quite a bit from milestone 1. We added more charts, more functionality and flip-flopped between LT2 implementations. Overall, there was more scope creep than there should have been.
For example, our initial anticipation did not account for how long the data processing portion would take
Each visualization has gone through changes, some minor and some drastic.
This section of the project was expanded to include one more chart. We went from a scatterplot of the cars and 2 line charts, to a line chart of overall power-to-weight ratios per year that acts as a filter for the scatterplot of cars (which now displays based on selected years).
The scatterplot has also been given a jitter. The reason both the filtering chart and the jitter were implemented is that we have a lot of overplotting, due to the nature of our data for this part of the project (that is, race cars that can be both of secretive specifications and also of regulation maxing specifications) many points lie on top of each other. Thus both combine into an attempt to solve scatter plot occlusion through putting less onto said scatterplot and to modify a point's X and Y coordinates a tiny bit for the sake of readability.
The rest of mechanical changes have not changed.
This section’s y-axis calculations were altered significantly. Initially, we had a placeholder method for calculating the average lap time per year, which was then shifted to theoretical lap time in minutes based on derived yearly multipliers done in the data processing step. Overall though, the choice of visualization did not change, as it remained a line chart as the same intended axises remained.
This visualization was changed from a stacked scatter plot with line connections into small multiples, which removed the occlusion issues we had with the stacked scatterplot with line connection (see left image). The result is a set of small multiples that had the labels on the outside left/bottom and is overall much easier to read (see bottom left image). After creating the small multiple, we figured that it would be a good idea to let people know that some data is actually “missing”—not every track is used in every season of F1. We did this with a dashed line and let people know that it is interpolated data (see bottom right image).
Our visualization goals reverted to our initial lofty intentions of having sector times animated. We were mistakenly under the impression that we needed a plethora of tracks to be available, not understanding that this was effectively a (polished) tech demo. This misunderstanding caused us to change our goal of comparing sector times to comparing overall lap times — a less interesting visualization. In the latter half of Milestone 3, the teaching staff alerted us to this fact and we got to work changing the static speed lap viewer to a sector time based lap viewer.
For mechanical changes, our technical goal developed alongside the visualization, due to issues that arose in the visualization. After consulting with the opinion of the teaching staff as well as within our team, we ended up expanding our design to include a filter to the old overview, which became a “sub overview” — that is, the scatterplot is still the overview for the power progress/power-to-weight progression line charts as it was before, but it has its own overview now.
This change helped remove some occlusion of data in the old overview’s (current “sub overview”) scatterplot, with overlap that made it highly difficult for distinguishability in some parts of the graph. This was a decently quick transition, with a heavy reference on how the existing detail view and old overview was set up to instantiate a new line chart overview by year. As expected, this addition in visualization did fix our initial issue with occlusion for the car data.
In the grand scheme of things, the LT0 visualization has not changed much from our initial concept as an averaged value chart that serves as an overview to the more detailed dive into the specific lap time across different tracks, as covered in LT1.
For LT1, our technical goals evolved along with the visualization—or rather, because of the visualization. Due to the occlusion mentioned in the previous section, we discussed some options with the team, conferred with the teaching staff and decided that small multiples were the way to go. This was a learning experience in both getting the small multiples working (though thankfully some good reference did help), and, then afterwards, getting the circles to draw on the line—this had no reference; after speaking with Steve, he correctly (and very quickly) pointed us in the direction of a data-pass issue which did solve our issue.
For LT2, we replaced the standard barchart with a stacked barchart to show the sector times, as this would give a better part-to-whole representation. Since we limit the number of tracks to 3, instead of using a drop down manual, we use radio buttons for track selection. We also narrow down the years to 2014 and 2019 only, thus the interaction for barchart where users could choose between years is disabled.
Our proposal was realistic — we implemented our charts as described (for better or for worse). Our changes served to further improve legibility/conform to branding/apply foundational theory.
This is difficult to measure — we figured out what we wanted to do. However, this does not mean that there weren’t some initial hacks being used (such as a counter to make sure axes only appended N amount of times), before realizing “better” ways to do it.
One thing that we did not have the time to do was connect the dataset of cars and lap times. The preprocessing was getting to be too much—it would have been a nice addition for LT1. However, the overall point of LT1 was viewing the general trend and so the viz keeps its usefulness in that aspect, but it would be a great supplement to be able to see which manufacturers are dominant, or have been setting the fastest laps. This idea alone must be its own visualization however, as there is a lot to do for it, and a lot we can do for it.
6.6) If you were to make the project again from scratch (or any other interactive visualization), what would you do differently?
The data preprocessing was a nightmare for what we picked, so the most pertinent thing that we would do differently is pick a good dataset. Everything else went good and overall it was a good learning experience.
We did not give estimates per task, instead we opted for overall hours worked. We will provide estimated hours per task, then tally them up, and compare them to our estimated weekly hours. New items are indicated by (NEW) on the item listing.
Lydia |
|
|
PM1 Estimate for Lydia: {16 hr}. Actual Total from PM3: {30+ hrs}. |
Robert |
|
|
PM1 Estimate for Robert: {16 hr}. Actual Total from PM3: {28}. |
Shabab |
|
|
PM1 Estimate for Shabab: {16 hr}. Actual Total from PM3: {30+ hours}. |
So overall our tasks shifted around from person to person, but we got a lot of what we said done, when comparing it to our PM3 work schedules, and we are on track for the most part.
The data had been more of a nightmare than we expected, we spent an inordinate amount of time on it… which was… unfortunate, to say the least.
Also, we are evidently not good at estimating times required, as consistent with our previous milestone.