Skip to content

Interactive data visualization of handedness on performance of baseball players using HTML5, CSS3, JS, and the D3 library.

Notifications You must be signed in to change notification settings

domwon/Baseball-Data

Repository files navigation

Summary

A CSV dataset of baseball players and their characteristics and statistics was parsed and an interactive data visualization was created using common web development languages (HTML5, CSS3, JavasScript) and the D3 library. The focus of this visualization was handedness of the player on batting average and amount of home runs hit (HRs).

Data Cleaning

We cleaned the original dataset of 1157 baseball players (stored in baseball_data.csv) by removing any players that did not score a homerun or have a batting average. To expedite the cleaning process, these players were removed directly with Excel. This new dataset contains 871 players and is stored in baseball_data2.csv.

Explanatory Outcomes

While the user can interact with the visualization, there are two key takeaways that we wanted to clearly illustrate through the bar charts:

  • On average, left-handed players have slightly better batting average than right-handed and ambidextrous players (0.253 vs 0.240 and 0.244 respectively).
  • On average, left-handed players hit more HRs than right-handed and ambidextrous players (69.9 vs 58.8 and 40.3 respectively).

To help quantify this difference, we used a two-sample independent t-test. At a 90% confidence level, all of these performance differences are statistically significant. More statistical details can be found in the baseball_data.xlsx file.

Design

Visual Encoding

The most important fields, the two performance indicators (batting avg and home runs hit), are encoded by x position and y position respectively. Since handedness is the variable that we wish to show a difference in, we encoded this through color for easy comparison. Additionally, we made this variable interactive by allowing the user to sort the data by handedness through interacting with the buttons on the right side. Clicking on these buttons highlights visual elements that pertain to that hand type.

Scatter Plot

We decided to use a large scatter plot to allow the user to see trends in the data and interact with it to find their own data story. The data points on the scatter plot allow for additional user interaction by displaying the player’s information in a statistics-like tooltip upon hovering and clicking.

Bar Charts

After some feedback, we also added a ‘summary statistics’ portion for quick explanatory visualization while retaining exploratory functionality with the whole dataset. Bar charts were used to allow quick comparison. The visual encoding of handedness is color to be consistent with the scatter plot. No axes were used as bar width and values in the bars were sufficient to encode the data and the colors were sufficient to encode the handedness.

Feedback

The visualization was updated based on collected feedback. Version 1.0 represents the visualization before the feedback and Version 2.0 represents the visualization after the feedback.

Please note, only Version 2.0 can be seen in the browser via the link: https://domwon.github.io/Baseball-Data/. Version 1.0 can be viewed by downloading it separately and viewing it locally.

Below is the feedback that was collected:

  • Brother noticed that there were two circles that would show up green (left handed) even though the red (right handed) button was selected. Refer to Version1.0_Issue.jpg.
  • To resolve this, we went back to the csv file and found out that there were duplicates in the name field.
  • Initially, we had assumed that the ‘name’ field was unique and bound the filtered data to the data circles with the name field in the ‘key’ function. However, upon further analysis of the data in Excel, we found two data points that had the same name to another data point.
  • Thus, to rectify this, we added an ‘id’ field and put a unique value for each record.
  • Mother mentioned the selected toggle button for handedness should be emphasized when the data is changed.
  • Therefore, we highlighted the selected button by increasing its opacity and fading the unselected buttons.
  • Father noted that while the visualization was brilliant and clean, it was difficult to determine relationships by showing the whole dataset.
  • Taking this feedback, we added a ‘summary statistics’ segment to compare player characteristics on performance statistics.

Resources

About

Interactive data visualization of handedness on performance of baseball players using HTML5, CSS3, JS, and the D3 library.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published