Developing an R Shiny Application with Statcast Data

Now that Sam's walked you all through some of the basics of coding, let's start to tie this stuff together. In today's post, Sam will show you the basic code needed to bring your team's data to life with an interactive dashboard.



Within the R programming language is a package called ‘shiny’ that allows users to develop interactive web applications directly from the RStudio interface. If you’ve read some of my work on Twitter, LinkedIn, or my Medium blog, you might be familiar with my application, HawkDashboard. This application was built entirely from scratch with the ‘shiny’ library in R, as well as supporting packages that I will briefly touch on later.


Shiny applications can act as a one-stop dashboard for personal or public viewing. An application can vary in complexity; you can build a one-dimensional dashboard to automate plot or table generation, or set out to build a full-scale internal information system. At Iowa, HawkDashboard has proven to be a central location for all the data our staff collects, as well as a home for interactive player profiles, post-action reports, and statistics leaderboards.


In this article, I will take you through the steps to building a simple R Shiny application. Like we’ve seen in my previous blog posts on Simple Sabermetrics, we will be using Statcast data from the 2020 MLB season, specifically the three National League Cy Young finalists: Trevor Bauer, Yu Darvish, and Jacob deGrom. I covered how to filter data frames in this article, but I have provided the 'NL_CY' CSV file on Github (link below). The content of the application we will create in this article will be using visualizations from the ‘ggplot2’ library, which I wrote about here.



If you’d like to download the entirety of the R code for this article’s Shiny application, I have provided that here in this Github folder. Finally, you can view this application publicly on my shinyapps account here.


Creating a Script

Before we dive right into the code syntax, we will need to create a script by clicking the dropdown menu seen below and selecting “Shiny Web App”.



This will bring up the dialog box you see below. The “Application name” box is the name that will be given to the folder that houses the “app.R” script you create. You will choose a desired home for that folder by clicking the “Browse…” button. For now, we will only cover single file application types.



Once you have completed these two steps, you should have a new R script that looks like a template for a Shiny web application. Next, let’s check the folder where the script was created. For me, this is where I created the “app.R” script for this application. The 'NL_CY' CSV file accompanies the script, as well as two folders - “rsconnect” and “www”. For now, ignore the former - this is an R generated folder that allows the user to publish applications to the web. The “www” folder was manually created and is where images (PNG or JPG) are stored to display in the application.



Necessary First Steps

Now that you’ve got your Shiny script and folder directory squared away, it’s time to dive right into the code. Before we talk about the two main components of a Shiny script - ui and server - we must first load the required libraries and read in our data.


library(shiny)
library(dplyr)
library(ggplot2)
 
NL_CY <- read.csv("NL_CY.csv")

If you do not write the code for your libraries and dataset(s) at the top of your Shiny script, the app will not work correctly. Additionally, any transformations or additions to your dataset must be written at the top as well. You can consider this section your “global code”.


Before we continue any further, I should mention that the execution of the code for this article and supporting application is different from the previous two articles I’ve written on R code. Instead of following along and executing code line-by-line, you will have to have the entirety of the code written in one script and click the “Run App” button, as seen below.



Building the User Interface

The user interface, referred to as the “ui”, is the section of your script to specify the layout of your application. If you are new to coding in Shiny, all of these functions should not be familiar to you. Rather than writing dplyr or ggplot2 code, we are using functions like fluidPage(), sidebarPanel(), and mainPanel() to piece together the aesthetic of a web page.


ui <- fluidPage(
 titlePanel(), 
 sidebarLayout(
 sidebarPanel(),
 mainPanel()
 )
)

This is close to the bare minimum needed to build a user interface. As you will see in the app.R script in the Github folder, I have the sidebarPanel() code filled in with the following…


sidebarPanel(
 selectInput(
 inputId = "PitcherInput", 
 label = "Select Pitcher", 
 choices = sort(unique(NL_CY$player_name))),
 dateRangeInput(
 inputId = "DateRangeInput", 
 label = "Select Date Range", 
 start = min(NL_CY$game_date), 
 end = max(NL_CY$game_date)),
 img(src = "ss_logo.png", 
 style = "display: block; margin-left: auto; 
 margin-right: auto;", 
 height = 150, 
 width = 150)
)

I have just introduced selectInput(), dateRangeInput(), and img(). These three inputs create a drop-down selection, date range selection, and an image, respectively. An input ID and label are required for the first two inputs, as well as designations for the inputs to interact with the connected dataframe. The pitcher input needs to know the list of pitchers, so those are determined in the “choices” argument. The date range input needs to know when to know a default start and end date, so those are denoted with arguments as well. In the end, this is what the sidebar panel looks like.



Off to the right of the sidebar panel is what is known as the main panel. As you will see in the app.R script in the Github folder, I have the mainPanel() code filled in with the following…


mainPanel(
 tabsetPanel(
 tabPanel("Pitch Usage - Bar Chart", br(), 
 plotOutput("barchart")),
 tabPanel("Pitch Velocity - Box Plot", br(), 
 plotOutput("boxplot")),
 tabPanel("Pitch Velocity Trend - Line Plot", br(), 
 plotOutput("lineplot"))
 )
)

Within the panel I have included a tabsetPanel(), which is a string of tabs that divide output into a series of viewable sections (see below). You can create as many tabPanel() sections as you’d like, but within this application I limited myself to three. A br() denotes a line break, which was included to provide white space between the tab titles and plot outputs, which you will see all together here shortly. Finally, the plotOutput() functions take code from the server portion of the script to display



Writing Server Code

Now that the user interface has been set up, it’s time to piece together the other part of the puzzle in our server code. As I mentioned earlier, the server is where we will write the code for our visualizations, or any outputs for that matter. Below is close to the bare minimum needed to write server code.


server <- function(input, output) {
 output$plot1 <- renderPlot({
 dataFilter <- reactive({})
 ggplot()
 }) 
}

The ID for this hypothetical output is “plot1”. Inside renderPlot({}) is where code is written for the plot, as well as the reactive ({}) function to allow the plot output to update with input from the user interface. As you will see in the app.R script in the Github folder, each output is filled in with something of this nature...


output$barchart <- renderPlot({
 dataFilter <- reactive({
 NL_CY %>% 
 filter(player_name == input$PitcherInput,
 between(game_date, input$DateRangeInput[1],
 input$DateRangeInput[2])) %>%
 group_by(pitch_name) %>%
 summarize('count' = n())
 })
 ggplot(dataFilter(), 
 aes(x = reorder(pitch_name, -count), 
 y = count, 
 fill = pitch_name)) + 
 geom_bar(stat = "identity") +
 labs(x = "Pitch Type", 
 y = "Count", 
 title = "Pitch Usage") +
 theme_bw() + 
 theme(legend.position = "none", 
 plot.title = element_text(hjust = 0.5, 
 face = "bold", 
 size = 16)) +
 theme(axis.title = element_text(size = 14), 
 axis.text = element_text(size = 12))
}, width = 850, height = 450)

The ID for this output is “barchart”. I created a reactive data frame called “dataFilter”. Again, a reactive data frame is one that updates with each change to the user inputs. If the data frame was not reactive, it would not change from the initial input selections. In addition to making the data frame reactive, I used some dplyr functions - similar to my data manipulation article - to create the new data frame.


Next, I take “dataFilter” and place it in ggplot() code below, but still within the renderPlot({}) parentheses. It’s important to note that there is a pair of parentheses placed immediately after “dataFilter” to maintain its reactive capability. If you forget to include them, a common error code you’ll get is “Error: object of type 'closure' is not subsettable.” Finally, between the final closing curly bracket and final closing parenthesis are definitions for the width and height of the plot. If done correctly, the following bar chart will populate in the Shiny app.



Tying It All Together

The very last component to a Shiny app is the shinyApp() function to collect the contents of both the ui and server to generate the resulting application.


shinyApp(ui = ui, server = server)

And there you have it. Those are the few main steps to creating a Shiny web application. While there are an endless number of potential additions, this article provides you the tools to create your own version of a dashboard, whether that’s with Statcast data or not. As I always like to do at the end of my Simple Sabermetrics articles, I have provided several links to helpful resources to grow your knowledge from here.


Additional Resources

https://rstudio.github.io/shiny/tutorial/

https://shiny.rstudio.com/tutorial/

https://shiny.rstudio.com/tutorial/written-tutorial/lesson1/

https://www.r-bloggers.com/2019/12/r-shiny-for-beginners-annotated-starter-code/

https://towardsdatascience.com/beginners-guide-to-creating-an-r-shiny-app-1664387d95b3

https://www.edureka.co/blog/r-shiny-tutorial/

Back to blog