![]() ![]() Using those columns we can grab the coordinates of a variety of different plays from specific players, from specific games, or from specific teams. Luckily for us, we can see some columns which can help us with this endeavor, namely: coordinate_x, coordinate_y, score_value, shooting_play, participants_0_athlete_id, and type_id. It would be even better if we could somehow plot the results of the datapoints onto a plot which resembled an NBA court for proper context. Though there of course many directions in which we can take our analysis, one of the most visually pleasing datapoints to chart onto a plot is shot makes and misses. To show this information we used the GridDB CLI’s showcontainer command: showcontainer nba_pbp_2022. dbWritetable and it will handle creating our SQL statements for us, including creating the table.Ĥ6 start_quarter_seconds_remaining INTEGER Using the RJDBC API allows us to simply call. So to ingest, we will load in the file directly from GitHub and ingest the data, line by line until it is finished. rds file format directly from one of the hoopR’s publically available GitHub repositories. From looking at the source code, we can see that the data is available to us in. The library’s function load_nba_pbp looks to be exactly what is needed: it accepts a DBI connection as one of its parameters and will load a specific year of data into our DB connection. ![]() To accomplish ingesting the play-by-play data, we will look to hoopR’s built-in functions which attempts to do all the work for us. With this done, we can move on to ingesting our dataset. If all of your details are correct, the conn variable will now be a DBI connection to GridDB. To make the connection, we must of course import the appropriate library and then enter our credentials, including the GridDB JDBC file.ĭrv <- JDBC(".sql.Driver", Once that connection is made, we can use the DBI connection to make sql queries to our GridDB instance. Using this package, we can simply enter our JDBC credentials and create a connection with GridDB. Luckily, there is a package which allows for the programming language R to connect directly via JDBC called RJDBC. Connecting to GridDB via JDBCĪs mentioned before, we will utilize JDBC to connect to our server. To ingest our dataset, we first need to connect to our running GridDB server. In this case, ingesting all of the seasons did not seem necessary, so we opted to simply ingest the latest season and conduct our analysis from there. Using the hoopR library, we can ingest play-by-play data from all NBA seasons starting from 2002 until the most recent season. For this article, we have opted to use go in a slightly-off-kilter direction: sports. Picking an extremely large dataset can lead us down many paths - we are, after all, in the era of big data. To run this you will need to accomplish the following: ![]()
0 Comments
Leave a Reply. |