Riding a motorcycle can be one of the most thrilling and freeing experiences in the world. The sensation of the rushing wind connects you with your environment, while also being one with your bike. You get a sense of adventure and exploration, even if you are doing your boring daily commute.
But it does come with it's drawbacks. I'm sure you've heard some say "Don't ride a motorcycle, it's too dangerous!" Well that's why I made Why You Should Ride. To explore the real reasons why motorcyclists are dying, and what we can do about it.
My name is Saatvik and I am an aspiring rider. I am a full time student at Stony Brook University, majoring in Computer Science and Applied Mathematics and Statistics. I have loved motorcycles since I was child (all because I watched Dhoom, if you know you know). I do not currently have a motorcycle, but that will change soon. I've had my license for over a year however so it's just a matter of buying one!
My analysis on motorcycle deaths was done using a variety of technologies.
I used the FARS dataset from the National Highway Transport Agency (NHTSA). I used data from 2021, 2020, and 2019, as these were the latest available datasets.
After downloading the csv files for the FARS dataset, I used Python in a Jupyter notebook environment to extract all necessary information. I performed any necessary data cleaning and prepared it for database insertion.
Uploaded the data to a database hosted by PlanetScale, a MySQL compatible database.
Using a combination of SQL and Python, I performed EDA on the dataset. I also used various libraries for plots, to help in this process. I also took my finding and cross referenced them with some studies, which you can check out in the EDA section.
Used Random Forest to create a model that accurately predicts crash severity (Fatal, severe, mild, no apparent, possible injury). Attained an 81% accuracy rate!