top of page

Analyzing Fitness Data (Python)

  • Writer: Brandon Hopkins
    Brandon Hopkins
  • Sep 13, 2022
  • 3 min read

Updated: Jan 20, 2023

Background


The idea behind this project was to dive deeper into fitness data and utilize it to help accomplish fitness goals. To make a change and increase our fitness activity, it will of course be helpful to know what our activity level has been in the past. This is what I tried to accomplish here, let's dive in and take a look at the process!


Data Collection


Unfortunately, I realized that I did not have a great amount of analyze! Luckily I was able to use fitness data from my friend, Avery Smith - he has done a great job in the past few years tracking his fitness data and has set a goal for himself of 10,000 steps per day. Avery was able to collect his data by frequently wearing an Apple Watch, and then exported the data from Apple Health via a 3rd party app called QS Access. I've included a screenshot of the resulting .csv below, as you can see the data consisting of two date/time columns and columns for calories burned, distance walked, flights of stairs climbed, and steps taken.



Data Analysis


For this project I wanted to use Python for automated and quick analysis. I used Google Colab, a cloud-based notebook, as my IDE. You can actually see the the notebook at the link here or you can check out my entire code below!

I started by importing pandas and uploading the .csv file and assigning a variable to the data. I then used df.types and df.describe() functions to view the data types and get some quick descriptive analysis done.


From this alone I could see that Avery is doing a great job! He's averaging just under his goal at 9,420 steps per day and is burning on average 626 calories a day - great job!


One thing I noticed was that Avery's max distance covered in one day was 42 miles, which (according to him) did not seem right. This brought me to the next step of my analysis, which was using Boolean Masks to tease out some of the outlier data.


By using Boolean Masks, I was able to identify which days had very high (> 20) miles travelled. I checked back in with Avery - because there were only a handful of days with this high of a mileage count, he could remember exactly which days they were and concluded that the 42 mile day must have been a time when he forgot to turn off the tracker in the car.


At this point, I made several plots to visualize the data. I've included a few of the below.


The two plots above give a visual representation of some of the aggregated data we saw earlier - Avery averages just under 10,000 steps per day and 625 calories burned per day.


I really like the above plot, as it gives you visual confirmation of something you may intuitively already know - the more steps taken, the more calories burned.


The above plot shows a trend you would expect, similar to that of the Calories Burned by Steps plot. However, it is interesting to note that there are plenty of data points where the distance walked is less, but calories burned is higher - this brings up another discussion for a future project, which is that workout intensity is certainly a driver for burning calories. Maybe on these days, Avery travelled a shorter distance nut with greater intensity. Tracking heartrate data and performing similar analysis would be very interesting!


Conclusion


This data set was fun to analyze and offered a great opportunity to practice Python skills! Based on my analysis, I would recommend to Avery to try and push for those last 500 steps a day to bump his average up to his goal. By doing this, that should raise the average calories burned per day count and hopefully help him in reaching his fitness goals!

 
 
 

Comments


  • alt.text.label.LinkedIn

Thanks for visiting!

©2022 by Brandon Hopkins | Data Enthusiast.

Proudly created with Wix.com

bottom of page