4-Nov-2020: SkateAnalytics

I have somewhat resumed street skating, and I've been riding my cruiser board on off-days to keep moving. I haven't made much progress with the SkateAnalytics project in a while because the air quality hasn't improved a whole lot in my area of California, which has discouraged me from going outside and skating much. (+ I've now held off from skating for so long that I'm decently out of shape and need to get back to performing as well as I did before, which already wasn't great.)

I'd like to make some simple, fun, and impressive progress with this project by making decent graphs of the data and posting them online. This should let me get more comfortable with working on the project while I continue to build out my statistics background so I can do more in-depth analysis of the data and come up with new research ideas.

(I have actually been feeling in-over-my-head with the statistics books that I've been reading: Statistical Rethinking and Regression and Other Stories. I've noticed that I put a lot of pressure on myself to understand the material ASAP to feel like less of an impostor in the world of statistics. This further discourages me from sitting down and going through the exercises associated with the former text, so I'm not absorbing the material well.)

Some good visualization goals would be: making ridgeplots like Figure 8.17 of this Data Visualization book to display the period-to-period changes in each trick's probability distributions and making GIFs that display each distribution in turn.

Note that the above speaks mostly about the direct probability distributions and not charts of the raw data or transformations thereof (like aggregating the count of trick attempts or successes over periods of time). I haven't had a lot of interest in that so far, but that class of things might be interesting to look at, too.

In the last chapter that I read of Statistical Rethinking (possibly chapter 4), there was an example where, within a sequence of nine actions that could each be one of two types, the author counted the number of times that a single type of action repeated before the next type of action occurred ... and he said that the distribution of occurrences formed by this is itself an interesting reflection on the probability of the one type of action occurring versus the other. And I still don't understand how to quite think about this, but the example seems to deal with a more raw picture from the data than the pure probability distributions that underlie it do. I suspect that thinking about the pictures given by the raw data can prompt better questions about what's going on behind the scenes. ... After writing this out I suspect that if/when I come to understand how that example links the raw data back to the underlying probability distributions, I'd have a deeper and more explicit appreciation for the links between the concrete (raw-ish numbers) and abstract (probability) realms.

To apply this to the SkateAnalytics: consider the lengths of sequences of successful trick attempts before a failed trick attempt occurs (and maybe the lengths of failed trick attempts?). Success is one action type, and failure is the other.


Some throwaway thoughts: Use existing data to create a general probability distribution given all recorded trick attempts. Show why this isn't that great and how it can be better/different, such as by splitting it into distributions for each individual trick and comparing each with the "general" distribution; splitting it into different time periods to show how there's a lot of variation within the "general" distribution; or both.