Analysis of one month of subway data – part 2: correlation between the total number of trains and the probability of delays

When I first decided to track trains in the NYC subway system I was sure I would find a correlation between the number of trains in the system and the percentage of trains running with delay. I was also sure it would be a positive relationship. The more trains there are in the system, the more traffic jams, and the more delays one should find — simple, right? Surprisingly, as we will see in the…

Continue reading