The geography

That was out methodology to come up with the coordinate acceleration between any to points in time using accelerometer data. But as we are using the accelerometer while riding a bike, that difference in time is also a difference in space. Our final goal is to use this metric to account for the fluidity of any given bike-lane. So, we need to take into account the geography of the bike lane and how we relate our measurements to it.

When we use cellphone sensor, what we get at the end is a series of points around a bike lane. But as we work with open data for cities, the standard is to represent bike-lanes and street with lines and not polygons. So, intead of this cloud of points with GPS coordinates around a line, we need to have points on the line. We need to snap those points that our sensors mesured to the bike-lane line.
But, why do we have a cloud of points around a bike-lane instead of a set of points perfectly aligned with the bike-lane line? Here we have to deal with the problem of measurement error and scale in spatial analysis. As we traverse the bike-lane we have to deal with two different problems. First, we have to account for the measurement errors in the GPS. No matter if we bike in a straight line, the final set of points won't be exactly on that line. But more importantly, we have to be mindful of the scale problem. The bike-lane is obviously wider than the bike. In fact, within the line we have to move sideways to avoid potholes in the bike-lane. We have acknowledged this possibility by saying that measurements of shocks in the X axis account for that sideways movement. As the bike-lane is wider we can have a cloud of points with GPS measures within the real bike-lane. But, at the same time, some bike-lanes are just some part of the street, and we could be forced to bike outside of the bikelane (for example, if we need to avoid an obstruction).
Shapely has a function interpolate() that implements the interpolation methodology and allows us to snap our point to the true nearest point on the line using linear referencing. But in order to do so, we need to provide a point and a line. The problem we have is that for every point that we recorded a GPS coordinate, we have several competing lines (ie. all of the NYC bike-lanes). Evidently, some are closer than others, and one of them is actually the closest and the ture bike-lane we were biking on. So, first we need to choose a candidate line to use as input in our interpolation function.
One approach could be simply choose the closest line. But that would imply to calculate every distance from every point to every line which would be computationally very expensive. Instead, we chose to generate a buffer around each bike-lane segment and then spatialy join the points to that buffer. Besides eficency, this method has another advantage. The join produces a mapping of every point to a bike-lane segment. As we want to compute some measurement for the quality of every bike-lane segment, using the points we measured, we need to map htose points to the corresponding bike-lane segment.
The disadvantage is that a point way be joined to more than one bike-lane segment. For example, in a street corner with 2 bike-lanes that point "belongs" to two different bikelanes. What we want is to assign the point to the bike-lane that we have been biking on. So for those points were we have competing candidate bike-lanes, we would choose the same bike-lane lane that the previous point for which we had only one candidate bika-lane.
Also, the join may return some empty bike-lane segment if the point is far away from a bike-lane. This can be because we were riding on a street with no bike-lane or the measurement errors were too big in those points.