Origin-Destination Matrix and Travel Time Distribution Estimation, Without Survey Data

Bob Butler nuclear engineer, musician (www.pleasantonband.org), former city councilman and Mayor of Pleasanton, California died October fifth https://www.pleasantonweekly.com/news/2021/10/14/what-a-week-remembering-bob-butler-former-pleasanton-mayor-and-councilman. He helped me get traffic counts data from the Pleasanton Traffic Department.

The objective is to estimate origin-destination (O-D) matrix proportions and travel time distributions, without individual data, and use the matrix and distributions to help manage traffic. The traffic model is a network of k-dimension M/G/infinitynodes and edges connecting them. Data are transient arrivals and departure counts, without individual vehicle identification. Within nodesindividualsturn or go straight randomly. Reconstruct route information by product and convolution.

Pleasanton is in the armpit of the intersection of two major freeways: in the southeast corner of the highways 580 and 680 intersection. Freeway drivers cut through Pleasanton to avoid the congested freeway interchange. In year 2000, Pleasanton assigned city cars to follow cars getting off freeways and driving into the city to see where they went and whether they went through. Drivers complained about being followed. Popular sentiment and complaints stopped city cars from tracking people driving through town.

Pleasanton changed traffic signal timing near freeway exits to discourage drivers from driving through town. Stopped cars backed up onto the freeways and caused accidents. I offered to help.

Pleasanton traffic department gave me traffic counts in 15 minute intervals from major intersections around edges of city. The intersection data told how many cars go straight, turn left, or turn right, minute by minute. I produced nonparametric least squares estimates of vehicle O-D matrixes and travel time distributions. [George 2000]

The Pleasanton O-D and travel-time traffic model is a network of M(t)/G/Infinity self-service queuing systems [Keilson and Servi] with non-stationary Poisson arrivals. Analyses of this kind of model have been called “network tomography,” https://en.wikipedia.org/wiki/Network_tomography [Castro et al.]. Leonard Kleinrock, a fellow UC Berkeley grad student, made a career analyzing the packet-switching Internet as networks of queues, https://nationalmedals.org/laureate/leonard-kleinrock/.

Table 1. Typical Pleasanton data, vehicle counts starting at beginning of rush hours. I combined several intersections into N, S, E, and W.

	Input				Output
Time	N	S	E	W	N	S	E	W
Start	15	104	54	16	4	47	46	8
0:15	12	73	37	14	3	60	43	10
0:30	17	74	50	13	3	42	18	9
0:45	8	56	41	11	3	31	29	10
1:00	9	36	34	13	2	21	27	6
1:15	5	39	22	6	0	21	28	6
1:30	8	31	23	1	2	27	21	5

Table 2. Simple O-D matrix. Body is proportion of inputs “From…” that go out “To…” “From 0” and “To 0” are those that originate in Pleasanton and those whose destination is within Pleasanton.

O from\D to->	From 0	From N	From S	From E	From W
To 0	0.0000	0.8640	0.0000	0.0000	0.7801
To N	0.2136	0.0000	0.0135	0.0000	0.0721
To S	0.1801	0.0285	0.0000	1.0000	0.0000
To E	0.1755	0.0177	0.2679	0.0000	0.1479
To W	0.4308	0.0899	0.7186	0.0000	0.0000
Sum	1	1	1	1	1

Table 3. Travel-time distribution estimates. Data counts were in 15-minute intervals, so could not quantify discrete probability density more finely.

Minutes	Pleasanton origin	Through Pleasanton
0-15	1	0.979148
15-30	0	0
30-45	0	0.020852

The Pleasanton traffic manager didn’t understand O-D matrixes, probability distributions, or their use for managing traffic. In year 2009, Pleasanton stationed cheap labor at major intersections around edges of city to record license numbers of cars entering or leaving Pleasanton. That gave some sample data to estimate travel times through Pleasanton by license number. Some tollway operators are doing that using license-plate readers, to cite speeders or suspend their toll passes.

Sample or population?

It’s human nature to doubt statistically significant conclusions based on a sample that is a small fraction of the population.

Air travel surveys ~10%, on the day surveyors choose. Commuter surveys have non-response, bias. People asked, “Why are they surveying me?” I want Origin-Destination matrix population proportions and travel time distributions, without sample uncertainty.

The M/G/infinity self-service system service-time distribution G(t) can be estimated, without customer identification! Imagine the supermarket analog of estimating the probability distribution of time in store. Observe supermarket arrivals and departures, starting from the time the doors open but without identifying individual shoppers. The reliability analog is to regard products or parts as entering self-service when sold or installed, departing when they fail (complaints, failures, return, repair, replacements, spares sales, etc.): data required by GAAP! The traffic analog is to regard traffic as going into and coming out of a network of M(t)/G/infinity self-service systems plus some originating or terminating in Pleasanton. That uses population data!

Assume Poisson input at rate λ, service time distribution is G(t), and self-service. Output is Poisson at rate λG(t). Arrival and departure times or counts, without identification, are statistically sufficient to make nonparametric estimates of service-time distribution G(t). Same is true for non-stationary Poisson input M(t)/G/infinity systems. [https://accendoreliability.com/estimation-of-a-hidden-service-time-distribution-of-an-mt-g-%e2%88%9e-self-service-system/#more-450442]

Tandem M/G/infinity Self-Service Systems

Imagine two M/G/infinity self-service systems with the output of one being the input to the second. Input Poisson at rate λG1(t) to a second M(t)/G2/infinity self-service system. Output is Poisson at rate

∫λG1(u)G2(t-u)dG1(u), (INTEGRATE[λG1(u)G2(t-u)dG1(u)]),

where the integral is from 0 to t.

Estimate distributions G1(t) and G2(t) from input, intermediate, and output observation counts. Refer to diffusion approximation to tandem or parallel queues, [George, 1973], for generalizations or approximations to non-Poisson inputs.

Simple Traffic M/G²/∞²/p Model

Imagine two crossing M/G/infinity self-service systems, one North-South and the other East-West. Customers from N-S may switch to E-W and vice-versa with probability p (Figure 1). Assume p and travel-time distribution G(t) are the same regardless of directions. Assuming symmetry gives you twice as much data to estimate p and G(t)

Figure 1. Simple intersection modeled as two queuing systems

Observe 2 inputs and 2 outputs. Estimate average arrival rates λ1 and λ2 from input counts. Estimate p and G(t) from output counts t=1,2,….

r1(t) = ((1-p)λ1 +pλ2)G(t)

r2(t) = ((1-p)λ2 +pλ1)G(t)

M/G²/∞²/p with one observation at time t

Suppose you observe output counts r1(t) and r2(t) at time t, then

p = (λ₂*r1(t) – λ1*r2(t)) / ((-λ1 + λ2)*(r1(t) + r2(t))) and

G(t) = (r1(t) + r2(t))/(λ1 + λ2).

Solve the two linear equations in two unknowns obtained by equating observed and expected output rates. Estimate lambdas from input rates:

r1(t) = 4, r2(t) = 3, λ1 = 10, λ2 = 20 gives

p = 5/7 and G(t) = 7/30.

Solutions for several observations

Use least squares estimation when more than one observation gives multiple solutions. If underdetermined, minimize sums of squared errors, SUM[(observed-expected)²]. For example:

r1(1) = 3, r2(1)= 4, r1(2) = 6, r2(2)= 8, l1 = 10, l2 = 20 gives

p = 0.285714, G(1) = 0.233333, G(2) = 0.466667.

With multiple observations and different turn rates and travel times, p1, p2, G11(t), G12 (t), G21(t), G22(t) depend on whether you want answers at time t or for all (i,j) pairs. If the system of equations is overdetermined, take more observations. Solutions must be constrained to have 0<p<1 and G(t) increasing in t. I used Excel Solver

M(t)/G^k/∞^k network with switching

Imagine a network of intersection nodes and connecting routes. The whole thing could be modeled as a (filtration of) Markov process using states as node arrival and departure counts at arrival and departure times. Observe node inputs and outputs; inputs can have nonstationary rates. Assume stationary switching and travel time distributions. Route frequency is product of switch probabilities (assuming independence). Travel time is convolution of intra-node travel time distributions.

Summary

Estimation of O-D matrix proportions and travel time distributions doesn’t require customer or vehicle identification. Population data beats sample data. Least squares method works for transient processes: e.g. commutes, BART, other daily cycles and traffic light cycles.

References

Sharminda Bera and K. V. Krishna Rao , “Estimation of origin-destination matrix from traffic counts: the state of the art,” European Transport no. 49 (2011): 3-23

George, L. L., “Origin-Destination and Travel-Time Distribution, Without Surveys,” INFORMS, Salt Lake City, May 2000

George, L. L., “Diffusion Approximation for Two Channel, Poisson-Exponential Service Systems with Dependence,” Ph. D. thesis, University of California, Berkeley; Advisor, Richard E. Barlow, 1973

J. Keilson and L. D. Servi, “A Distributional Form of Little’s Law”, 1965, also Operations Research Letters, Volume 7, Issue 5, 1988, Pages 223-227, ISSN 0167-6377, https://doi.org/10.1016/0167-6377(88)90035-1.

Rui Castro, Mark Coates, Gang Liang, Robert Nowak, and Bin Yu, “Network Tomography: Recent Developments,” 2003

Sample or population?

Tandem M/G/infinity Self-Service Systems

Simple Traffic M/G2/∞2/p Model

Solutions for several observations

M(t)/Gk/∞k network with switching

Summary

References

About Larry George

Leave a Reply Cancel reply

Simple Traffic M/G²/∞²/p Model

M(t)/G^k/∞^k network with switching