Chapter 10

Aggregation Method

Using a standard GIS technique, data is collected at various geographies from various sources, and aggregated to the 1/2 or 1/4 mile buffer around each transit node (station area). Depending on the type of variable examined, the “Proportional Sum,” “Proportional Average,” or “Proportional Weighted Average” is used. The concepts are simple:

As a simple case, consider a census tract that has 1000 people in it. If the 1/4 mile buffer is completely contained within its boundary and covers 25% of the tract’s area, the “Proportional Sum” of population in the 1/4 mile buffer is 250 people or 25% of the total. As a slightly more complicated example, consider a 1/2 mile buffer containing two census block groups, each having 1000 people and each overlapping the block group by 25%. The proportional sum for this buffer will be 500, or two times .25 times 1000. Similarly, the average and weighted average are calculated against the intersection of the base geography and the transit buffer.

For the purpose of the TOD database the finest base geography possible is used. Therefore, where the data is collected at the census block level, the data is aggregated to the transit buffer using the census block, even if it was also available from the US Census at the block group and tract level. The following table shows these base geographies for the various datasets.

Table 1: Base Geographies for Datasets
Dataset Description Base Geography Exceptions
American Community Survey 2005 - 2009 5 Year Estimates Census Block Groups (2009) Census Tracts (select variables)
Census 2000 Summary File 1 Consists of counts and information on age, sex, race and ethnicity, household relationship and housing tenure. The SF1 data set is synonymous with the Census short form. Census Blocks (2000) Census Tracts (HCT and PCT variables)
Census 2000 Summary File 3 Consists of information on place of birth, education, employment status, income, as well as detailed information on housing units such as value of housing, mortgage and rental payments, and year the structure was built. SF3 data is the result of the Census long form, which was last used in 2000. Census Block Groups (2000) Census Tracts (HCT and PCT variables)
Census 2010 Summary File 1 Consists of counts and information on age, sex, race and ethnicity, household relationship and housing tenure. The SF1 data set is synonymous with the Census short form. Census Blocks (2010) Census Tracts (HCT and PCT variables)
Census Transportation Planning Package (CTPP) 2000 Part 1 (Place of Residence) CTPP is a special data product synthesized from the Decennial Census long form specifically for transportation planners. The TOD Database uses CTPP data products that were released in 2000. Part 1 provides information on where workers live, as well as worker and household characteristics Census Tracts (2000)
CTPP 2000 Part 2 (Place of Work) Part 2 provides information on where workers work, as well as worker characteristics Census Tracts (2000)
CTPP 2000 Part 3 (Origin/Destinatio n) Part 3 provides journey-to-work flow data. These data include an origin Census Tract, a destination Census Tract, and a count of workers (all workers and several cohorts of workers) that commute between those tracts. These data are not included in their origin/destination structure, but, median distances are derived for all Census Tracts. Census Tracts (2000)
Housing + Transportation Affordability Index The Housing + Transportation Affordability Index (H+T® Index) is a proprietary model developed by the Center for Neighborhood Technology that models household travel demand. The H+T Index estimates household transportation expenditures at the block group level based on neighborhood variables and household variables for 337 (2000) and 887 (2009) metropolitan statistical areas. Census Block Groups (2000 and 2009)
Local Employment Dynamics (LED) Origin/Destinatio n 2002 - 2009 Origin/Destination provides journey-to-work flow data. As with CTPP Part 3 data these data are not included in their origin/destination structure, but, median distances are derived for all Census Blocks. Census Blocks (2009)
Local Employment Dynamics (LED) Residential Area Characteristics 2002 - 2009 Residential Area Characteristics provides information on where workers live, as well as worker characteristics Census Blocks (2009)
Local Employment Dynamics (LED) Work Area Characteristics 2002 - 2008 Workplace Area Characteristics provides information on where workers work, as well as worker characteristics Blocks. Census Blocks (2009)

This map shows the 1/2 mile buffers relative to some census block groups in a fairly urban location, with the blocks outlined in white.


Figure 1: Generic map of census geography over 1/2 mile buffers.

Although keeping the aggregation to the smallest possible geography is the best way to sum and average the underlining data, there can be times when the results are confusing. When comparing similar statistics from a different dataset that aggregates from a different base geography, the results are not necessarily comparable. Census blocks can be very small and are not based on population, but rather on geography; they are simply a set of closed intersecting lines from the Census Tiger line data. The block groups are built from these blocks and are the combination of enough of these blocks so that there is approximately the same number of people in each block group (although there are exceptions to this; for a full understanding of Census Geography, for instance see http://factfinder2.census.gov/help/en/glossary/c/census_tract.htm). Since the block can be very small relative to the transit buffers and the block groups, and the tracts are not, especially in lower density places, there will be differences in the proportional aggregate between these two aggregation geographies. As an example consider a station, such as the Rosemont stop near Chicago, that is in the middle of an expressway, with many hotels, and little housing adjacent to it. However it is near housing, and overlaps census block groups that have housing and thus population in them. For this example a report of the SF1 population (aggregated from blocks) and the SF3 population (aggregated from block groups) gives a very different number:


Figure 2: TOD Database report on two measures of population for a station in an inhomogeneous location.

However one can see from the map that the buffer mostly has highway right of way, a large forest preserve and a group of hotels near it (this is the last stop before one gets to O’Hare Airport). In contrast, the California stop in a more urban neighborhood shows that the two aggregates give similar, more comparable numbers:


Figure 3: TOD Database report on two measures of population for a station in a more homogeneous location.

It is important to realize that this is how the aggregation is accomplished in the TOD database, and it is incumbent on the user to make sure that the comparisons are appropriate.