Characteristics Of Regression Analysis

Multiple-regression techniques are commonly used to transfer flood characteristics from gaged to ungaged watersheds. In the arid or semiarid areas of the United States, the standard errors of these regression relations often are quite large. One way to reduce the standard error is to identify basin characteristics that are significant for predicting T-year flood discharges such as 50- or 100-year floods. Recent investigations have identified new characteristics that appear to be promising. Examples of these are main channel sinuosity, hydraulic radius, bank-full channel conveyance, basin shape, time-to-peak of the flood hydrograph, effective drainage area, and percent of the basin in a given hydrologic soil group. The appropriateness of their use and the application of these basin characteristics are discussed. In addition, a few new basin characteristics are suggested that have not yet been investigated. Examples include channel infiltration losses, ratio of main channel width to flood-plain width, stream-network magnitude, channel storage indices, and drainage density.

Record URL:
Record URL:
Availability:
- Find a library where document is available. Order URL: http://worldcat.org/isbn/0309047625
Supplemental Notes:
- This paper appears in Transportation Research Record No. 1201, Arid Lands: Hydrology, Scour, and Water Quality. Distribution, posting, or copying of this PDF is strictly prohibited without written permission of the Transportation Research Board of the National Academy of Sciences. Unless otherwise indicated, all materials in this PDF are copyrighted by the National Academy of Sciences. Copyright © National Academy of Sciences. All rights reserved
Authors:
- Thomas Jr, W O
Publication Date: 1988

Media Info

Characteristics Of Multiple Regression Analysis

This is similar to using regression, without propensity score, to adjust for differences in baseline characteristics. But instead of using multiple predictors, all baseline characteristics are combined (through propensity score) into one “index” (the propensity score). This makes it more simple to check for model assumptions. Applications of regression analysis exist in almost every field. In economics, the dependent variable might be a family's consumption expenditure and the independent variables might be the family's income, number of children in the family, and other factors that would affect the family's consumption patterns.

Features: Figures; References; Tables;
Pagination: p. 37-42
Monograph Title: Arid lands: hydrology, scour and water quality
Serial:
- Issue Number: 1201
- Publisher: Transportation Research Board
- ISSN: 0361-1981

Subject/Index Terms

TRT Terms: Arid land; Floods; Multiple regression analysis; Reduction (Chemistry); Regression analysis; Standard error; Watersheds
Uncontrolled Terms: Flood peaks
Old TRIS Terms: Arid region; Basin characteristics; Discharge; Multiple regression; Reduction; Semiarid land
Subject Areas: Data and Information Technology; Highways; Hydraulics and Hydrology; I26: Water Run-off - Freeze-thaw;

Filing Info

Accession Number: 00489703
Record Type: Publication
ISBN: 0309047625
Files: TRIS, TRB, ATRI
Created Date: Nov 30 1989 12:00AM

Investigation into bus ridership changes using regression analysis.

Like many US transit agencies, the MBTA has seen a slight overall ridership decline in the past couple years. As discussed in multiple presentations to the FMCB [viewable here and soon, here], we are monitoring these changes and analyzing ridership data to better understand the reasons for the decline. We noticed that not all services, days or routes dropped at the same rate; some services have mostly steady or even increasing ridership.

One of our analyses focused on explaining the variance in the change among bus routes. This post describes the regression model we created to try to tease out some of the correlations between certain characteristics of bus routes and their gains or losses in ridership using the change in ridership between Fiscal Year 2016 and Fiscal Year 2017.

Exploratory Analysis

The first thing that stood out to us in examining the data was the differences between types of day. Ridership on buses fell the most (on a percentage basis) on weekends, and was closer to steady on weekdays:

The scatter plot below shows the distribution of ridership change among both key and local buses. There does not appear to be a pattern based on ridership (high ridership routes are just as distributed along the ridership change axis as lower-ridership routes) nor based on key bus classification (some key buses lost ridership while others gained it, as did local buses).

Characteristics of regression analysis in statistics

Mapping the routes by the percentage change we did notice some spatial clustering. In particular routes in Roxbury, Dorchester and Mattapan lost ridership. In future research, in order to investigate these patterns further we will be creating a spatial dataset.

Our research question for this analysis is: are there either service quality or rider characteristics of MBTA buses that help explain how ridership on each route changed from FY16 to FY17?

Data

We selected bus routes with reliable automated fare collection (AFC) data that had at least 1,000 average weekday riders in FY2017, resulting in 92 routes. To precisely identify the route being operated, these data were crosswalked with vehicle location data. This process excluded routes like the SL1 to the airport, as many passengers board there without interacting with fareboxes. The mean ridership on these routes was 2,900 on an average weekday in FY2017, with the maximum ridership on the #66 of 10.5k average weekday riders. 86 of these routes had reliable Saturday ridership data, and 77 had reliable Sunday ridership data.

The table below summarizes the ridership changes on the included bus routes. Weekday ridership changes occur with more variety among bus routes, with weekend days showing both more consistent and proportionally higher losses of ridership.

Number of Routes	Minimum	Maximum	Mean
Avg. Weekday Ridership Change FY17 over FY16	92	-14%	+11%	-3%
Avg. Saturday Ridership Change FY17 over FY16	86	-22%	+1%	-9%
Avg. Sunday Ridership Change FY17 over FY16	77	-23%	+7%	-7%

Service quality was measured by the metrics set in the Service Delivery Policy, specifically by each route’s cost effectiveness rank and its crowding, reliability, span of service, and frequency metrics.

Ridership and route characteristics were measured by each route’s proportion of riders paying a reduced fare (senior/student/TAP), from AFC data; proportion of journeys involving a transfer to or from another MBTA service, calculated from our ODX model; and proportion of minority riders, and proportion of weekday trips to or from work, collected by a System-wide Passenger Survey. In addition, an indicator of whether the route was a key bus was also included.

Not all the information was available for all routes, so the final analyses only included 81 routes, which limits the number of variables we can include in the analyses and the ability of analyses to find significant effects. Only weekday ridership change could be analyzed, since even fewer routes had all the information for Saturdays and Sundays.

Service Quality Regression Model

A model estimating the Average Weekday Ridership Change with only service quality measures is not very predictive. It only explains about 7% of the variation among bus routes in ridership change.

The only significant predictor is reliability, in the expected direction (higher reliability corresponds with increase in percent ridership change). The size of the effect is fairly minor: 10% increase in reliability corresponds with 1.5% increase in ridership change. The scatter plot below shows the relationship between reliability of a route and the ridership change between FY16 and FY17.

Route and Ridership Characteristics Regression Model

Characteristics Of Multiple Regression Analysis

This model is more predictive, explaining 16% of the variance between routes in ridership change. The only significant variable is percent of riders paying a reduced fare, in the opposite direction (higher proportion of riders paying a reduced fare corresponds with decreases in ridership change). The effect size is moderate: a 10% increase in percent of riders paying a reduced fare corresponds with a 2.3% decrease in ridership change.

How To Explain A Regression Analysis

Percent minority was excluded from this model after analysis because it was highly correlated with the included variables. Excluding it did not reduce the explanatory power of the model.

The scatter plot below shows the relationship between proportion of a route’s riders paying reduced fare in FY2016 and the ridership change between FY16 and FY17.

Combined Model

The combined model (with variables from both service quality and rider characteristics) explains 21% of the variance between routes in ridership change and maintains the two independently-predictive aspects: reliability and percentage of riders paying reduced fare. Percentage of riders paying reduced fare has a somewhat bigger effect than reliability on predicting the route’s ridership change.

Characteristics of multiple regression analysis

The proportion of reduced fares being significant does not necessarily mean loss of reduced fare trips specifically. Likely, the proportion of reduced fares is reflecting some other aspect of bus service (perhaps the spatial distribution of bus routes or number of discretionary trips) that is more explanatory of the route’s ridership change.

Conclusion

Meaning And Characteristics Of Regression Analysis

This investigation only explained a small portion of the variance in ridership change between bus routes. We think that this can be improved by measuring the service quality and rider characteristics over a longer period of time and with more nuanced measures. In addition, we think there are likely to be variables that are still missing entirely from this analysis, in particular spatial variables like land use and demographic shifts, along with trip-level variables that will contribute to explaining the variance in ridership beyond the service quality and rider characteristics.

This analysis is part of a larger effort the MBTA is undertaking to explain ridership changes. This larger effort will consider elements that have been found to have predictive effects in other regions including: fare and pass multiple pricing; land use and demographic changes; shifts in high-transit ridership populations (immigrants, zero-vehicle households, etc.) in the area; recent changes in visitor patterns to the Boston region; ride-hailing (Uber, Lyft) usage and other mode shifts; service quality and capacity; and changes in types and lengths of trips, among other variables. We will post results to sections of this research project as we finish them.

Characteristics Of Regression Analysis

View the discussion thread.