Category: Data Science

Projecting Student Enrollment

The Pennsylvania State System of Higher Education (PASSHE) asked Civilytics to build an enrollment forecast tools for the fourteen universities in the system.

Enrollment Projection and Planning System (EPPS)

  • Produces highly accurate projections of fall freshman enrollment for every system campus.
  • Uses PASSHE’s strength, data collection across campuses, to train and update a flexible statistical algorithm (a generalized additive model) that adjusts to the unique conditions in each campus without overfitting.
  • Produces predictions that are useful for different planning scenarios, providing administrators with access to a “conservative” prediction representing the minimum size of the fall class, a “likely” prediction representing the most likely fall class size, and a “ceiling” prediction representing the highest likely size of the fall class.
  • Runs on free software, is simple to maintain, requires little data, and is integrated into the existing AOD report.

EPPS is better than what any individual campus could achieve doing this forecasting. It uses a statistical methodology that learns from application data trends across the system and borrows information to better inform predictions for all campuses. The EPPS projections are also flexible for different planning purposes. It is important to know both the floor and ceiling of new freshman enrollment – to understand both the potential revenue for the incoming cohort and the capacity needed to serve them.

EPPS utilizes the existing familiarity with PASSHE reporting systems to provide predictions back to campuses in the same place they already provide the data. This reduces the need for any training related to EPPS.

How does EPPS work?

The core of EPPS is a statistical framework known as generalized additive models (GAMs). GAMs are like standard linear regression models, except they have the added flexibility of allowing the relationship between a predictor and the outcome to be modeled by a flexible function instead of a linear coefficient. In the case of EPPS, the GAMs are extended to generalized additive mixed models (GAMMs) which allow the additional flexibility of modeling campus specific differences using a multilevel regression framework.

To accommodate the wide range of sizes of PASSHE campuses, EPPS blends multiple models and uses simulation and cross-validation to determine which prediction is best for each campus at each time period.

Using the GAMM framework to model the interaction between predictors and ultimate fall freshman enrollment results in a model that balances between overfitting the pattern in any campus in any year and using an inflexible linear pattern when the relationship in campuses is more complex.

Learn More

To learn how you can use GAMMs or other models to improve your strategic planning through enrollment forecasts, please get in touch.