Population-Household Submodule of SMARTPLANS

The population submodule of SMARTPLANS predicts the total number of zonal population (or household) in each zone of the study area as a result of simulating two processes: mobility of people and residential location choice behavior. The predictions are based on two alternative approaches:

  1. MNL-Based Location Choice Model
  2. Constrained Logistic-Regression Model

1. MNL-Based Location Choice Model

Under this approach, the submodule starts with the zonal population KiH(t)for population segment H in the base year t to predict the number of people who will stay in zone i by time t +1. Here, people can either stay in the zone or leave the zone. This population mobility process can be modeled using the logistic regression approach. Here, the model predicts the probability PSiH of population segment H staying in zone i for time t +1 as follows:

where betas are the estimated parameters and the significant covariates X’s characterizing zone i. The total number of people from segment H who stayed in zone i by time t +1 is then calculated as follows:

Let KH(t+1) be an exogenously provided region-wide number of population by segment H for time t +1 (either determined by the Rogers Demographic Model or obtained from publicly available official forecasts). The total region-wide number of people seeking a zone to move in at time t + 1 can be calculated as follows:

Given the calculated KMH(t+1) value, the number of people pertaining to segment H who will move to zone i at time t + 1 can then be predicted as follows:

Where PC|H is an exogenous number representing the share of class C population within segment H, and PMiH|C is a multinomial logit probability calculated using the MNL-based Location Choice Model. The latter probability is formulated as follows:

where ViH|C is a linear-in-parameter systematic utility function of class C population in segment H and attributes of the destination zone i.

The total zonal population pertaining to segment H in zone i at time t +1 is calculated as follows:

On the other hand, KMH(t+1) is a region-wide value that represents segment’s H population seeking a location to move in. This value is calculated as follows:

2. Constrained Logistic-Regression Model

Under the constrained logistic regression modeling approach, the user needs to provide the following inputs:

  • HSi(t+1): average number of people per household in zone i
  • ADiH(t): available number of vacant dwellings from time t that could be occupied at time t + 1
  • NDiH(t+1): the number of newly constructed dwellings in zone i at time t + 1
  • KiH(t): number of households of type H in zone i at time t

The model here starts by calculating the number of dwellings that become vacant in at time t + 1 for household type H. Similar to the previous approach, dwellings in zone i can either remain occupied or become vacant at time t + 1 as a result of the mobility process. Therefore, the number of dwellings that become vacant in zone i at time t + 1 for population segment H can be calculated as follows:

PSiH is the probability of remaining in the dwelling in zone i. This probability is based on a logistic regression model which predicts the probability of staying or moving out. Next, the total number of vacant dwellings VDiH(t+1) that can be occupied in zone i at time t + 1 is calculated as follows:

Let PVOiH|C be the probability that a certain percentage of the available vacant dwellings VDiH(t+1) will be occupied by population class C of segment H (i.e., moving in) zone i at time t +1, then:

where PVOiH|C is a logistic regression probability that is formulated as follows:

where betas are the estimated parameters and X’s are the significant covariates pertaining to the socio-economic characteristics of the population segment that moved in the dwellings available in destination zone i, as well as the location attributes of zone i. If PC|H is the exogenous proportions of class C within household type H, then:

The region-wide total number of households of type H that moved into new locations in the study area at time t + 1 can be calculated as follows:

Where KH(t+1) is the total number of households that exists in the study area at time t + 1, and the summation over i for KSiH(t + 1) is the total number of households that did not change dwelling by t + 1. The estimated total number of households of type H that moved in zone i is further using the KMH(t+1) to ensure consistency in the predictions. This is done as follows:

Based on the above calculations, the total number of households in zone i at time t + 1 for population segment H is calculated as follows:

Households are then translated into total population in zone i using the exogenous average household size (i.e., HSi(t+1)), that is:

Finally, the number of available vacant dwelling at time t + 1 in zone i that will be needed for the subsequent simulation is then calculated as follows: