# Population-Household Submodule of SMARTPLANS

The population submodule of SMARTPLANS predicts the total number of zonal population (or household) in each zone of the study area as a result of simulating two processes: mobility of people and residential location choice behavior. The predictions are based on two alternative approaches:

- MNL-Based Location Choice Model
- Constrained Logistic-Regression Model

### 1. MNL-Based Location Choice Model

Under this approach, the submodule starts with the zonal population K_{iH}(t)for population segment H in the base year t to predict the number of people who will stay in zone i by time t +1. Here, people can either stay in the zone or leave the zone. This population mobility process can be modeled using the logistic regression approach. Here, the model predicts the probability PS_{iH} of population segment H staying in zone i for time t +1 as follows:

where betas are the estimated parameters and the significant covariates X’s characterizing zone i. The total number of people from segment H who stayed in zone i by time t +1 is then calculated as follows:

Let K_{H}(t+1) be an exogenously provided region-wide number of population by segment H for time t +1 (either determined by the Rogers Demographic Model or obtained from publicly available official forecasts). The total region-wide number of people seeking a zone to move in at time t + 1 can be calculated as follows:

Given the calculated KM_{H}(t+1) value, the number of people pertaining to segment H who will move to zone i at time t + 1 can then be predicted as follows:

Where P_{C|H} is an exogenous number representing the share of class C population within segment H, and PM_{iH|C} is a multinomial logit probability calculated using the MNL-based Location Choice Model. The latter probability is formulated as follows:

where V_{iH|C} is a linear-in-parameter systematic utility function of class C population in segment H and attributes of the destination zone i.

The total zonal population pertaining to segment H in zone i at time t +1 is calculated as follows:

On the other hand, KM_{H}(t+1) is a region-wide value that represents segment’s H population seeking a location to move in. This value is calculated as follows:

### 2. Constrained Logistic-Regression Model

Under the constrained logistic regression modeling approach, the user needs to provide the following inputs:

- HS
_{i}(t+1): average number of people per household in zone i - AD
_{iH}(t): available number of vacant dwellings from time t that could be occupied at time t + 1 - ND
_{iH}(t+1): the number of newly constructed dwellings in zone i at time t + 1 - K
_{iH}(t): number of households of type H in zone i at time t

The model here starts by calculating the number of dwellings that become vacant in at time t + 1 for household type H. Similar to the previous approach, dwellings in zone i can either remain occupied or become vacant at time t + 1 as a result of the mobility process. Therefore, the number of dwellings that become vacant in zone i at time t + 1 for population segment H can be calculated as follows:

PS_{iH} is the probability of remaining in the dwelling in zone i. This probability is based on a logistic regression model which predicts the probability of staying or moving out. Next, the total number of vacant dwellings VD_{iH}(t+1) that can be occupied in zone i at time t + 1 is calculated as follows:

Let PVO_{iH|C} be the probability that a certain percentage of the available vacant dwellings VD_{iH}(t+1) will be occupied by population class C of segment H (i.e., moving in) zone i at time t +1, then:

where PVO_{iH|C} is a logistic regression probability that is formulated as follows:

where betas are the estimated parameters and X’s are the significant covariates pertaining to the socio-economic characteristics of the population segment that moved in the dwellings available in destination zone i, as well as the location attributes of zone i. If P_{C|H} is the exogenous proportions of class C within household type H, then:

The region-wide total number of households of type H that moved into new locations in the study area at time t + 1 can be calculated as follows:

Where K_{H}(t+1) is the total number of households that exists in the study area at time t + 1, and the summation over i for KS_{iH}(t + 1) is the total number of households that did not change dwelling by t + 1. The estimated total number of households of type H that moved in zone i is further using the KM_{H}(t+1) to ensure consistency in the predictions. This is done as follows:

Based on the above calculations, the total number of households in zone i at time t + 1 for population segment H is calculated as follows:

Households are then translated into total population in zone i using the exogenous average household size (i.e., HS_{i}(t+1)), that is:

Finally, the number of available vacant dwelling at time t + 1 in zone i that will be needed for the subsequent simulation is then calculated as follows: