Synthetic population generationwithout a sample

                by J. Barthelemy and Ph. L. Toint

              Report NAXYS-12-2010  23 December 2010

Abstract.
The advent of microsimulation in the transportation sector has created the
need for extensive disaggregate data concerning the population whose behaviour
is modelled. Due to the cost of collecting this data and the existing privacy
regulations, this need is often met by the creation of a synthetic population
on the basis of aggregate data.  While several techniques for generating such
a population are known, they suffer from a number of limitations.  The first
is the need for a sample of the population for which fully disaggregated data
must be collected, although such samples may not exist or may not be
financially feasible.  The second limiting assumption is that the aggregate
data used must be consistent, a situation which is most unusual because this
data often comes from different sources and is collected, possibly at
different moments, using different protocols.
 
The paper presents a new synthetic population generator in the class of the
Synthetic Reconstruction methods, whose objective is to obviate these
limitations. It proceeds in three main successive steps: generation of
individuals, generation of household type's joint distributions and generation
of households proper.  The main idea in these generation steps is to use data
at the most disaggregate level possible to define joint distributions, from
which individuals and households are randomly drawn. The method also makes
explicit use of both continuous and discrete optimization and used the
$\chi^2$ metric to estimate distances between estimated and generated
distributions.

The new generator is applied for constructing a synthetic population of
approximately 10,000,000 individuals and 4,350,000 households localized in the
589 municipalities of Belgium. The statistical quality of the generated
population is discussed using criteria extracted from the literature, and its
is shown that the new population generator produces excellent results.