Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and home–work locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method's efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.
This work was partially supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2023-00246523 and No. RS-2024-00337956 ).This research was conducted at the Future Cities Lab Global at Singapore-ETH Centre. Future Cities Lab Global is supported and funded by the National Research Foundation, Prime Minister\u2019s Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme and ETH Zurich (ETHZ), with additional contributions from the National University of Singapore (NUS) , Nanyang Technological University (NTU), Singapore and the Singapore University of Technology and Design (SUTD) .