摘要
人类移动行为与交通、传染病、安全应急等现实应用息息相关。尽管现代信息和通信技术发展使得采集大规模个体移动定位数据成为易事,但涉及个人隐私又存在冗余、缺失和噪声的原始轨迹数据在实际中的可得性和易用性仍有很大局限。通过建模方式生成个体轨迹数据和群体移动数据,使其在统计层面接近并在应用层面可替代真实数据,是值得尝试的解决思路。本文面向个体轨迹数据生成和群体移动数据生成两大研究主题,将生成方法分为基于机理模型的方法和基于机器学习的方法,对其研究进展进行了系统总结,并探讨了其发展趋势和所面临的挑战。本文提出,未来人类移动数据生成方法研究需要多学科深度交叉共同探索人类移动行为底层机制、关注机理模型与机器学习耦合建模、借助生成式人工智能与大语言模型前沿技术、平衡生成数据可用性与隐私保护效果、强调空间泛化与迁移学习能力、控制模型训练与使用成本等。本文认为,人类移动过程是典型的人地交互过程,地理信息科学应在吸纳计算机科学、统计物理学、复杂性科学等多学科理论方法的基础上,充分发挥本学科特色,将空间依赖、距离衰减、空间异质性、尺度等地理空间效应显式纳入建模过程,提升模型性能及合理性。
Human mobility data play a crucial role in many real-world applications such as infectious diseases,transportation,and public safety.The development of modern Information and Communication Technologies(ICT)has made it easier to collect large-scale individual-level human mobility data,however,the availability and usability of the raw data are still significantly limited due to privacy concerns,as well as issues of data redundancy,missing,and noise.Generating synthetic human mobility data through modeling approaches to statistically approximate the real data is a promising solution.From the data perspective,the generated human mobility data can serve as a substitute for real data,mitigating concerns about personal privacy and data security,and enhance the low-quality real data.From the modeling perspective,the constructed models for human mobility data generation can be used for scenario simulations and mechanism exploration.The human mobility data generation tasks include individual trajectory data generation and collective mobility data generation,and the research methods primarily consist of mechanistic models and machine learning models.This article firstly provides a systematic review of the research progress in human mobility data generation and then summarizes its development trends and challenges.It can be observed that mechanistic-model-based methods are predominantly studied in the field of statistical physics,while machine-learning-based methods are primarily studied in the field of computer science.Although the two types of models have complementary advantages,they are still developing independently.The article suggests that future research in human mobility data generation should focus on:1)exploring and revealing the underlying mechanisms of human mobility behavior from a multidisciplinary perspective;2)designing hybrid approaches by coupling machine learning and mechanistic models;3)leveraging cutting-edge generative Artificial Intelligence(AI)and Large Language Model(LLM)technologies;4)improving the models'spatial generalization and transfer-learning capabilities;5)controlling the costs of model training and implementation;and 6)designing reasonable evaluation metrics and balancing data utility with privacy-preserving effectiveness.The article asserts that human mobility processes are typical phenomenon of human-environment interactions.On the one hand,research in Geographic Information Science(GIS)field should integrate with theories and technologies from other disciplines such as computer science,statistical physics,complexity science,transportation,and others.While on the other hand,research in GIS field should harness the unique characteristics of GIS by explicitly incorporating geographic spatial effects,including spatial dependency,distance decay,spatial heterogeneity,scale,and more into the modeling process to enhance the rationality and performance of the human mobility data generation models.
作者
刘康
LIU Kang(Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China)
出处
《地球信息科学学报》
EI
CSCD
北大核心
2024年第4期831-847,共17页
Journal of Geo-information Science
基金
国家重点研发计划项目(2022YFB3904203)
国家自然科学基金项目(42271474,41901391)。
关键词
人类移动数据
合成数据
轨迹生成
机器学习
机理模型
生成式人工智能
隐私保护
地理空间人工智能
human mobility data
synthetic data
trajectory generation
machine learning
mechanical model
generative AI
privacy-preserving
Geospatial Artificial Intelligence(GeoAI)