Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteris...Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteristics have a substantial impact on travel behavior(TB),which makes it important to take into account while studying transport options.Traditional statistical techniques frequently presume linear correlations,but real-world data rarely follows these presumptions,which may make it harder to grasp the complex interactions.Thorough systematic review was conducted to examine how machine learning(ML)approaches might successfully capture nonlinear correlations that conventional methods may ignore to overcome such challenges.An in-depth analysis of discrete choice models(DCM)and several ML algorithms,datasets,model validation strategies,and tuning techniques employed in previous research is carried out in the present study.Besides,the current review also summarizes DCM and ML models to predict TMC and recognize the determinants of TB in an urban area for different transport modes.The two primary goals of our study are to establish the present conceptual frameworks for the factors influencing the TMC for daily activities and to pinpoint methodological issues and limitations in previous research.With a total of 39 studies,our findings shed important light on the significance of considering factors that influence the TMC.The adjusted kernel algorithms and hyperparameter-optimized ML algorithms outperform the typical ML algorithms.RF(random forest),SVM(support vector machine),ANN(artificial neural network),and interpretable ML algorithms are the most widely used ML algorithms for the prediction of TMC where RF achieved an R2 of 0.95 and SVM achieved an accuracy of 93.18%;however,the adjusted kernel enhanced the accuracy of SVM 99.81%which shows that the interpretable algorithms outperformed the typical algorithms.The sensitivity analysis indicates that the most significant parameters influencing TMC are the age,total trip time,and the number of drivers.展开更多
Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restric...Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restrictions diversely, how to properly select the specific type among discrete choice models for realistic application still remains to be a tough problem. In this article, five typical discrete choice models for transport mode split are, respectively, discussed, which includes multinomial logit model, nested logit model (NL), heteroscedastic extreme value model, multinominal probit model and mixed multinomial logit model (MMNL). The theoretical basis and application attributes of these five models are especially analysed with great attention, and they are also applied to a realistic intercity case of mode split forecast, which results indi- cating that NL model does well in accommodating similarity and heterogeneity across alternatives, while MMNL model serves as the most effective method for mode choice prediction since it shows the highest reliability with the least significant prediction errors and even outperforms the other four models in solving the heterogeneity and similarity problems. This study indicates that conclusions derived from a single discrete choice model are not reliable, and it is better to choose the proper model based on its characteristics.展开更多
This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved produ...This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved product quality in terms of a scalar variable as seen in previous methods.An aspect-based sentiment classification technique is designed to construct feature-related review variables from millions of review contents.We discuss the performance of review variables both in a hedonic pricing model and a conditional logit discrete choice model.Our results demonstrate that review variables show a good performance either as instruments for price or as explicit control variables in demand models.In detail,the pricing prediction accuracy increases 3.4%,which is considered as a significant improvement in the practice of forecasting.In the discrete choice model,the estimated price coefficient is biased in the positive direction without endogeneity correction.It is adjusted in the expected way after including review variables.The findings indicate that online reviews provide alternative sources of information in dealing with endogeneity in discrete choice models.We also analyze the differences in the preferences and needs of individual consumers to provide some practical implications of marketing.展开更多
Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital w...Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital world. These networks can be viewed as a collection of nodes and edges, where users and their interactions are represented as nodes and the connections between them as edges. Understanding the factors that contribute to the formation of these edges is important for studying network structure and processes. This knowledge can be applied to various areas such as identifying communities, recommending friends, and targeting online advertisements. Several factors, including node popularity and friends-of-friends relationships, influence edge formation and network growth. This research focuses on the temporal activity of nodes and its impact on edge formation. Specifically, the study examines how the minimum age of friends-of-friends edges and the average age of all edges connected to potential target nodes influence the formation of network edges. Discrete choice analysis is used to analyse the combined effect of these temporal factors and other well-known attributes like node degree (i.e., the number of connections a node has) and network distance between nodes. The findings reveal that temporal properties have a similar impact as network proximity in predicting the creation of links. By incorporating temporal features into the models, the accuracy of link prediction can be further improved.展开更多
The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for chargi...The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for charging operations is essential.This study uses modelling and simulation of EV user behaviour to forecast possible scenarios for electric charging in cities and to identify potential management problems and opportunities for improvement of EVs and EV charging infrastructures.The conurbation of Turin was selected as a case study to reproduce realistic scenarios by applying discrete choice modelling based on socio-economic and transport system data.One of objectives of the study was to describe user charging behaviour from a geographic perspective to model where users prefer to charge in the area studied according to the variables that may affect decisions.Another objective was to estimate the number of electric vehicles in Turin and the characteristics of their users,both of which are helpful in understanding electric mobility within a city.Analysing these behavioural issues in a modelling framework can provide a set of tools to compare and evaluate a variety of possible modifications,indicating an adequate network of charging infrastructure to facilitate the diffusion of electric vehicles.展开更多
In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics a...In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics are obtained from statistical data,while the trip mode split data is collected through a trip survey in Bengbu.In addition,the discrete choice model is adopted to build the functional relationship between the mode choice and the travelers' personal characteristics,as well as family characteristics and trip characteristics.The model shows that the relationship between the mode split and the personal,as well as family and trip characteristics is stable and changes little as the time changes.Deduced by the discrete model,the mode split result is relatively accurate and can be feasibly used for trip mode structure forecasts.Furthermore,the proposed model can also contribute to find the key influencing factors on trip mode choice,and restructure or optimize the urban trip mode structure.展开更多
Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors pos...Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors possess individual characteristics, histories, and objectives which complicate the choice of what platform features that maximize the conversion rate. Modern web technology has made clickstream data accessible allowing a complete record of a visitor’s actions on a website to be analyzed. What remains poorly constrained is what parts of the clickstream data are meaningful information and what parts are accidental for the problem of platform design. In this research, clickstream data from an online retailer was examined to demonstrate how statistical modeling can improve clickstream information usage. A conceptual model was developed that conjectured relationships between visitor and platform variables, visitors’ platform exit rate, boune rate, and decision to purchase. Several hypotheses on the nature of the clickstream relationships were posited and tested with the models. A discrete choice logit model showed that the content of a website, the history of website use, and the exit rate of pages visited had marginal effects on derived utility for the visitor. Exit rate and bounce rate were modeled as beta distributed random variables. It was found that exit rate and its variability for pages visited were associated with site content, site quality, prior visitor history on the site, and technological preferences of the visitor. Bounce rate was also found to be influenced by the same factors but was in a direction opposite to the registered hypotheses. Most findings supported that clickstream data is amenable to statistical modeling with interpretable and comprehensible models.展开更多
Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.De...Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.Design/methodology/approach:A survey was conducted to gather data from the rural people who received the Shanghai information and communication technology(ICT)training courses and data analysis was based on the 1,200 valid questionnaires retrieved.By using the discrete choice model,we studied the impacts of individual attributes such as gender,age,educational level and occupation and the factors of information inequality such as information skill and the purpose of using information technology(IT)on information inequality in suburban Shanghai.Findings:The most critical factors affecting information inequality of Shanghai suburban residents are educational level and information skill,followed by age and the purpose of using IT.The results show that the purpose of using IT and information skill are the two main aspects of information inequality of Shanghai suburban residents.Differences between individuals,especially in educational level and age,are identified as the underlying causes of the information inequality.Research limitations:Subjects in the sample were limited to those who received training in the Shanghai rural ICT training project.Such a sample limits the generality of the study findings.Practical implications:The study will help enhance our understanding of information inequality and find ways to reduce the inequality.Originality/value:Most previous studies on information inequality were focused on theoretical discussions.This study adds to the limited empirical research done on information inequality and also provides some insights into the ways to reduce the inequality.展开更多
The grid load attributable to electric vehicles (EVs)is affected by the choice behaviors of EV users. To analyze theeffects of factors such as travel demand and electricity priceson user behavior, a logit discrete cho...The grid load attributable to electric vehicles (EVs)is affected by the choice behaviors of EV users. To analyze theeffects of factors such as travel demand and electricity priceson user behavior, a logit discrete choice model is introducedto simulate the users decisions to charge/travel. Based on aquasi-steady-state traffic network, a model for cluster electricvehicles considering the user’s behavior is designed to obtain theprobability distribution of the user’s behavior and the chargeand discharge curves of cluster EVs under various scenarios. Thevalidity of the proposed model is verified using an IEEE 9-nodetraffic network case and an urban traffic network case. Furthermore,the impact of the electricity price, traffic conditions, andother factors on the load curves of urban EVs is analyzed.展开更多
文摘Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteristics have a substantial impact on travel behavior(TB),which makes it important to take into account while studying transport options.Traditional statistical techniques frequently presume linear correlations,but real-world data rarely follows these presumptions,which may make it harder to grasp the complex interactions.Thorough systematic review was conducted to examine how machine learning(ML)approaches might successfully capture nonlinear correlations that conventional methods may ignore to overcome such challenges.An in-depth analysis of discrete choice models(DCM)and several ML algorithms,datasets,model validation strategies,and tuning techniques employed in previous research is carried out in the present study.Besides,the current review also summarizes DCM and ML models to predict TMC and recognize the determinants of TB in an urban area for different transport modes.The two primary goals of our study are to establish the present conceptual frameworks for the factors influencing the TMC for daily activities and to pinpoint methodological issues and limitations in previous research.With a total of 39 studies,our findings shed important light on the significance of considering factors that influence the TMC.The adjusted kernel algorithms and hyperparameter-optimized ML algorithms outperform the typical ML algorithms.RF(random forest),SVM(support vector machine),ANN(artificial neural network),and interpretable ML algorithms are the most widely used ML algorithms for the prediction of TMC where RF achieved an R2 of 0.95 and SVM achieved an accuracy of 93.18%;however,the adjusted kernel enhanced the accuracy of SVM 99.81%which shows that the interpretable algorithms outperformed the typical algorithms.The sensitivity analysis indicates that the most significant parameters influencing TMC are the age,total trip time,and the number of drivers.
基金supported by the Science&Technology pillar project(No.0556)of Guangzhou
文摘Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restrictions diversely, how to properly select the specific type among discrete choice models for realistic application still remains to be a tough problem. In this article, five typical discrete choice models for transport mode split are, respectively, discussed, which includes multinomial logit model, nested logit model (NL), heteroscedastic extreme value model, multinominal probit model and mixed multinomial logit model (MMNL). The theoretical basis and application attributes of these five models are especially analysed with great attention, and they are also applied to a realistic intercity case of mode split forecast, which results indi- cating that NL model does well in accommodating similarity and heterogeneity across alternatives, while MMNL model serves as the most effective method for mode choice prediction since it shows the highest reliability with the least significant prediction errors and even outperforms the other four models in solving the heterogeneity and similarity problems. This study indicates that conclusions derived from a single discrete choice model are not reliable, and it is better to choose the proper model based on its characteristics.
文摘This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved product quality in terms of a scalar variable as seen in previous methods.An aspect-based sentiment classification technique is designed to construct feature-related review variables from millions of review contents.We discuss the performance of review variables both in a hedonic pricing model and a conditional logit discrete choice model.Our results demonstrate that review variables show a good performance either as instruments for price or as explicit control variables in demand models.In detail,the pricing prediction accuracy increases 3.4%,which is considered as a significant improvement in the practice of forecasting.In the discrete choice model,the estimated price coefficient is biased in the positive direction without endogeneity correction.It is adjusted in the expected way after including review variables.The findings indicate that online reviews provide alternative sources of information in dealing with endogeneity in discrete choice models.We also analyze the differences in the preferences and needs of individual consumers to provide some practical implications of marketing.
文摘Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital world. These networks can be viewed as a collection of nodes and edges, where users and their interactions are represented as nodes and the connections between them as edges. Understanding the factors that contribute to the formation of these edges is important for studying network structure and processes. This knowledge can be applied to various areas such as identifying communities, recommending friends, and targeting online advertisements. Several factors, including node popularity and friends-of-friends relationships, influence edge formation and network growth. This research focuses on the temporal activity of nodes and its impact on edge formation. Specifically, the study examines how the minimum age of friends-of-friends edges and the average age of all edges connected to potential target nodes influence the formation of network edges. Discrete choice analysis is used to analyse the combined effect of these temporal factors and other well-known attributes like node degree (i.e., the number of connections a node has) and network distance between nodes. The findings reveal that temporal properties have a similar impact as network proximity in predicting the creation of links. By incorporating temporal features into the models, the accuracy of link prediction can be further improved.
基金This work was partially supported by the EU Horizon 2020 project“INCIT-EV”,with Grant agreement ID:875683.
文摘The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for charging operations is essential.This study uses modelling and simulation of EV user behaviour to forecast possible scenarios for electric charging in cities and to identify potential management problems and opportunities for improvement of EVs and EV charging infrastructures.The conurbation of Turin was selected as a case study to reproduce realistic scenarios by applying discrete choice modelling based on socio-economic and transport system data.One of objectives of the study was to describe user charging behaviour from a geographic perspective to model where users prefer to charge in the area studied according to the variables that may affect decisions.Another objective was to estimate the number of electric vehicles in Turin and the characteristics of their users,both of which are helpful in understanding electric mobility within a city.Analysing these behavioural issues in a modelling framework can provide a set of tools to compare and evaluate a variety of possible modifications,indicating an adequate network of charging infrastructure to facilitate the diffusion of electric vehicles.
基金The National Natural Science Foundation of China (No.50738001,51078086)
文摘In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics are obtained from statistical data,while the trip mode split data is collected through a trip survey in Bengbu.In addition,the discrete choice model is adopted to build the functional relationship between the mode choice and the travelers' personal characteristics,as well as family characteristics and trip characteristics.The model shows that the relationship between the mode split and the personal,as well as family and trip characteristics is stable and changes little as the time changes.Deduced by the discrete model,the mode split result is relatively accurate and can be feasibly used for trip mode structure forecasts.Furthermore,the proposed model can also contribute to find the key influencing factors on trip mode choice,and restructure or optimize the urban trip mode structure.
文摘Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors possess individual characteristics, histories, and objectives which complicate the choice of what platform features that maximize the conversion rate. Modern web technology has made clickstream data accessible allowing a complete record of a visitor’s actions on a website to be analyzed. What remains poorly constrained is what parts of the clickstream data are meaningful information and what parts are accidental for the problem of platform design. In this research, clickstream data from an online retailer was examined to demonstrate how statistical modeling can improve clickstream information usage. A conceptual model was developed that conjectured relationships between visitor and platform variables, visitors’ platform exit rate, boune rate, and decision to purchase. Several hypotheses on the nature of the clickstream relationships were posited and tested with the models. A discrete choice logit model showed that the content of a website, the history of website use, and the exit rate of pages visited had marginal effects on derived utility for the visitor. Exit rate and bounce rate were modeled as beta distributed random variables. It was found that exit rate and its variability for pages visited were associated with site content, site quality, prior visitor history on the site, and technological preferences of the visitor. Bounce rate was also found to be influenced by the same factors but was in a direction opposite to the registered hypotheses. Most findings supported that clickstream data is amenable to statistical modeling with interpretable and comprehensible models.
文摘Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.Design/methodology/approach:A survey was conducted to gather data from the rural people who received the Shanghai information and communication technology(ICT)training courses and data analysis was based on the 1,200 valid questionnaires retrieved.By using the discrete choice model,we studied the impacts of individual attributes such as gender,age,educational level and occupation and the factors of information inequality such as information skill and the purpose of using information technology(IT)on information inequality in suburban Shanghai.Findings:The most critical factors affecting information inequality of Shanghai suburban residents are educational level and information skill,followed by age and the purpose of using IT.The results show that the purpose of using IT and information skill are the two main aspects of information inequality of Shanghai suburban residents.Differences between individuals,especially in educational level and age,are identified as the underlying causes of the information inequality.Research limitations:Subjects in the sample were limited to those who received training in the Shanghai rural ICT training project.Such a sample limits the generality of the study findings.Practical implications:The study will help enhance our understanding of information inequality and find ways to reduce the inequality.Originality/value:Most previous studies on information inequality were focused on theoretical discussions.This study adds to the limited empirical research done on information inequality and also provides some insights into the ways to reduce the inequality.
基金the National Natural Science Foundation of China (No.51777065).
文摘The grid load attributable to electric vehicles (EVs)is affected by the choice behaviors of EV users. To analyze theeffects of factors such as travel demand and electricity priceson user behavior, a logit discrete choice model is introducedto simulate the users decisions to charge/travel. Based on aquasi-steady-state traffic network, a model for cluster electricvehicles considering the user’s behavior is designed to obtain theprobability distribution of the user’s behavior and the chargeand discharge curves of cluster EVs under various scenarios. Thevalidity of the proposed model is verified using an IEEE 9-nodetraffic network case and an urban traffic network case. Furthermore,the impact of the electricity price, traffic conditions, andother factors on the load curves of urban EVs is analyzed.