Frontiers of Data and Computing ›› 2023, Vol. 5 ›› Issue (4): 112-126.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.04.010

doi: 10.11871/jfdc.issn.2096-742X.2023.04.010

• Technology and Application • Previous Articles     Next Articles

Taxi Demand Prediction Model Based on Spark and Improved BP Neural Network

MENG Zhe1(),YU Su2,*()   

  1. 1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
    2. Graphic Information Center, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2022-05-02 Online:2023-08-20 Published:2023-08-23

Abstract:

[Objective] Taxi dispatching, as an important issue affecting the development of transportation in China, has been widely concerned by scholars. To solve the problems such as long-time idle driving of taxis, low-efficiency of matching between taxis and passengers, and short supply of taxis in real life, a taxi demand prediction model based on Spark and the improved BP network is proposed. The work of the model is to predict the total demand for taxis in a certain area of the city in one day. The core prediction algorithm is an improved BP network. [Methods] Given the slow convergence and unsatisfactory training effect of traditional BP neural network in the face of big data set, grey relation analysis and genetic algorithm are used to optimize the model. The data set is processed by grey relation analysis in advance and the results are applied to the interior of the BP neural network to optimize the convergence speed and training effect. Then, the genetic algorithm is used to optimize the parameters of the model again. And the final model is achieved by Spark, to accelerate the training speed of the model. [Results] The experimental results of the model's training and prediction on the taxi data set from TLC (The New York City Taxi and Limousine Commission) show the following conclusion: compared with the traditional BP neural network model, the BP neural network improved by genetic algorithm, the BP neural network improved by simulated annealing algorithm combined with genetic algorithm and the BP neural network improved by particle swarm optimization algorithm, the prediction accuracy of the proposed improved model is increased by 25%, 11.1%, 6.9%, and 12.4% respectively, the training duration is shortened by 32.9h, 30.1h, 36.2h, and 33.5h respectively, and the convergence speed is significantly accelerated. Finally, the model is trained and predicted on the Chengdu taxi data, which also proves the universality of the model and the effectiveness of the model in forecasting the demand for urban taxis in China. [Conclusions] The improved model can complete the task of taxi demand prediction nicely and provide an effective reference for decision-makers to carry out taxi dispatching, to alleviate the existing taxi dispatching problem. [Limitations] However, the model still has room for improvement such as prediction range and selection of parameters.

Key words: taxi dispatching, taxi demand, prediction model, BP neural network, grey relation analysis, genetic algorithm, Spark