Methodology:
I applied Random Forest to realize the prediction. The algorithm works in four steps: 1) Select random samples from a given dataset.2) Construct a decision tree for each sample and get a prediction result from each decision tree.3) Perform a vote for each predicted result. 4)Select the prediction result with the most votes as the final prediction. In this case, I would use Scikit-learn to create and train the model. To be more specific, I will import the random forest regression model from skicit-learn, instantiate the model, and fit the model on the training data. At last, in order to examine the accuracy of the prediction model, I would make predictions on the test set and calculate an accuracy using the mean average percentage error subtracted from 100 %.
random forest model constructed by combining several different Independent base classifiers, and multiple trees could help to prevent overfitting. Since the features(borough, incident type, and time of day) I chose from the dataset are independent to each over and the size of the dataset is relative small which won't cause performance issue. I believe random forest model is good fit for my use case.