preprocess
Numeric features often exist in a variety of data scale. Standardization of feature values not only can prevent the objective function of algorithms hindered by features with a board range of values, but also facilitate the algorithms using gradient descent as an optimizer as the converging speed is much faster. Features with missing values are first filled with the mean values before standardization.