3. Machine learning: A data-driven approach
Fermentation is a multivariate system in which any number of involved
parameters can influence the process outcome [24]. As outlined in
the previous section, mechanistic models (e.g., CBM in this review) can
lead to fine-tuning some fermentation parameters such as medium
composition. Nevertheless, we cannot investigate the effect of all
fermentation parameters on productivity through mechanistic approaches.
On the other hand, strictly experimental trial-and-error methods are
time-intensive and commonly high-priced. Despite the difficulties of
such traditional techniques, the large amount of data generated from
worthwhile previously fermentation studies provide an appropriate space
for data-driven modeling approaches to find the optimal sets of
fermentation parameters. Moreover, a rational analysis of large and
complex datasets generated from experiments, measurements, and
simulations can significantly contribute to an in-depth understanding of
the system of interest [74].
Machine learning (ML) is a data-driven approach that uses statistics and
probability science to analyze a dataset and discover the hidden
relationships between existing data to justify a phenomenon and build a
predictive model based on the patterns it learned. In the past,
researchers did not distinguish between ML and artificial intelligence
(AI), but nowadays, ML is recognized as a subfield of AI [75].
Actually, AI is the industry of developing tools and techniques for ML,
while ML uses these tools in various fields such as engineering and
science [76]. In the ML process, a problem is first defined on a
dataset. Then, a set of preprocessing operations is performed on the
dataset based on the defined problem. In the next step, the ML model is
created by a user-defined estimator. Finally, the model is validated and
evaluated by standard techniques. Figure 2 shows the general
scheme of the machine learning workflow.