1. Introduction
Fermentation technology is enjoying a significant moment, due to the
potential of metabolic engineering, systems biology, and synthetic
biology [1]. Various economically important compounds such as
different chemicals, fuels, and biopharmaceuticals can be obtained
through fermentation processes. With the purpose of commercialization of
any fermentation-based product, the amount of obtained product should
meet the market demand [2]. Therefore, optimization of the
fermentation parameters (e.g., temperature, pH, medium composition,
feeding strategies, etc.) is a critical factor that has an important
role in bioprocess overall yield and productivity. Furthermore,
fermentation optimization can reduce the overall cost of bioprocess
through its impact on downstream processes and purification [3].
Various strategies have been implemented to find the optimal values of
fermentation parameters so far. Modeling has always been one of the most
popular methods due to its ability to replace expensive laboratory
experiments, or at least diminish the amount of them. In this approach,
according to the specified algorithms, the output is calculated as a
function of given inputs [4]. For instance, the inputs can be media
composition, temperature, pH, etc., then the appropriate values of these
parameters are optimized to make the desired output. Generally, three
types of models are implemented to the problems: purely
mechanistic/knowledge-driven, merely data-driven, or a combination of
the two [5]. Each of these approaches has its own advantages and
disadvantages. For example, data-driven models are black-box models,
which do not provide adequate information on the underlying mechanism
[6]. Nevertheless, large datasets may be not incorporated into a
model framework smoothly. On the other hand,
hypothesis/mechanistic-driven approaches use basic knowledge to extract
deeper information from datasets and provide valuable information on the
underlying mechanism. Nonetheless, the construction of these models is
challenging due to the rapid growth of data. However, in order to
construct the most powerful model, it is crucial for the researcher to
understand the strengths and weaknesses of these approaches. Moreover,
the hybridization of these two approaches might be the most powerful
model [7].
Fermentation parameters have a significant effect on cellular
metabolism, thus productivity. So, mechanistic analysis of the
interaction between environmental conditions and metabolic pathways
leads us to fine-tune fermentation parameters in a comprehensive way
[8]. There are several mechanistic models for simulating metabolism
in the field of systems biology [9]. Among them, constraint-based
modeling (CBM) of metabolism is one of the most common approaches
[10]. These models are built from a genome-scale metabolic network
reconstruction to predict metabolic flux values through optimization
techniques such as flux balance analysis (FBA) [11, 12]. To date,
genome-scale metabolic models (GEMs) for diverse eukaryotic and
prokaryotic organisms and cells have been reconstructed [13] and
applied in biotechnology and human health [14, 15]. Such models have
been extensively utilized for qualitative mapping of cellular
metabolism, predicting metabolic functions, and guiding metabolic
engineering designs and bioprocess optimizations toward the desired
phenotype [16].
In parallel, machine learning (ML) is a purely data-driven approach with
the creation and evolution of algorithms that identify patterns and
makes hypothesis or models based on learning from existing data [17,
18]. Because of the rapid increase in omics datasets, many researchers
prefer to use machine learning independently to interpret systems
biology and metabolic engineering datasets. For instance, genome
annotation, host strain selection, pathway discovery, metabolic pathway
reconstruction, metabolic flux optimization, multi-omic data
integration, and protein modeling can be obtained through machine
learning methods [3, 19]. Besides, due to the availability of the
large amounts of fermentation parameter values from empirical studies,
machine learning algorithms can be implemented directly to this
multivariate system to fine-tune the fermentation conditions [20,
21].
Although the applications of each of the two methods separately are
constantly increasing, the unique capabilities of each have led to the
integration of models with more prediction power and accuracy. Recently,
comprehensive reviews of the integration of machine learning algorithms
and mechanistic models have been published, indicating a promising
outlook for this field of knowledge [22-26]. However, it is
worthwhile to review the capabilities of these two methods individually
or in combination for fermentation parameter optimization. The basic
idea here is that machine learning is a powerful computational tool for
analyzing omics data individually or inferring multi-omic relationships.
Moreover, as a result of CBM, an additional layer of omics data called
fluxomics is created, which can be analyzed by machine learning methods
separately or by integrating with other omics data [26].
In the present review, first, we highlight the latest efforts in the
literature that utilize CBM as a mechanistic approach for fermentation
optimization. Next, we introduce ML as a data-driven method and
highlight its recent applications in tuning the fermentation parameters.
Finally, we present the studies in which CBM and ML combined to improve
the model accuracy for analyzing fermentation conditions.