DEvElOpINg cHEMOMEtRIc MODEls
Our research today aims to evaluate this type of signal using
chemometric models [ 3]. The main goal is to predict more
than one parameter using a single model, since water and fat
content are linked to several properties of food.
Developing such models generally involves an exploratory
analysis of the original data to verify the correlation between
the variables and their influence in the data set—and, additionally, to assess the tendency of clustering among samples.
Principal component analysis (PCA) is the most disseminated
chemometric tool used to exploit several features as shown
in Fig. 1. The regression method partial least squares (PLS) is
based on PCA. The algorithm used to calculate PLS is the
Nonlinear Iterative Partial Least Squares (NIPALS).
When using PLS, it is necessary to choose a group
of samples which comprise a training set. With this set, a
calibration model is built to relate the instrumental response
or independent variables (matrix x) with the property of
interest or dependent variable (y, vector or matrix) of a given
sample type by means of a vector (b) containing the regression
coefficients. The instrumental responses (x) will be the TD-NMR
signals. The properties of the samples (examples, fat content,
moisture and other parameters) are the y matrix.
For PLS, two decompositions are calculated by PCA. The
first one to matrix x, and the second one to y. The main goal
of the PLS is to find a correlation between x and y. An internal relationship can be achieved by observing the scores of y
(u) and x (t) by a linear relationship according to Equation 1, in
which bh are the regression coefficients.
The regression coefficients may be used to predict different
properties, such as the fat content of meat.
Another aspect to be pointed out about PLS is that an
external set of samples must be used in addition to the training set. These external samples are not used to construct the
model, but are instead used to validate the suggested model.
When choosing the number of latent variables that will be used
in the PLS, it is important to remember that: 1. the variance
explained by each latent variable, and 2. the predictive residual
error sum of squares (PRESS). The appropriate number of latent
variables is one in which the value of PRESS is low. Other values
to be observed are the root mean square errors of calibration
(RMSEC) and validation (RMSEV).
FIG 1: Principal component analysis.
Ûh = bhth