Abstract:
The near-infrared spectroscopy technology was adopted to quantitatively analyze the nutritional components of blueberries given different storage times, so as to determine the correlation between their chemical components and near-infrared spectroscopy data. Besides, spectroscopy technology was applied to perform the nondestructive detection of blueberry nutritional components. As for the obtained near-infrared spectral data, two machine learning algorithms, Partial Least Square Regression (PLSR) and Support Vector Regression (SVR), were used to predict the content of soluble solids (SSC) and vitamin C (V
C) in blueberries. In order to improve the accuracy of prediction, one or more of the methods, such as First Derivative (1-DER), Second Derivative (2-DER), Standard Normal Variable Transform (SNV), Multivariate Scatter Correction (MSC), Savitzky Golay smoothing (S-G), were used to preprocess the spectral data, and the best-performing methods were comparatively analyzed. Competitive Adaptive Weighted Sampling (CARS) and Random Frog (RF) were adopted either separately or in combination to reduce the dimensions of spectral wavelengths. Results showed that, after dimension reduction, the SSC wavelength as a variable was reduced to 1.7%, 4.3% and 5.6% of the full spectral variable, while the V
C wavelength as a variable was reduced to 2.5%, 2.9% and 4.8% of the full spectral variable, respectively. With the screened spectral wavelength as a variable, PLSR was used to construct a prediction model of near-infrared spectroscopy for SSC and V
C contents in blueberry. The comparison showed that the wavelength variables screened by CARS in combination with RF algorithm produced a better outcome of prediction. The model correction correlation coefficients were 0.9001 and 0.8707 respectively, the correction root mean square errors were 0.8234 and 2.9429 respectively, the prediction correlation coefficients were 0.8424 and 0.8350 respectively, and the prediction root mean square errors were 0.9613 and 2.9482 respectively. To eliminate the impact of model performance on the prediction results, an SVR model was established to compare the prediction results. It was also discovered that a better prediction result was produced by CARS in combination with RF algorithm. The model correction correlation coefficients were 0.8702 and 0.8503, respectively. The correction root mean square errors were 0.9549 and 3.2431, respectively. The prediction correlation numbers were 0.8269 and 0.8183, respectively. The prediction root mean square errors were 0.8769 and 2.8818, respectively. To sum up, this study provides a model basis for monitoring the quality of blueberry nutrients, and the method proposed to select characteristic wavelength provides a reference for more models of fruit and vegetable nutrients prediction.