Prediction of the Phytochemical Properties of Luffa Cylindrica Seed Oil Using Artificial Neural Network

. The research used an artificial neural network (ANN) to examine optimum extraction conditions and phytochemical contents of Luffa cylindrica seed oil. The oil yield was predicted using an artificial neural network. The performance of the ANN and response surface methodology models was compared. The optimum extraction yielded 7.567% oil yield, 185.676 mg/l phenol, and 45.087 mg/l terpineol at 75.57 °C extraction temperature, 5.77 h extraction time, and 10.68 g/mol n-hexane concentration, respectively. These data show that the oil output is poor but has a significant phenol and terpenoid content that may be employed in pharmaceutical sectors. The FT-IR analysis of Luffa cylindrica seed oil revealed a high level of unsaturated hydrocarbons and esters, making the oil appropriate for using in the paint industry and creating cosmetics.


INTRODUCTION
Various plants have been used to produce oil. Although many different portions of plants may provide oil in commercial practice, oil is typically collected from seeds (endosperm) of plants that grow all over the globe. The properties of oils from various sources are determined mainly by their composition; no oil from a single source may be used for all applications [1]. The world is becoming more environmentally aware with the rising substitution of synthetic items with organically produced ones. Consequently, there is a rise in demand for seed oils as raw materials in the chemical industry. Some of these oils are ingested directly or indirectly as dietary ingredients or as components of many industrial items (e.g. soaps, perfumes, personal care and skin products, candles, and cosmetic products). Seed oils are also employed in biodiesel manufacture. Several oils, including moringa oil, sunflower oil, rapeseed oil, palm oil, soya bean oil, corn oil, baobab oil, and pumpkin oil [2], are expensive. Thus, there is a need for new low-cost oil seed crops for the production of inexpensive oils suitable for food, pharmaceutical, and Luffa cylindrica seed oil is one product that may be used to provide a good outcome in terms of cost, renewability, biodegradability, and non-edibility. With the current industrial attention to its renewability and global friendliness, luffa oil derivatives may find larger markets globally, increasing the quantity of study focused on harnessing its seed oils for diverse purposes. Regarding availability and renewability, Luffa cylindrica seed oil has been discovered to be a sustainable resource for biodiesel and alkyd resin production. Luffa cylin-drica is a rapidly growing annual vine that spreads widely and matures in four months. The luffa plant is a cucurbit that includes gourds, pumpkins, and cucumbers and is part of the Cucurbitaceae family. Various names, including smooth loofah, loofah, loofah sponge, sponge gourd, vegetable sponge, dishrag gourd, and Chinese okra, know the Luffa. The luffa species are luffa cylindrica and luffa aegyptiaca [3]. In Nigeria, it is known as 'soso' in Hausa, 'kankan' in Yoruba, and 'asisa' in Ibo [4]. Most oil extraction processes are expensive owing to the inability to control specific intrinsic characteristics. Many studies have been conducted to discover alternate methods of manufacturing oil for process industries and the food business. It has been found that practically all seeds contain oil, which opens the door for other researchers to look for other applications for additional oil-producing chemicals prevalent in people's daily lives [5].
There are various methods for extracting oil from oilseeds. However, solvent extraction has been reported to be the most efficient [6], implying the need for process industries to optimise current extraction methods, thereby improving production profitability and ensuring a sufficient oil supply. Bioactive molecules in vegetables, fruits, cereal grains, and plant-based drinks such as tea and wine are known as phytochemicals. Because of their antioxidant and free radical scavenging properties, phytochemical ingestion is linked to a lower risk of various chronic illnesses [7]. Recent studies have also shown that they may have a role in better endothelial function and higher vascular blood flow [8]. About 10,000 phytochemicals have been identified, and many remain unknown [7]. Based on their chemical structure, phytochemicals can be broken into groups [9], as shown below in Figure 1. Response Surface Methodology (RSM) and Artificial Neural Networks (ANN) are mathematical and statistical approaches that may help assess observational evidence, determine the best scenario, and anticipate outcomes.
Depending on the degree of non-linearity and the initial assumption, most classic optimisation strategies based on gradient methods have the potential to get imprisoned at local optima. As a result, it does not assure global optimal and has restricted use. Non-traditional optimisation and search techniques and approaches based on natural phenomena such as neural networks and evolutionary computing (simulated annealing, genetic algorithm, and differential evolution) have been created [11]. An artificial neural network (ANN) is a simplified representation of a biological network's structure [12]. An artificial neuron is the core processing element of ANN (or simply a neuron). A biological neuron gets information from various sources, mixes it, applies a non-linear operation, and finally outputs the final result [13]. The primary benefit of ANN is that it does not need any mathematical model since it learns from examples and detects patterns in a sequence of input and output data without making any assumptions about their nature or interrelationships [12]. ANN is an excellent substitute for traditional empirical modelling based on polynomial and linear regressions [14]. More information is needed on the phytochemical composition of Luffa cylindrica seeds oil. As a result, this research aims to use an Artificial Neural Network to predict the phytochemical features of Luffa cylindrical seed oil.

Statement of the problem
Alternative materials are more required than ever in today's world for manufacturing lubricating oil, paints, varnishes, medicines, transformer oils, cosmetics, etc. Conventional fuels, such as coal, natural gas, and fossil fuel, are rapidly depleting; nonetheless, the world's reliance on these fuels is increasing. These minerals are byproducts of petroleum, which is non-renewable, nonbiodegradable, and pollutes the environment; also, its over-dependence has resulted in shortages and the production of inferior goods. The usage of ANN was motivated by the requirement for an efficient model since RSM may not be able to correctly evaluate a big data set necessary to achieve accurate and optimal results. ANN was used in this study to forecast the oil extraction process and to characterise the phytochemical properties of this plant's seed oil.
The precise goals of this research are as follows: 1. To extract, characterise, and evaluate the phytochemical characteristics of Luffa cylindrica seed oil.
2. To examine the influence of processing parameters on the extraction of Luffa seed oil and its phytochemical characteristics.
3. ANN was used to predict the phytochemical characteristics of Luffa cylindrica seed oil.
This research aims to reduce our reliance on imported oil by producing oil locally from Luffa seed, which can be used as raw materials in industrial applications, and to develop a new route from potential oil-producing roots. Over the years, researchers have struggled to create a model that can effectively anticipate the behaviour of the phytochemical characteristics of Luffa cylindrica seed oil; such models may drastically decrease time and operating costs in many technical areas. As a result, a requirement is to model Luffa seed oil extracts using Artificial Neural Network (ANN).
This research aims to extract Luffa cylindrica seed oil, analyse its phytochemical characteristics, investigate and optimise process factors, and forecast the phytochemical characteristics of Luffa seed oil using an Artificial Neural Network.

MATERIALS AND METHOD
Sample Collection. The Luffa samples were acquired from the National Root Crops Research Institute Umudike, Abia State, South-East of Nigeria and surrounding. The pieces were sorted, and the Luffa seed was extracted from the gourd by hand. The samples were maintained in an oven for a few hours to attain an equilibrium temperature with the environment before utilisation. Both ripened and dried fruit of this tree were collected in massive amounts. The seeds will be winnowed, and husks and dirt will be removed, following which it will be sun-dried for easy removal of the shell and will also be oven dried at 60c to constant weight before grinding to enhance the surface area for oil extraction.
The materials used for the experiment include the following: Oven, Grinder, Soxhlet Extractor, Re-flux Condenser, Heating Mantle, and Round bottom flask.
Experimental design. The experiment was designed using Design Expert version 6.0.8, where a Box-Behnken experimental design was employed to optimise oil extraction from Luffa cylindrica. The experiment was designed on three levels, three factors that will generate 17 experimental runs. The three independent factors are extraction time, extraction temperature and solvent ratio.
Extraction of oil from Luffa Cylindrica using solvent extraction method. The extraction of oil was carried out in the laboratory of the Department of Chemical Engineering, the Michael Okpara University of Agriculture Umudike, Abia State, Nigeria, using the technique given by [15]. The extraction was carried out using a soxhlet apparatus of 250 cm 3 capacity using n-hexane of analytical quality as the solvent. The extraction was done by utilising a prepared sample of 40 g of luffa ground seed; the powdered oil seed was put into the thimble, and the thimble was placed in the soxhlet apparatus ( Figure 2).

Figure 2 -Experimental apparatus for oil extraction
A round bottom flask with a known solvent volume (n-hexane) was placed on a heating mantle, delivering heat at a temperature just below the solvent's boiling. The soxhlet apparatus was put atop the flask, and intake and outflow water was linked to the condenser. Time of 4-6 hours, and extraction temperature ranges from 60 to 80 °C. The experimental runs were carried out according to the experimental runs produced by Design Expert.
The solvent was collected after every experimental run by a distillation procedure, and the natural oil produced was weighed. The experiment was repeated for additional settings, and the % yield was determined.
Determination of oil yield. The powdered seed will be weighed and put into a thimble. Soxhlet will be extracted for various time intervals based on the needed time interval for each experiment that will run between 4-6 hours using n-hexane as the solvent. The hexane extract will be filtered and evaporated under vacuum to create a thick mass of oil; the oil will be placed into a beaker of known weight and stored in a Griffin temperature adjustable oven at 60-70 °C to evaporate the excess solvent. The Luffa seed oil thus obtained shall be maintained in an air-tight container with no air gap and labelled appropriately. The percentage oil yield was measured as the ratio of the weight of oil recovered to the importance of the loofah seed sample before extraction. Oil yield was mathematically calculated using the formula employed by [16,17]: Screening of phytochemicals. Phenol Concentration Determination: The total phenol concentration measurement was adapted using the Folin Ciocalteu technique [18]. About 0.1 g of the oil extract was weighed into a test tube, and 1 ml of methanol was injected and brought into a water bath and shaker, where it was allowed to shake for 30 minutes at 40 °C.
The sample was withdrawn, 1 ml of Folin Ciocalteu was inserted, and 2 ml of 20% Na2CO3 was introduced. The combination was allowed to rest for 10 minutes before being spun in a centrifuge for 20 minutes at 400 vpm. The absorbance was obtained using a UV spectrophotometer at 625 nm. The standard curve was created by generating different levels of gallic acid starting at 10 mg/l.

Ann model development.
Artificial neural network (ANN) architecture will be constructed in MATLAB 8.4 (R2015b) software environment where the training, validation and testing of the ANN model will be carried out. A three-layer ANN using a tangent sigmoid function (tensing) at the hidden layer, a linear transfer function at the output layer and the Levenberg-Marquardt backpropagation method with 1000 iterations. The input layer correlates with the three experimental parameters: temperature (°C), solvent ratio and time (minutes). The output layer will be the oil yield, terpineol concentration and phenol concentration ( Figure 3).

Figure 3 -Proposed ANN structure
All the data from oil extraction from Luffa cylindrica seed oil will be randomly separated into three groups (training, validation and testing) with a ratio of 70, 15 and 15%, respectively. In this investigation, ten neurons will be employed as a default test to establish the optimal method for the prediction. 1-15 neurons in the hidden layer and one neuron in the output layer will be applied, and data will be collected from multiple factors simultaneously.

RESULTS AND DISCUSSION
Oil yield analysis. The backpropagation method was determined after comparing eleven algorithms. For all Back Propagation methods, a three-layer ANN with a tangent sigmoid transfer function (tensing) at the hidden layer and a linear transfer function (purelin) at the output layer was employed. At the same time, ten neurons were utilised in the hidden layer for all BP strategies.
The benchmark comparison research demonstrated that the LMA could give reduced MSE compared to other BP algorithms. As indicated in Table 1, the minimum MSE was achieved at approximately 4.66180 x 10 -7 using the trainlm function. However, trainrp and conjugate gradient algorithms such as traincgf, traincgp and traincgb showed more significant errors than the LMA, with the biggest mistake being traingdx with an error of 0.4217.
The loss in the optimality of the estimates/results provided by various BP training methods may be ascribed to the experimental data's combinatorial character and non-linear structure. As a result, the different training methods used in the benchmark comparison confirmed the issue's complexity analysis. The exemplary architecture of the ANN model and its parameter modification was established based on the most negligible value of the MSE of the training and prediction set. In the optimisation of the network, two neurons were employed in the hidden layer as an initial estimate. With an increase in the number of neurons, the network provided numerous local minimum values, and distinct MSE values were acquired for the training set. The criteria for selecting the ideal ANN structure are the MSE of the train data and the correlation coefficient (R 2 ). Table 2 indicates the association between the number of neurons, R 2 and MSE for provided ANN.   Figure 4 displays the MSE vs the number of epochs for optimum ANN models.
Phenol analysis. The best-suited Back propagation algorithm was selected by comparing eleven backpropagation algorithms. For all Back Propagation algorithms, a three-layer ANN with a tangent sigmoid transfer function (tansig) at the hidden layer and a linear transfer function (purelin) at the output layer was used. In contrast, all BP algorithms used ten neurons in the hidden layer. The comparison analysis indicated that the LMA could reduce MSE compared to other BP algorithms. As shown in Table 3, the least MSE was found as 2.56353 × 10 -9 for trainlm function. However, trainrp and conjugate gradient algorithms such as traincgf, traincgp and traincgb exhibited higher inaccuracy than the LMA. The loss of the optimality of the estimates/results provided by various BP training methods may be ascribed to the experimental data's combinatorial character and non-linear structure. Hence, the complexity analysis of the issue was corroborated by the outcomes of the different training methods employed in the benchmark comparison.
The optimal architecture of the ANN model and its parameter modification was established based on the most negligible value of the MSE of the training and prediction set. In the optimisation of the network, two neurons were employed in the hidden layer as an initial estimate. With an increase in the number of neurons, the network provided numerous local minimum values, and distinct MSE values were acquired for the training set. The criteria for selecting the ideal ANN structure are the MSE of the train data and the correlation coefficient R 2 .
The result in Table 4 demonstrates the association between the number of neurons, R 2 and MSE for provided ANN. The result in Table 4 demonstrates the relationship between the MSE and R 2 on the number of neurons. The observation indicated that the MSE value lowers as the number of neurons rises.
The MSE value of 0.514682, 169.61173 and 22.95281 was found at neuron 2, 3 and 4, accordingly revealing an unfitted correlation of the experimental results. The MSE value reduced dramatically at neuron 6 to 5.94709 × 10 -6 while the lowest MSE value and best correlation coefficient (R 2 ) of 3.74065 × 10 -17 and 0.999999 was found when the number of the hidden neuron was raised to 11. Therefore, the neural network comprising 11 hidden neurons was the best instance. The training was ended after 15 iterations (TRAINLM) for the LMA because the discrepancies between training error and validation error began to rise.  Terpenol analysis. The backpropagation algorithm was selected by comparing eleven algorithms. For all Back Propagation algorithms, a three-layer ANN with a tangent sigmoid transfer function (tansig) at the hidden layer and a linear transfer function (purelin) at the output layer was used. In contrast, all BP algorithms used ten neurons in the hidden layer.
The benchmark comparison study showed that the LMA could provide smaller MSE than other BP algorithms. As shown in Table 5, the smallest MSE was obtained at about 4.97667 × 10 -6 for trainlm function. However, trainrp and conjugate gradient algorithms such as traincgf, traincgp and traincgb produced more significant errors than the LMA. The loss of the optimality of the estimates/results produced by some BP training algorithms can be attributed to the experimental data's combinatorial nature and non-linear structure. Hence, the complexity analysis of the problem was validated by the results of the various training algorithms used in the benchmark comparison.   Figure 6 shows the MSE versus the number of epochs for optimal ANN models. Figure 6 -Training, validation, and test mean squared errors using the Levenberg-Marquardt algorithm for terpanol The comparison analysis (Table 7) of the response surface methodology and artificial neural network showed that the correlation coefficient result of the oil yield for RSM was obtained as 0.8395 while ANN has 0.99999. The FT-IR result of the oil yield from Luffa cylindrical, as shown in Figure 7, indicated a pointed peak of 3008.0, indicating the alkene group, which is an unsaturated hydrocarbon.

CONCLUSIONS
A three-layer ANN with a tangent sigmoid transfer function (tansig) at the hidden layer and a linear transfer function (purelin) at the output layer was suggested to estimate the oil yield, phenol and terpenol content of luffa cylindrica seed oil.
The benchmark comparisons conducted with ten hidden neurons resulted that Levenberg-Marquardt algorithm was the best algorithm among the eleven backpropagation algorithms used due to the nearness of its R 2 to 1 and the Mean Root Square to zero compared to the other 11 BP algorithms for the oil yield, phenol and terpenol content. The comparative study of the response surface methodology and artificial neural network indicated that the correlation coefficient result of the oil yield for RSM was achieved as 0.8395 while ANN has 0.99999. The RSM correlation coefficient for phenol and terpanoid were found as 0.9942 and 0.9868, correspondingly, while the ANN correlation coefficient for phenol and terpanoid was 0.99999 for both. The study underlined that the ANN model result correlates more significantly than the RSM. The FT-IR result of the Luffa cylindrical seed oil indicated a high degree of unsaturated hydrocarbon and esters. This makes the oil ideal to be utilised mainly in paint industries as a drying agent, cosmetics manufacturing, and soap production and may also be edible for animal feed.
The phytochemical properties of an oil are essential in determining the medicinal effects of the oil as well as its uses in various fields; hence it is recommended that further studies should be carried out on predicting the phytochemical properties of the luffa cylindrical seed oil using another black box model such as Adaptive Neuro-Fuzzy Inference System (ANFIS).
However, a simulation based on the ANN model may contribute to a better understanding of the dynamic behaviour of processes.