Investigating new sources of vegetable oils is particularly relevant for countries like Morocco, where drought and heat limit the production of traditional oilseed crops, meeting only 20% of national demand. This dissertation characterizes balanites (Balanites aegyptiaca), cactus (Opuntia ficus-indica), and date palm (Phoenix dactylifera) as unconventional sources of vegetable oil in the Moroccan desert focusing on their chemical composition and authenticity assessment. The first objective was to define the chemical identity of these oils by analysing their main lipid groups: fatty acids (FA), triacylglycerols (TAGs), tocochromanols, and phytosterols. In Balanites kernel oil (BKO) and cactus seed oil (CO), linoleic acid was the dominant fatty acid, followed by oleic, palmitic, and stearic acids. A distinctive feature of CO was its relatively high content (~5%) of vaccenic acid, an isomer of oleic acid. Date seed oil (DSO), by contrast, exhibited a more diverse FA profile, including oleic, lauric, myristic, palmitic, linoleic, and stearic acids which partly explains the high oxidative stability of DSO. TAG analysis supported the FA results, with DSO showing the most diverse profile (34 TAGs). Regarding tocochromanols, BKO was rich in α-tocopherol, CO in γ-tocopherol, while DSO was characterized by high tocotrienol content, especially α-tocotrienol. All three oils shared a similar phytosterol composition, dominated by β sitosterol, campesterol, and Δ5 avenasterol, with CO showing the highest total content. To explore the effect of geographical origin, it was hypothesized that oil composition varies by collection area and that some variation could be linked to climate conditions such as water deficit and/or temperature. Despite limitations in regional sample size, multivariate statistical analysis was used to identify origin-related trends. The degree and nature of these variations depended on the region and the oil type. For instance, Moroccan BKO samples differed clearly from those collected in Sudan and Mauritania, mainly due to major compounds from all four chemical classes (FA, TAGs, tocochromanol, and phytosterol). In contrast, for DSO at a smaller geographical scale (three Moroccan palm groves), especially minor TAGs were key to distinguishing origin. These findings were discussed in relation to known biosynthetic pathways and stress-related enzymatic responses. Still within the scope of geographical origin, an untargeted metabolomic approach was applied to DSO as a case study, to determine whether the polar metabolite profile could reflect geographical variation, by revealing clustering trends and potential origin-specific markers not captured by classical lipid analysis. Samples from three Moroccan palm groves were analysed by UHPLC-ESI-QTOF-MS in both positive and negative ionisation modes. PCA results showed a similar clustering trend as observed with lipid composition, with samples from Allougoum forming a distinct group compared to Alnif and Errachidia. Based on these results, an OPLS-DA model was used to identify the discriminative features. Among the top 50 discriminative features, 25 metabolites from various chemical classes were tentatively identified and hydroxy fatty acids were the most represented class. These compounds, reported for the first time in DSO, expand the current understanding of its chemical profile. The data processing workflow used was based entirely on open-source tools and can be readily applied to other oils such as BKO and CO. For the second aspect of authenticity, adulteration detection, the study aimed to build a machine learning-based model that does not rely on large sets of physical mixtures. Instead, simulation methods were used to generate synthetic data for model training and testing. The hypothesis was that integrating analytical data with simulation and machine learning could enable reliable adulteration detection. CO was used as a test case, with refined sunflower oil (SO) as adulterant. After analysing FA, TAG, and tocochromanol profiles of pure CO and SO, two simulation methods were tested: Monte Carlo (MC) and Conditional Tabular Generative Adversarial Network (CTGAN). MC performed consistently well, even with small datasets, whereas CTGAN was less effective. Using a weighted sum formula, the simulated oils were used to create multiple levels of adulteration. These simulated mixtures were used to train Random Forest (RF) and Neural Network (NN) models. RF outperformed NN, achieving 94% accuracy on simulated data and 90% on real test samples, with better interpretability and lower computational demand. Thus, combining MC simulation with RF is proposed as a robust approach for oil adulteration detection. The methodology, implemented in Python and shared as open-source code, can be easily adapted to other oils with minimal retraining. Overall, this dissertation presents a multi-approach framework for assessing Moroccan oils authenticity, integrating lipid profiling, untargeted metabolomics, and machine learning. It advances understanding of chemical composition, demonstrates traceability of origin, and offers a robust strategy for detecting adulteration.
Said El Harkaoui (Thu,) studied this question.