Selections of donors depending on agronomic traits, seed yield components, and fatty acid profile for genetic improvement of Carthamus using stepwise multiple regression

This is anOpe Abstract – Safflower (Carthamus tinctorius L.) is of potential interest to agriculture due to mainly variability of fatty acid composition of seeds oil. The purpose of this study was to evaluate various exotic genotypes of safflower for agronomic traits, components of seed yield and fatty acid content. For this purpose, plant height (cm), number of first, second and third branches/plant, seed yield/plant (g), thousandseed weight (g), oil content (%), and composition of fatty acid were investigated. Stepwise multiple regression analysis was used to develop fitted equation to predicate seed yield/plant. Analysis of variance of agronomic traits showed high differences among genotypes. Although the safflower oil genotype K2 had the highest oil seed content (42.8%), K13 obtained the highest percentage of monounsaturated fatty acids (MUFA). Highest oleic type of safflower oil has been found in K13 and K26, which can be used as a source oil quality for plant breeding. Heritability as broad sense was high and ranged from 82% in number of secondary branches (NSB) and number of third branches (NTHB) to 99% in seed index (1000 seeds weight) and oil content. High genetic advance was found in plant height (PH), seed yield/plant (SYP) and 1000-seed weight, estimated at 43.41 cm, 21.34 g and 17.62 g, respectively. Stepwise multiple regression analysis indicated that, 99.2% of the total variation in seed yield/plant could be explained by variation in yield of secondary (YSB) branches, yield of first branches (YFB), yield of third branches (YTHB), plant height (PH) and spiny as dummy variable. 23.56% of the total variation in seed oil percent could be explained by variation in yield of first branches (YFB), seed index and spiny as dummy variable. The information detected here may be a useful tool for the selection of parents in safflower breeding program.

Mots clés : carthame / héritabilité / profil des acides gras / analyse de régression par étapes / épineux 1 Introduction Safflower (Carthamus tinctorius L.) is an important cultivated semi-arid oilseed crop, including Egypt. It is worth to note and document that Egypt is suffering dramatically from great shortage in edible oils needed for nutritional consumption. The total local production from plant oils is about 250,000 t and the consumption is about 1,700,000 t in 2017. This indicated that there is a great gap (85.3%) between production and consumption that has created importation to fulfill the requirements of local market (FAS, 2017). Safflower flowers have been used as a source of textile dyes in Egypt as well as for food and medicinal use (Chapman and Burke 2007). Safflower oil is a valuable vegetable oil seed because it contains a high proportion of oleic and linoleic acid, as well as achieves high quality oil (Khan et al., 2009). Safflower breeding program will improve diversity within species and cultivars (Kemal and Hailu, 2019). The efficiency strategies of plant breeding depend on the inheritance information of the agronomic traits. Genetic improvement would be of interest to safflower-mediated accumulation of desirable parents alleles (Golkar et al., 2012;Kemal and Hailu, 2019). Selection is useful in improving components of yield and in reducing morphological and physiological effects on yield, especially when the traits are highly heritable (Bleidere et al., 2012;Tahernezhad et al., 2018).
Heritability and variance analysis for seed weight, seed yield, oil content and plant height were calculated (Ramachandram and Goud 1981). Safflower (Hamdan et al., 2008) was confirmed to have inherited the very high linoleic acid content. Seed yield and oil content were the most important criteria for improving safflower, so it was important to research the relationship between seed/oil yield before any breeding program was initiated (Hamrouni et al., 2004;Liu et al., 2016).
Safflower oil has different used variously for edible and industrial purposes in recent years. Results of nutritional studies suggested that safflower oil has been promising with high-linoleic and high-oleic oils, and can be blended with other oils (Arslan and Culpan, 2018). Plant selection and genetic improvement depend on the adequate heritable and variation of selected parents used for breeding programs and the nature of the gene action included in expressing the quantitative characters concerned (Acquaah, 2012;Golkar et al., 2017).
Most genotypes of safflower are spiny with many sharp spines on bracts and the leaves (Bradley et al., 1999). Singh (2007) reported that safflower varieties that are almost totally free of spines have been improved for hand harvest in certain geographic zones. Golkar et al. (2010) showed that oil contents of the spiny genotypes are higher than those of the spineless genotypes of safflower. The great shortage in edible oils in Egypt and in world needed improvement in the production and quality of oil plant. The aim of this study was to evaluate various exotic genotypes of safflower for agronomic traits, components of seed yield/plant, genetic components and fatty acid content. Stepwise regression parameters were performed to evaluate and predicate the seed yield/plant and oil percent using quantitative traits (plant height, yield of first, second and third branches/plant, seed yield/plant, 1000-seed weight, oil content) and qualitative trait (spiny as dummy variable).

Plant material and experimental design
Exotic safflower genotypes that had been cultivated in earlier experiments conducted at the Demo Farm (Southeast Fayoum; 29°17'N; 30°53'E), Faculty of Agriculture, Fayoum University, Egypt. Eighteen genotypes of safflower were kindly provided by the gene bank of the United States Department of Agriculture (USDA) for use in this study (Tab. 1). The experiments were conducted during the 2016-2017 and 2017-2018 growing seasons. The genotypes were planted in a randomized complete block design with three replications. Sowing was done by hand in 6m 2 plots, each consisting of five rows, 3 m in long and 30 cm apart; plants were 25 cm apart within rows. Only the three middle rows were harvested (area 3.6 m 2 ).
Analysis of variance was performed separately for each individual evaluation trial, and results were combined over the two years. Data measured on the plant characters, plant height (cm), number of first, second and third branches per plant, seed yield/plant (g), 1000-seed weight (g), and oil content (%) was obtained for each trial. These data were used to derive the genetic parameters, such as genotypic (r 2 g) and phenotypic (r 2 p) variances, phenotypic coefficient of variation (PCV), genotypic coefficient of variation (GCV), broad-sense heritability (h 2 b ) and genetic advance (GA), as suggested earlier (Burton, 1952;Johnson et al., 1955).

Extraction of crude oil from seeds
Safflower seeds were carefully cleaned to remove any foreign matter and dried to appropriate moisture and then crushed by blender (type IKA A 11 basic). Oil extraction by a conventional method was performed. The crushed seeds were soaked in purified n-hexane for 24 hours at room temperature. The miscella was separated from the cake by filtration with Whatman No. 1 filter paper. This process was repeated 3 times. The filtrate miscella was combined and the oil was then recovered by evaporating the n-hexane solvent, under vacuum at 50°C, by using a rotary evaporator (type Buchi R-114). The obtained oil was dried over anhydrous sodium sulfate, the mass of the oil was quantified by the gravimetric method and then stored in dark brown bottles at À20°C until analysis (Velioglu et al., 2017).
The experimental assays were completely randomized and performed in triplicates.

Gas chromatography analysis of oils
Fatty acid profile of oil samples were determined by using gas chromatography (GC), after initial derivatization to form fatty acid methyl esters (FAME). Methyl esters of fatty acids were prepared in accordance to the method of Mohdaly et al. (2015). Briefly, 100 mg sample adding 1 mL BF3/methanol (14%) and 1 mL hexane. The tube then vortexed and placed under nitrogen for 60 min at 100°C. Esters of fatty acids were extracted by adding 1 mL of hexane and washing with 2 mL of distilled water. After the centrifugation step (4500 rpm, 10 min, 20°C), the supernatant is recovered in vials and then injected into the GC column according to the method described by Mohdaly et al. (2015). The GC instrument (GC-type CG-2010 Plus, Shimadzu) was equipped with a flame ionization detector and a capillary column of 60 m length, 0.25 mm internal diameter. The thickness of the film is 0.20 microns was used for analysis. The samples were separated on the column using helium as the carrier gas with a flow rate of 0.8 mL/min. The sample was injected in split mode (50:1). The temperature program used in the analysis was to keep the unit at 120°C for 2 min and then climb to 180°C for 2 min and the sample was kept at 220°C for 25 min. The peak integration is done on the software GC, and the peaks were identified based on comparing retention times with standard fatty acids. Resultant values of fatty acid composition were expressed with a percentage (%, w/w) for all the fatty acids detected.

Statistical analysis
The experimental design used was randomized complete blocks with three replications. Statistical analyses were carried out using IBM ® SPSS ® (SPSS Inc., IBM Corporation, NY, USA) Statistics Version 25 (2017) for Windows. Data were tests for normal distribution (Shapiro-Wilk's test) (Shapiro and Wilk, 1965;Razali and Wah, 2011). Data were subjected to combined analysis of variance (ANOVA) with P-value of <0.05 was considered to be statistically significant. Bartlett's test was used to determine the homogeneity of variances for two years before doing a combined analysis. The Bonferroni adjustment correction post-hoc test was used to compare the genotypes means (Abdi, 2007).

Multiple regression analysis
The primary selection criteria for safflower breeding are seed yield/plant and its oil content. To obtain a predicted model of seed yield/plant as dependent variable and seven traits as independent variables, stepwise of multiple regression analysis (Kutner et al., 2005) was employed. Six of seven independent variables used as quantitative traits; YSB, YFB, YTH, PH, oil percent and 1000-seed weight while spiny of leaves used as categorical predictor variable. Stepwise regression was also used in order to determine the most important variables significantly contributed to oil percent as dependent variable. Six independent variables used as quantitative traits; YSB, YFB, YTH, PH, seed yield/plant and 1000-seed weight while spiny of leaves used as categorical predictor variable. Categorical predictor variable spiny of leaves used as predictor in regression analysis of both traits seed yield/plant and the oil percent content by generating predictor variable dummy (spiny = 1 and spineless = 0) to reflect group membership knowledge.

Agro-morphological traits
Analysis of variance for plant height (PH) shows high differences among genotypes. The PH has been recorded with values beginning at 67.5 cm to K27 and 121.8 cm to K5. Notable variation was reported with respect to PH (Baye and Becker 2005). The analysis of variance illustrated significant differences among genotypes in all traits. The highest values of NSB, SYP and YSB were obtained from the K26 genotype, the highest values of NFB and YFB were obtained from the K6 genotype, the highest values of NTHB and YTHB were obtained from the K8 genotype, the highest values of 1000seed weight was obtained from the K4 genotype, while the highest values of oil percent was obtained from the K2 genotype (Tab. 2). In this regard, Babaoglu and Guel (2015), Erbas and Baydar (2017), Kose et al. (2018) illustrated high variations of agronomic traits and seed yield among Carthamus tinctorius lines selected.
The oil contents values of the genotypes tested varied from 20.1 and 42.8% (Tab. 2). The K2 genotype, which is a good source of oil production, recorded the highest oil content of 42.8%, while the lowest oil content was recorded in the K5 genotype at 20.1%. The current study's safflower samples showed higher oil contents than those reported by Arslan and Culpan (2018) who found that the oil content values of the Turkey safflower genotypes' oil content values ranged between 15.58 and 37.42%. In another study implemented by Liu et al. (2016), seed oil content ranged from 20% to 45%, oil content ranged between 28 and 32% for the C. tinctorius varieties. The variability could be due to effect of the environmental, variety, genetics or agronomy.

Safflower oil fatty acids changes
The quality of the oil is a considerable consumer concern. Changes in fatty acids are of special importance for the oil quality. The oil's compositions of fatty acids have notable effects on its physical and chemical characteristics and on its frying performance. Thus, in this study, we carried out oil analyses to determine the content of aforesaid fatty acids. The composition of fatty acids of the various genotypes safflower oil is given in Table 3. The findings show that on average, palmitic, stearic, oleic and linoleic acids comprised over 98.6% of total lipids, and more than 90% of total fatty acids comprised of those oleic and linoleic acids which are proven to be healthy sources of lipids for human body. Knowing the of oleic and linoleic acid in seed oil is therefore extremely important for characterizing a safflower genotypes, both from an agronomic and an economic point of view. Changes in oleic and linoleic acids content among safflower genotypes were clearer than those in the palmitic and stearic acids. Table 3 shows that 18:1n-9 oleic fatty acid predominated in the K13 and K26 samples (∼70%), followed by 18:2n-6 linoleic acid (∼21%), and that there was a significant difference between the two genotypes for these fatty acids. They had slightly lower palmitic and stearic acids, but a higher of oleic and linoleic acid content, and a higher monounsaturated fatty acids (MUFA) ratio. In the genotypes of K13 and K26, respectively, these four fatty acids made up 99.18 and 99.27% of seed oil. The other fatty acids, including myristic (C14:0), palmitoleic (C16:1), arachidic (C20:0) paullinic (C20:1) and behenic (C22:0) ranged from 0.06 to 0.29% in total. On the other hand, the main fatty acid in seed oil of other genotypes was linoleic unsaturated fatty acid of 75.33 to 84.68% followed by oleic acid (from 6.76 to 15.87%), and palmitic (from 5.10 to 7.63%) and stearic (from 1.81 to 6.82%) saturated fatty acids, and their differences were significant. La Bella et al. (2019) demonstrated that, with cultivation practices and genotype, safflower does not always have the same fatty acid composition, which varied mainly from year to year. Arslan and Culpan (2018) reported the high variability of the safflower for seed oil fatty acids with an average composition of 44.4% oleic (from 13.97 to 74.74%) and 41.0% linoleic acid (from 12.21 to 69.83%). The fatty acid profile of the genotypes safflower in our study was in agreement with those reported by Camas et al. (2007) who found that high linoleic (75-80%) and low oleic acids (10-15%) have been determined by some genotypes of safflower, while the other genotypes has determined low linoleic acid (12-30%) and high oleic acid (64-83%).
The linoleic fatty acid is considered essential, as it cannot be synthesized by humans and must be derived from food. Consumption of 18:2n-6 (linoleic acid) is widely thought to be able to reduce LDL and overall cholesterol levels. The highly unsaturated fatty acid of linolenic acid (C18:3) content in seed oil of the samples and the commercial was very low (less than 1.06%). The samples content of heavily unsaturated linolenic acid fatty acid (C18:3) in seed oil of and the commercial was very small (less than 1.06%).
In linoleic types, the oil stability (18:1/18:2) of samples varied from 0.08 to 0.21, in oleic forms from 3.25 to 3.35. Oils oxidative stability was inversely related to linoleic acid and high oleic types (K13 and K26) should be relatively stable and oxidation-resistant. Based on the results of this study, it can be hypothesized that K13 and K26 genotypes are heterozygote dominant while the others genotypes are homozygote dominant in terms of structure of fatty acid. Safflower oil genotypes K1 and K9 were high in erucic acid with K1 and K9 values of 3.24 and 2.90%, respectively. The overall levels of this fatty acid is 2% of total fatty acids, since large levels of erucic acid are harmful to human well-being, and can cause heart harm, according to the US law (Hossain et al., 2019).
The highest proportion of monounsaturated fatty acid (MUFA) in total belongs to the safflower oil genotype K13, while the highest percentage of polyunsaturated fatty acid (PUFA) belongs to the K17 genotype safflower oil. In the 18 genotypes, the proportion of unsaturated fatty acids was  87% (Tab. 4). The proportions of polyunsaturated fatty acids to saturated fatty acids (PUFA/SFA) also varied within the genotypes. The proportions of saturated fatty acids to unsaturated fatty acids (S/U), widely used criteria to describe the nutritional value of fat, value was low (from 0.08 to 0.16), suggesting their potential as edible sources of oil feed as mentioned by Zahran et al. (2020) who said that the UFA have a favourable effect and positive health benefit than SFA. The fatty acid profile of oil samples in the linoleic types and commercial were almost the same, suggesting the suitability of the seed oil of these genotypes for human use and for industrial purposes. In addition, the high oleic type of safflower oil samples (K13 and K26) are considered very advantageous for health as this class of fatty acids is suitable for hypocholesterol diets, for frying and frozen food preparation. The quality and value of oil production is controlled by fatty acids composition, good oil that is rich in linoleic and oleic acid. Under current study, high variability among safflower genotypes indicated the presence of highly genetic variation, which consider as a good tool for safflower selection and improvement. The principle selection for one trait has the most impact on crops for genetic improvement.
3.3 Genetic variability, heritability and genetic advance of traits Table 5 presents broad sense heritability, genetic advance, genotypic coefficient of variability (GCV), and phenotypic coefficient of variability (PCV). Heritability ranged from 82% in NSB and NTHB to 99% in 1000-seed weight and oil content as broad sense (Tab. 5). High estimates of broad-sense heritability for the various traits under study indicate the effect of dominance and/or epistatic variability of control traits may be contributed (Mather and Jinks, 1982).
Application of the breeding selection approach could be more successful for genetic improvement of traits when the heritability value was high (Singh and Pawar, 2005;Mohammadi and Pourdad, 2009).
It has been estimated that high heritability values for plant height and 1000-seed weight suggest these traits are under high genetic control and have been least influenced by environmental effect (Adhikari et al., 2018;Kose et al., 2018;Tahernezhad et al., 2018).
For the majority of traits, the genotypic coefficient of variation (GCV) ranged from 15.54 in oil content to 149.15 in NTHB and phenotypic coefficient of variation (PCV) ranged from 15.55 in oil content to 166.96 in NTHB. Higher GCV than PCV indicates the impact of trait on the environment. The remaining traits recorded GCV values from moderate to low.
The disparity between PCV and GCV reverts to the environmental impact on controlling of traits. High genetic advance when combined with high heritability suggests the role of additive genes in regulating the inheritance of these traits, which could be strengthened by selection (Yassein, 2013;Minnie et al., 2018). Although coefficients of variation measure the magnitude of variability between genotypes, estimates of heritability and genetic advances are important in plant breeding program, as they provide necessary information before designing the most successful breeding program.

Estimates of expected genetic advance
Genetic advances were between 4.1 (NTHB) and 43.41 (PH), with similar findings (Baye and Becker, 2005). Table 5 provides predicted genetic advance values for the characters evaluated of the genotypes. High genetic advance was found in PH, SYP and 1000-seed weight, respectively, 43.41, 21.34 and 17.62. The trait with high heritability and low genetic advance, NTHB and YTHB, suggests the action of non-additive gene and may be not effective successful in selecting early segregating generations as verified by Chand et al. (2008).
The importance of genetic advance and heritability (Johnson et al., 1955) is one of the most important criteria used in selection process. Genetic advance is a useful in explaining the type of genetic influence involved in trait control, whereas high genetic advance value is indicative of an additive gene influence and low values are indicative of nonadditive gene effect. Broad sense heritability of a trait indicates genetic influence among the phenotypes attributed and affected by the environmental impact on the genotypes (Eshghi et al., 2012;Johnson et al., 1955).

Stepwise analysis
Fitted equation, coefficient of determination (R 2 ), adjusted coefficient determination (R 2 -adj), P-value, standard error of estimate (SEE), Durbin-Waston (d), Mallows prediction (Cp) and Variance Inflation Factor (VIF) to predicate seed yield/plant and oil percentage of all genotypes are presented in Table 6. Several models were given as a result of stepwise regression for seed yield/plant and oil percentage prediction. Model 5 best .31 of the spiny variable indicated that, when each genotype is spineless (equal to one) relative to baseline spineless (equal to zero), the seed yield/plant and percentage of oil increased by 0.97 g and 3.31% respectively for spiny genotypes more than spineless genotypes withhold other all independent variables as constant. Model 5 and model 3 were positively and strongly correlated with seed yield/plant and oil percent, respectively, with high value of R 2 -adjusted (99.66 and 23.56% respectively), indicating that 99.66% of the seed yield/plant variance was predictable from those five independent variables, while 23.56% of variance of oil percent variance was predictable from those three independent variables. Seed yield/plant and oil percentage for model 5 and model 3 (0.96 and 4.1, respectively) were observed of low standard error estimate. These findings are consistent with those reported by Choulwar et al. (2005) and Golkar et al. (2010) who reported that plant height, first branch height, number of branches/plant, head diameter, seeds per head and 1000seed weight are the most important morphological characteristics related to seed yield. Results of indices used for testing of both models were showed in Table 6. Therefore, the value of Variance Inflation Factor (VIF) for both models was VIF < 10, so there was no multicollinearity issue between independent variables for seed yield/plant and oil percentage. These results promote the identification and intervention of the actual contribution of each independent variable in the each model with negligible confounding effects and interference. The Durbin-Watson analysis for seed yield/ plant and oil percentage d = 2.36 and 1.7 respectively, which is between the two critical values of 1.5 < d < 2.5. Therefore, in both models there is no linear auto-correlation of first order. Lastly, Mallows Cp suggests that model five and model three were relatively precise and unbiased seed yield/plant prediction (Cp = 5.2) and oil percentage (Cp = 4). Those two Mallows' Cp values were closest to the number of predictors plus the constant (respectively 6 and 4 of the models 3 and 5).

Conclusion
High variations confirmed between genotypes. Genotype K26 had high NSB, SYP and YSB, while genotype K6 had high NFB and YFB, genotype K8 in NTHB and YTHB, K17 in HFB, genotype K4 in 1000-seed weight and K2 in oil percentage. Genotypes K26 and K13 had high safflower oleic and can be used in breeding program. Safflower oil genotypes K1 and K9 were high in erucic acid with K1 and K9 values of 3.24 and 2.90%, respectively. Most of traits with high heritability that indicate these traits are under high genetic control. High genetic advance in PH, SYP and 1000-seed weight was reported. High heritability coupled with high genetic advance that shows the additive gene effect that controls the inheritance of these traits. Stepwise multiple regression analysis showed that 99.22% of the total variation in seed yield/plant could be explained by yield of secondary branches (YSB), yields of first branches (YFB), yield of third branches (YTHB), plant height (PH) and spiny as dummy variable. Yields of first branches (YFB), 1000 seed weight and spiny were responded about 23.56% of total variation of oil percent. Results suggest that yields of first branches (YFB) and spiny as dummy variable are primary selection criteria for improving seed yield/plant and oil percent.