Correlation and sequential path analysis of oil yield and related characteristics in camelina under seasonal variations

– The objectives of the current study were to determine the usefulness of sequential path analysis in camelina to obtain information about the relationship between yield and yield components and to evaluate their relative importance in camelina oil yield under summer and winter cultivation. A split-plot design, with two varieties as the main plot and four sowing times as the subplot, was carried out over two growing seasons (2017 – 2019) in Samsun, Turkey. Sequential path analysis revealed that, as ﬁ rst-order predictors, grain yield and oil content displayed the most signi ﬁ cant and positive direct effects on oil yield in both summer and winter cultivation. The sequential path analysis of second-order variables over the ﬁ rst-order variable revealed that seed number per pod and pod number explained approximately 90% of the variation of the grain yield in summer cultivation and branch number explained approximately 67% of the variation grain yield in winter cultivation. These results indicated that grain yield, as a main predictor of oil yield, had different pathways to affect oil yield in the summer and winter seasons. A higher magnitude of seed number per pod compared to pod number in this study indicated that selecting for higher grain yield can be done indirectly using plants with lower pod number and higher seed number per pod in the summer season. Moreover, branch number was the only trait that had a direct negative effect on grain yield in the winter season, indicating that plants with lower branch number should be selected for higher grain yield. Different environmental factors, including the seasonal cultivation of camelina in this study, were found to be a key factor in improving oil yield and, hence, should be considered as criteria indices in camelina breeding programs in the future.

connexes chez la caméline en fonction de la saison. Les objectifs de la présente étude étaient de déterminer l'utilité de l'analyse séquentielle de chemin pour la caméline afin d'obtenir des informations sur la relation entre le rendement et les composantes du rendement et d'évaluer leur importance relative dans le rendement en huile de la caméline en culture d'été et d'hiver. Un dispositif expérimental en split-plot, avec deux variétés comme bloc principal et quatre périodes de semis comme sous-blocs, a été réalisé sur deux saisons de croissance (2017-2019) à Samsun, en Turquie. L'analyse séquentielle de chemin a révélé que, en tant que prédicteurs de premier ordre, le rendement en graines et la teneur en huile ont affiché les effets directs les plus significatifs et les plus positifs sur le rendement en huile dans les cultures d'été et d'hiver. L'analyse du chemin séquentiel des variables de second ordre sur la variable de premier ordre a montré que le nombre de graines par gousse et le nombre de gousses expliquent environ 90 % de la variation du rendement en graines de la culture d'été et que le nombre de branches explique environ 67 % de la variation du rendement en graines de la culture d'hiver. Ces résultats indiquent que le rendement en graines, en tant que prédicteur principal du rendement en huile, suivait des voies différentes pour affecter le rendement en huile en été et en hiver. Une variation plus élevée du nombre de graines par gousse par rapport au nombre de gousses dans cette étude a indiqué que la sélection pour un meilleur rendement en graines peut se faire indirectement en utilisant des plantes avec un nombre de gousses plus faible et un nombre de graines par gousse plus élevé pendant la saison d'été. De plus, le nombre de branches est le seul trait qui a eu un effet négatif direct sur le rendement en graines pendant la saison d'hiver, ce qui indique que les plantes avec un nombre de branches plus faible devraient être sélectionnées en vue d'un meilleur rendement. Différents facteurs environnementaux, y compris la culture saisonnière de la caméline dans cette étude, se sont avérés être clés dans l'amélioration du rendement en huile et, par conséquent, devraient être considérés comme des indices de critères dans les programmes de sélection de la caméline à l'avenir.

Introduction
Camelina is a member of the Brassicaceae family, and Camelina sativa is the only species of economic importance out of 7 species (C. sativa, C. laxa, C. rumelica, C. microcarpa, C. hispida, C. anomala and C. alpkoyensis) belonging to the genus Camelina (Davis, 1970;Göre, 2015). Camelina is a native plant of Northern Europe and Central Asia, and many remains from archaeological excavations have revealed that this plant has been cultivated in Europe for at least 3000 years (Zubr, 1997). Canola was preferred instead of camelina because it is a higher-yielding oil plant after camelina was cultivated economically as an important oil crop in Europe until the 1930s and in North America until the 1950s (Berti et al., 2016). However, interest in camelina has increased again due to its superior fatty acid ratio compared to other industrial plants, especially Omega-3, its suitability for low-input agricultural systems and its potential to be used as a biofuel (Bujnovsk y et al., 2020;Lohaus et al., 2020). More than 90% of the fatty acids in camelina oil are unsaturated fatty acids and the concentration of saturated fatty acids is around 8-10%. Polyunsaturated fatty acids constitute an important part (about 60%) of unsaturated fatty acids in camelina (Murphy, 2016). They consist of 35-45% linolenic acid (C18:3; n-3; Omega-3) and 15-20% linoleic acid (C18:2; n-6; Omega-6). The concentration of monounsaturated fatty acids is approximately 36%, and these fatty acids are primarily composed of oleic acid (C18:1n-9) and eicosenoic acid (C20:n-9).
Camelina oil contains more monounsaturated fatty acids than soybean, sunflower and cotton. On the other hand, camelina oil contains polyunsaturated fatty acids close to canola, soybean and sunflower (Günç Ergönül and Aksoylu Özbek, 2020). The combination of these superior oil profile properties with desirable agricultural properties such as short life cycle (Kagale et al., 2014), high adaptation to different environmental conditions (Singh et al., 2015) and low input requirement (Manca et al., 2013) has made camelina a preferred resource. Since the main goal of a breeding program is to obtain high yields of good quality, it is important to know the relationship between the various characteristics that have a direct and indirect effect on the yield.
Physiological events affect the development of plant characters. Yield in plant production results from physiological events occurring in plants. However, it is affected by many factors such as genetic potential, environmental conditions and cultivation technique applications such as fertilization, irrigation, and sowing time (Gill and Narang, 1993). Understanding the relationship between yield and other factors affecting yield is highly likely to lead to higher yields.
The relationship between yield and its components determines the appropriate properties used in yield improvement studies. Simple correlation and path analysis can be used to reveal this relationship. Correlation and path analysis also provide information to researchers about the genetic relationship between yield and the characters that contribute to the yield, and this is useful in the development of reproductive strategies of plants (Mohammadi et al., 2003;Asghari-Zakaria et al., 2007;Feyzian et al., 2009;Maleki et al., 2011;Abdolinasab et al., 2020). Correlation simply measures the relationships between yield and other traits but does not provide a factual concept between dependent and independent variables. Therefore, path analysis is commonly employed in plant breeding to evaluate the correlations between yield and yield components (Tuncturk and Ciftci, 2005).
Multiple regression for path analysis assumes that each character used as a predictor variable is unrelated to the others in the dataset. In reality, yield-related characteristics are closely interconnected, frequently resulting in substantial multicollinearity. Because of this, it is difficult to determine and evaluate the actual contribution of a given character. A new way of grouping variables into different order paths was first used in crop plants by Samonte et al. (1998).
Sequential path analysis aims to help researchers understand the relationships between traits that develop next in the life cycle of plants. Del Moral et al. (2005) reported that yield and its components are not accurately reflected by simple correlation analyses and path analyzes are very helpful in clarifying relationships and effects. Vollmann et al. (2007) showed that camelina yield is a complex polygenic character and there are strong relationships between yield and yield components. Simple correlation analysis between the yield and yield components of camelina has been performed by various researchers (Katar et al., 2012;Guy et al., 2014;Jiang et al., 2016;Zanetti et al., 2017;Leclère et al., 2021). Nevertheless, no literature was found using "path analysis" or "sequential path analysis" to reveal the relationships between yield and yield components on camelina, and also the contribution of yield components in seed and oil yields under different seasons remained unknown. The objectives of the current study were: (i) to determine the usefulness of sequential path analysis in camelina, (ii) to obtain information about the relationship between yield and yield components and (iii) to evaluate their relative importance on camelina yield under summer and winter cultivation. In addition, it is to determine the important agricultural characteristics that have direct and indirect effects on the yield that can be used as selection criteria in camelina breeding programs.

Material and method 2.1 Plant material
In this research, camelina genotypes of Swedish origin PI-304269 and Danish origin PI-650142, obtained from USDA and found to have the best performance in adaptation studies in the research area, were used as plant material.

Soil and climate information
The soil structure of the experimental area is moderate in terms of organic matter and phosphorus content and rich in potassium content. In addition, the soil pH level of the experimental area is 7.71, and the soil situation is slightly alkaline (Tab. 1).
It is seen that the temperature, relative humidity, and day length averages of the planting seasons are higher than the averages of the long years. However, the total amount of precipitation is lower than the average for the long years (Tab. 2). It is also seen that the precipitation in the first summer planting flowering season period (July 2017) is negligible (0.4 mm), while the precipitation amount in the second summer planting season is less than half of the average precipitation of long years. In addition, the climate data other than the total precipitation amount in the summer planting season of 2018 are above the average of 2017 and many years, while the relative humidity and day length is higher in the winter planting season. Accordingly, it can be said that the climate data discussed in both growing seasons have a limiting structure in terms of plant cultivation.

Experimentation
The plots for the experiments were arranged in a split-plot design with three replicates, in which varieties were set up as the main plot, whereas sowing times as the sub-plot. The sowing dates for the summer and winter seasons were made with 10-day intervals. When the dates for sowing was arranged at May 1st, 11, 21 and 31 during the 2017 and 2018 summer seasons, the dates for sowing was arranged at 24 October, 3 November, 13 November and 23 November during the 2017 and 2018 winter seasons. The experimental area was empty in the previous year, the field was plowed twice according to the cultivation of camelina, and suitable sowing intervals were opened. The plot size was 3 Â 1 m with a spacing of 20 cm row distance and 5 cm plant distance in each row. Camelina genotypes were sown by hand. In both years of the experiment period, 40 kg ha À1 of ammonium nitrate fertilizer was applied before flowering. Weed control was carried out twice, before the pre-flowering and during the full-flowering stage. Field capacity irrigation was carried out on 1st July in the first year and on 12 June and 11 July in the second year for summer vegetation. At harvest, 0.5 m from the beginning and end of each plot and the remaining area after throwing a row from both sides, was harvested by hand when the plants were 90% mature. Yield and yield components were calculated on ten plants randomly taken from the harvest area of each parcel during the harvest.

Biological yield (BIO) (g/plant)
Ten plants taken randomly from each plot were placed in a separate paper bag and labeled, and then dried in an oven at 82°C for 48 hours. At the end of the period, dried plant samples were weighed on a precision scale (g) and recorded as biological weight.

Stem ratio (SR) (g g À1 )
The data obtained by the ratio of stem dry weight to biological weight in ten plant samples taken randomly from each plot was recorded as stem ratio.

Dry matter (DM) (g/day)
Ten plant samples taken randomly from each plot were dried in an oven at 83°C for 48 hours and then weighed and their biological weights were calculated. The following equation is used to calculate the amount of dry matter per unit time. The time expression in the relation refers to the number of days from sowing to sampling.

Grain yield (GY) (g/plant)
In 10 plant samples taken randomly from each plot during the harvest period, seeds obtained from all capsules on the plant were weighed on a precision scale (g) and recorded as grain yield per plant.

Oil content (OC) (%)
Oil analysis was performed using Ankom XT15 semiautomatic Soxhlet device in accordance with AOAC (Cunniff and Washington, 1997). The XT4 filter bags were tared, 1-1.5 g of finely ground seed sample was placed in the bag and weighed. The sample number is written on the bags with a special pen that is not affected by chemicals and does not leave any residue. XT4 filter bags were sealed using a heated press device. The samples were dried in an oven at 105°C for 150 minutes in order to lose their moisture content, and then cooled to room temperature in a desiccator and weighed. After this process, XT4 bags filled with samples were extracted in Ankom XT15 device at 90°C for 70 minutes. At the end of the period, the XT4 bags were kept in an oven at 105°C for 30 minutes, cooled in the desiccator and then weighed again on a precision balance. At the end of all these processes, the oil rate was calculated using the following equation: Oil ratio % After removing the marginal effect from each plot, the plants in the middle lines were harvested and weighed. It was calculated using the following equation: Harvest index ¼ Grain yield Biological yield Â 100:

Statistical analysis
Before the analysis of variance, outliers and the normality of the data were tested with the Grubbs test and the Anderson-Darling test. SPSS 24.0 statistical software was used to calculate Pearson correlation coefficients between various pairs of traits. Average data for planting dates were used to perform correlation and path analysis. Therefore, two data sets of summer and winter cultivation were subjected to the respective analyses.
In the current study, two distinguished types of path analysis (simple and complex models) were compared. Complex models (sequential path analysis) set up traits at different ontological levels, with relations (that is, co-relations or cause-and-effect relationships) between them meant to reflect possible biological relations, while simple models (usual path analysis) set up all traits except the dependent one at the same ontological level, making for their being treated as co-related (Kozak and Azevedo, 2014).
Initially, the conventional path analysis was employed, and all traits were treated as the initial predictor traits for the oil yield (OY). Then, for the first and second-order paths of the predictor variables to be organized based on their respective contributions to the total variance of the respective variable (OY) and least collinearity, sequential stepwise multiple regressions were carried out. The "variance inflation factor" (Hair et al., 1984), which is the inverse of the "tolerance" value, was used to quantify the extent of multicollinearity in each component path using SPSS 24.0 statistical software. After constructing a sequential path diagram, path analysis was done using AMOS 24.0. First-order predictors described the associations with OY, followed by second-order predictors explaining the relationships with the first-order predictors.
3 Results and discussion

Phenotypic correlations
All traits in winter camelina cultivation in the present study, except SR and HI, were measured higher than in summer cultivation (unpublished data) due to the longer vegetative growth period of camelina. Climate data in the current study showed that in winter cultivation of camelina, monthly average temperature, total rainfall, and monthly average relative humidity were higher and the average day length was lower compared to summer cultivation (Tab. 2). So, the earlier autumn sowing and winter cultivation of camelina not only allowed the plants to intercept and absorb more solar radiation for a longer growing period, more water and relative humidity for the crop, but also prevented end-season drought and heat stress during the seed filling period. These results confirmed previous findings on camelina that weather conditions have a significant impact on seed production and that milder temperatures during the growing season result in higher seed yields (Zubr, 1997;Zanetti et al., 2017;Zanetti et al., 2021).
Phenotypic correlation coefficients of different traits in summer and winter cultivation of camelina are presented in Table 3. The OY strongly followed the GY in both summer and winter cultivation. The strong and positive correlation between grain and oil yield (0.977**) was due to the calculation of oil yield by multiplying grain yield by oil content. A strong positive correlation between OY and GY was also reported in Katar et al. (2012)  According to Guy et al. (2014), spring planting in the Pacific Northwest (PNW) of the United States yielded better seed performance than fall planting. In contrast, in Turkey's ecological conditions, research by Katar et al. (2012) and Kinay et al. (2019) found that winter sowings outperformed summer sowings in yield and yield components, which were similar to this study.
Camelina seed oil content rose when the flowering and seed-filling stages occurred at lower temperatures (i.e., autumn sowing), according to Righini et al. (2019). Zanetti et al. (2020) stated that camelina's longer period of vegetative development during the winter cultivation is likely the explanation for taller and usually bigger plants. Carbohydrates and nitrogen are often transferred from vegetative tissues to reproductive organs to boost seed yield when crops have a prolonged vegetative phase (Bouchet et al., 2016). It has been reported that seed yield and oil content are higher when camelina is grown at colder growing season temperatures (especially during seed development) (Obour et al., 2017). Obeng et al. (2019) revealed that GY was negatively associated with heat stress index, suggesting that heat stress during flowering and seed development reduced camelina's seed production.
The correlation between camelina seed yield and yield components to better understand the relationship between yield-related traits has been reported by several researchers (Katar et al., 2012;Guy et al., 2014;Hossain et al., 2019;Angelini et al., 2020). PN had negatively non-significant correlations with OY (À0.229) and GY (À0.142) in summer cultivation, while had negatively significant correlations with OY (À0.841**) and GY (À0.769*) in winter cultivation (Tab. 3). A similar result was also reported in Jewett (2013) study, which found that the camelina yields are unaffected by PN. Hossain et al. (2019) stated that the contribution of one pod per plant to seed yield in camelina was low. It can be said that this is due to the fact that the camelina plant produces about three to four times more pods than the canola plant. Therefore photosynthetic accumulation is less in camelina pods than in canola pods.
BN showed a positively non-significant correlation with PN (0.589) in summer cultivation and a significant positive correlation (0.865**) in winter cultivation (Tab. 3). This result was in agreement with the findings of Angelini et al. (2020). More branching is produced to compensate for lower plant density, as shown by Hossain et al. (2019). Thus, more BN has led to lower plant density and increased PN. This was further corroborated by the absence of a clear relationship between camelina yield and the number of plants per unit area (Angelini et al., 2020). In contrast, Jewett (2013) reported that camelina yield was most strongly influenced by plant density per hectare. He indicated that increasing the planting density of the field is the quickest and most straightforward approach to boost camelina productivity. However, in a dryland agricultural system, a higher density may have a negative tradeoff in increased crop water demand. He concluded that if breeders are interested in improving seed output, it would be prudent to adopt a variety with a thousand seed weight, as this is highly heritable and genotype-dependent.
OC showed a non-significant and positive correlation with GY (0.284) in summer cultivation, while had a non-significant and negative correlation with GY (À0.377) in winter cultivation (Tab. 3). This might be attributed to camelina's substantially higher vegetative development period in winter cultivation than previously reported by Zanetti et al. (2020). HI in winter grown camelina is relatively low compared to other Brassicaceae, such as oilseed rape (Fan et al., 2017). Guy et al. (2014) concluded that OC was shown to be inversely associated with seed yield. They stated that OC, the larger of the two components of oil yield in their study, did not appear to be necessarily negatively connected with high seed yield, suggesting that the environment might impact seed oil content more than genotype selection. As a matter of fact, it has been reported that the yield performance of camelina is mainly  studies, which suggest that TSW may be of little utility in predicting seed yields in camelina. This is also consistent with the findings of Vollmann et al. (2007), who discovered a negative association between camelina seed yield and 1000-seed weight in research conducted over three growing seasons. Therefore, TSW may not be a good predictor of grain yield in camelina.
The negative correlation of BN with GY observed in both growing seasons in this study was also previously reported in Hossain et al. (2019). Katar et al. (2012) reported a significant and positive correlation of BN with GY and OY. They concluded that this difference was related to yearly variations that significantly affected the branch number per plant.
PH had non-significant and positive correlations with GY, OC and OY in summer (0.515, 0.691, and 0.613) and winter (0.075, 0.438, and 0.262) cultivations, respectively (Tab. 3). A positive correlation between PH and GY was also reported in Gehringer et al. (2006) and Neupane et al. (2020) studies. Results by Guy et al. (2014) revealed that a strong positive relationship was observed between HI and SY in their study.
There was a high correlation coefficient between yield components in the current study, which meant that it was impossible to determine the exact contribution of each component to total oil production because of mixed or muddled effects. Path analysis is used to understand the magnitude and direction of component characteristics' direct and indirect contributions to yield since correlation coefficients do not convey the entire picture when the causative factors are interconnected and interdependent.

Conventional and sequential path analysis
The conventional path analysis demonstrated collinearity between independent traits, demonstrating the first model's insufficiency to reflect each trait's actual contribution in both summer and winter cultivation of camelina (Tab. 4). Table 4 summarizes the direct impacts of predictor traits on camelina oil yield using conventional path analysis and indicators of collinearity across two growing seasons. However, there was considerable collinearity between several traits, notably those with high direct impacts on oil yield, despite the fact that the multicollinearity of some of the traits was good (VIF < 10 and tolerance > 0.1) in both summer and winter cultivation. For example, the VIF for SW was 11.798 in summer cultivation and 2515.267 for RW during winter cultivation, respectively (Tab. 4). That is because agronomic traits are highly correlated (Tab. 4). Multicollinearity was found to be severe in matrices with characteristics with VIFs greater than 10 and TOL lower than 0.1 (Mansfield and Helms, 1982). In this regard, the VIF of SW in summer cultivation and all traits except RL, OC and TSW in winter cultivation showed high collinearity (Tab. 4).
In this study, the use of sequential path analysis was found to be effective in reducing the collinearity of attributes (Tab. 5). Compared to conventional path analysis, sequential path analysis simplifies the relationship between traits and their contribution to oil yield. The findings revealed a considerable reduction in the variance-influence factor values of the first model compared to the second model in both summer and winter cultivation (Tab. 5). Moreover, the VIF lower than 10 and TOL greater than 0.1 displayed by the firstand secondorder predictors in the sequential path analysis elucidated the interrelationships between the OY-related characteristics.
Using stepwise regression analysis, variable collinearity was reduced, and by lowering effect mixing, each variable's real participation rate was accurately determined in different paths. Studies have shown that sequencing path analysis is more effective than conventional path analysis (Mohammadi et al., 2003;Asghari-Zakaria et al., 2007;Dalkani et al., 2011). GY and OC were chosen as first-order predictors to explain OY variance in both summer and winter cultivation (Tab. 5, Figs. 1 and 2). Given that GY mainly determines OY, these variables account for about 98% of the variance in OY.
GY and OC displayed the most significant and positive direct effects on OY (0.922 and 0.195, respectively) (Tab. 5). Both seasons had different magnitudes of impacts, but their direction remained the same. Their indirect effects on OY were found to be positive, non-significant and negligible in summer cultivation. In contrast, the indirect effect of OC through GY was found to be negative (À0.404), non-significant and moderate in winter cultivation. This was because of a low, negative, and non-significant correlation (À0.377) between the two characters OC and OY in winter cultivation (Tab. 4), indicating that indirect selection for higher GY generally has lower OC in winter cultivation (Tab. 5).  According to the stepwise regression of second-order variables over the first-order variables, SNP and PN account for about 90% of the GY's variation in summer cultivation, and BN explained about 67% of the variation of the GY in winter cultivation (Tab. 5). As a main predictor of OY, these results indicated that GY had different pathways in the summer and winter seasons (Figs. 1 and 2). Mohammadi et al. (2003) concluded that character relationships revealed by path analysis might be impacted by a variety of factors, including the germplasm utilized, the attributes evaluated for analysis, the environment(s) used for assessment, and the statistical approaches used to resolve the correlations.
In the summer season, SNP had the most considerable positive direct effect on GY (1.076**) with a non-significant, low and negative indirect effect via PN (À0.295). Selecting for GY can be done indirectly using plants with lower PN and higher SNP in the summer season (Tab. 5), despite the fact that PN had a negative direct effect on GY (À0.639**) and a nonsignificant, moderate, and positive indirect effect via SNP (0.497). In Hossain et al. (2019) study using multiple regression analysis revealed similar results, and they concluded that the SNP significantly affected seed yield rather than PN. In the winter season, BN was the only trait that had a direct negative effect on GY (À0.847**), indicating that plants with lower BN should be selected for higher GY. Darapuneni et al. (2014) reported that seed yield in flax was negatively correlated with pods per tiller. They found that the positive direct effect of tiller number on yield was significant enough to outweigh the negative indirect effect of pods per tiller, resulting in a positive effect of tiller number on seed yield.
The Chi-square approach was used to validate the sequential path model. The expected and observed correlation coefficients were compared. Results of Chi-square analysis showed low and non-significant values (0.087 ns , 0.247 ns ) for the two datasets (summer and winter cultivation) used in the construction of the path.
The diagrams of sequential path analysis for both summer and winter cultivation are shown in Figures 1 and 2. Improved insight into the interdependencies between oil yield-related characteristics in camelina was achieved through the categorization of predictors into firstand second-order predictors.

Conclusion
Climatic and environmental factors such as seasonal cultivation can influence camelina seed and oil yield. In the present study, stepwise regression analysis was used to reduce the collinearity measures of all variables, allowing for the determination of the actual contributions of each predictor variable in distinct path components with minimum confounding effects and interference. The sequential path analysis of second-order variables over the first-order variable by stepwise regression revealed that SNP and PN explained approximately 90% of the variation of the GY in summer cultivation and BN explained approximately 67% of the variation GY in winter cultivation. These results indicated that GY, as a main predictor of OY, had different pathways to affect oil yield in the summer and winter seasons. A higher magnitude of SNP compared to PN in this study indicated that selecting for higher GY can be done indirectly using plants with lower PN and higher SNP in the summer season. Moreover, BN was the only trait that had a direct negative effect on GY in the winter season, indicating that plants with lower BN should be selected for higher GY. Different environmental factors, including the seasonal cultivation of camelina in this study, were found to be a key factor in improving oil yield and, hence, should be considered as criteria indices in camelina breeding programs in the future.