Variability of the structure of correlations between the morphological and commercial traits of soybeans with different growth habit and branching characters

High yields of seeds, green pods and green biomass is the main goal of soybean breeding in many countries. An assessment of relationships between the productivity traits and their effect on the yield may be useful in developing effective crop cultivation programs. In soybean, the stem growth habit and the branching character are interrelated with plant productivity and in most cases determine it. Therefore, the aim of the present work was to study the variability of the level (strength) and the structure of correlations between 92 morphological, phenological, biochemical, agronomic traits of soybean accessions with different growth habits, and branching characters in different weather conditions. 270 soybean accessions of different ecological and geographical origin from the VIR collection have been grown in the Krasnodar region within 3 years. Field studies of the traits and biochemical analysis were carried out according to VIR guidelines. The variability of correlation matrices as regards the strength and structure of relationships was analyzed using the correlation and factor analysis (the principal component method), as well as the method developed by N.S. Rostova. A comparison of the level (R2, coefficient of determination) and structure of correlations in different years has shown that the deterioration of external conditions is followed by an increase in the strength of relationships (R2) between the traits and in the difference between correlation matrices’ structure. Soybean adaptation to the changing conditions occurs due to the rearrangements of relationship systems, whereas the degree and direction of these changes are determined by the growing conditions and specificity of the accessions response. Under favorable conditions, the structure of correlations in soybeans with different growth habits, and branching characters has more similarity than in the conditions critical for development. The highest level of relationships (R2) between the traits was observed in the year that was unfavorable for the growth of the semi-cultivated accessions (with the indeterminate growth habit and a large number of branches of the 1st and 2nd order). The green biomass productivity of accessions with the determinate growth habit and more than two branches is most strongly associated with the branch weight, while in accessions with the indeterminate growth habit and with (or without) 1–2 branches it depends on the growing season duration, one leaf weight and the number of leaves per plant. In the semi-cultivated accessions (with the indeterminate growth habit and numerous branches of the 1st and 2nd order), it correlates, besides the listed traits, with the number of nodes, the internode length, the main stem diameter, the weight of leaves, seed morphometric parameters and their quality.


Introduction
Soybean is one of the most economically important leguminous crops, ranking first among them in the world in terms of cultivated areas (http://www.fao.org/faostat). The numerous varieties that have been created by now differ by a huge variety of forms and demonstrate adaptation to various climatic conditions. Along with the specialized varieties, the semi-cultural forms are also industrially cultivated. The latter are commonly used for producing green fodder and green manure, as well as for the development of modern varieties. Soybeans are characterized by several types of the main stem growth: the varieties with indeterminate and determinate growth habit are distinguished. Since the determinate growth habit very rarely occurs in the wild-growing soybeans, this type of stem is associated with the domestication of the species (Liu et al., 2007;Tian et al., 2010). Previous studies have shown that the type of stem growth in soybean is mainly controlled by the Dt1 locus; the indeterminate growth habit is dominant or not fully dominant with respect to the determinate dt1 (Woodworth, 1932). Also known is the second locus, which controls stem growth and is designated as Dt2. The Dt2 allele is almost dominant with respect to dt2. The Dt2/Dt2 genotypes determine semideterminate phenotypes in the Dt1/Dt1 genetic background, while the dt2/dt2 genotypes determine the indeterminate ones. However, the phenotype is determinate in the dt1/dt1 genetic backgrounds, because dt1 is epistatic to Dt2 and dt2 (Bernard, 1972). It has been reported about the identification of the third allele in the Dt1 (dt1-t) locus, which produces a phenotype with some characteristics of both dt1 and Dt2 (Thompson et al., 1997). The Dt1 gene (=GmTfl1) is homologous to the terminal flower 1 (TFL1) gene of Arabidopsis, i. e., the regulatory gene encoding the apical meristem signaling protein.
The transition from the indeterminate to determinant type occurred through four independent single-nucleotide substitutions, each of which resulted in the replacement of amino acids (Tian et al., 2010).
The stem growth habit in soybean is an agronomically important trait, which is interconnected with many economically important characters. However, it is often difficult to distinguish between the indeterminate and determinate growth habit in field conditions, since their manifestation is influenced by the day length and the unfavorable growth conditions (Bernard, 1972). The branching character has not less influence on the grain and feed productivity of plants. The semi-cultivated varieties are distinguished by a large number of branches; the modern ones have no 2nd order branches, or form only the main stem.
Considering the importance of soybean as a food and forage crop, high yield of seed, green biomass and green pods is the main goal of soybean breeding in many countries. An assessment of relationships between the productivity component traits and their effect on the yield may be useful in developing effective crop cultivation programs. In this regard, a lot of research is devoted to the study of correlations between plant characters, as well as to the search for indicator traits which can be used for selecting accessions with the necessary economically important traits. In the works reviewed by the authors, seed productivity was connected with phenological and morphological characters, such as the 'days before ripening' and 'grain filling period' (Ferrari et al., 2018), plant height and number of branches (Aditya et al., 2011;Hakim, Suyamto, 2017), number of pods per plant (Board et al., 2003;Nagarajan et al., 2015;Rodrigues et al., 2015;Machado et al., 2017), number of nodes and pods with 2-3 grains (Machado et al., 2017), number of seeds per plant (Rozhanskaya et al., 2016), number of pods per plant and nodes on the main stem (Silva et al., 2015), as well as the number of seeds per plant and the 1000 seed weight (Vu et al., 2019). A number of works have noted a close relationship between seed productivity and green biomass (Leshchenko et al., 1987) or the above-ground mass of the plant (Huang et al., 2009).
There may be several reasons for so different, sometimes even opposite results concerning the relations between the traits of the seed and biomass yield obtained in studies of different authors. On the one hand, it can be explained by the nature of quantitative traits, which are characterized by continuous variability determined by the influence of a large number of genes, or by the redistribution of the number and range of genes due to the change of the limiting environmental factors. On the other hand, many authors studied correlations Variability of correlations of morphological traits of soybeans with different growth habit using a limited number of genotypes, evaluated a different number of traits employing different statistical methods, and therefore obtained different results. Besides, the studies did not take the diversity of the studied forms into account. It is known that a change of only the sample size can reveal the previously inconspicuous connections (Rostova, 2002). For instance, a simultaneous study of hay, silage, and green fodder soybean accessions may yield one set of results, whereas their separate analysis may yield a different set of data (Burlyaeva, Rostova, 2014).
Though the amount of correlation studies in soybeans performed by now is quite huge, there is no common point of view on the relationships between the quantitative traits that determine the seed and green biomass productivity. In spite of the obvious difference in morphological and economically important traits between the accessions with different growth habit and branching characters, we failed to find any studies differentiating varieties according to these characters and analyzing the links between productivity elements taking the morphotype into account. The information about correlations is fragmented and is limited to a statement of facts; meanwhile, the study of the correlations variability allows making a judgment on the interrelationships between traits, inherited variability, and helps to choose the right breeding strategy.
The purpose of the present work was to study the variability of the level (strength) and structure of correlations between morphological, phenological, biochemical, economic traits in soybean accessions with different growth habit and branching characters in different weather conditions.

Materials and methods
To study the variability of the structure of correlations between morphometric, biochemical and economically important soybean traits, 270 accessions representing domestic and foreign varieties of different ecological and geographical origin were selected from the global VIR collection (the N.I. Vavilov All-Russian Institute of Plant Genetic Resources). Ninety-two traits of soybean varieties have been investigated (Supplement)1. The soybean plants included in the study significantly differed from each other. They were divided into three groups according to the growth habit and branching characters. The first group included accessions with the determinate growth habit and a large number of branches (more than two), the second group united the semi-cultivated accessions with the indeterminate growth habit and numerous branches of the 1st and 2nd order, while the third group consisted of the accessions with the indeterminate growth habit with 1-2 branches or without them.
The data selected for the analysis resulted from the field experiments, which were carried out in 1989, 1992 and 1994 at the Kuban Experiment Station of VIR (KOS VIR) located in the steppe area of the Prikubanskaya Plain. The trial years were characterized by contrasting meteorological conditions. In 1989, the growing degree days above 10 °С amounted to 3590 °С, 3156 °С in 1992, and 3578 °С in 1994. The amount of rainfall during the growing season in 1989 was 394.7 mm, 334.3 mm in 1992394.7 mm, 334.3 mm in , and 177.1 mm in 1994394.7 mm, 334.3 mm in . In 1989394.7 mm, 334.3 mm in and 1992, the rainfall exceeded the average long-term norm, whereas in 1994 it was significantly below the norm. The high moisture availability in 1992 was observed only in the first half of the growing season, while the second half of the summer was characterized by insignificant rainfall.
The accessions were sown according to the collection nursery pattern. Each variety was sown in a four-meter single-row plot, with the inter-row spacing of 70 cm and 10 cm between plants in a row. Phenological observations, botanical and morphological descriptions of the accessions were carried out in accordance with the The International COMECON List of Descriptors for the Genus Glycine Willd (1990). The green biomass yield was estimated in the mowing ripeness phase (the onset of pods filling). The weight of branches, leaves and pods was determined at the same time for 10 plants of each accession. After ripening, the analysis was carried out on 10 soybean plants selected from the middle of a row.
The content of dry matter, fiber, protein in green biomass, oil and protein in the seeds was determined in the biochemistry laboratory of the Kuban Experiment Station of VIR, while the content of trypsin, chymotrypsin in the seeds was analyzed in the Biochemistry Department of VIR according to the Methods of biochemical research in plants (Ermakov et al., 1987). The green biomass biochemical composition was analyzed when assessing the green biomass yield in the mowing ripeness phase.
The revealing of regularities in variability and of the degree of correlation of 92 economic and biological traits in different environmental conditions with different types of soybean growth and branching characters, determination of their information value, adjustment of the initial set of traits by discarding redundant and secondary characters were performed using statistical data processing, which included correlation analysis and factor analysis of the correlations system using the principal component method. The identification of the groups of the most interconnected traits (pleiades) was carried out by analyzing the systems of correlations when constructing the correlation circles (matrix images in the form of the correlation cylinder sections) (Terentiev, 1959). The analysis was performed for nine correlation matrices calculated for each sample (for three groups of accessions, composed according to the growth habit and branching characters, for three years of research). The correlations were compared concerning the strength (level) of relationships (R 2 -coefficient of determination) and their structure (traits rearrangement in the correlation pleiades). The differences between the correlations matrices in the strength of relationships were determined by comparing the average values of the determination coefficients (R 2 ) (Wright, 1920). Similarity of characters relationship systems (correlation matrices) was calculated between matrices with z-transformation (Rostova, 2002). The R. Fisher's z-transformation was introduced for converting the distribution of the correlation coefficients (r) to the normal one according to the formula z = 0.5 ln ((1 + r)(l -r)). After the z-transformation, each of the compared correlations matrices (diagonal elements excluded) was rearranged into a vector. The obtained 9 vectors were used to form a new data array; in it, each matrix was regarded as a trait, and the individual coefficients in this matrix as values of the trait. The compared matrices were ordinated using the principal component method. The first main component was regarded as the factor of matrices similarity, Strength of relationships in correlation matrices of the accessions with different growth habit and branching characters in different year of the study (1989,1992,1994) Group Year and the proportion of variance corresponding to this component (FD1%) was used as an indicator of the degree of all the compared matrices similarity. The second main component was interpreted as an indicator of differences in the structure of matrices (Rostova, 2002). After grouping 9 correlation matrices by the principal component method and combining them with the correlation circles (matrix images in the form of the correlation cylinder sections), the regularities in the matrices distribution and changes in the structure of relationships in them were determined, i. e., the variability of the system of correlations in soybean accessions with different growth habit and branching character in different weather conditions have been revealed. Data analysis was performed using Statistica.7 and Excel 7.0 for Windows.

Results
At the initial stage of the study, the entire set of traits was analyzed in order to assess their relative informativeness. The factor analysis of all traits was performed using the data combined for three years of research and for each year separately. It has shown that the studied traits variability is associated with ten main factors. The studied traits got distributed into the factors of growth habit, seed weight and size, growing season duration, plant height, seed biochemical composition, leaf size and shape, green biomass biochemical composition, plant color (anthocyanin content in organs), the content of anti-nutrients in seeds, inflorescence parameters, and of green biomass yield. A more detailed description of this analysis is given in a previously published paper (Burlyaeva, Malyshev, 2013). As a result, 20 traits were selected as the most important ones for studying the accessions concerning their growth habit and green biomass productivity indicators. Also, the characters most strongly associated with the coordinated variability of plants in changing environmental conditions were identified. These included the weight of plants, branches and leaves; the number of leaves, branches and nodes; the average weight of one leaf and branch; stem diameter; plant length; internode length; middle leaf length and width; percentage of leaves from the total plant weight; growing season duration; the content of protein, fiber and dry matter in the green biomass; the content of protein and oil in seed; 1000 seed weight; and the seed hilum width. Further studies of variability of the correlations level and structure were carried out using the adjusted set of traits. The evaluation of matrices for the relationships structure variability (by the traits rearrangement in the correlation pleiades) was carried out using the principal component analysis according to the method by N.S. Rostova (2002) described above in the Materials and Methods section. The first main component was interpreted as the factor of matrices similarity, while the second component reflected differences in the structure of matrices connection. The proportion of the main component variance was taken as the indicator of the degree of the compared matrices similarity.
A comparison of nine z-transformed correlation matrices (three groups for three years of study) has shown that the similarity of the structure of correlations between accessions from different groups is lower than within each group (52.2 % for all groups, 70.9 % for group 1, 52.4 % for group 2, and 65.5 % for group 3). When studying the variability of correla-tion matrices for each year, the largest differences between them were observed in 1994 with correlations similarity of 67.0 %, while in 1989 and 1992 variability of the trait correlation structure was approximately the same (73.4 % and 73.6 %, respectively) (Fig. 1).
The factorial variance of the correlation matrices of all accessions over the years of the study was 83.4 % and exceeded the factorial variance of the matrices calculated for groups (69.7 %). It follows, that the correlation structure variability for these accessions is largely affected by the genotypic properties of the variety. In 1994, the strongest differences were noted between the structure of correlations in matrices calculated for all groups; obviously, the conditions critical for growth caused serious and diverse changes in the structure of correlations in different accessions. The highest variability in the structure of correlations was noted for the semi-cultivated F 1 -matrix similarity factor; F 2 -matrix specificity factor. 1 -group of accessions with the determinate growth habit and more than 2 branches; 2 -group of accessions with the indeterminate growth habit and numerous branches of the 1st and 2nd order; 3 -group of accessions with the indeterminate growth habit and 1-2 branches (or without them). As the growing conditions changed, an instability of the structure of relationships between traits was also noted in this group, though the degree of traits determination (R 2 , the relationships strength) was the strongest (Table).
When comparing the level and structure of correlations in different years, one can notice that the strength of relationships between the traits and the difference in the correlation matrices structure increase with the deterioration of external conditions. Under favorable conditions, the structure of correlations in soybean varieties with different growth habit and branching characters is more similar than in the conditions critical for vegetation. The adaptation of soybeans to the changing conditions in different groups occurs due to restructuring of the relationship system specific to a particular group.
To reveal more specific differences in the structure of trait relationships in accessions from different groups in different U -plant weight in pod filling phase; L u -leaves weight; L ux -one leaf average weight; L% -leaves percentage of the total plant weight; L w -middle leaf width; L n -number of leaves per plant; V u -branches weight; V ux -one branch average weight; V -number of branches per plant; H -plant length; h 1 -average internode length; Di -stem diameter; N -number of nodes per plant; W1 -1000 seed weight; R w -seed hilum width; T -sprouting-to-maturity period duration; DR -dry matter content in green biomass; Pro -protein content in green biomass; Pro s -protein content in seeds; OL s -oil content in seeds. years of the study, matrix images in the form of the correlation cylinder sections were used (Fig. 2). The correlation relationships in the green biomass productivity pleiad were the strongest and most stable in all accessions. In 1989, a year that was favorable for growth, the plant weight in accessions of the first group was associated with the weight of leaves, branches and the number of leaves. A strong and permanent correlation was also observed between the growing season duration and the dry matter content in the vegetative mass. In 1992, in the conditions of a cold and wet year, there was a strengthening of correlation relationships in the plant mass productivity pleiad, and the stem diameter character entry into the pleiad. A significant increase was also noted for the correlation between the growing season, the weight of 1000 seeds and protein content in them. Under severe drought conditions in 1994, the influence of the growing season on other traits increased. The plant length was determined by the growing season duration. The 1000 seed weight correlated with the high protein content in seeds both in 1992 and in 1994.
The second group composed of semi-cultivated accessions was distinguished by strong relationships between almost all traits, and the strongest ones were observed in 1994, a year that was dry and critical for soybean growth. In contrast to accessions from the first group, in 1989, a year that was favorable for vegetation, the green biomass productivity depended not only on the leaves weight and number, but also on the number of nodes and the internode length. Also, a very strong correlation was observed between the 1000 seed weight and the oil content in seeds. The growing season duration correlated with the plant length and with the content of dry matter in the green biomass. At low temperatures in 1992, the role of plant weight significantly increased in the total variability due to the increased strength of correlations with the weight of leaves and stem diameter. The leaf weight correlated with the 1000 seed weight, stem diameter and oil content in seeds. In contrast to 1989, the 1000 seed weight strongly correlated with the internode length. Similar to the accessions with the determinate growth habit, the relationship between the growing season duration and dry matter remained stable in all years of the study. Under the conditions of 1994, an increase in all correlations is accompanied by a weakened influence of the number of nodes on the variability of plant structures. The green biomass productivity, on the contrary, correlated with the majority of the studied parameters. There formed a relationship between the stem diameter and leaf characteristics, the percentage of leaves and the number of branches per plant. It is interesting to note that in contrast to 1992, there was a negative correlation between the leaf width and the protein and oil content in seeds.
In all the years of the study, the most stable and strong correlations in the third group of accessions (with the indeterminate growth habit and with 1-2 branches, or without them) were those between the green matter productivity, the weight of leaves, the number of leaves, and the growing season duration. In 1989 and 1994, there was a stronger negative relationship between the percentage of leaves per plant and the growing season duration than in the accessions from the first and second groups. The correlations between the traits of green biomass productivity were similar to the relationships revealed in the accessions with the determinate habit growth.
The strongest relationships between the traits were observed in the conditions of 1992. That year, the stem diameter and the number of nodes per plant played a more significant role in the total variability of traits. The growing season duration correlated with the protein content in seeds. The trait of plant length had a greater significance than for the accessions from other groups. This trait was associated with the internode length, the growing season duration, dry matter content in green biomass and protein content in seeds. The drought of 1994 increased the strength of relationships between traits, though to a lesser extent than the growing conditions did in 1992. The structure of correlations in 1994 was close to the structure of relationships between traits in the accessions with the determinate growth habit.

Discussion
A comparison of the average level of traits determination (R 2 ) ( Fig. 3) and stability of the structure of their relationships (correlations) (FD1%) performed for both different years and different groups has shown that the level of strongest relationships and similarity in the structure of correlations were observed for the vegetative mass productivity, growing season duration, and the number of branches per plant. A comparison of the R 2 /FD1% ratios for nine correlation matrices (all years and groups) displays a noticeable decrease in the similarity in the structure of relationships between such traits as the growing season duration, seed hilum width and the number of branches per plant.
These traits are the main ones in the pleiades of the growing season duration (T), seed and pod parameters (R w ), stem characters, growth habit i. e., growth habit and branching characters (V); they have a high level of relationships with other plant traits (not included in their own pleiades) and are characterized by strong variability of these correlations, which depends on both conditions and genotype. It means that the same traits in different groups form relationships with different traits under the changing conditions. The lability of the trait correlations described above apparently plays a role in the plant adaptation to various growth conditions. An analysis of the determination coefficients variability revealed the highest level (strength) of relationships between the traits in the pleiades of seed productivity, growing season, and plant characters (growth habit and branching characters) (Fig. 4). A lower level of determination and its relative stability were observed for such traits as stem diameter, leaf length, protein content in seeds, and dry matter content in green biomass. The traits in the pleiad of green biomass productivity, shoot, and leaf width were distinguished by a lower variability of the determination coefficients, i. e., displayed a stable level of relationships (correlations).
A detailed study of the correlation structure variability has shown that the relationships of green biomass productivity with the weight of branches and leaves, and with the number of leaves, are stable, display the highest level of relationships and are characteristic of all accessions. The green biomass increase in the varieties with the determinate growth habit and a large number of branches is influenced more by the average weight of a branch, while the growing season duration, one leaf weight, and the number of leaves per plant have a stronger impact on the accessions with the indeterminate growth Variability of correlations of morphological traits of soybeans with different growth habit Рис. 4. Traits determination and variability of the strength of relationships between them.
X-axis -average determination (R 2 ); Y-axis -standard deviation of the average determination coefficient (SD R 2). For trait designations see Supplement and Fig. 2. habit, 1-2 branches or without them. In addition to the above mentioned traits, the green biomass productivity in the semicultivated accessions (with the indeterminate growth habit and numerous branches of the 1st and 2nd order) depends on the number of nodes and the main stem internode length. Changes in the environmental conditions cause both general rearrangements in the structure of correlation relationships, and individual ones, which are characteristic of plants with a certain growth habit and branching.
The studies of correlation relationships between the traits in soybean varieties with different growth habit and branching characters have resulted in revealing a regularity in these A minor deterioration in the growing conditions entails a slight decrease in the degree of correlation of all the traits. More tough conditions change the behavior of the traits of the generative and vegetative spheres. Fig. 4 shows the separation of these traits into two groups. R 2 values either diminish, or increase for the vegetative organ traits. Determination coefficients sharply increase for the traits associated with seed productivity. Separation of the green biomass and seed productivity traits in terms of the degree of correlation in varieties of the first and second groups (with numerous branches) was observed during their development during a drought. The accessions from the third group (with few branches) displayed such a separation of traits during vegetation in a year with excessive moisture in the initial phases of growth and insufficient moisture in the second half of summer. The difference between the varieties during critical periods was associated with the branching characters in these groups. In a dry year, the accessions with numerous branches experienced a greater moisture deficiency from the first phases of vegetation (from the branching period). The varieties characterized by a small number of branches did not experience such a severe influence of drought at that time. For them, the most unfavorable year was the one with a long period of vegetative organs growth (the phase before flowering) and insufficient moisture during the periods of mass flowering and pods formation.
Thus, the seed and green biomass productivity not always show a direct and strong correlation between them. The relationships between these traits is influenced by both the conditions of the year and the features of the variety. The variation of R 2 values for green biomass and seed productivity has its specificity in each group, which should be taken into account in breeding for the main economic traits.

Conclusion
A comparison of the correlation coefficient values calculated for all accessions, groups distinguished by the growth habit and branching characters, and for the years of study, yields results that are noticeably different. The data calculated for all accessions regardless of the growth habit and branching characters, characterize only the species specificity of the coordinated variability of soybean traits and do not reveal the features of varieties that are important for understanding their behavior in changing environmental conditions. This was not taken into account in most works that dealt with determining correlations between the traits and identifying indirect traits for selection for important economic characters. That is why the selection using the indicator traits identified by other researchers often did not yield a proper result.