Chromosome number evolution was analysed in 6 major angiosperm clades (Monocots, Fabids, Malvids, Caryophyllales, Lamiids, Campanulids; APG IV, 2016) including a minimum of 67 and a maximum of 200 taxa per clade. The whole dataset is listed in the Supporting Information Table S2.
The topology of the phylogentic tree is based on the backbone phylogeny of European flora (Durka & Michalski 2012). Subsequently, lower level nodes and species were resolved and assembled onto the backbone tree, using more than 90 phylogenetic and systematic studies (see Supporting information Table S1 and Figure S1).
Mean diploid (2n) and basic (x, see Peruzzi, 2013) chromosome numbers for each taxon were visualized on the phylogenetic tree (Fig. 1) with the plotsimmap function in the package phytools (Revell, 2012) of R (R Core Team, 2015).
Climatic, ecological and morphological data
Climatic data associated with the sampling sites were acquired from the Worldclim database at 2.5 min scale (http://www.worldclim.org; Hijmans et al., 2005) The considered climate data are: mean annual temperature (”C), temperature seasonality (SD ” 100), temperature continentality (”C), mean annual precipitation (mm) and precipitation seasonality (coefficient of variation). Means and standard deviations were estimated for each species within a buffer of 10 min over georeferenced sites using Qgis v. 2.18 (Quantum Gis, http://www.qgis.org).
Data about morphological traits and habitat characteristics, for each of the considered species, were retrieved from literature (Pignatti, 1982 and Table S1). To characterise the relevant features of vegetative morphology and reproductive strategies, we considered the growth form (annual herb, geophytes, perennial herb and woody), the flower size (large, small, inconspicuous) and whether the flowers are clustered in inflorescence or not. Habitats were classified using categorical variables ecological meaningful in terms of stability vs. instability of the habitats and reliable vs. unreliable resource availability. For this purpose, we classified the species into three categories according to soil moisture (dry; moist; wet), habitat light (open; semi-shaded; forest) and soil nutrient contents (oligo-, meso-, eu-trophic).
Phylogenetic comparative analysis
We investigated whether the evolution towards optima of diploid (2n) and basic (x) chromosome numbers is influenced by climatic variables (continuous predictors), habitat characteristics or plant traits (categorical predictors) within different angiosperm clades.
To this end, we used the phylogenetic comparative approach implemented in the R program SLOUCH, designed to study adaptive evolution of a trait along a phylogenetic tree (Hansen et al., 2008). With this approach, the response trait is modelled as an Ornstein’Uhlenbeck model assuming that the trait has a tendency to evolve towards a ‘primary’ optimum ”, defined as the average optimal state that species will reach in a given environment when ancestral constraints have disappeared (Hansen, 1997).
To facilitate interpretation of parameter estimates, the tree is scaled to 1.0 total length (from the root to the tip in the ultrametric tree). The two main parameters returned by the model are the phylogenetic half-life (t1/2) and the stationary variance (vy). Phylogenetic half-life can be interpreted as the average time it takes a species to evolve half the distance from the ancestral phenotype towards the predicted optimal phenotype, and thus is a quantification of the phylogenetic inertia (Hansen, 1997). A half-life above zero indicates that adaptation is not immediate, while a half-life = zero means that there is no evolutionary lag. The stationary variance is the stochastic component of the model, which can be interpreted as evolutionary changes in the response trait to unmeasured selective forces and genetic drift.
Models that include a predictor variable are referred to as adaptation models because these models test whether the response traits evolve towards optima influenced by a predictor. The adaptation models are contrasted with an intercept-only model in which the half-life is a measure of the phylogenetic effect in the response variable. A half-life = zero in such a model means that the response variable is not phylogenetically structured, while a half-life’>’0 indicates that there is an influence of phylogeny on the data, when the half-life is very large, this can attributed to an underlying continuous Brownian motion process. Indeed, a phylogenetic effect can be due to lag of adaptation, adaptation towards phylogenetically structured optima, or a combination of both. Hence, when a trait exhibits phylogenetic signal, it is not immediately clear whether this is due to the trait itself having strong inertia, or if the trait is evolving in response to a phylogenetically structured variable. By comparing a model with and without the inclusion of predictor variables, it is possible to determine how much of the phylogenetic signal can be attributed to phylogenetic inertia. No reduction in the t1/2 suggests that phylogenetic signal can be entirely attributed to phylogenetic inertia; on the contrary when a trait evolves in response to a variable, a reduction in the half-life for the response trait should be observed.
The adaptation models which include continuous predictors (climatic variables) are fitted using generalized least squares for estimation of the regression parameters and maximum likelihood for estimation of t1/2 and vy in an iterative procedure (Hansen et al., 2008). In addition, the SLOUCH model assumes that the predictors have a longer phylogenetic half-life than the model residuals and this is well supported by the variables involved in our study. The model returns parameters of an optimal regression and of a phylogenetic regression. The former is the relationship between the response and predictor variable that is predicted to evolve free of ancestral influence (absence of inertia). Therefore, the slope of this regression must be steeper than that of the phylogenetic regression.
To evaluate the effect of the categorical predictors on the evolution of chromosome number, the ANOVA and ANCOVA extensions implemented in SLOUCH were used. Categorical character-states were mapped onto the phylogeny using parsimony reconstruction.
Adaptation models are compared to the intercept-only models using the Akaike information criterion corrected for small sample-size (AICc); models with AICc scoring zero-two units lower are considered substantially better while if more than two units lower than the intercept-only models are considered significantly better (Burnham & Anderson, 2004). Finally, model interpretations were based on comparisons of t1/2 and vy of the adaptation models with the intercept-only model together with the amount of variation in chromosome number that the models explain. All statistical analyses have been carried out in R v3.2.3.