Evaluating revised biomass equations: are some forest types more equivalent than others?

Background In 2014, Chojnacky et al. published a revised set of biomass equations for trees of temperate US forests, expanding on an existing equation set (published in 2003 by Jenkins et al.), both of which were developed from published equations using a meta-analytical approach. Given the similarities in the approach to developing the equations, an examination of similarities or differences in carbon stock estimates generated with both sets of equations benefits investigators using the Jenkins et al. (For Sci 49:12–34, 2003) equations or the software tools into which they are incorporated. We provide a roadmap for applying the newer set to the tree species of the US, present results of equivalence testing for carbon stock estimates, and provide some general guidance on circumstances when equation choice is likely to have an effect on the carbon stock estimate. Results Total carbon stocks in live trees, as predicted by the two sets, differed by less than one percent at a national level. Greater differences, sometimes exceeding 10–15 %, were found for individual regions or forest type groups. Differences varied in magnitude and direction; one equation set did not consistently produce a higher or lower estimate than the other. Conclusions Biomass estimates for a few forest type groups are clearly not equivalent between the two equation sets—southern pines, northern spruce-fir, and lower productivity arid western forests—while estimates for the majority of forest type groups are generally equivalent at the scales presented. Overall, the possibility of very different results between the Chojnacky and Jenkins sets decreases with aggregate summaries of those ‘equivalent’ type groups.


Background
Nationally consistent biomass equations can be important to forest carbon research and reporting activities. In general, the consistency is based on an assumption that allometric relationships within forest species do not vary by region. Essentially, nearly identical trees even in distant locations should have nearly identical carbon mass. In 2003, Jenkins et al. published a set of 10 equations for estimating live tree biomass, developed from existing equations using a meta-analytical approach, which were intended to be applicable over temperate forests of the United States [1]. These equations were developed to support US forest carbon inventory and reporting, and had several key elements: (1) a national scale, so that regional variations in biomass estimates due to the use of local biomass equations was eliminated, (2) the exclusion of height as a predictor variable, and (3) in addition to equations to estimate aboveground biomass, a set of component equations allowing the separate estimation of biomass in coarse roots, stem bark, stem wood, and foliage. Since their introduction, these equations have been incorporated into the Fire and Fuels Extension of the Forest Vegetation Simulator as a calculation option [2], Open Access *Correspondence: choover@fs.fed.us USDA Forest Service, Northern Research Station, Durham, NH, USA utilized in NED-2 [3], and have provided the basis for calculating the forest carbon contribution to the US annual greenhouse gas inventories for submission years 2004-2011 (e.g., see [4]). Researchers in Canada [5,6] and the US (e.g. [7][8][9]) have also employed the equations while other investigators have adopted the component ratios to estimate biomass in coarse roots or other components (e.g. [10,11]).
In 2014, Chojnacky et al. [12] introduced a revised set of generalized biomass equations for estimating aboveground biomass. These equations were developed using the same underlying data compilations and general approaches to developing the individual tree biomass estimates as for Jenkins et al. [1], but with greater differentiation among species groups, resulting in a set of 35 generalized equations: 13 for conifers, 18 for hardwoods, and 4 for woodland species. Important distinctions are: the database used to generate the revised equations was updated to include an additional 838 equations that appeared in the literature since the publication of the 2003 work or were not included at that time, taxonomic groupings were employed to account for differences in allometry, and taxa were further subdivided in cases where wood density varied considerably within a taxon. The only component equation revised by Chojnacky et al. [12] was for roots; equations were fitted for fine and coarse roots, in contrast to Jenkins et al. [1] where fine roots were not considered separately.
Based on the similarity of the equation development approach, it is likely that applications using the Jenkins et al. [1] set would have essentially the same basis for employing the revised equations. Since the primary objective of Chojnacky et al. [12] was to present the updated equations and describe the nature of the changes, only a brief discussion of the behavior of the updated equations vs. the Jenkins et al. [1] equation set was included. The authors noted that at a national level results were similar, while differences occurred in some species groups, for example, western pines, spruce/fir types, and woodland species. Given the limited information provided in Chojnacky et al. [12] we felt that a more thorough investigation of the differences in carbon stock estimates as generated with both sets of equations was needed.
One potentially practical result from a comparison of the two approaches is to identify where one set effectively substitutes for the other, which then suggests that revising or updating estimates would change little from previous analyses. For this reason we applied equivalence tests to determine the effective difference of the Chojnackybased estimates relative to the Jenkins values. Note that hereafter we label the respective equations and species groups as Chojnacky and Jenkins (i.e., in reference to their products not the publications, per se).
In this paper, we: (1) provide a roadmap for applying the Chojnacky equations to the tree species of the US Forest Service's forest inventory [13], (2) present results of equivalence testing for carbon stock estimates computed using both sets of equations, and (3) provide general guidance on the circumstances when the choice of equation is likely to have an important effect on the carbon stock estimate. Note that we do not attempt any evaluation of relative accuracy or the relative merit of one approach relative to the other.

Results and discussion
We conducted multiple equivalence tests on data aggregated at various levels of resolution. As noted by Chojnacky et al. [12], at a national level the carbon density predicted by both equations was the same when grouped by just hardwoods and softwoods, while some type groups showed differences (though no statistical comparisons were conducted). Relative differences emerged as four regions ( Fig. 1) relative to the entire United States were used to summarize total carbon stocks in the aboveground portion of live trees as shown in Fig. 2. Totals for the US as well as separate summaries according to either softwood or hardwood forest type groups (not shown) are about 1 % different. This similarity in aggregate values between the two approaches holds for the Rocky Mountain and North regions, where there is less than a 1 % difference between the two. There are more sizeable differences in the Pacific Coast and South regions, notably differing in direction and magnitude. The largest difference is in the South. Note that our results are presented in terms of carbon mass rather than biomass.
To examine the drivers of those differences, we carried out equivalence tests by forest type group at both the national and regional levels on the mean density of carbon in aboveground live trees; a summary of the results is given in Table 1. The quantity tested is mean difference (Chojnacky − Jenkins) in plot level tonnes carbon per hectare; the test for equivalence was based on the percentage difference relative to the Jenkins based estimate (i.e. 100 × ((Chojnacky − Jenkins)/Jenkins)). The 5 (or 10) % of Jenkins, which was set as the equivalence interval, was put in units of tonnes per hectare for comparison with the 95 % confidence interval for the α = 0.05 (or α = 0.1) two one-sided tests (TOST) of equivalence. Of the 26 forest type groups included in the analysis, 20 are equivalent (at 5 or 10 %) at the national level, with most equivalent at 5 %. The exceptions are: spruce/fir, longleaf/slash pine, loblolly/shortleaf pine, pinyon/juniper, other western softwoods, and woodland hardwoods. At a regional level, differences emerge; in the North, only spruce/fir and loblolly/shortleaf pine are not equivalent (too few plots were available in pinyon/juniper for a reliable test statistic) while in the South, the pine types lacked equivalence, as did pinyon/juniper. This is very likely a reflection of the fact that the Chojnacky equations divide some taxa by specific gravity, while the Jenkins equations do not; softwoods generally display a larger range of specific gravity values within a species group than do hardwoods [14]. Researchers have noted considerable variability in the estimates produced by different southern pine biomass equations [15], even between different sets of local equations. Specific gravity, as mentioned above, is a factor, (southern pines exhibit considerable variability in specific gravity), as well as stand origin, and the mathematical form of the equation itself. Melson et al. [16], in their investigation of the effects of model selection on carbon stock estimates in northwest Oregon, noted that the national level Jenkins [1] equations produced biomass estimates for Picea that were consistently lower than from approaches developed by the investigators, and hypothesized that differences in form between Picea species introduced bias into the generalized equation.
Pinyon/juniper was not equivalent in any region in which it was tested. While fir/spruce/mountain hemlock was not equivalent in the Rocky Mountains, the stock estimates were equivalent to 5 % in the Pacific Coast region, likely a function of the species and size classes that dominate the groups in each of these regions. The elm/ash/cottonwood category is represented in each region, and was equivalent to 5 % in all areas except the Pacific Coast. The woodland class has been less well studied than the others, and so less data and fewer equations are available to construct generalized equations like those in Jenkins et al. [1] and Chojnacky et al. [12]. Consequently, the woodland equations are not equivalent at the national level or in any region.
We also explored the effect of size class on equation performance, testing each combination of forest type group and stand size class and found notable differences among size classes, though no evidence of a systematic pattern. A summary of the results is given in Fig. 3a and 3b; the error bars represent the 95 % confidence interval transformed to percentage. Not every combination is shown; groups with results similar to another or comprising a very small proportion of plots are not included. While some groups such as ponderosa pine, oak/hickory, lodgepole pine, and white/red/jack pine show small differences between size classes and are equivalent (or nearly so), others such as loblolly/shortleaf pine, longleaf/ slash pine (data not shown), woodland hardwoods, and spruce/fir show a strong pattern of increasing differences with increasing stand size, with a lack of equivalence between the small and large sawtimber classes. Note that both the direction and magnitude of the differences were variable across the forest type groups. Hemlock/Sitka spruce displayed a strong trend in the opposite direction, with large differences between the two approaches for the small and medium size classes, and a very small difference in the large sawtimber class. The difference between the two sets of estimates for the woodland group that is shown in Table 1 is readily apparent in Fig. 3a, with a large increase in the percent difference as the stand size class increases. This may be due to the lack of woodland biomass equations based on diameter at root collar (drc) and the difficulty of obtaining accurate drc measurements. Bragg [17] and Bragg and McElligott [15] have discussed the importance of diameter at breast height (dbh) in some detail, comparing the performance of local, regional, and national equations for southern pines across a range of diameters. While most equations returned fairly similar estimates for trees up to 50 cm dbh, equation behavior diverged at larger diameters, in some cases returning estimates that were considerably different. In these examples, the national level Jenkins equations [1] did not produce extreme estimates, they were intermediate to those returned by local and regional equations. Melson et al. [16] also noted that considerable error could be introduced when applying equations to trees with a dbh value outside the range on which the equations were developed.
Equivalence was not tested at the level of the individual tree, though a random subset of individual tree estimates were plotted for each species group to compare tree-level biomass estimates. These plots reflect the patterns demonstrated above, with one method producing values consistently higher or lower than the other, the differences becoming more apparent at larger diameters. Tree data were also classified by east and west to further explore equation behavior within species groups where there are considerable differences in the range of tree diameters, east versus west. In many cases, no trends were revealed, but there are some key differences; a notable example is shown in Fig. 4a, b, which show the results of tree-level carbon estimates by each set of equations, categorized as east and west. In Fig. 4a, the eastern US, the Jenkins estimates are larger than those produced from the Chojnacky equations, while in Fig. 4b, the western US, the Jenkins estimates are generally somewhat lower, with the exception of the "Abies; LoSG" group. Figure 5 shows similar data for the woodland taxa; again, there is a considerable difference between the estimates Table 1 Mean stock of carbon in aboveground live tree biomass as computed using the equations from Jenkins et al. [1] and Chojnacky et al. [12] Values followed by a double asterisk (**) are equivalent at 5 %; values followed by a single asterisk (*) are equivalent at 10 %. Regions are as shown in Fig. 1. A diamond preceding a value indicates that the sample size was too small for a reliable test of equivalence. Data not shown for categories represented by fewer than 10 plots a As shown in Fig computed with the two methods, with the Jenkins equations producing consistently lower estimates than the Chojnacky equations. In this case, we see no obvious differences between the predictions in the East or West. As mentioned above, the belowground component equations were also revised in the 2014 publication, and while not divided according to hardwood and softwood, the revised root component equations are subdivided by coarse and fine roots. There are important differences in the shape of the root component curve between the two approaches ( Fig. 6), and the Jenkins hardwood equation yields a consistently lower proportion than the Chojnacky equation. This suggests that adopting the Chojnacky estimates for full above-and belowground tree would add up to an additional 2-3 % of biomass for hardwoods but would also affect some softwood estimates.
A preliminary analysis did show an effect on the test for the 5 % equivalence for some categories. However, our emphases here are the various species groups/equations and not the components.

Conclusions
The revised approach to developing these biomass equations has the effect of providing better regional differentiation/representation at the plot/stand level summaries by allowing for separation within the taxonomic classes according to wood properties or growth habit. The emergence of Southern pines as distinctly different under the Chojnacky groups is one example. It is challenging to provide specific criteria for choosing one set of equations over the other, since validating any biomass equation requires the destructive sampling of multiple stems across a range of diameters. The Chojnacky groups appear to provide greater resolution across forest types and regions. From this, investigators working in southern pine, northern spruce-fir, pinyon-juniper, and woodland types may be advised to use the updated equations [12], which provide more taxonomic resolution. It should also be noted that estimates of change over time Estimates are classified by forest type group and stand size class. The error bar represents the confidence interval used in the equivalence tests. In general, small stands have at least 50 % of stocking in small diameter trees, large stands have at least 50 % of stocking in large and medium diameter trees, with large tree stocking ≥ medium tree. The 12 forest type groups included here are: loblolly/shortleaf pine, pinyon/juniper, ponderosa pine, oak/pine, oak/hickory, and woodland hardwoods in panel a, and white/red/jack pine, spruce/fir, Douglas-fir, lodgepole pine, hemlock/Sitka spruce, and maple/beech/ birch in panel b Fig. 4 Examples of the Chojnacky-based and Jenkins-based estimates for aboveground carbon mass (kg) of individual live trees (plotted by diameter at breast height, dbh). Separate panels show the East (a, North and South) and the West (b, Pacific Coast and Rocky Mountain). This example includes trees within the fir species group of Jenkins (black) and their mapping to Chojnacky (red) species groups, which are identified in Table 2. Data points include applicable live trees in the FIADB tree data table up to the 99th percentile of diameters in the east and west, respectively are somewhat less sensitive to equation choice than stock estimates, so if change is the primary variable of interest, the user can select either equation set, based on personal preference.
Individual large diameter trees can be very different-Chojnacky relative to Jenkins-given the general trends of the tree-level estimates (Figs. 4 and 5 in this manuscript as well as Figs. 2, 3, and 4 in Chojnacky et al. [12]). This effect of one or a very few larger trees can result in very different estimates even in an "equivalent" forest type group, and this potential for larger differences is reflected in plot-level data. For example, in some eastern hardwood type groups, which were consistently identified as equivalent, up to one-third of the plots were individually more than 5 % different. The oak/gum/ cypress type group in the South had 8 % of the plots with greater carbon density by over 5 % with the Jenkins estimates, while 27 % of plots had over 5 % greater carbon.
The remaining 65 % of the individual plots are within the 5 % bounds (data not shown here). This is consistent with our observation about similarities between the two sets and scale (Fig. 2)-the sometimes obvious and large differences for some forest type groups (all scales) become obscured when summed to total live tree carbon for the US. Singling out the correct or most accurate equations is beyond the scope here; however, caution is always warranted when applying equations to trees that are considerably outside the range of diameters used to construct the equations [16].
Our results point to a few forest type groups that are clearly not equivalent-southern pines, northern sprucefir, and lower productivity arid western forests-while the majority of forest type groups are generally equivalent at the scales presented. Overall, the possibility of very different results between the Chojnacky and Jenkins sets decreases with aggregate summaries of those 'equivalent' type groups.

Tree data source
In order to implement the revised biomass equations and identify applications where they are effectively interchangeable, or equivalent, we used the Forest Inventory and Analysis Data Base (FIADB) compiled by the Forest Inventory and Analysis (FIA) Program of the US Forest Service [13]. The data are based on continuous systematic annualized sampling of US forest lands, which are then compiled and made available by the FIA program of the US Forest Service [18]; the specific data in use here were downloaded from http://apps.fs.fed.us/fiadb-downloads/datamart.html on 02 June 2015. Surveys are organized and conducted on a large system of permanent plots over all land within individual states so that a portion of the survey data is collected each year on a continuous cycle, with remeasurement at 5 or 10 years depending on the state. The portion of the data used here include the conterminous United States (i.e., 48 states), and the portion of southern coastal Alaska that has the established permanent annual survey plots (the gray areas in Fig. 1).
Our focus here is on the tree data of the FIADB, and for this analysis we present the Chojnacky and Jenkins estimates in terms of carbon mass (i.e., kg carbon per tree or tonnes per hectare per plot). We use the entire tree data table to assure that all applicable species (the gray areas in Fig. 1) are represented. All other summaries are based on the most recent (most up-to-date) set of tree and plot data available per state, with the Chojnacky and Jenkins estimates expressed as tonnes of carbon per hectare in live trees on forest inventory plots. These plot-level values are expanded to population totals, that is, total carbon stock per state, as provided  within the FIADB as the basis for the result presented in Fig. 2. A subset of the current forest plot level summaries where the entire plot is identified as forested (i.e., single condition forest plots) is the basis for the results provided in Table 1 and Fig. 3.

Application of Chojnacky et al. [12] to the FIADB
Chojnacky et al. [12] provided a revised and expanded set of biomass equations following the approach of Jenkins et al. [1]. The revised equations are based on an approach similar to that of Jenkins et al. [1] and with an expanded database of published biomass equations; see Chojnacky et al. [12] for details. The new set of 35 Chojnacky species groups are based on taxon (family or genera), growth habit, or average wood density. See Table 2 for the links between species in the FIADB and the Jenkins and Chojnacky classifications. This allocation to the newer categories is not a simple mapping of the 10 Jenkins groups to Chojnacky groups. That is, while Jenkins groups are split among Chojnacky groups, so also the Chojnacky groups are in some cases composed of species from different Jenkins groups. While Chojnacky et al. [12] developed the set of new groups based on the FIADB, similar to Jenkins et al. [1], a very small percentage of hardwood species were not explicitly named (i.e., families were not listed [12]). We assigned these to the "Cor/Eri/Lau/Etc" group (Table 2).
In order to systematically assign all the biomass estimates presented in Chojnacky et al. [12] to trees in the FIADB (as in this analysis), we present a short set of steps to make this link. Note that these include our interpretation of some of the assignments of species to groups that are not explicit such as some assignments to the woodland groups or allocation to deciduous versus evergreen. These seven steps, which also include application of the revised root component, are the basis for the biomass equation group assignments in Table 2. Note that tables and figures referenced in this list refer to those in Chojnacky et al. [12]: 1. Overall, follow the placement of taxa as suggested within the manuscript (i.e., as in Tables 2, 3, 4, and Figs. 2, 3, and 4). 2. If a tree record is one of the five families (of Table 4) and the tree diameter is measured as diameter at root collar then one of the Table 4 woodland equations applies. Otherwise, if one of the five (Table 4) families and diameter is dbh then use the appropriate equation from Tables 2 or 3. If not one of the five Table 4 families but tree diameter is provided as a root collar measurement, then convert drc to dbh following information provided in Fig. 1 before applying a Table 2 or 3 equation. 3. The calculations for the woodland (Table 4) Cupressaceae ("Cupre; WL") uses the "2nd juniper" equation from footnote #2 in Table 5. 4. The Fabaceae/Juglandaceae split into the two groups-"Fab/Jug/Carya" and "Fab/Jug"-is according to the genus Carya versus all others (i.e., not-Carya). 5. Fagaceae's deciduous/evergreen split-"Faga; Decid" and "Faga; Evergrn"-sets deciduous as the default. The Fagaceae allocated to evergreen are those five species explicitly listed as evergreen in Table 3 and those identified as evergreen from the USDA PLANTS database [19], which currently includes the addition of three live oak species. 6. The 6-family general equation at the middle of page 136 (in Table 3 of Chojnacky et al. [12])-"Cor/Eri/ Lau/Etc"-is assigned trees by family from 3 sources: (a) the six families listed in Table 3; (b) the five additional families noted in the Fig. 3 caption, and (c) any additional formerly unassigned hardwood species. 7. Roots-the Chojnacky estimates use both of the belowground root equations of Table 6 (the sum of the two is generally equivalent to the original Jenkins root component). Note these are dbh-based, so a drc tree should first convert drc-to-dbh according to Fig. 1. Also note, all other (other than root) components of the original Jenkins et al. [1] are applicable here.

Identifying equivalence between the alternate biomass estimates
Tests of equivalence of the plot level (tonnes carbon per hectare) representation of the Jenkins and Chojnacky groups are included principally as guidance as to where the choice of biomass equations may matter. The analysis does not address relative accuracy of the two alternatives. Specifically, we focused on equivalence tests of the mean difference between the two estimates at the plot, or stand, level according to region and forest type groups. While these are species (group) level equations, any practical effect (of interest) is at plot to landscape to national (carbon reporting) levels. Equivalence tests are appropriate where the questions are more directly "are the groups similar, or effectively the same?" and not so much "are they different?" [20,21]. This distinction follows from the idea that failure to reject a null hypothesis of no difference between populations does not necessarily indicate that the null hypothesis is true. The essential characteristic of an equivalence test is that the null hypothesis is stated such that the two populations are different [22,23] which can be viewed as the reverse of the more common approach to hypothesis testing. The specific measure, or threshold, of where two populations can be considered          The first part of the Chojnacky parameter designator is the species group; text after a semicolon indicates the relevant category when more than one set of coefficients is given for a group HiSG the coefficients given for the highest specific gravity in the designated species group, LoSG the lowest specific gravity given for a species group, MedSG select the coefficients given for the mid-range specific gravity. WL select the set of coefficients given for the woodland type. For example, Fagac; WL indicates that the second to the last line of Table 5, Woodland, Fagaceae should be used rather than the coefficients provided for Hardwood; Fagaceae