The northern (N) region encompassed 289,766 ha of core study area within 659,592 ha of total area (Figure 1). The core study area was defined by an ongoing REDD+ project, whereas the total area was defined by our satellite coverage. The N region contained forests broadly classified as mid-elevation (600-2900 m a.s.l.) evergreen humid tropical forest, underlain by basement rock formations [19]. Above 2100 m elevation, forest vegetation transitions to montane Philippia shrubland [19]. Below 900 m, much of the region is deforested, persisting as a spatially complex mosaic of grassland, bare substrate (rock and soil) and patches of woody vegetation protected by local-scale terrain variation (e.g., gullies, steep slopes, etc.). The southern (S) region spanned 495,767 ha of core study area within 1,713,088 ha of total area. This area crosses a strong orographic climate boundary incorporating low- to mid-elevation evergreen humid tropical forests on windward slopes in the east, to deciduous, seasonally dry, or "spiny" forests and shrublands to the west (Figure 1). Geologic substrates vary from basement rock formations in the humid sub-montane forests to basalt and sandstone formations underlying the drier spiny forests. Elevation in the S region reaches a maximum of 1968 m a.s.l., and areas below about 500 m have undergone substantial clearing in the past.
Field Calibration of Airborne LiDAR
Histograms of the plot-based estimates of ACD are shown in Figure S1 (Additional file 1). The mean (± s.d.) ACD of dry forests was 14.5 (8.1) Mg C ha-1, with a range of 1.4 to 28.9 Mg C ha-1. The mean (± s.d.) ACD humid forest plots was 99.5 (58.5) Mg C ha-1, with a range of 9.3 to 257.4 Mg C ha-1. Following the processing of the LiDAR data to MCH, we predicted ACD within and across all study regions with high precision (r2 = 0.88) and accuracy (RMSE = 21.1 Mg C ha-1) (Figure 2). The power-function scaling coefficient of 1.3 is slightly lower than those derived for forests in the Neotropics, resulting in a nearly linear MCH-to-ACD relationship [discussed in [27]]. An average pixel-level error of 21 Mg C ha-1 compares favorably with past studies showing uncertainties of 40 Mg C ha-1 or more in temperate and tropical forests [11, 12, 17, 28, 29].
Landscape Controls over LiDAR-derived ACD
Our airborne LiDAR mapping results revealed the influence of human activity, as well as underlying climatic and physiographic controls, in shaping ACD patterns throughout northern and southern Madagascar. High levels of deforestation have reduced standing carbon stocks in many lowland areas, as evidenced by the sharp decline in LiDAR-derived canopy height in zones of human activity (Figure 3). Yet this signal of human-mediated ACD suppression did not obscure natural ACD gradients controlled by climate and elevation. Forests found at higher elevations were less likely to be deforested or degraded, and were also more likely to contain the tallest forest canopies harboring higher carbon stocks.
To further quantify the controls over ACD, we spatially integrated the LiDAR-based carbon mapping results in both N and S regions, and found that ACD peaked at intermediate elevations (Figure 4). Moreover, we found major differences in ACD distributions and medians along elevation gradients, and these patterns differed in the N and S regions. In the north, the lowlands (< 1050 m) harbored a highly skewed distribution of carbon storage, resulting from human-driven forest losses leading to low carbon levels; yet with a sufficiently large, high-resolution regional sample size, we also found that ACD can reach very high values in surviving lowland forests. In effect, the right side of the distribution indicates maximum potential carbon stocks in the lowlands, which exceeded 200 Mg C ha-1. In contrast to N lowland carbon values, the S region harbored very suppressed ACD levels (Figure 4), and this was due to the drier conditions that support lower-biomass spiny forests as well as extremely high forest loss rates.
In both the N and S regions, aboveground carbon stocks reached their highest median levels at mid-elevation (Figure 4). In the north, we found that ACD peaked in the 1350-1650 m zone, whereas highest median values were closer to 700 m in the south (Appendix 3, Additional file 1). Above these zones of highest carbon storage, both ACD median and variance decreased steadily to much lower and narrower ranges in montane conditions. The sharpest high-elevation decline in ACD occurred above 1950 m in the N region, where forests give way to short-statured Philippia shrublands [19]. In the S region, the decrease in ACD at higher elevations was less pronounced. We note, however, that ACD levels in the S sub-montane (e.g., 1100-1500 m) and montane (> 1500 m) zones were lower and less variable than in the N forests at comparable elevations. Importantly, most ACD distributions were non-normal, suggesting that without very large sample sizes, or carefully distributed sampling, field plots alone may not accurately resolve carbon stock variation.
Deforestation, disturbance and regrowth all imparted differences in standing carbon stocks (Appendix 3, Additional file 1). In the N region, secondary regrowth from deforestation of at least 5 and 10 years of age resulted in mean (± s.d.) ACD of 39.4 (29.4) and 52.9 (49.5) Mg C ha-1, respectively. Forested areas subjected to disturbance (also known as degradation), as mapped with CLASlite, contained 70.9 (± 56.6) Mg C ha-1. In the S region, carbon stocks in forest regrowth varied slightly by elevation (Appendix 3, Additional file 1), but were generally between 16 and 49 Mg C ha-1, depending upon age and whether they were deforested or disturbed prior to assessment. In sum, aboveground carbon stocks in forest regrowth were extremely variable, but consistently much lower than in intact forest.
Statistical analyses indicated that SRTM elevation and CLASlite PV cover fraction were the factors most closely associated with landscape-scale, LiDAR-based ACD variation (Figure 5). Although all variables were significantly correlated with ACD, soil fractional cover and aspect generally explained less ACD variation (< 4%). Variables explaining more than 4% of the variation, such as slope and REM in the S region, were also strongly correlated with elevation and PV cover. Based on this finding, we conducted multiple regression analyses to determine the total ACD variation that could be explained by a combination of the variables. Models including elevation and PV explained 27 and 67% of the overall variation (adjusted r2) in the N and S regions, respectively (Appendix 4, Additional file 1). In both regions, other factors including slope, REM, aspect and bare soil fraction did not contribute to explanatory power of the multiple regression beyond that explained by elevation and PV. As a result, these additional factors were excluded from the final regional stratification. Although the regression analyses suggested that directly modeling ACD using input terrain variables may be tractable, second-order polynomials did not provide sufficient fidelity to account for the non-linear influence of the variables (e.g., elevation and PV in the S region; Figure 5b), suggesting that a highly complicated model would be required. A stratified approach, therefore, maintained greater parsimony and potentially higher accuracy. In addition, any potential class-specific influences over ACD can be captured using a stratification approach, whereas a continuous model often smoothes them artificially.
Regional Sources of Carbon Stock Variation
Following stratification and application of the LiDAR-based ACD estimates to the full regional study extent, a comparison of LiDAR-derived ACD and the regional-scale stratified mapping of median ACD revealed the effectiveness of the stratification approach (Figure 6). Stratification by CLASlite forest cover, forest change, PV cover fraction, and SRTM elevation yielded high-resolution ACD variation that closely matched LiDAR-derived ACD. Repeated comparisons, such as that in Figure 6, suggested that scaling LiDAR data regionally in this manner is tractable if the satellite data capture the detailed variation in topographic, environmental and land-use effects at appropriate spatial scales. Our experience suggests that 1 ha or higher resolution is required to support this scaling step.
The relative error between high-resolution LiDAR-scale ACD maps and regional ACD maps was calculated as the percent difference between the square root of the median squared residual value (predicted ACD - observed ACD) and mean observed ACD. We used the median rather than mean squared residual value to compensate for highly right-skewed distributions of pixel level error. Thus, our pixel-level uncertainties reflect median uncertainties. We estimated a pixel-level uncertainty of 35% and 10% in the N and S regions, respectively, at 1 ha resolution. Greater uncertainty in the N region is caused by generally weaker relationships between environmental variables and measured ACD (Figure 5). Nonetheless, a 35% uncertainty at the pixel level is on par with errors inherent to field inventory plots [30, 31], and when averaged over much larger jurisdictional scales of thousands to millions of hectares, the overall uncertainty drops to extremely low values [3, 17].
Additional comparisons between regionally-extrapolated ACD and high-resolution forest cover were made by overlaying regional ACD maps on Google Earth imagery (Figure S2, Additional file 1). In one southern landscape, the largest tree crowns were clearly aligned with areas of high ACD (reds), whereas areas containing small-statured vegetation (naturally or from land use) aligned with low-to-moderate ACD (yellow-green), and no ACD (blues) estimates. In one portion of the southern montane landscape, the abrupt orographically-mediated transition from high biomass windward forests (red) to lower biomass leeward forests (orange) was evident (Figure S2, Additional file 1). This break was so distinct, it was expressed over just a few hundred meters of distance. Similar extreme gradients can be found on other tall oceanic islands where orographic effects are common [32]. Close inspection against Google imagery also revealed that suppressed ACD levels along slopes (greens-blues; Figure S2, Additional file 1) were caused either by natural landslides or human activities.
The combined satellite and LiDAR-based results provided a synoptic view of aboveground carbon stocks throughout the much larger N and S mapping regions (Figure 7). In the N region, the highest ACD levels were found in upland zones farther from human activities, particularly in sub-montane hillslope positions where substrate fertility and climate harbor tall forests [19, 33, 34]. Sub-montane to montane forests in the S region harbored as much carbon in aboveground tissues as found in N forests, but the orographic effect confined the highest biomass areas to east-facing slopes (Figure 7b). Across both study regions, very little lowland intact forest remains, and in the south, much of the spiny forests either harbor naturally low carbon levels or have been lost to deforestation. Together these two regions are geographically important examples of the carbon landscape persisting throughout Madagascar today.
Error Analyses
Simulations of both reduced numbers of field plots as well as reduced LiDAR coverage demonstrated that our sampling effort for both field plots and airborne data was effective at capturing the true variation in ACD. When considering the number of field plots needed to produce a consistent LiDAR-to-ACD calibration model, we found that regression standard error (SEE) increased rapidly up to ~ 20 plots, but stabilized within ~ 2 Mg C ha-1 of the full regression SEE (21.1 Mg C ha-1) at just 24 of the 83 plots (Figure 8). Additional plots therefore had little effect on the regression SEE. Likewise, the ability of the simulated calibrations to accurately predict a set of 4 randomized field plots (not included in the calibration) was initially poor, but stabilized within ~ 2 Mg C ha-1 of the full regression RMSE with just 24 of the 83 plots. The stabilization of the LiDAR-to-plot calibration at ~ 24 field plots is comparable to that found in a previous Amazonian study [17].
As LiDAR coverage was statistically reduced, we found that median ACD remained stable for most vegetation classes - within 10% of the true median value - until the coverage declined below ~ 300 ha (Figure S3, Additional file 1). However, for some classes, at least 2000 ha of LiDAR sampling was required to be within 10% of the true median. Conservatively, median ACD values reported for classes with > 500 ha of LiDAR coverage can be considered to have very high confidence. These large classes account for 90% and 88% of the core study areas in the N and S regions, respectively (Appendix 3, Additional file 1). These results point to the importance of obtaining thousands of samples in order to derive the true median and distribution of ACD throughout the landscape.