 Research
 Open Access
 Published:
Effects of field plot size on prediction accuracy of aboveground biomass in airborne laser scanningassisted inventories in tropical rain forests of Tanzania
Carbon Balance and Management volume 10, Article number: 10 (2015)
Abstract
Background
Airborne laser scanning (ALS) has recently emerged as a promising tool to acquire auxiliary information for improving aboveground biomass (AGB) estimation in samplebased forest inventories. Under designbased and modelassisted inferential frameworks, the estimation relies on a model that relates the auxiliary ALS metrics to AGB estimated on ground plots. The size of the field plots has been identified as one source of model uncertainty because of the socalled boundary effects which increases with decreasing plot size. Recent research in tropical forests has aimed to quantify the boundary effects on model prediction accuracy, but evidence of the consequences for the final AGB estimates is lacking. In this study we analyzed the effect of field plot size on model prediction accuracy and its implication when used in a modelassisted inferential framework.
Results
The results showed that the prediction accuracy of the model improved as the plot size increased. The adjusted R^{2} increased from 0.35 to 0.74 while the relative root mean square error decreased from 63.6 to 29.2%. Indicators of boundary effects were identified and confirmed to have significant effects on the model residuals. Variance estimates of modelassisted mean AGB relative to corresponding variance estimates of pure fieldbased AGB, decreased with increasing plot size in the range from 200 to 3000 m^{2}. The variance ratio of fieldbased estimates relative to modelassisted variance ranged from 1.7 to 7.7.
Conclusions
This study showed that the relative improvement in precision of AGB estimation when increasing fieldplot size, was greater for an ALSassisted inventory compared to that of a pure fieldbased inventory.
Background
Tropical forests play an important role in the global carbon cycle as they store about 40% of the global terrestrial carbon, and absorb larger amounts of CO_{2} from the atmosphere than any other vegetation type [1]. Despite their potential, tropical forests continue to be exploited at alarming rates, by being converted into secondary forest and many other forms of land use. In an effort to conserve tropical forests, the United Nations Framework Convention on Climate Change (UNFCCC) has developed the mechanism called Reducing Emissions from Deforestation and Forest Degradation in tropical countries (REDD+). There is high interest in seeing such initiatives to take form, but a key limitation for successful implementation of REDD+ is reliable methods for quantifying forest aboveground biomass (AGB) [2,3]. Such methods are important because payments for carbon offsets under REDD+ are based on estimates of carbon stock and stock changes over time. Moreover, AGB information is also useful for understanding the contribution of the tropical forests to the global carbon cycle and ecosystem processes [4].
Airborne laser scanning (ALS) has emerged as one of the most promising remote sensing technologies to support AGB forest inventories in boreal, temperate, and tropical forests [5]. A particular strength of ALS for forest applications is its ability to accurately characterize the threedimensional (3D) structure of the forest canopy [6]. Such information is more useful for forest inventories than the information from other remote sensing techniques see e.g. [7]. Height and density metrics derived from the ALS data has been reported to be highly correlated with AGB see e.g. [8,9]. Furthermore, ALS has shown to be superior to other remote sensing data sources because the relationship between AGB and the remotely sensed information has a much higher saturation level for ALS compared to other types remote sensing. Because of this, ALS is a highly appropriate choice of technique in highbiomass forests. Based on its potential, ALS has recently been recommended for Monitoring, Reporting and Verification (MRV) systems under REDD+ initiatives [10].
Estimation of AGB using ALS is often carried out according to the areabased approach (ABA) [11]. In ABA, empirical models between various metrics derived from the ALS data and AGB values obtained in georeferenced field sample plots are fitted. The area of interest is then tessellated into grid cells [12] with the same size as the plots [13,14] and the developed models are used to provide cellwise predictions of AGB. Finally, estimates for the particular area of interest (forest stand, forest property, village, district, or nation) are provided by summing the individual cell predictions. For some estimation approaches, adjustment of model prediction bias [15] is also carried out.
As indicated above, the modeled relationship between ALS metrics and groundbased values is of fundamental importance for the outcome of the ALSassisted estimation. The use of field plot data for model development requires co–registration of field plot location with the ALS data [16,17]. In an ALSassisted inventory, the point cloud is extracted only within the plot perimeter. However, in field measurements trees are treated as being inside plots if the center point of the stem is inside the plot. This is a challenge in ALSassisted forest inventory, since the crowns of trees just outside the plot border partly extend into the plot area which means that the ALS data will be affected by trees that are not registered in field. Conversely, also trees just inside the plot extend their crowns beyond the plot boundary. This means that there may be mismatch between the data captured in field and from the air.
In order to reduce these boundary effects, it has been suggested in a number of studies to use larger plots in ALSassisted forest inventory see e.g. [18,19]. This is because, as plot size increases, the perimeter to area ratio decreases and thus the plots include a lower proportion of boundaryrelated elements. Similarly, the relative and negative influence of a given plot positioning error is reduced because the relative overlap between the field and ALSdata becomes larger as plot size increases. Reduction in model errors are also expected by increasing plot size due to socalled spatial averaging of the errors [20], because both the field observations and the ALS data capture more of the spatial variation as they increase in size. Thus, as plot sizes increase, the variances of fieldbased and ALSassisted estimates are expected to be reduced, which means that fewer plots are needed to reach a certain precision of an AGB estimate. However, large plots also have disadvantages by being more complicated to measure, which may affect the time consumption for collecting field measurements [21], This makes it challenging to select the “optimal” plot size that balances the tradeoff between plot size, sample size (number of plots), onplot costs, traveling costs and precision of ALSassisted AGB estimates in different forest types.
As indicated above, plot size has a profound effect on the precision of ALSassisted AGB estimates for several reasons. Likewise, the plot size has an impact on the precision of pure fieldbased estimates for reasons mentioned above; larger plots capture more of the variability in the area of interest and thus precision will tend to improve as long as the sample size is kept constant. A key question is therefore if larger plots will favor ALSassisted estimation precision to the same extent as it favors fieldbased estimation precision. Different responses to plot size should have a direct impact on how tropical ALSassisted field sample surveys should be designed as their designs currently are “optimized” for pure fieldbased estimation.
Forest sample surveys are often designed according to designbased (probabilitybased) principles. Simple random sampling is one of these principles, and analytical and socalled designunbiased estimators and corresponding variance estimators exist for a great number of such designs. When auxiliary data such as those acquired by ALS are at hand for the entire area of interest, or at least with partial coverage of the area of interest, use of these data can greatly improve the precision over a pure fieldbased estimate assuming the same design. The inferential framework applied under probability sampling when a model is used to predict AGB using the ALS data is known as designbased modelassisted (MA) estimation. In the MA framework, the model is used to predict AGB for grid cells and then AGB is summed over all grid cells as indicated in the ABA, but in addition to that, the model predictions for the ground samples are used to provide an estimate of bias in the model predictions, which corrects the pure modelbased estimate. Several studies see e.g. [2224] have indicated the potential of MA estimation in reducing the variance of AGB estimates in boreal forests, but apart from some indications provided by [23], neither of them has analyzed how the variance of the estimates is affected by changes in field plot sizes. In tropical forests where the current study was conducted, there is even less knowledge regarding performance of MA estimation using ALS with varying plot sizes. Several tropical studies have examined the effects of plot size on model prediction accuracy See e.g. [2527], but none of them have assessed the effects on the precision of AGB estimates and compared such precision estimates with corresponding precision of fieldbased AGB estimates using the same sampling design, which is of fundamental importance for designing future sample surveys serving multiple purposes and estimation approaches.
The objectives of this study were to (1) examine the effects of field plot size on AGB regression model quality, (2) assess plot boundary effect and its impact on model quality based on the field data, and (3) quantify the precision of ALSassisted estimates of AGB relative to fieldbased estimates of AGB assuming the same design for different plot sizes. The study was conducted in tropical rain forest in Tanzania with high AGB densities, which was expected to represent a particular challenge in terms of large boundary effects.
Results
Effects of field plot size on ALS AGB predictions
To assess the effect of plot size on ALS assisted forest inventory, we first fitted the regression models for each of the plot sizes. The independent variables selected varied between the models developed for the different plot sizes (Table 1). The number of variables varied between two and three. For all models, the parameter estimates were significantly different from zero (p < 0.05) and the VIF values were <10, indicating acceptable levels of multicolinearity. The variability explained by separate models (i.e. adjusted R^{2}) improved as the plot size increased, with few exceptions (Figure 1a). The adjusted R^{2} ranged from 0.35 for the plot size of 200 m^{2} to 0.74 for the plot size of 3000 m^{2}. The RMSE% values for LOOCV decreased nonlinearly with increasing plot size, from 63.8 to 29.2% (Figure 1b). The MPE% values (Figure 1b) and the pattern of under predictions for plots with high AGB were relatively lower for larger plots compared smaller (Figure 2). However, it should be noted that the number of the larger plots was relatively small.
Boundary effects
Boundary effects were studied by analyzing how the relative residual errors of the models were affected by the ground reference AGB of the trees in an outer buffer zone for different field plot sizes. Our results showed that SAGB_{buffer} and MAGB_{buffer} contributed to explaining the variation in the relative residual errors (Table 2). Relating the absolute value of the relative residual with plot size using simple linear regression model indicated that there was a highly significant effect of plot size (p < 0.0001). Furthermore, the parameter estimate for plot size was negative showing that the relative residual is larger in absolute terms for small plots compared to larger plots (Table 3).
Efficiency of ALSassisted AGB estimation
The SE estimates for the fieldbased AGB estimates were larger than the corresponding modelassisted SE estimates (Figure 3). For the plot sizes that allowed consistent analysis for all 30 sizes, i.e. from 200 to 1900 m^{2}, the fieldbased SE estimates decreased from 58.0 Mg ha^{−1} to 28.7 Mg ha^{−1}, while the modelassisted SE estimates decreased from 44.3 Mg ha^{−1} to 15.5 Mg ha^{−1}. Relative to the mean of field reference AGB for the plot size from 200 to 1900 m^{2}, the field –based SE estimates decreased from 14.1% to 8.2% , while for the modelassisted estimates decreased from 10.8% to 4.4%. Similarly, for the larger plots (up to 3000 m^{2}) for which 22 observations were available for consistent analysis, the SE estimates for modelassisted were relatively much smaller compared to the fieldbased inventory. In both cases the SE was higher for smaller plots compared to the larger plots. Generally, the effectiveness of the ALSassisted estimates was more improved as the plot size increased compared to the fieldbased estimates. This indicates that larger plots are relatively more favorable for ALSassisted estimation than for pure fieldbased estimation. The RE values were >1 with a maximum value of 3.4 (Figure 4) for the plot sizes ranging from 200–1900 m^{2} for which we have a complete dataset of 30 plots. For the other set with plot size up to 3000 m^{2} the maximum RE value was 7.7. It should be noted that the peak in relative efficiency for the smallest dataset (22 plots) in Figure 4 was caused by considerable change in the observed AGB for a single plot when increasing the plot size beyond 2000 m^{2}. The increasing AGB was due to a large tree that was included in the plot measurements once the plot radius exceeded 25 m. This illustrates that in a small dataset the results can be sensitive to individual observations and even to the presence of individual trees.
Discussion
The findings of this study demonstrated the importance of choosing appropriate field plot sizes in ALSassisted forest inventories in tropical forests. This is particularly important given that field campaigns are expensive and time consuming, and linking field measurements with remotely sensed data in the most effective manner would benefit both REDD+ implementations, together with all other studies related to forest carbon cycle. The current study extends previous research conducted in tropical forests, by having a dataset with a wide range of plot sizes. Furthermore, most of the previous studies have used rectangular plots.
See e.g. [18,26], whereas in this case circular plots have been used. Circular plots are more convenient for remote sensing studies compared to square or rectangular plots because only a single coordinate together with a plot radius are needed to match the two data sources geographically [19,28,29]. Circular plots are also within certain sizes easier to establish in the field because they have one dimension (i.e. radius) that defines the plot boundary. The use of circular plots minimizes the plot boundary effects because of a smaller circumference to area ratio than all other plot shapes. However, the visibility from the plot center to the perimeter on a circular plot is increasingly hampered as the plots get larger, which increase per tree measurement time for the border trees. An increase of the area of a rectangular plot would not necessarily mean increased marginal cost (cost of including one more tree) if the width of the plot is kept constant and inclusion of trees are made with reference to the long side. However, rectangular plots are in general more difficult to establish. For example, in rugged terrain it can be difficult to keep the sides parallel.
Our findings demonstrated empirically the positive effects of increasing plot sizes on improved predictive power of the AGB models. The model fit (adjusted R^{2}) of the regression models was improved as plot size increased. Reduced circumference to area ratio, spatial averaging, and less effect of positioning errors are probably the main reasons. The fit of our models are in line with previous ALSbased studies in both tropical forests and temperate forests. For example, [30] reported R^{2} of 0.78 in the tropical rainforest of Hawaii islands while [31] reported R^{2} of 0.64 in a tropical rainforest of West Africa. Furthermore, results from the crossvalidation showed smaller RMSE% and MPE% (Figure1b) for larger plots compared to smaller plots. Similar trends have been reported and discussed by other authors in both temperate and tropical forests see e.g. [32].
Plot boundary effects have been discussed in previous studies see e.g. [16,33] as one among the sources of model error in ALSassisted inventories, particularly when relying on small plots. We demostrated this in two steps; first by relating relative residuals to the sum of AGB per hectare for all trees in the buffer (SAGB_{buffer} ) and the maximum AGB per hectare for the largest tree in the buffer (MAGB_{buffer} ) where we noted that their importance were depending on the size of the buffer. The buffer conditions as expressed both by (MAGB_{buffer}) and (SAGB_{buffer}), seemed to have more impact on the residual error with decreasing distance to the plot judged by the AIC values (Table 2), which is logical. Furthermore, when comparing the two variables, SAGB_{buffer} seemed to lose less explanatory power by going from 3 meter to 6 m buffer than MAGB_{buffer}. This result was also expected because the represetation of the whole buffer by SAGB_{buffer} is less prone to be changed by the increase in size compared to MAGB_{buffer} which is calculated from a single tree. Furthermore, the decrease in ALS model residuals (Table 3) with increasing plot sizes is a clear indication that smaller plots are more prone to boundary effects compared to larger plots.
Contribution of ALS data in improving precision of AGB estimates was also demonstrated within varying ranges of plot sizes. The RE values were > 1, indicating that ALSassisted estimation is more efficient compared to pure fieldbased estimation. To achieve similar precision of a pure fieldbased estimate relying on simple random sampling, would mean to increase the sample size for the fieldbased inventory by a factor equivalent to the value of RE, which would have a substantial effect on field inventory costs. In general, the gain in relative efficiency was more pronounced as plot size increased, suggesting that larger plots are more favorable when ALSdata are used to assist in the estimation. Even though we did not undertake any analysis of costefficiency, the trend would be toward larger and fewer plots as one introduces ALS to support in the estimation. Even this finding can be attributed to the effects discussed above, namely reduced boundary effects and coregistration errors.
Despite the potential of improving the efficiency of ALSassisted inventories by use of larger plots, choice of an “optimal” plot size must be seen in a broader context by considering a number of factors including; sample sizes, onplot costs, traveling costs and overall field inventory design. Several authors see e.g. [20,23,30] have indicated that selection of the plot size also will depend on forest types, available resources and the needed precision. Based on our findings, there is larger potential of gaining efficiency of using ALS data in this type of forest when the field plot size is larger than 1200 m^{2}. Finally, even though our study was limited to the tropical rainforests of Tanzania, the major findings are of interest and efforts should be taken to upscale to other tropical forests by considering more factors that would lead to selection of “optimal” plot size.
Conclusions
To conclude, our study has demonstrated that field plot size effect the prediction accuracy of ALSassisted AGB estimation in the tropical forests. Generally, there was substantial improvement in prediction accuracy from larger plots compared to smaller plots. Indicators of boundary effects were also identified and confirmed to have significant effects on the model quality. From a purely technical point of view, our results suggested that it is relatively more favorable to increase the plot size when ALS is used to enhance the estimates. This study showed that there is a relative improvement in precision of ALSassisted AGB estimation, compared to pure fieldbased estimation up to around 3000 m^{2} in this type of forest. However, the maximum plot size of 3000 m^{2} in the current study leaves an open question as to whether there are any additional gains in relative precision beyond this size. Future studies should be conducted to quantify the contribution of ALS to improve estimation precision for even larger plots as the basis for design of future inventories in tropical rainforests. Similar studies should also be conducted in other types of tropical forests.
Methods
Site description
The study was conducted in Amani nature reserve (ANR), which is situated in the southern part of the East Usambara Mountains in northern Tanzania (Figure 5). It was gazetted in 1997 with a protected area of 8,380 ha. ANR lies between 5°14'  5° 04' S and 38° 30'  38°40' E, with an altitudinal range of 190 to 1130 m above sea level [34]. Rainfall is heavy at higher altitudes and in the southeast of the mountain, with an average of 1900 mm annually. The dry seasons are from June to August and January to March, but rainfall is frequent throughout the year. The mean annual temperature is 20.6°C [35].
Data collection
Sampling design
An initial probability sample of 173 field plots with an average size of 900 m^{2} were established across ANR according to a systematic design (450 m × 900 m distance between plots) in 1999–2000 by a nongovernmental conservation and development organization, Frontier Tanzania [34] (Figure 5). The plots were revisited and remeasured in 2008–2012. In order to analyse plot size effects on AGB estimates, a small subsample of 30 large plots was established. Measurements on the 30 plots were acquired in a separate campaign after completion of measurements of the large sample. Due to high travel costs and long walking distances in the very steep and rough terrain, establishing a probability sample of 30 large plots across the entire study area was costprohibitive. Instead we developed a sampling strategy by which we took advantage of the a priori knowledge of the distribution of AGB in the large probability sample and selected purposefully three subregions within the study area in which the initial plots were revisited. There is a strong altitudedependent AGB gradient in the study area. It was therefore important to capture the altitude gradient in each of the three subregions in order to resemble the AGB distribution in the initial probability sample.
In the sampled subregions, we first selected 16 of the plots in the initial probability sample for measurement. We also established 14 new and additional plots along the gridlines of the probability sample and located them exactly midway between two existing plots. Thus, the distance between our plots was 225 m rather than 450 m.
Although the resulting sample of 30 large plots was not selected according to probabilistic principles, it closely resembled essential properties of the large probability sample. First of all the AGB distributions of the two samples were similar (Figure 6). The mean AGB of the 30 plots with an area of 900 m^{2} was 366.0 Mg ha^{−1} (Table 4, Figure 6), while it was 461.9 Mg ha^{−1} for the large probability sample (Figure 6). The AGB range was 69.4908.3 Mg ha^{−1} (standard deviation of 216.3 Mg ha^{−1}) while it was 43.21147.1 Mg ha^{−1} (standard deviation of 214.7 Mg ha^{−1}) for the large sample. Furthermore, the 30 plots covered an elevation range of 200 to 1000 m above sea level (Figure 7a) so that both the lowland forests (<800 m above sea level) and the sub mountain forests (>800 m above sea level) were represented. The 30 plots also covered a wide range of tree sizes (Figure 7b).
Field data
Field data were collected during November 2012, about six mounts after completion of the field work on the large probability sample. On each of the 30 plots, we registered all trees within a radius limited by the maximum distance measuring range of a Vertex hypsometer [36], which was used to measure the horizontal distance from the plot centre to each tree. The maximum measuring range of the hypsometer varied among the plots due to differences in terrain ruggedness and forest density. The radius distribution among the 30 plots was as follows; 31 m (22 plots), 28 m (2 plots), 26 m (1 plot) and 25 m (5 plots). For each tree with diameter at breast height (dbh) larger than 5 cm, scientific name, local name, distance to plot centre and dbh was registered. A diameter tape, rather than a calliper, was used to gauge diameters since tree trunks in this forest type tend to be both oval and large in size. The distance was measured from plot center to the front of each tree, and half of the tree diameter was added to get the total horizontal distance. The distance measures enabled us to generate any plot size within the limit of the maximum radius. For this study, we decided to select radii between 7.98 m (200 m^{2}) and 30.90 m (3000 m^{2}) (Table 4) for further analysis. Three trees (largest, medium and smallest in terms of diameter) per plot were measured for height (h) using a Vertex hypsometer.
Precise field coordinates were determined in the centre of each plot by means of differential Global Navigation Satellite Systems (dGNSS). Topcon Legacy 40 channels dual frequency receivers, observing both pseudorange and carrier phase of the Global Positioning System (GPS) and the Global Navigation Satellite System (GLONASS) were used as rover and base station. The postprocessing reports from Pinnacle version 1.0 software [37] indicated an average error of 19 cm for the planimetric coordinates. The error was computed as two times the standard deviations of the corrected single observations reported from Pinnacle output [38].
Field estimates of AGB
For each plot AGB was estimated by using the local allometric AGB model developed by [39] with both dbh and h as predictor variables (Eq. 2). Using models with both dbh and h is reported to moderate the effect of large dbhvalues on AGB estimates as compared to models with dbh only [4042]. Before calculating AGB, a height model (Eq. 1), was developed using the observations of tree height and corresponding diameters from each plot. A number of model forms for diameter–height relationship [4348] were tested using nonlinear mixed effect approach. Best model fit, judged by the Akaike information criterion (AIC), was obtained using the model form by [46]
This model was used to predict height for trees without height measurements. AGB was calculated for individual trees within each plot according to [39] i.e.,
and then summed to obtain total AGB for the respective plot. The AGB values were finally scaled to per ha values for the different plot sizes (Table 4). The calculated AGB values are henceforth denoted field reference AGB.
Laser scanner data
ALS data were collected during the period from 19 January to 18 February 2012 using a Leica ALS70 sensor (Leica Geosystems AG, Switzerland) carried by a Cessna 404 fixedwing aircraft. Mean flying altitude was 800 m above ground covering the entire area of ANR (i.e. wall to wall) at a ground speed of 75 m s^{−1}. The scanning rate was 58.6 Hz and the instrument operated at a pulse repetition frequency of 339 kHz with a resulting average pulse density of 10.6 points m^{−2}.
Processing of the ALS data started with classification of each ALS echo as ground or vegetation using the progressive irregular triangular network densification method [49] implemented in the TerraScan software [50]. A Triangular Irregular Network (TIN) was created using the ALS echoes classified as ground echoes. The heights above the ground surface were calculated for all echoes by subtracting the respective TIN heights from the height values of all echoes recorded. Up to five echoes were registered per pulse and we used the three echo categories classified as “single”, “first of many”, and “last of many”. The “single” and “first of many” echoes were pooled into one dataset denoted as “first” echoes, and correspondingly, the “single” and “last of many” echoes were pooled into a dataset denoted as “last” echoes.
Several variables were extracted from the ALS data for each of the field plot sizes as described by [51]. For each plot size, height distributions of both first and last echoes were first created. A height threshold of 2.0 m was applied in order to remove the effect of low vegetation and echoes from ground features falsely classified as vegetation. Then, heights at nine percentiles (10^{th}, 20^{th}, …, 90^{th}) of both the first and last echo distributions were computed to represent canopy height and labeled H_{10}.F, H_{20}.F, …, H_{90}.F (first echoes) and H_{10}.L, H_{20}.L, …, H_{90}.L (last echoes), respectively. Measures of canopy density were also derived for first and last echoes of each plot size. The range between the lowest ALS canopy height (>2 m) and the 95^{th} percentile height was divided into 10 vertical fractions of equal height. Canopy densities were then computed as the proportion of ALS echoes above each fraction to total number of first echoes and labeled D_{0}.F (>2 m), D_{1}.F, …, D_{9}.F. Density variables for the last echo distribution were calculated the same way (relative to total number of last echoes) and labeled D_{0}.L, D_{1}.L, …, D_{9}.L. Furthermore, for both first and last echo height distributions on each plot, the maximum height (H_{max.}.F and H_{max}.L ), mean values (H_{mean.}.F and H_{mean}.L), standard deviation (H_{sd}.F and H_{sd}.L), coefficient of variation (H_{cv}.F and H_{cv}.L), and skewness (H_{skewness}.F and H_{skewness}.L) were computed.
Data analyses
Model development
Multiple linear regression analysis with ordinary least square regression (OLS) was used to develop the statistical models relating the field reference AGB and the predictor variables from the ALS data. To ensure that our modelling approaches met the basic assumptions of OLS, the response variable was transformed to logarithmic scale [11,52], while for the predictors both log transformed and nontransformed variables were used. Separate models with log transformed response and combination of log transformed and nontransformed predictor variables were fitted for each of the plot sizes. We decided to fit separate models (unique variable combinations) for each of the plot sizes, because we wanted the model for each plot size to be the “best” and not be constrained by forcing specific variables into the model.
Variable selection was conducted by using regsubset in the leaps package in R [53]. The selection of the variables was limited to the best combinations of three or fewer variables in order to avoid multicollinearity among candidate predictors. The preferred models were chosen based on the Bayesian information criterion (BIC) [54]. Adjusted R^{2} was also used for assessing the model fit while multicollinearity was assessed by computing the variance inflation factors (VIF). The VIF values were determined for the individual β parameters. VIF values greater than 10 were regarded as an indication of multicollinearity problems [55].
Logtransformation of the response variable introduces a bias when backtransforming to the arithmetic scale. The model for AGB was therefore adjusted for logarithmic bias according to [56] by adding half of the model mean square error to the constant term before transformation to arithmetic scale.
Model validation and accuracy assessment
In order to assess the performance of the models for each plot size, leaveoneout cross–validation (LOOCV) was performed. One field plot at a time was excluded from the dataset, and the model was fitted based on n1 plots to predict the AGB of the left out plot. Here, n denotes the number of field plots, where i = 1,…, n. Relative root mean square error (RMSE %) and the mean prediction error (MPE%) were used as the measures of reliability and calculated according to
Where y _{ i } and \( {\hat{y}}_i \) denote field reference AGB and predicted AGB for plot i, respectively, and \( \overline{\mathrm{y}} \) denotes mean field reference AGB for all plots. RMSE% is a good measure of how accurately the model predicts the response and is the most important criterion for fit if the main purpose of the model is prediction [57].
Analysis of boundary effects
To analyze the boundary effects we studied how the residual errors of the models were related to the field reference AGB of the trees in an outer buffer zone for different field plot sizes. To archive this, we extracted field reference AGB values for 3 m and 6 m buffers outside the field plots for the plot sizes of 200–1500 m^{2} and 200–1100 m^{2}, respectively. We selected the trees with dbh > 10 cm and computed AGB per hectare for the largest tree in the buffer and the total AGB per hectare for all trees in the buffer. To obtain the model residual error, we first subtracted the ground reference AGB from the predicted AGB. Then we calculated the ratio between the residuals and the total field reference AGB for the respective plot (i.e., relative residual). Similar ratios between (1) sum of AGB per hectare for all trees in the buffer (SAGB_{buffer}) and the field reference AGB for the plot and (2) the maximum AGB per hectare for the largest tree in the buffer (MAGB_{buffer,}) and the field reference AGB for the plot were also computed. Two empirical models explaining the variation in the relative residual values using either SAGB_{buffer} or MAGB_{buffer} as explanatory variables were developed. Linear mixed effects (LME) regression using nlme addon package [58] in R was used for model fitting. LME models are linear regression models in which parameters are the sum of the fixed and random effects. In this case the fixed effects were either SAGB_{buffer} or MAGB_{buffer} while plot identity was treated as the random effect. We assumed that each plot will have different random error structures and that the distribution of AGB within these plots is not independent of one another. To test the effect of plot sizes on relative residual, we also fitted the linear regression model which relates relative residuals in absolute form and plot sizes. Absolute value was used because we were interested in the magnitude of the residual regardless of its sign.
Efficiency of ALSassisted AGB estimation
ALSassisted estimation of AGB within the designbased and modelassisted inferential framework can greatly improve the precision compared to pure fieldbased estimation. The purpose of this analysis was to quantify the gain in estimated precision of using ALS data relative to a pure fieldbased estimate for increasing plot sizes.
A basic requirement for validity of designbased inference is the availability of a probability sample [59]. As stated above, the current sample of 30 plots was obtained as a subsample of a probability sample, but the subsampling was not conducted according to strict probabilistic principles. However, the subsample was selected to resemble important properties of the large probability sample as closely as practically feasible. Thus, a comparison of variances using the current data and assuming a probabilistic design will most likely introduce a bias in the estimators of unknown magnitude. Likewise, when a systematic sample is obtained, it is common to adopt designbased estimators assuming e.g. simple random sampling (SRS) although it is wellknown that SRS variance estimators usually are positively biased under systematic sampling. The magnitude of the bias is always unknown for a particular sample because bias is a property of an estimator and not a particular sample. The current analysis was conducted under the assumption that the sample at hand would give a meaningful quantification of the effect of plot size on relative variance estimates. Thus, in the current study we adopted designbased variance estimators assuming simple random sampling and complete cover of ALS data.
Assuming SRS, the variance estimator for the fieldbased AGB estimate ignoring corrections for finite population is [60].
For modelassisted estimation, the variance estimator of the socalled generalized regression estimator is [60].
where \( {\hat{e}}_i={y}_i{\hat{y}}_i \) is the model prediction residual for plot i and \( \overline{e}=\frac{{\displaystyle {\sum}_{i=1}^n}{\hat{e}}_i}{n} \) is the mean residual for all plots. Standard error (SE) was computed as the square root of the variance estimates. Finally, the relative efficiency (RE) of ALSassisted inventory relative to fieldbased inventory was calculated for different plot sizes as the ratio of the two variance estimates, i.e.,
Values of RE greater than 1.0 indicates higher efficiency of ALSassisted estimates than fieldbased estimates for a given plot size. To achieve consistency in the analysis across different plot sizes, the dataset was divided into two major groups. The first group subject to analysis comprised all the 30 plots and allowed consistent analysis of plot size ranging from 200–1900 m^{2}. The second group allowing analysis from 200 to 3000 m^{2} consisted of 22 of the plots.
Abbreviations
 ALS:

Airborne laser scanning
 ABA:

Areabased approach
 AGB:

Aboveground biomass
 AIC:

Akaike information criterion
 ANR:

Amani nature reserve
 BIC:

Bayesian information criterion
 dGNSS:

differential Global Navigation Satellite Systems
 GLONASS:

Global Navigation Satellite System
 GPS:

Global Positioning System
 LME:

Linear mixed effects
 LOOCV:

leaveoneout cross–validation
 MA:

Modelassisted
 MAGB_{buffer,} :

Maximum AGB per hectare for the largest tree in the buffer
 MPE:

Mean prediction error
 MRV:

Monitoring, Reporting and Verification
 OLS:

Ordinary least square regression
 REDD+:

Reducing Emissions from Deforestation and Forest Degradation in tropical countries
 RMSE:

Root mean square error
 SAGB_{buffer} :

Sum of AGB per hectare for all trees in the buffer
 SAR:

Synthetic aperture RADAR
 SE:

Standard error
 SRS:

Simple random sampling
 TIN:

Triangular Irregular Network
 UNFCCC:

United Nations Framework Convention on Climate Change
 RE:

Relative efficiency
References
 1.
Lewis SL, LopezGonzalez G, Sonké B, AffumBaffoe K, Baker TR, Ojo LO, et al. Increasing carbon storage in intact African tropical forests. Nature. 2009;457:1003–6.
 2.
Joseph S, Herold M, Sunderlin WD, Verchot LV. REDD+ readiness: early insights on monitoring, reporting and verification systems of project developers. Environ Res Lett. 2013;8:034038.
 3.
Herold M, Skutsch M: Monitoring, reporting and verification for national REDD plus programmes: two proposals. Environ Res Lett. 2011;6:014002.
 4.
Keith H, Mackey BG, Lindenmayer DB. Reevaluation of forest biomass carbon stocks and lessons from the world's most carbondense forests. Proc Natl Acad Sci. 2009;106:11635–40.
 5.
Hyyppä J, Hyyppä H, Leckie D, Gougeon F, Yu X, Maltamo M. Review of methods of small‐footprint airborne laser scanning for extracting forest inventory data in boreal forests. Int J Remote Sens. 2008;29:1339–66.
 6.
Vauhkonen J, Maltamo M, McRoberts RE, Næsset E: Introduction to Forestry Applications of Airborne Laser Scanning. In: Maltamo M, Næsset E, Vauhkonen J, editors. Forestry applications of airborne laser scanning – concepts and case studies. Dordrecht, Netherlands: Springer; 2014. p. 1–16.
 7.
Coops NC, Wulder MA, Culvenor DS, StOnge B. Comparison of forest attributes extracted from fine spatial resolution multispectral and lidar data. Can J Remote Sens. 2004;30:855–66.
 8.
Hansen EH, Gobakken T, Bollandsås OM, Zahabu E, Næsset E. Modeling Aboveground Biomass in Dense Tropical Submontane Rainforest Using Airborne Laser Scanner Data. Remote Sens. 2015;7:788–807.
 9.
Ioki K, Tsuyuki S, Hirata Y, Phua MH, Wong WVC, Ling ZY, et al. Estimating aboveground biomass of tropical rainforest of different degradation levels in Northern Borneo using airborne LiDAR. For Ecol Manage. 2014;328:335–41.
 10.
Gautam B, Peuhkurinen J, Kauranne T, Gunia K, Tegel K, LatvaKäyrä P, et al. Estimation of Forest Carbon Using LiDARAssisted MultiSource Programme (LAMP) in Nepal. In: Proceedings of the International Conference on Advanced Geospatial Technologies for Sustainable Environment and Culture, Pokhara, Nepal. 2013. p. 12–3.
 11.
Næsset E. Predicting forest stand characteristics with airborne scanning laser using a practical twostage procedure and field data. Remote Sens Environ. 2002;80:88–99.
 12.
Næsset E. Estimating timber volume of forest stands using airborne laser scanner data. Remote Sens Environ. 1997;61:246–53.
 13.
Næsset E, Bjerknes KO. Estimating tree heights and number of stems in young forest stands using airborne laser scanner data. Remote Sens Environ. 2001;78:328–40.
 14.
Næsset E: AreaBased Inventory in Norway–From Innovation to an Operational Reality. In: Maltamo M, Næsset E, Vauhkonen J, editors. Forestry applications of airborne laser scanning – concepts and case studies. Dordrecht, Netherlands: Springer; 2014. p. 215–240.
 15.
McRoberts RE, Cohen WB, Naesset E, Stehman SV, Tomppo EO. Using remotely sensed data to construct and assess forest attribute maps and related spatial products. Scand J For Res. 2010;25:340–67.
 16.
Frazer GW, Magnussen S, Wulder MA, Niemann KO. Simulated impact of sample plot size and coregistration error on the accuracy and uncertainty of LiDARderived estimates of forest stand biomass. Remote Sens Environ. 2011;115:636–49.
 17.
Gobakken T, Næsset E. Assessing effects of positioning errors and sample plot size on biophysical stand properties derived from airborne laser scanner data. Can J Forest Res. 2009;39:1036–52.
 18.
Mascaro J, Detto M, Asner GP, MullerLandau HC. Evaluating uncertainty in mapping forest carbon with airborne LiDAR. Remote Sens Environ. 2011;115:3770–4.
 19.
Næsset E, Bollandsås OM, Gobakken T, Gregoire TG, Ståhl G. Modelassisted estimation of change in forest biomass over an 11 year period in a sample survey supported by airborne LiDAR: A case study with poststratification to provide “activity data”. Remote Sens Environ. 2013;128:299–314.
 20.
Zolkos S, Goetz S, Dubayah R. A metaanalysis of terrestrial aboveground biomass estimation using lidar remote sensing. Remote Sens Environ. 2013;128:289–98.
 21.
Asner GP, Mascaro J, MullerLandau HC, Vieilledent G, Vaudry R, Rasamoelina M, et al. A universal airborne LiDAR approach for tropical forest carbon mapping. Oecologia. 2012;168:1147–60.
 22.
Gregoire TG, Ståhl G, Næsset E, Gobakken T, Nelson R, Holm S. Modelassisted estimation of biomass in a LiDAR sample survey in Hedmark County, Norway This article is one of a selection of papers from Extending Forest Inventory and Monitoring over Space and Time. Can J Forest Res. 2010;41:83–95.
 23.
Næsset E, Gobakken T, Solberg S, Gregoire TG, Nelson R, Ståhl G, et al. Modelassisted regional forest biomass estimation using LiDAR and InSAR as auxiliary data: A case study from a boreal forest area. Remote Sens Environ. 2011;115:3599–614.
 24.
Ene LT, Næsset E, Gobakken T, Gregoire TG, Ståhl G, Nelson R. Assessing the accuracy of regional LiDARbased biomass estimation using a simulation approach. Remote Sens Environ. 2012;123:579–92.
 25.
Asner GP, Clark JK, Mascaro J, Vaudry R, Chadwick KD, Vieilledent G, et al. Human and environmental controls over aboveground carbon storage in Madagascar. Carbon balance and management. 2012;7:2.
 26.
Asner GP, Mascaro J. Mapping tropical forest carbon: Calibrating plot estimates to a simple LiDAR metric. Remote Sens Environ. 2014;140:614–24.
 27.
Mascaro J, Asner GP, Dent DH, DeWalt SJ, Denslow JS. Scaledependence of aboveground carbon accumulation in secondary forests of Panama: A test of the intermediate peak hypothesis. For Ecol Manage. 2012;276:62–70.
 28.
Adams T, Brack C, Farrier T, Pont D, Brownlie R. So you want to use LiDAR?a guide on how to use LiDAR in forestry. N Z J For. 2011;55:19–23.
 29.
White JC, Wulder MA, Varhola A, Vastaranta M, Coops NC, Cook BD, et al. A best practices guide for generating forest inventory attributes from airborne laser scanning data using an areabased approach. For Chron. 2013;89:722–3.
 30.
Asner GP. Tropical forest carbon assessment: integrating satellite and airborne mapping approaches. Environ Res Lett. 2009;4:034009.
 31.
Chen Q, Vaglio Laurin G, Battles JJ, Saah D. Integration of airborne lidar and vegetation types derived from aerial photography for mapping aboveground live biomass. Remote Sens Environ. 2012;121:108–17.
 32.
Gobakken T, Næsset E. Assessing effects of laser point density, ground sampling intensity, and field sample plot size on biophysical stand properties derived from airborne laser scanner data. Can J Forest Res. 2008;38:1095–109.
 33.
Wulder MA, White JC, Nelson RF, Næsset E, Ørka HO, Coops NC, et al. Lidar sampling for largearea forest characterization: A review. Remote Sens Environ. 2012;121:196–209.
 34.
Doody K, Howell K, Fanning E. Amani Nature ReserveA biodiversity survey. East Usambara Conservation Area Management Programme, Technical Paper 52. In: Ministry of Natural Resources and Tourism Tanzania and FrontierTanzania. Tanga. 2001.
 35.
Hamilton AC, BenstedSmith R: Forest conservation in the East Usambara mountains, Tanzania. Gland, Switzerland: IUCN; 1989.
 36.
Haglöf A. Users guide Vertex III and Transponder T3. Långsele, Sweden: Haglöf Sweden, AB; 2002.
 37.
Anon: Pinnacle User’s Manual; Javad Positioning Systems. In: CA. Edited by Jose S. USA; 1999.
 38.
Naesset E. Effects of differential singleand dualfrequency GPS and GLONASS observations on point accuracy under forest canopies. Photogramm Eng Remote Sens. 2001;67:1021–6.
 39.
Masota A: Tree allometric models for predicting above and belowground biomass of tropical rainforests in Tanzania. in press.
 40.
Feldpausch T, Banin L, Phillips O, Baker T, Lewis S, Quesada C, et al. Heightdiameter allometry of tropical forest trees. Biogeosciences. 2011;8:1081–106.
 41.
Banin L, Feldpausch TR, Phillips OL, Baker TR, Lloyd J, AffumBaffoe K, et al. What controls tropical forest architecture? Testing environmental, structural and floristic drivers. Glob Ecol Biogeogr. 2012;21:1179–90.
 42.
Mugasha WA, Bollandsås OM, Eid T. Relationships between diameter and height of trees in natural tropical forest in Tanzania, Southern Forests. J For Sci. 2013;75:221–37.
 43.
Nilsson U, Agestam E, Ekö PM, Elfving B, Fahlvik N, Johansson U, et al. Thinning of Scots pine and Norway spruce monocultures in Sweden. 2010.
 44.
Ratkowsky DA, Giles DE. Handbook of nonlinear regression models. New York: Marcel Dekker; 1990.
 45.
Richards F. A flexible growth function for empirical use. J Exp Bot. 1959;10:290–301.
 46.
Winsor CP. The Gompertz curve as a growth curve. Proc Natl Acad Sci U S A. 1932;18:1.
 47.
Wykoff WR, Crookston NL, Stage AR. User's guide to the stand prognosis model. In: US Department of Agriculture, Forest Service, Intermountain Forest and Range Experiment Station. 1982.
 48.
Yang RC, Kozak A, Smith JHG. The potential of Weibulltype functions as flexible growth curves. Can J Forest Res. 1978;8:424–31.
 49.
Axelsson P. Processing of laser scanner data—algorithms and applications. ISPRS J Photogramm Remote Sens. 1999;54:138–47.
 50.
Axelsson P. DEM generation from laser scanner data using adaptive TIN models. Int Arch Photo Remote Sensing. 2000;33:111–8.
 51.
Næsset E. Practical largescale forest stand inventory using a smallfootprint airborne scanning laser. Scand J For Res. 2004;19:164–79.
 52.
Hudak AT, Crookston NL, Evans JS, Falkowski MJ, Smith AM, Gessler PE, et al. Regression modeling and mapping of coniferous forest basal area and tree density from discretereturn lidar and multispectral satellite data. Can J Remote Sens. 2006;32:126–38.
 53.
Team RC: R: a language and environment for statistical computing. 2013. R Foundation for Statistical Computing, Vienna, Austria. In.: ISBN 3900051070; 2013.
 54.
Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):461–4.
 55.
Fox J, Weisberg S: An R companion to applied regression. United Kingdom: Sage; 2011.
 56.
Goldberger AS: The interpretation and estimation of CobbDouglas functions. Econometrica: J Econc Soci. 1968:464–472
 57.
Yoo S, Im J, Wagner JE. Variable selection for hedonic model using machine learning approaches: A case study in Onondaga County, NY. Landsc Urban Plan. 2012;107:293–306.
 58.
Pinheiro J, Bates D, DebRoy SS, Sarkar D: D., and the R Development Core Team 2013. nlme: Linear and Nonlinear Mixed Effects Models. R package version:3.1103.
 59.
McRoberts RE, Næsset E, Gobakken T. Inference for lidarassisted estimation of forest growing stock volume. Remote Sens Environ. 2013;128:268–75.
 60.
Sarndal CE, Swensson B, Wretman J: Model assisted survey sampling. New York: SpringerVerlag; 1992.
Acknowledgements
The financial support for this research was provided by Government of Norway through the two projects entitled “Climate Change Impacts, Adaptation and Mitigation (CCIAM) in Tanzania” and “Enhancing the Measuring, Reporting and Verification (MRV) of forests in Tanzania through the application of advanced remote sensing techniques”. We are highly acknowledging our field team in Tanzania, and Terratec Norway, for collecting and processing of the ALS data. We are also grateful to the administration of ANR for all support, and especially for provision of office space for establishment of the GPS base station.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Author’s contributions
All the authors have made substantial contribution towards successful completion of this manuscript. Authors; EWM and OMB have been involved in designing the study, drafting the manuscript, data analysis and write up. EHH has been involved in data analysis and quality control of both raw data and results. EN and TG have been responsible for designing the ALS acquisition and they were involved in revising the manuscript. REM was involved in critical discussion on the field inventory design and all logistics related to the field data acquisition. All authors read and approved the final manuscript.
Author’s information
EWM and EHH are PhD students in forest inventory at Norwegian university of Life Sciences (NBMU).They are both associated with the forest mensuration group in the university. OMB is the researcher in the same group specialized on the application of ALS in forestry. EN and TG are senior scientists and professors in ALS and forest sampling at NMBU. Both EN and TG, are resource persons for the forest mensuration group at NMBU. REM is professor in forest inventory and mensuration at Sokoine university of Agriculture, Tanzania.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Mauya, E.W., Hansen, E.H., Gobakken, T. et al. Effects of field plot size on prediction accuracy of aboveground biomass in airborne laser scanningassisted inventories in tropical rain forests of Tanzania. Carbon Balance Manage 10, 10 (2015). https://doi.org/10.1186/s130210150021x
Received:
Accepted:
Published:
Keywords
 Airborne laser scanning
 Modelassisted estimation
 Plot size
 Aboveground biomass