Standard Error or Standard Deviation for error associated with averaging raster values within a polygon?

I am trying to measure uncertainty associated with a mean value for spatial data. I have:

  • A polygon representing a spatial district.
  • A raster dataset that was produced via a Random Forests model, with associated RMSE.

I want to calculate the average value of the raster data within the district. My understanding is that I should simply calculate the mean, followed by the standard error of the data to represent how far the sample mean is from the true mean.

My issue is that I don’t know whether to treat the raster data as a population or a sample. Including all raster pixels accounts for all values within the district, which drives the standard error close to 0. However, taking only a sample of the raster values from within the district seems incorrect as it is an arbitrary decision to omit available data.

I am currently propagating uncertainty by taking the root sum of squares of both the standard error as well as the model RMSE. I think this is correct, but the standard error contributes almost no uncertainty given that there are a very large number of raster values (>10,000).

Can anyone provide clarification on how to think about this problem? I have not been able to find much material that describes how to think about summarizing raster data from a traditional sampling approach. References or additional reading would greatly be appreciated.

Cross Validated Asked by jbukoski on December 28, 2020

0 Answers

Add your own answers!

Related Questions

Do we need hypothesis testing when we have all the population?

7  Asked on November 2, 2021 by siddhi-kiran-bajracharya


Fatality Rate for SARS-CoV-2

2  Asked on November 2, 2021 by dsmalenb


Ask a Question

Get help from others!

© 2022 All rights reserved.