Standard Error or Standard Deviation for error associated with averaging raster values within a polygon?

Question

I am trying to measure uncertainty associated with a mean value for spatial data. I have:

A polygon representing a spatial district.
A raster dataset that was produced via a Random Forests model, with associated RMSE.

I want to calculate the average value of the raster data within the district. My understanding is that I should simply calculate the mean, followed by the standard error of the data to represent how far the sample mean is from the true mean.
My issue is that I don't know whether to treat the raster data as a population or a sample. Including all raster pixels accounts for all values within the district, which drives the standard error close to 0. However, taking only a sample of the raster values from within the district seems incorrect as it is an arbitrary decision to omit available data.
I am currently propagating uncertainty by taking the root sum of squares of both the standard error as well as the model RMSE. I think this is correct, but the standard error contributes almost no uncertainty given that there are a very large number of raster values (>10,000).
Can anyone provide clarification on how to think about this problem? I have not been able to find much material that describes how to think about summarizing raster data from a traditional sampling approach. References or additional reading would greatly be appreciated.

Standard Error or Standard Deviation for error associated with averaging raster values within a polygon?

Add your own answers!

Ask a Question