Archive for category Analysis

Calculating Variable Width Buffers on a Stream Segment Based on Elevation

GOAL: variable width buffer based on elevation
RESULT: What we’ve got is mostly correct, but with the following issue.

In the example below, the stream line is shown in blue and some retention ponds are visible in the aerial photo near that stream:

Next, we take that stream segment and make it into a grid with the same extent as our elevation grid. It looks like this, where each cell has a slightly different elevation value (streams generally go downhill!):

Then we create a Euclidean allocation grid from those stream elevations. It just assigns the elevation of the nearest source stream pixel to all of the other cells. For example, the yellow area is all assigned pixel values of 148817 (meters X 100, since I have also multiplied the elevation grid by 100 to make it into an integer). That stream pixel value that the yellow is emanating out from has a value of 148817.

The next step is to subtract the Euclidean grid from the true elevation grid so that, for each pixel, we determine how much higher that pixel is than the elevation of the closest stream pixel. Ideally, this would produce a variable width buffer once you set a threshold for how high you want to go. For example, if I set the threshold at 1/2′ (which is 15 units in my data, remember it is meters and multiplied by 100), then the resulting “buffer” looks like this, in green:

The area shaded in green is a model of where the locations around the stream are within a half-foot elevation rise of the stream. Areas that are not in green are higher than half a foot of the stream. Note that I’ve left out some of the details which involve subtracting the Euc grid from the elevation grid, then testing (with a con statement in ArcMap in this case) whether or not the result is above 15 or lower than 15, where it is lower than 15, we assign the pixel a 1 and where it is above 15 we assign a NoData value.

The model works pretty well in many areas. As can be expected, it looks even better when we use less fine increments. So if we calculate the 7′ elevation rise, for example, the resulting buffer does not have a lot of strange “jogs” out from the stream line. I’ve chosen this particular example location to illustrate the problem that these retention ponds make very clear. One would expect that the entire retention pond, which has a completely even elevation surface, would be in the buffer if part of it is. However, in this case, the anomaly lies with the Euclidean distance raster. Since it is allocating the closest stream elevation to each pixel, if a stream changes in elevation enough from one neighboring pixel to another to cause the subtraction of the grids to cause one part of the pond to be “in” the buffer and one “out” of the buffer, then this will happen.

In this case, the pink part of the Euclidean distance calculation, in the upper left quadrant, comes from a source pixel having elevation equal to 148,987.
The purple area of the Euclidean grid just below that pink area has an elevation value equal to 148,801.
The retention pond has an elevation of 148,990 throughout its area (well, at least until you get to the slight rise at its edges).

For the pink part then, 148,990 – 148,987 = 3. Clearly, 3 is within out threshold of half-foot (which is 15 units)
For the purple part, 148,990 – 148,801 = 189. Clearly, 189 is not in our half-foot threshold. Thus, it doesn’t show up, even though the retention pond has an even elevation throughout!

At this stage in the analysis we must quantify the effect of this anomaly on the overall model’s accuracy.

(I noted at the end of How To Create an Inside Buffer Based on Elevation, that a cost distance raster could be used to do what I describe above. However, cost distance works only for local areas where the stream elevation remains constant–in this case, the stream elevations change from mountain elevations on down to plains elevations)


Global Patterns, Local Exceptions

Yesterday a graduate student asked about my farmers’ market analysis, because she is TA-ing a university course on data collection methods and research. Her questions reminded me to alert her to the fact that data patterns are not necessarily constant across scales. For example, farmers’ market correlations may be seen in a global or national dataset but may not be prevalent at the local city level. Conversely, patterns seen at the local city level may not be seen in the national or global map.

Furthermore, focusing on local exceptions instead of global or national regularities, may be more meaningful, especially if the data are a high enough resolution to provide adequate insight. While I’m not sure if the farmers’ market dataset from the USDA will show patterns at a local level, I’m sure that it is a good thing to try. This approach also allows a more intricate data quality assurance, because with fewer datapoints (less than 20 per city), they can easily be verified and added to as needed if one is looking at just a single city.

This discussion reminds all of us analysts that “it might be incorrect to assume that the results obtained from the whole data set represent the situation in all parts of the study area.”* I’d be happy to hear your thoughts on this.

*See Quantitative Geography by A. Stewart Fotheringham, Chris Brunsdon, and Martin Charlton for further reading (p11 especially).

1 Comment

Farmers’ Markets, More Analysis

I continue to explore the USDA Farmers Market database. See previous posts on this dataset here. Taking a look at just the Colorado portion of the data, we see there are 119 market points, shown as purple dots on the interactive map below:

powered by mapbiquity

It is interesting to see what happens when we quantify farmers’ markets, population density, and obesity rates. The number of farmers’ markets by population actually remains fairly constant (and fairly small) within all four categories: 0.001735%, 0.001975%, 0.001929%, 0.001875%* for categories 1-4 respectively where the categories are as follows:

1-12-25% obesity
2-25-30% obesity
3-30-35% obesity
4-35-45% obesity

However, the population density alone is very telling within the four categories (again from 1 through 4): 150/sq mi, 78/sq mi, 65/sq mi, 45/sq mi.*
*These figures are for the whole country, not just Colorado.

The correlation between population density alone and obesity is rather strong (also see: Distances to farmers’ markets is likely to be much lower in higher populated places, which may be a factor. The opposing viewpoint is that distances to fast food and other likely anti-correlates are also closer. Another point to make could be that effects on obesity by farmers’ markets (if any) might be a much more local scale issue, and if so, this data is not fine enough to capture that.

Here’s a micromap I put together with the market counts and the population density, by county.

Population per square mile:

Proportion of markets by population:

These three graphics just show population and farmers’ markets. Next I need to add in the obesity information so that we can have a visual of that with all the rest.


The Power of Maps to Save a Species

The Power of Maps to Save a Species is the title of a short introductory video on the topic that I put together a few weeks ago. If you’ve been wondering what it is that I do in my daily GIS analysis life–aside from this blog, book, and article writing–this will give you an idea. I mostly focus on salmon. A while back I even made these pens to give out (they say GIS FOR SALMON on them).

Here’s the video, created via webcam. As someone mentioned, the sound quality could be better. I have a nice recording microphone that I should hook up next time I try this. But for now…

1 Comment

Big Data (Articles) Everywhere

A little while ago I wrote about how demand for GIS analysts may be on the rise (Demand for GIS Analysts on the Rise?). It referenced a report from the McKinsey Institute called Big Data: The next frontier for innovation, competition, and productivity. Here are a few more articles that are referencing that report:

The Age of Big Data – New York Times
Big data is likened to a microscope. We can now see things that we couldn’t before – such as how political beliefs spread.

Data scientist: the hot new gig in tech – CNN Money
This article notes that there’s a course at Stanford on data mining that five years ago had 20 students but that most recently had 120 students. It also notes that there are now blogs about big data as well as a data scientist summit.

Big Data prep: 5 things IT should do now – PC Advisor
This article points out that big data refers to more than just the structured data that organizations collect for specific purposes. Indeed, big data is also about the data that is now available via social networks and other means, which the organization isn’t collecting itself, but which could have implications for how the organization does business if people know how to harness it and what to use it for.

No Comments

Newish Open Source Extensions to ArcGIS

If you do GIS analysis with ArcGIS you might be interested in a couple of open source extensions that add toolbox tools to ArcGIS. Karsten Vennemann is an expert on open source GIS solutions. He recently gave a presentation to the King County GIS Center User Group and posted his slides for everyone to benefit from. This is a great way to get a good overview of what these powerful extensions can do.

1 Comment