Archive for category Analysis

Creating an Aspect-slope Map in GeoServer

An aspect-slope map (sometimes called a slope-aspect map) is an overlay of a semi-transparent slope map on top of an aspect map that is styled with a unique hue for every 45 degrees of slope direction. To put it visually:

aspect_slope_visual

To do this in GeoServer, you need slope and aspect raster datasets in degrees. Create an SLD for each with the following syntax (I’m not including the whole SLDs here for brevity).

SLOPE SLD

<sld:ColorMapEntry color=”#999999″ opacity=”1″ quantity=”10″/>
<sld:ColorMapEntry color=”#999999″ opacity=”.66″ quantity=”20″/>
<sld:ColorMapEntry color=”#999999″ opacity=”.33″ quantity=”30″/>
<sld:ColorMapEntry color=”#999999″ opacity=”0″ quantity=”90″/>

Here we have a gray color being used to denote very flat areas where the slope is less than 10. The gray is increasingly transparent as the slope becomes steeper, thus revealing the underlying aspect layer with more brightness in the steepest locations. These parameters could be tweaked to allow more or less brightness to show through.

ASPECT SLD

<ColorMapEntry color=”#9AFB0C” quantity=”22.5″ />
<ColorMapEntry color=”#00AD43″ quantity=”67.5″ />
<ColorMapEntry color=”#0068C0″ quantity=”112.5″ />
<ColorMapEntry color=”#6C00A3″ quantity=”157.5″ />
<ColorMapEntry color=”#CA009C” quantity=”202.5″ />
<ColorMapEntry color=”#FF5568″ quantity=”247.5″ />
<ColorMapEntry color=”#FFAB47″ quantity=”292.5″ />
<ColorMapEntry color=”#F4FA00″ quantity=”337.5″ />
<ColorMapEntry color=”#9AFB0C” quantity=”360″ />

The colors and class breaks are from this Esri/Buckley blog entry which, in turn, references the color scheme from Moellering and Kimerling’s MKS-ASPECT (GIS world 1991). The colors, in order, are shown here:

 

aspect_slope_palette

PUTTING THE LAYERS TOGETHER

Using a two-layer syntax in the wms request mashes the two layers together. List the aspect layer first and the slope second. I had it switched around on my first try and it took me a bit of time to realize that it draws that second referenced layer on top. And no, the finished map does not have a hillshade underneath. The combination of aspect and slope creates that “hillshade” effect.

————-

 

4 Comments

In Our Defense

I bill myself as a data scientist. After all, 50% of any GIS or cartography project, in general, involves data wrangling. Knowledge of statistics and geo-specific analytics is imperative to getting complex maps right. Of course, as with many tech fields, tools are always changing and there always seems to be something new to learn.

However, I take issue with this little snippet in Sunday’s NY Times from David J. Hand. When speaking about geographic clusters* he wags his finger at us and pontificates, “…if you do see such a cluster, then you should work out the chance that you would see such a cluster purely randomly, purely by chance, and if it’s very low odds, then you should investigate it carefully.” See the short article here.

Granted, he’s probably reacting to the surfeit of maps that have been circulating the internet claiming to prove this, that or the other, when in fact they are mostly bogus. For example, Kenneth Fields tweeted this abomination this morning:

Jonah Atkins has created a github location for sharing remedies to bad maps like the above called Amazing-Er-Maps (this is itself in reaction to the name “Amazing Maps,” which has been given to a twitter account that showcases maps of questionable quality at times.)

Amazing-Er-Maps, as I understand it, is a place for you to upload a folder that contains the link to a bad map and a new map that is similar but does a better job. You include the data and the map as well as any code that goes with it. It’s a fabulous idea. Don’t just complain about bad maps, seek to make them better in a way that the whole community can gain inspiration from and learn from. Check it out, Jonah’s already got it going with several fun examples. Super warm-fuzzies.

Circling back to Mr. Hand, he has a point: we need to apply sound statistical and mathematical reasoning to our datasets and the maps we make from them. For example, when I was helping the Hood Canal Coordinating Council map septic system points, I didn’t just provide maps for them to visually inspect for clusters of too-old septics, I produced a map of statistically significant clusters of the too-old septics using hierarchical nearest neighbor clustering, which provides a confidence level for the chance that the cluster could be random.

The point is, those who are already practicing sound data mapping practices don’t like to be lumped in with the creators of maps that are produced–let’s face it–as sensational products. Our little map community is challenging those bad maps out there, creating great ones for our clients and bosses, and continuing to learn to make them better. Give us a bit more credit here and check out some of the really amazing things we’ve done.

*On an exciting note, “geographic clusters” makes main-stream news media!

2 Comments

Pairwise Primer

Note: This is a really basic primer that completely leaves out all the math behind this technique. I built a spreadsheet that does all the calculations several years ago that you could modify. Ask me if you want it…

Creating a GIS decision model often involves weighting criteria in order to reflect its relative contribution to the model or its effect on the variable being measured. To do this, we usually start by ranking the inputs to the model in order of importance, then we try to set some weights according to our ranking. The process of choosing the ranking and weights can be decided by one person or a group of people. Essentially the process winds up being a “whoever shouts out the loudest wins” kind of thing as opposed to a disciplined scientific ranking based on facts.

For example, let’s say you have a simple erosion model with some inputs: aspect, slope, and soil type. It’s tempting to use your subjective reasoning, based on intuition and prior experience, to give a weight of, say, 40% to slope, 10% to aspect, and 50% to soil type (I’m completely making these up). But do we know for sure that slope should be 40% and not perhaps 45%? While it isn’t possible to get around some subjectivity, it is possible to do this in a more rigorous manner.

I’ll go over the basics of the pairwise comparison method here (for a very thorough discussion and implementation plan, you must read GIS and Multicriteria Decision Analysis by Malczewski.) Essentially, you take all of the model’s criteria that you want to weight, put them in a matrix where they are repeated on both the horizontal and vertical axes, then fill in the cells where they meet with numbers representing their relative importance. The brilliance of this is that you are only comparing two criteria at a time instead of trying to rank the whole list at once.

For each comparison you ask yourself or your team: Is X criterion way more important relative to Y criterion, somewhat more important, or of equal importance? That’s essentially what you ask yourself, though when you run an actual pairwise comparison you use 9 gradations of importance running the gamut from way more important to equal importance, with 1 being that the two are of equal importance and 9 being that the horizontal criterion is of extreme importance relative to the vertical criterion.

Once all of the numbers are filled in to the matrix you run a bunch of calculations on them and at the end of it all you get an ordered list of criteria with weights for each. But wait, there’s more! You also get to do a test with the numbers to make sure they are significantly different from one another. If, for example, you have filled in the matrix with all the pairs being essentially equal to each other, then the test will fail. In other words, you need to have criteria that are different enough in importance to make a ranking worthwhile. Otherwise all the criteria should just go into the model without being weighted at all.

If this sounds like the path you want to follow for your next modeling endeavor, check out the Malczewski book linked above and follow his calculations. I recommend creating a spreadsheet with all the calculations in it so that you can go back and change your comparisons if you find that you need to.

1 Comment

Friday Resources

  • You can download SF 1 Census data, with all the states compiled into a single geodatabase, from a relatively new “blog + extras” site called gisnuts.com. They get bonus points for the GIS quips in the upper-right corner of the homepage: 

  • If anyone needs a continental U.S. polygon representing the landward borders and a 5 nautical mile buffer extending seaward from the coastlines, let me know. Created today out of several datasets.

  • If you’ve never used the ArcGIS extension ET GeoWizards, be sure to check it out. I find the polygon tools like advanced merge (where you can intersect datasets in several useful ways) and fill holes very instrumental in getting data creation tasks completed. You can do everything with the free version as long as the input datasets have less than 100 features.

No Comments

Tobler’s Law: Critical Questions to Ask

Tobler’s first law of geography (TFL), that near things are more related to each other than distant things*, is employed in spatial models used to demonstrate, explain, and predict phenomena. The law is employed when designing migration and trip models, population center growth patterns, meme spread, and causes of disease, to name a few.

In its essence, TFL is the basis of most spatial analytical procedures. TFL, while not claimed to be immutable fact, is often considered to be close to it. Indeed, TFL is, in general, a concept that has wide applicability and usefulness.

However, not much has been said with regard to disputing what seems to be a unanimously agreed-upon viewpoint, and perhaps there are some important questions that arise if we take the stance of Devil’s advocate. Sometimes when we universally accept a model, missing pieces are never discovered and even downright wrong implications are held as true.

To draw a parallel with the financial markets, options markets are occasionally “underpriced” since they presuppose that prices for commodities or stocks or currencies hover around the most recent prices in a bell-curve like behavior. Extreme price changes are not priced into the model** underlying options prices, and therefore offer opportunities for someone who anticipates an extreme price change in the future (such as when a firm comes under investigation, or a key product is recalled.) Some hedge funds have made large sums of money taking advantage of this fact.

Getting back to Tobler’s first law, it would be prudent to allow ourselves to ask certain questions of it in order to better ascertain the risks we are exposing ourselves to when classifying the world in such a way–just as it is good to have a thorough understanding of the risks and opportunities in the options market before betting the farm.

Using an example from ecology, one might apply TFL to a model and incorrectly assume that nearby ecosystems are more related to one another than far-away ecosystems. In a hyper-local model, one could be mostly right in this assumption. However, at some point trouble is encountered, such as when the goegraphy changes abruptly via a mountain range, stream corridor, or seashore, for example. Nearshore ecology is more likely to be similar to nearshore ecology elsewhere in a country than it is to the nearby backshore ecology, for example.

Of course, Tobler himself has stated that TFL is not necessarily true in every instance.*** The questions we should be asking are:

1. How can TFL fail us?
2. What could TFL miss?
3. If TFL fails us or obscures an important truth, what would be the implications?
4. Would these implications be inconsequential or very large?

* There’s another part to the law that states that “everything is related to everything else”, which is not addressed in this post.
** See the Black-Scholes model, which is still used, but which does not allow for extreme changes in price that can actually occur.
*** Sui, Tobler’s First Law of Geography, a big idea for a small world? http://www.geographdy.com/blog/wp-content/uploads/2009/09/sui_2004.pdf

3 Comments

Big Data in the News Again

I’ve talked about big data on this blog before (Demand for GIS Analysts on the Rise? and Big Data Articles Everywhere) and also mentioned it at the end of my recent talk with James Fee. So it was with much interest that I read Harvard Business Review’s headline article this month titled “Big Data: The Management Revolution” by Andrew McAfee and Erik Brynjolfsson.

In it, the authors discuss how Big Data is different from regular data and report on the results of their study of 330 public North American companies. They sought to find out if businesses that use Big Data analytics actually perform better financially than businesses that don’t. I highly recommend getting the article–which you could read at your local library if you don’t have a subscription, but I find the kindle version to be quite worth the cost–to find out what the results of the study are.

Now, I realize that those of us who specialize in spatial analytics and mapping are interested in making a difference in all kinds of different ways, not just in big-business financials, but we could assume that the outcomes of this research are applicable to increasing performance in whatever field of expertise you currently work in, whether its local government, natural resources, utilities, or something else.

In terms of natural resources, which is the field I work in, I’ve thought about it in terms of one of the longest standing analytical subjects that I’ve been involved in: salmon habitat in the Pacific Northwest. For over 10 years the approach has been to work with the salmon scientists to determine what factors are important in the salmon habitat equation, where those resources are available on the ground, how those resources might fare in the future, and where the potential risks are, to name a few relevant metrics.

And yes, those datasets can be quite large–think individual tree ages based on LiDAR (more on that here and here)–but they are not real time. One of the differences between regular data and Big Data is the real-time nature of Big Data. What I foresee as something that can make a big impact along these lines is to see data on fecal coliform levels in real time or septic system failures in real-time, for example, so that measures can be taken to immediately ameliorate their impacts to salmon. This is only one way in which Big Data could impact my work, there are, I am sure, a large number of other things that could be done once this concept takes hold, that I haven’t even begun to be cognizant of. The possibilities!

No Comments