Archive for category Workflow
Huge Increase in Sharability by Combining Git and QGIS
After tweeting today about the Unmitigated Amazingness that is a QGIS + Git workflow, someone suggested that I write a blog about my experiences in this regard. Unfortunately today is a deadline day for a portion of what will become my next book* so I can’t put a lot of time into a full-blown explanation of how this workflow will CHANGE your life. But I can give you a taste.
To that end, in a nutshell, and realizing I might be leaving out some important bits of information and because I suspect there are a lot of people out there who’ve never used this workflow before in their life, I’m deliberately not using the technical Git terms pull, push, etc., just to keep it simple:
- You begin by installing Git on your machine
- Unless you want to use command-line Git the two choices that I’m familiar with are the following combinations: Bitbucket for your online stuff** and SourceTree to manage updating that stuff OR GitHub for your online stuff and GitHub Desktop to manage updating that stuff
- You create a project (aka “repo”) on Bitbucket or GitHub
- Copy it locally via SourceTree or GitHub Desktop
- (alternatively you can create it locally and then create it in the cloud)
- I suggest that all the geodata you’ll use goes in one folder within the repo while all the QGIS projects you create go in another, any images or other odd things that you need in your QGIS projects could go in a Misc folder
- You do your work normally: create a QGIS project, add data, but do it all within that repo folder on your machine
- Open SourceTree or GitHub Desktop on your machine and it’ll tell you that you made changes like that you added data and that you created a project, you can choose if you want all of that to be put in your Bitbucket or GitHub cloud. If you do want it up in the cloud, you use one of those programs to sync it up with your cloud repo
- Your collaborators simply use their own SourceTree or GitHub programs to put that project and its data on their machine exactly as you uploaded it. If they make changes that they want you to see then they can also sync those up, then your SourceTree or GitHub alerts you about the changes
And guess what?! In this way your QGIS project and all the files it uses are easily synced with other people. You don’t have to zip anything up. You don’t have to locate all the places where you put your data because you’ve already put it all in that repo/geodata folder. You can order online cheap Viagra that will help you get a strong erection in bed. There is NO repairing of data source paths on your collaborator’s end! Think of the possibilities! It is truly a wonderful thing.
Now, I really am sure that I’ve left a whole lot of info out while trying to create this simple bird’s-eye view of the process but hopefully this provides a taste of the possibilities so that you can go learn more. After using both the Bitbucket/SourceTree workflow and the GitHub/GitHub Desktop workflow I personally find the GitHub/GitHub Desktop workflow to be a bit easier. Its desktop program is a little more streamlined as it “exposes” less of the advanced capabilities.
————Edited 9/1/2015 to add: Soon after posting this a reader pointed out that James Fee and I had coincidentally written about similar topics on our blogs yesterday. His topic was spatial DATA versioning while mine was spatial PROJECT versioning. To be clear, the project-sharing that I’m talking about in this post doesn’t really involve changing data at all. In fact, what I’ve been doing is collaborating with someone else on cartography designs using QGIS, and we needed a way to see each other’s designs (i.e., QGIS projects) and tweak them and send them back and forth. So yes, while we do store spatial data in our git repos, we aren’t concerned about that data changing, just really the styling of the data within the QGIS projects themselves. Fee explains much better in his follow-up post GIS and Git. ————
*First public hint about my next book: it will be about cartography! 😉
**Highly technical here
Very simple overview of how to use a QGIS-Git workflow to dramatically increase sharability: http://t.co/WIC9hkshSl
— Gretchen Peterson (@PetersonGIS) August 31, 2015
@PetersonGIS it’s a great workflow. How do you deal with changes in binary data? Potentially huge repo if you commit many changes to .shp’s
— Kristian Evers (@kbevers) August 31, 2015
@kbevers @NickBearmanUK @PetersonGIS Put your map data in a “geogig” repo: https://t.co/GxAgrioqX3
— Barry Rowlingson (@geospacedman) August 31, 2015
.@erikfriesen @PetersonGIS being doing this for almost a year now, no regrets. easier than geogig albeit the granularity of diffs missed
— Antonio Locandro (@antoniolocandro) August 31, 2015
Simple version control for #QGIS with Git: http://t.co/vBt7YdZWWT /by @PetersonGIS
— zanols (@zanols) August 31, 2015
@PetersonGIS how large of files do you keep under version control?
— Nick Swanson-Hysell (@polarwander) August 31, 2015
@antoniolocandro @PetersonGIS I'm wondering if maybe a hybrid of the two would be manageable. Git for map project files, geogig for data.
— Erik Friesen (@erikfriesen) August 31, 2015
@erikfriesen if you aren't using a database for example using no shapefiles using GML or JSON in GitHub could be enough @PetersonGIS
— Antonio Locandro (@antoniolocandro) August 31, 2015
@erikfriesen we actually manage map doc in git and version control in postgis through custom solution @PetersonGIS
— Antonio Locandro (@antoniolocandro) September 1, 2015
@antoniolocandro @erikfriesen @PetersonGIS Stupid question: what if collaborator changes dir path in a repo, does it break everything?
— Christopher Rice (@colocarto) September 1, 2015
@polarwander The repo I'm working with right now has 1.2 GB in it and 200 files (there's an additional 400 small git files too).
— Gretchen Peterson (@PetersonGIS) September 1, 2015
@colocarto no stupid question, hmm it might for map project not sure for data itself @erikfriesen @PetersonGIS
— Antonio Locandro (@antoniolocandro) September 1, 2015
@PetersonGIS @antoniolocandro @erikfriesen Slick, can it do rollbacks too? Postgres support also?
— Christopher Rice (@colocarto) September 1, 2015
@antoniolocandro @PetersonGIS absolutely. I was just thinking of the case of using postgis for a datastore
— Erik Friesen (@erikfriesen) September 1, 2015
@erikfriesen for that you need geogig but project left boundless to eclipse nothing new in a while @colocarto @PetersonGIS
— Antonio Locandro (@antoniolocandro) September 1, 2015
@antoniolocandro @colocarto @PetersonGIS damn
— Erik Friesen (@erikfriesen) September 1, 2015
@erikfriesen @antoniolocandro @PetersonGIS Agreed. If I'm using QGIS it feels strange not utilizing PostGIS
— Christopher Rice (@colocarto) September 1, 2015
@PetersonGIS and it's smooth sailing when there are changes to big files?
— Nick Swanson-Hysell (@polarwander) September 1, 2015
@PetersonGIS I ask as I was just at a workshop where they advised against using git for quite large files.
— Nick Swanson-Hysell (@polarwander) September 1, 2015
@PetersonGIS but as you say it is great to just have everything up on Github.
— Nick Swanson-Hysell (@polarwander) September 1, 2015
@PetersonGIS I was working on project where the .git directory of the repo was ballooning in size due to changes to graphic files
— Nick Swanson-Hysell (@polarwander) September 1, 2015
@polarwander @PetersonGIS Did they say why?
— Cian Dawson (@cbdawson) September 1, 2015
@cbdawson @PetersonGIS but it seems to work pretty well to have large files in repos and it is nice to treat all file types the same way
— Nick Swanson-Hysell (@polarwander) September 1, 2015
@spatialadjusted @cageyjames It sounds like @PetersonGIS is thinking along the same lines http://t.co/OffokmVufL
— Phil Knight (@PhilipWhere) September 1, 2015
@PhilipWhere @spatialadjusted GeoGit = versioning for data, whereas I was speaking more to versioning a project.
— Gretchen Peterson (@PetersonGIS) September 1, 2015
@PetersonGIS great blog [again] Gretchen, what format data are you using though? Is it all shapefile?
— Nicholas Duggan (@Dragons8mycat) September 1, 2015
@Dragons8mycat In our case we have shapefiles and a SpatialLite osm db of the Seattle area.
— Gretchen Peterson (@PetersonGIS) September 1, 2015
@PetersonGIS Just curious if your tried any branching/merging of your binary data files.
— Bill Dollins (@billdollins) September 1, 2015
@PetersonGIS Thx. Specifically, I want to see if this works: http://t.co/e37RfyMXcX
— Bill Dollins (@billdollins) September 2, 2015
How to Win Friends and Use QGIS
I had the opportunity to speak at a great conference last week: the Free and Open Source Software for Geo North America (FOSS4G NA 2015) conference. With over 400 enthusiastic people getting together to talk about geo software in the opensource realm, it was a big success.
My talk was on using QGIS and had a somewhat unusual format in that I switched back and forth between quotes from Dale Carnegie (of How to Win Friends and Influence People fame) and a live demo of the making of a complete map using QGIS 2.8.
Check it out here and when you’re done head over to the FOSS4G NA 2015 YouTube channel to see all the other fantastic talks. The content taught in this talk will be expanded and made into a tutorial for the upcoming Maptime Boulder (sign up here) and also posted online sometime in April for everyone.
2015 GeoHipster Calendar
At Boundless, we put together a nice and subtle world-wide basemap for our new product: Versio. It’s meant to be a basemap that shows you where your data is but doesn’t get in the way, thus the quiet color scheme coupled with ample data from OpenStreetMap.
A stitched together series of screenshots at about zoom level 14 in the San Francisco region provided a good entry for the 2015 GeoHipster Calendar and I’m pleased to announce that it has made the cover.
While I was the main designer for the map, we all know that cartography is only as good as its underlying data, and in the case of dynamic maps, as good as its underlying infrastructure. That’s why the map was really a
team effort by all of the Versio team at Boundless.
A short background on the map in case you’re interested: we used imposm3 to obtain a world-wide OpenStreetMap dataset with a customized mapping.json file that allowed us to get some generalized data for roads and things for the lower zoom levels while still getting the non-generalized data for the higher zooms. We also used quite a bit of NaturalEarth data for the lower zooms, including a raster hillshade for the ocean overlaid with a semi-transparent ocean layer to make it more subdued. Most of the labels are not cached, they are dynamic so that we don’t have any issues with double labels or labels cut off at tile edges. Because we aren’t using too many labels in the dynamic label layer, this doesn’t seem to affect performance. The map was made with most of the OpenGeoSuite components, including–yes, I’ll say it–SLDs that I basically edited by hand. GeoServer serves up the data + SLDs, PostGIS holds the OSM data, the NaturalEarth data are kept in shapefile format, geowebcache cuts the tiles, and OpenLayers shows them off on the webmap.
Making a basemap with OSM data, some notes
Creating a zoomable map with openstreetmap data that covers the entire world is doable even if you’re a single-person entity without your own server infrastructure. You’ll have to use a provider like AWS to scale up to the level of performance that you’ll need, of course. As with any cartography project, minding the data becomes about 50% of the work and multi-zoom basemaps of the world are certainly no exception to that rule.
A tool like imposm or osm2pgsql is usually required to parse data from a source like geofabrik (pbf). At Boundless we’ve had some great success with imposm3, though it is still in the experimental phase, but so far proves to be much faster than imposm2. The benefit of imposm is that you don’t dump everything into 3 tables like osm2pgsql; instead you dump it into any number of tables based on data type, usually around 25 tables (you customize the download via a json file with whatever parameters you want to specify). This makes querying faster. You also get “diff support,” which means that you can easily incorporate updates from OSM diff files, thus enabling easy updates however often you want to update. (We haven’t tested that in imposm3 yet.)
Going along with the “data is 50% of the project” maxim, you need to become familiar with PostGIS in order to deal with this data. You need to be able to use a viewer like pgAdmin and/or get familiar with the command-line tools both for viewing the tables and their contents so that you can actually use and style the data but also to manipulate the data if needed.
For example, performance is often enhanced if you create attribute indexes on any attributes that you are using for styling. Let’s say I’m showing parks at a high zoom level. My GeoServer SLD that contains the styling rules for my map might specify that type=parks in table landusages needs to be green with a dark green outline. The table landusages may benefit from an attribute index being created on the type field.
Personally, I believe the very best way to get familiar with all the tools that I’ve mentioned above is to play with them yourself in a real-world environment. For this my favorite two options are 1) learn on your own (I love Boundless’s new online training courses) and 2) attending a maptime near you. And on a final note, don’t try to download and parse the entire world’s worth of OSM data your first time. 😉 Start with a city, county, or small state of interest and scale up from there.
Spending even a few nights beginning to fool around with OpenStreetMap, imposm2 or imposm3, PostGIS, and/or GeoServer will make you more marketable. Investing in this knowledge acquisition will certainly pay dividends.
Geogit Bit
Geogit is a tool made just for me. Despite my best intentions, there are times when data file names get the best of me. These times are lamentable because when the inevitable moment arrives when I have to comb through a series of files in order to find the one that is the most correct or the most current, I’ll find something like this:
Krige_1
Krige_2
Krige_3
Krige_3_final
Krige_3_final_final
Or worse. Maybe later on down the line, during that complex interpolation project last year, I decided that this analytical method was sub-optimum and went with natural neighbor instead. So then there’s a series of “natural_neighbor” prefixed datasets in the same vein to sort through. After a year goes by it’s hard to know which one was the end result. Which one had the fewest errors. Which one is the most authoritative.
And with multi-person teams things get more complex. Maybe the intern added 5,000 septic system points to the wrong database and you don’t have an easy way to undo it.
Enter geogit.
It has more benefits than solving the above problems but these are the ones that hit home to me the most at this, the beginning stages of my geogit learning journey.
Hey, guess what? I started a new position this week at Boundless. I’m absolutely thrilled to be a part of such a great team. Learning about geogit is one of the first things I get to do. I can’t complain.
Thumbs Up for Graphics Programs
Inkscape is a free open source graphics editor. I’ve been using it quite a bit this week and, while it’s just basic graphics work, it’s a lot of fun. Those of us who came to cartography via the traditional way, GIS, have a particular fear of graphics software. Getting over that, though, opens up a whole new world of possibilities. Ever wish it were easier to move labels around and experiment with different fonts? So much easier in Inkscape or Illustrator! Ever wish you could just quickly re-color an entire map so all the colors were lighter or darker, or apply a weird-effect, or change individual colors without having to click in a symbology editor? So much easier in Inkscape or Illustrator! (Though you can also do some great color effects in TileMill without much effort too.) Have an old map that needs a few tweaks but don’t want to open up the GIS project and re-connect all the datasources? Sometimes you can vectorize an old map like that and simply change it in a graphics program. So many uses for a good image editor, we should all be using one regularly.
(Flipside: cartographers who got into maps from the graphic design field can improve their analytical visualizations, ensure projection perfection, and a host of other things by checking out a GIS once in a while.)
Here I’ve used Inkscape’s Filter called “Moonerize” to change the colors of one of my existing LULC maps to something a bit more…out of this world. Yep, I had to say that.
More realistic, is this little exercise in updating one of the maps from the 1st edition of GIS Cartography. Here’s the original map, warts and all:
What is going on there? Yes, its meant to be a thumbnail but anyone can see that whatever those coastal features are don’t line up with the coasts at all. Almost all the 200+ maps in the book are saved as project files in a neatly organized spot so they’re easy to update. This one, though, was mysteriously missing. Thinking I could quickly re-create the thing using public data had me running into two issues (1) most of the government sites are down right now and (2) sea surface current data is pretty hard to find anyway. And, like I said, the data is nowhere to be found on my local machines either.
Inkscape to the rescue! I made the new basemap in ArcMap and exported it as an svg. Imported that into Inkscape. Then I imported the original image (sized the same as the basemap I just made) into Inkscape and converted the sea surface current data, those black triangles, into their own bitmapped layer (Path>Trace Bitmap), deleted that bad background, and simply superimposed the current triangles onto the new basemap. So much better now:
Figure 6.51 you are now good to go!
Recent Comments