Collaboration in the Open with Geospatial Data

For the Spring, I will continue to work on an issue that has stymied me for years. An mechanism for sharing and collaborating openly using geospatial data.

Collaboration is key to many types of research and more research is becoming spatially oriented. Archaeology has always been spatial and frequently collaborative. Yet there are no good tools for collaboration with geospatial data.  I proposed to test some new tools, but first, a little background.

For the past three years, I and my students have been involved in the identification and reconstruction of a historic landscape in the Lehigh Valley. In the 19th century, two small iron furnaces – the East Penn Furnace in East Penn Township, Carbon County and the Lehigh Furnace in Washington Township, Lehigh County- required tons of charcoal to fuel the conversion of iron ore into metallic iron (or “pig” iron). Charcoal was made along the Blue (or Kittatinny) Mountain between the two furnaces. Using LiDAR data, historic maps, and field-collected data, we have been able to begin to reconstruct the ways in which the mountain was used by colliers (charcoal burners) to produce charcoal, transport it to the furnaces (and elsewhere) and use it in the furnaces. Additionally, we have begun to reconstruct the way in which colliers inhabited the mountain, where they resided for eight months out of the year.

In order to do this, we have shared geospatial data. Simply sharing multifaceted spatial data with students is terribly old fashioned, complex and error prone. This is caused by three interrelated factors. First, some geospatial data are large- for example the digital elevation model (or DEM) for our area of interest is 300 MB and about 1/3 of the data  that I share with students (because I usually don’t share it all at once) is approximately 4 GB. Secondly, most vector data is shared in the shapefile format, each layer of which is actually 5 separate files. While this shouldn’t matter, it rapidly increases the number of items that have to be managed and misplacing a single file can “break” the layers. Third, tracking changes in these files is almost impossible because there is no effective versioning system for geospatial data. If two students collaborate on modifying a layer, it is incredibly cumbersome to bring that data back together. Some of these concerns are similar to the long term preservation of geospatial data (see Clark 2016) The best solution at this point is to copy it to flash drives (which most students have) and use the data in QGIS (which students can put on their computer or is available in the Soc/Anth classroom). However, if there are any problems, the student has to come back to me and get a new copy of the files. Managing versions is almost impossible. Even sharing via GSuite is very difficult unless each student has the “Google Drive- Sync” app downloaded onto their computer, which downloads and synchronizes files (and takes up space). While this messy system has worked, any collaborative project needs to minimize the amount of time (and frustration) that collaborators spend managing data. Currently, the process is far too frustrating. Yet, we need a way to share and modify data in a way that preserves versions.

The answer MAY be Harvard’s WorldMap. First, all aspects of WorldMap are open, including their code, along with contributions to open source communities, such as GeoNode, Django, ExtJS, GeoServer, OpenLayers, PostGIS, GEOS, GDAL, OGR.  WorldMap is based upon GeoNode, which is a “web-based application and platform for developing geospatial information systems (GIS) and for deploying spatial data infrastructures (SDI).” Layers (both vector and raster) can be uploaded to Worldmap and the owner can choose to make them public or allow individual users the ability to see and/or modify the layers. New vector layers can also be created within WorldMap. Layers can be combined into “maps” in order to use layers to visualize and stylize certain features or relationships. While maps can be versioned, documentation is not clear as to whether or not layers can be versioned. Layers could easily be shared with students, who could construct visualizations (i.e., maps) of the data through the WorldMap portal. These digital maps could then be used in webpages (though using them for paper-based maps is not ideal). Additionally, documentation suggests that these same layers can be served up via WMS to desktop software, which is much more feature rich compared to the online WorldMap. Importantly, when uploading layers, WorldMap requests a wide variety of metadata that often do no live with the data (not that they can’t, but they rarely do). In this sense, it provides a component to geospatial data that is often lacking.

The project, therefore, is to test this setup. I have run it through some basic tests- uploading layers (vector and raster), making maps, editing layers, etc. I even had a student worker, Marguerite Runyon, upload a variety of layers. This took many hours and too frequently failed. While sometimes this was our fault (files need to be in the correct projection, WGS84), frequently there was no explanation for the failure. She simply had to try again and, normally, it succeeded. Now, I need to see how effective or not it is at collaboration. During the Spring semester, I will be teaching the Anthropology CUE. While the course is about the application of skills, theories, etc. learned through the Anthropology major, we will be using charcoal production on the mountain as an example. This means that participants will need access to the many, many different types of data that my students and I have collected over the years and use this data to both produce new maps and anthropological understandings of the people and the landscape. This would be a real-life test of the system. Part of the project, therefore, is setting up a more formal testing structure to determine what works and what does not.

 

Leave a Reply

Your email address will not be published. Required fields are marked *