Summary: This vignette explores spatial hotspots and seasonality of Flickr images in the northeastern United States, using the spatstat
package.
# load some core packages
library(maptools)
library(spatstat)
library(ggmap)
library(rgdal)
Social media data contains a trove of rich information about human activity and behavior. Since the first appearance of popular social media websites, researchers have been keen to develop ways to exploit these Internet data archives for a variety of academic and commercial purposes.
Image sharing platforms that host geotagged photographs, such as Instagram and Flickr, are unique in that they present opportunities to examine spatial and thematic patterns in human activity. Every public image posted to these websites contains (1) coordinates and (2) content, which combined provide numerous opportunities, for better or worse, to explore human interactions in the modern world. For a while, these data were freely available from open source API programs. But this has changed in the last few years with more and more platforms charging fees for data downloads.
Over the last decade, a new wave of research by ecologists and social scientists has examined the capacity of social media images for understanding human-environmental interactions. From estimating park visitation in protected areas (Kuehn et al. 2020; Wood et al. 2013) to monitoring insect populations (Medhat et al. 2017), these data offer a wide variety of applications for conservation. For an in depth recent review, see Ghermandi and Sinclair (2019)—https://doi.org/10.1016/j.gloenvcha.2019.02.003.
In this document, I showcase some basic spatial and temporal data summary and visualization techniques with Flickr images from the Northern Forest, a mostly rural region extending from New York to Maine. This project is part of an ongoing study with the State University of New York College of Environmental Science and Forestry.
More information is available at https://www.esf.edu/socialmediastudy/.
These Flickr data for the Northern Forest Region (NFR) were prepared in advance, so I won’t be demonstrating how they were mined from the Flickr API.
Preparing data for spatspat
is straightforward. If you have a shapefile already, you simply load your the file(s), via readOGR
or using one of the tidier sf
methods.
# load rural Flickr point and gridded shapefiles
flickr.rural <- readOGR("flickr_rural.shp", verbose=FALSE)
flickr.grid <- readOGR("flickr_grid_counts_042518.shp", verbose=FALSE)
NFR.shp <- readOGR("NFR_boundary.shp", verbose=FALSE)
Then you reclassify your point files as ppp
objects, which is what spatstat
works with.
# create a spatstat spatial planar point pattern (ppp) object
flickr.pts.ppp <- as(flickr.rural, "ppp")
flickr.grid.ppp <- as(flickr.grid, "ppp")
The last step is assigning your points to a window, or an owin
object.
NFR <- as.owin(NFR.shp)
class(NFR)
## [1] "owin"
Prior to analyzing the image data and generating maps, duplicate images (i.e., pictures taken by the same photographer on the same day and at the same coordinates) were omitted to reduce redundancies.
Ultimately, 116,360 images were used to identify regional hotspots. Of these, 89,231 (77%) images were taken in rural areas (i.e., outside of US census designated urban areas). We focus on the spatial and temporal patterns of these rural images in this document, using the following data fields.
colnames(flickr.pts.ppp$marks)
## [1] "OBJECTID" "caption" "url" "isUrban" "latitude"
## [6] "longitude" "id" "owner" "hometown" "datetaken"
## [11] "dateupload" "tags1" "tags2" "tags3" "tags4"
## [16] "tags5" "tags6" "tags7" "tags8" "tags9"
## [21] "tags10" "uniqueID" "STUSPS" "month" "date"
Quick note: the tags were generated from neural networks run in the Clarifai image classification software to identify image content for all the photographs in our database. You can find more information on the project website for this study.
Below are all of the original image locations, clustered by their density in the landscape.
# project rural points to web mercator
flickr.rural.NFR <- spTransform(flickr.rural, CRS("+proj=longlat +datum=WGS84"))
NFR.shp <- spTransform(NFR.shp, CRS("+proj=longlat +datum=WGS84"))
library(leaflet)
flickr.view <-
leaflet(data = flickr.rural.NFR) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions(),
popup = paste0("<a href='", flickr.rural.NFR$url,
"' target='_blank'>",
"Click Here to View Image</a>")) %>%
addPolygons(data = NFR.shp, color = "#444444", weight = 1, smoothFactor = 0.5,
opacity = 1, fillOpacity = 0.2) %>%
addTiles(group = "OSM (default)") %>%
addProviderTiles(providers$CartoDB.Positron, group = "B/W") %>%
addProviderTiles(providers$Esri.WorldTopoMap, group = "Topo") %>%
addLayersControl(baseGroups = c("OSM (default)", "B/W", "Topo"),
options = layersControlOptions(collapsed = FALSE))
flickr.view