Next Article in Journal
A Multi-Element Approach to Location Inference of Twitter: A Case for Emergency Response
Next Article in Special Issue
Investigating the Feasibility of Geo-Tagged Photographs as Sources of Land Cover Input Data
Previous Article in Journal
Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information

1
International Institute for Applied Systems Analysis (IIASA), Schlossplatz 1, Laxenburg A2361, Austria
2
Department of Computer Science, Maynooth University, Maynooth W23 F2H6, Ireland
3
School of Geography, University of Nottingham, Nottingham NG7 2RD, UK
4
School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK
5
School of Geography, University of Leeds, Leeds LS2 9JT, UK
6
NOVA IMS, Universidade Nova de Lisboa (UNL), 1070-312 Lisboa, Portugal
7
Department of Earth Systems Analysis, ITC/University of Twente, Enschede 7500 AE, The Netherlands
8
Faculty of Engineering and Sustainable Development, Division of GIScience, University of Gävle, Gävle 80176, Sweden
9
Finnish Geospatial Research Institute, Kirkkonummi 02430, Finland
10
Norwegian Institute for Air Research (NILU), Kjeller 2027, Norway
11
Sinergise Ltd., Cvetkova ulica 29, Ljubljana SI-1000, Slovenia
12
Urban Planning Institute of the Republic of Slovenia, Ljubljana SI-1000, Slovenia
13
Institute of Geoinformatics, Óbuda University Alba Regia Technical Faculty, Székesfehérvár 8000, Hungary
14
Université Paris-Est, IGN-France, COGIT Laboratory, Saint-Mandé, Paris 94165, France
15
Institute for Interdisciplinary Mountain Research, Austrian Academy of Sciences, Technikerstr. 21a, Innsbruck A6020, Austria
*
Author to whom correspondence should be addressed.
Submission received: 24 January 2016 / Revised: 5 April 2016 / Accepted: 18 April 2016 / Published: 27 April 2016
(This article belongs to the Special Issue Volunteered Geographic Information)

Abstract

:
Citizens are increasingly becoming an important source of geographic information, sometimes entering domains that had until recently been the exclusive realm of authoritative agencies. This activity has a very diverse character as it can, amongst other things, be active or passive, involve spatial or aspatial data and the data provided can be variable in terms of key attributes such as format, description and quality. Unsurprisingly, therefore, there are a variety of terms used to describe data arising from citizens. In this article, the expressions used to describe citizen sensing of geographic information are reviewed and their use over time explored, prior to categorizing them and highlighting key issues in the current state of the subject. The latter involved a review of ~100 Internet sites with particular focus on their thematic topic, the nature of the data and issues such as incentives for contributors. This review suggests that most sites involve active rather than passive contribution, with citizens typically motivated by the desire to aid a worthy cause, often receiving little training. As such, this article provides a snapshot of the role of citizens in crowdsourcing geographic information and a guide to the current status of this rapidly emerging and evolving subject.

1. Introduction

Mapping and spatial data collection are two activities that have radically changed from primarily professional domains to increased involvement of the public. This shift in activity patterns has occurred as a result of significant technological advances during the last decade. This includes the ability to create content online more easily through Web 2.0, the proliferation of mobile devices that can record the location of features, and open access to satellite imagery and online maps. The literature describes this phenomenon using a multitude of terms, which have emerged from different disciplines [1]; some are focused on the spatial nature of the data such as volunteered geographic information (VGI) [2] and neogeography [3], while others have much broader applicability, e.g., crowdsourcing [4], citizen science [5] and user-generated content [6], to name but a few. Despite their differences, these terms are often used interchangeably to capture the same basic idea of citizen involvement in carrying out various activities relating to geographic information science. These activities can be driven by the needs of a second party such as a commercial company needing to outsource micro-tasks or by researchers who need large datasets collected that would otherwise not be possible using their own resources. Citizens may be motivated to contribute for a diverse set of reasons. The participants involved might, for instance, feel compelled to contribute by a collective cause such as contributing to an open map of the world through OpenStreetMap [7]. Another motivation is simply the desire to share information more widely, by, for example, placing georeferenced photographs online via a site like Panoramio or georeferenced commentary captured via Twitter. Whatever the motivation for collecting and sharing the data, these systems have become important sources of geographical data and are now being used by others for applications that may be unforeseen by the contributors, such as scientific research [8,9]. This adds a new dimension to this rapidly changing field and has led to new terms appearing, such as ambient [8] and contributed [10] geographic information to name but a few. Jiang and Thill [11] would even argue that geographic data contributed by citizens represents a new paradigm for socio-spatial research.
Although some papers have clearly acknowledged the existence of different terms in the literature, see for example, [12,13], there has been little attempt to collate these in a single place, examine how they relate to one another or analyze their appearance over time. The first objective of this paper is, therefore, to present a compilation of terms, providing some basic definitions and their primary attributions. The terms are then categorized according to active and passive contributions and to separate out spatial from non-spatial examples of user-generated content (Section 2.1). This is followed by an analysis of the appearance of these terms in the literature (Section 2.2) and for extracting profiles from Google Trends (Section 2.3) to examine their emergence over time in both the academic literature and more popular science outlets.
The second objective of this paper is to better understand the current state of mapping and spatial data collection by citizens through a systematic review of different online initiatives (Section 3). A similar approach was undertaken within the VGI-net project [14], which was a collaborative undertaking between the University of California, Santa Barbara, Ohio State University and the University of Washington, to classify sites related to the collection of VGI in order to study VGI quality and develop methods for analyzing VGI. The results were reported in Reference [15] and showed that most of the sites were local in extent, appearing after 2005 when Google released its application programming interface, which facilitated online mapping. Moreover, more than 60% of sites were developed in the private sector and purposes ranged from geovisualization to sharing of geographic content. Using the VGI-net inventory of sites as the starting point, a similar review was undertaken here. The sites that were currently still active were retained and new online initiatives were added, which were then evaluated using a much broader set of criteria than used in Reference [15]. These include: the thematic area in which the initiative fell (Section 3.1); the nature of the spatial data collected (Section 3.2); the level of expertise and training needed (Section 3.3); access to the data and metadata (Section 3.4); measures of quality assurance and use of the data in research (Section 3.5); information about the participants (Section 3.6) and what incentives there were for participation (Section 3.7). Based on these findings, Section 4 provides a discussion of the main issues raised in Section 2 and Section 3 and provides some suggestions for areas where further research is needed.

2. A Review of the Terminology

2.1. Definitions

Table 1 is a compilation of the different terms that have appeared in the literature to represent the general subject of citizen-derived geographical information, along with definitions and attributions. Underneath each term is the year in which it appeared. These terms were then divided into types as indicated in the third column of the table. If a term refers to data and/or information collected, then it is labelled with the letter “I” for information. For example, ambient geographic information is actual data collected from users and so the type is “I”. The second type reflects whether a term refers to a process or mechanism that can result in the generation of information—for example, a citizen science initiative. If so, this is denoted by the letter “P” for process in the last column. Figure 1 is an attempt to place all these terms into a single representation that separates out the different terminology for information from the process that can be used to generate it. It must be stressed that Figure 1 is a simplification, aiming to provide a summary of the broad general nature of the topic. Thus, while some of the content extracted from Wikipedia and data from social media can, for example, be georeferenced, the majority of this user-generated content is aspatial. However, some data from social media have been used as passive crowdsourced geographical information.
Examples include, for instance, the use of photographs from Flickr for assessing the accuracy of Corine land cover [9] or the use of Twitter for determining whether earthquakes were felt [8]. We refer to the righthand side of Figure 1 as “crowdsourced geographic information” as this term covers both active and passive contributions while explicitly retaining the spatial dimension of the information. It also encompasses much of the diverse terminology into a single umbrella term.

2.2. Temporal Analysis of the Literature

The abstracts of 25,338 scientific papers, published between 1990 and 2015, which contained any of the terms listed in Table 1 in their title, keywords or abstract were downloaded from Scopus. The data were cleaned to remove English stopwords (conjunctions, pronouns etc.), numbers, punctuation, whitespaces and any words less than three characters long. The words were then stemmed, which is the process of establishing common etymological roots. For example, propose and proposal have the same stem of propos. The cleaned and stemmed abstracts were then organized into a corpus of 24 documents based on the year of publication. Figure 2 summarizes the frequency of their use, updating an initial analysis of such trends in Reference [48]. As expected, terms that describe more general crowdsourcing activities are more frequently used in contrast to GI specific ones but a number of specific temporal trends are evident: the steady rise of User-generated Content and Citizen Science, the long term, steady increase of Swarm Intelligence, the rise and perhaps fall of Mashups and the recent and intense rise of Crowdsourcing.
Here, the analysis of temporal trends in the use of key terms is extended with a focus on their relative search volumes with Google Trends.

2.3. Google Trends Analysis

The Google Trends website [49] allows for the examination of relative search volumes of terms over time. This analysis serves to illustrate trends in popularity of terms that are more mainstream than academic and is an indicator of movements from the academic literature to more layman outlets, e.g., through media and into popular science. Figure 3 shows the trends for the terms crowdsourcing and citizen science together. Both terms were first searched with sufficient volume using Google’s search engine during 2006, and both show an increase in search volume over time to reflect an increasing interest in these subjects. Crowdsourcing, compared to citizen science, has much larger search volumes, but this is unsurprising given the commercial interest in crowdsourcing as a business model. The large peak in the term crowdsourcing coincides with large-scale efforts by citizens to search for the missing Malaysian Airplane (flight MH370); over two million people helped search for the missing aircraft by analyzing satellite images [50]. The rest of the search terms from Table 1 were then put into the Google Trends application. Some search terms do not register a trend with sufficient search volume to generate a graph, including neogeography or terms that are generally more restricted to the academic literature. The term mashup(s) shows considerable search volume but is not displayed here, since mashup is a generic term for integrating data from different sources and can apply to non-geographic applications such as those connected with music or video and, therefore, goes beyond the realm of just spatial mashups that are relevant to this article.
Terms such as GeoWeb and web mapping pre-date 2005, which is around the time when Google Trends started. The web volume for the term GeoWeb was much higher than crowdsourcing, until the last few years when the search volumes are similar (Figure 4). Web mapping shows a steady decline over time and since 2010 is searched much less frequently than the other two terms.
The terms VGI, collaborative mapping and participatory sensing show very small search volumes with minor peaks of activity. These peaks might be linked to times of year when students have searched for references to complete course work or the occurrence of conferences and workshops at specific times of the year. However, when compared with terms such as citizen science, the search volumes of these terms are an order of magnitude lower (Figure 5). This is similar to what was found in the semantic analysis, with low frequencies registered for VGI, collaborative mapping and participatory sensing.

3. The Current State of Crowdsourced Geographic Information

To evaluate the current state of crowdsourced geographic information (which we use here as an umbrella term to include the different terms available), a review was undertaken of existing websites and mobile applications that involve the collection of any type of georeferenced information. The starting point for this review was VGI-Net [14], which was compiled by researchers at the University of California, Santa Barbara, the Ohio State University and the University of Washington in 2011. VGI-Net has not been maintained regularly, so hence the first task was to eliminate sites from the inventory that were no longer in operation (which was roughly half of the original sites on VGI-Net), keep those sites that were still operating, and then add new sites that have emerged since 2011. This resulted in approximately 100 sites and/or applications that have been reviewed. This review is not intended to be comprehensive, since sites and applications are changing all the time. Rather it is intended to provide a large enough sample from which to draw general conclusions about the current state of crowdsourced geographic information. These sites were then evaluated based on a series of criteria, as described below.

3.1. Theme

At the highest level, the sites and applications can be divided into three main types: (i) those that allow users to create and share a map; (ii) those that collect georeferenced data; and (iii) high level data sharing websites contributed by experts but which may include citizen-collected data. Of the roughly 100 sites reviewed, 12 were of the first type and four were of the third type. Therefore, the majority of sites/applications were focused on data collection, and these were further categorized by subject as outlined in Table 2.
The most frequent category of website was in the area of ecology (e.g., species identification), even though the websites and applications reviewed here represent only a small proportion of all the citizen science and crowdsourcing projects that are currently within the field of ecology, biology and nature conservation. Meta-sites maintained by Cornell University and SciStarter list many more that have not been reviewed here. This large number is unsurprising given the very long history of citizen science in these fields, stretching back decades and even centuries before the advent of the Internet [51,52].
Other categories with multiple sites (i.e., greater than 5) include environmental monitoring; location-based social media, where location plays a pivotal role in the social networking function such as sites for connecting people based on proximity; sites of interest/travel with sharing of geo-tagged photographs, videos and travel stories; transport including sites like OpenStreetMap for digitizing roads; and weather data, which covers amateur weather stations, snow depth and avalanche reporting. Disaster mapping is another category that is probably under-represented in this review, since sites tend to appear during events and then disappear post-event, or because contributors are often recruited on the ground and mapping takes place internal to organizations. However, there are at least three permanent sites that are noteworthy, i.e., Ushahidi [53], which is a platform to allow people in affected areas to upload and view georeferenced information online, Tomnod [54] for crowdsourced damage mapping, and the humanitarian arm of OpenStreetMap [55].
Although it is not possible to readily characterize sites by data volumes or number of transactions, there is a relationship between the way that the data are subsequently provided to the public (if access is open) and the amount of data collected. For example, the largest data volumes are often served using APIs (Application Programming Interfaces) as evidenced by sites such as OpenStreetMap, Geograph, Flickr and Twitter. Data volumes also tend to be higher in passively collected geographic information, notably, for instance, in relation to communications, location-based social media, or where sensors are used to collect data such as with transport, weather, hiking, or any site where there is a mobile application to facilitate data collection using mobile-phones or tablets, which is common in many ecology related applications.

3.2. Nature and Types of Crowdsourced Geographic Information

If crowdsourced geographic information is taken to mean any data contributed by the crowd with a geographical reference that could potentially be mapped, the nature of the data can be characterized based on whether it falls into the territory of mapping agencies (or framework data) in the first dimension or axis as shown in Figure 6. Framework data are typically data that are collected by government agencies, which can be organized into the following themes: geodetic control, orthoimagery, elevation, transportation, hydrography, governmental units and cadaster, and which comprise the basic components of a spatial data infrastructure (SDI) [15]. Depending on the country, these datasets may vary (e.g., some countries do not have cadasters, while others may include a gazetteer as part of their SDI). In the second dimension, crowdsourced geographic information can be classified according to whether the data are contributed actively as part of a crowdsourcing system/campaign (hereafter referred to as active crowdsourced geographic information), or whether the data were collected for another purpose and were then mapped (hereafter referred to as passive crowdsourced geographic information).
Figure 6 summarizes the current types of crowdsourced geographic information from the review by category, based on where they fall within the quadrants of these two dimensions. Types of crowdsourced geographic information that were not encountered in the review but which come from other sources such as from academic publications were also added in blue. Figure 6 aims only to provide a simple generalization of the situation. There are, for example, a wide variety of weather related citizen science projects which could occupy different locations in the space depicted in Figure 6; see for example, [56].

3.3. Expertise and Training

The sites that were reviewed were then evaluated based on the amount of expertise required of the participants and the amount of training available. As this applies primarily to active crowdsourced geographic information, only those sites belonging to the categories on the right-hand side of Figure 6 were considered.
In general, it was found that most sites require very little expertise in order to participate, except for Internet and mobile phone literacy. Many sites involve filling in a simple online form, where location is indicated on a map interface or latitude longitude coordinates are input manually. Note that even though the form may be easy to fill out, the collection of the actual information may not be. Other sites involve capturing the information using a mobile application, and uploading photographs and comments, so that spatial coordinates are automatically captured. These sites are at the most basic level and, therefore, often provide little in the way of training materials.
At the next level are sites where users must become familiar with how to characterize different phenomenon (e.g., different types of weather, recognizing different features in satellite sensor imagery etc.) and these sites tended to have some form of training material such as online instructions, videos and/or FAQs, which users could consult. Although the expertise required is minimal, involvement still requires a small learning effort on the part of the participants. Some sites did this more effectively than others.
At the highest level are some of the sites in the hiking/trails, ecology and weather categories. For the hiking/trails category, familiarity with the use of a global positioning system (GPS) is required. While one site had good training materials others did not. In the case of ecological sites, greater expertise is required for those sites that follow strict protocols in data collection, and in a few cases, these require physical attendance at a training session. In the case of the weather category, amateur weather stations require knowledge about installation, which must be in accordance with certain principles in order to satisfy quality concerns. The availability of training materials was, therefore, generally found to be a function of the difficulty of the task and/or whether the data collected were used subsequently for research or other authoritative purposes such as assimilation of weather data into a numerical weather prediction model. Training materials for these higher level sites were either extensive or designed to ensure minimum standards in data quality.

3.4. Crowdsourced Geographic Information Availability and Metadata

Data availability varied across the sites from unavailable (used only internally), only available to those people who contributed, only available to those who have registered and logged in, or more broadly open to everyone. Within these different levels of access, data were available for viewing on a map interface, available for downloading in a variety of formats (notably CSV, KML, KMZ, XML, Atom, GPX), and available via an API. For some sites, the data available were the raw data contributed by individuals, while in other examples this was only the aggregated data from multiple contributors. Some of the sites in the communications, feature mapping, geocaching, location-based social media, sites of interest/travel and transport categories were available via APIs, which reflects those sites with considerable data volumes and demand for the data.
Metadata, in the sense of standards such as those associated with the European Infrastructure for Spatial Information in the European Community (INSPIRE) directive (which requires that member states of the European Union comply with implementing specific rules for metadata), were not mentioned in any of the sites with the exception of one map creation and sharing site called Geocommons. The latter provides the option of sharing the contributed data with metadata that are compliant with ISO19115, a metadata standard for describing geographic information and services.
Metadata, in the sense of documentation of the data, are provided to some degree by all sites that offer access via an API, and to various degrees for other sites that offer the data in other downloadable formats. Some of the downloaded data files were well documented, while others expected users to interpret the headers of the data or the data themselves. Moreover, sites with strong data collection protocols were well documented in terms of metadata, and higher level data sharing sites require detailed metadata with each data set shared via the site.

3.5. Quality and Use of the Data for Research

Given that citizens may vary greatly in expertise and often collect data without regard to established protocols or standards, there is often considerable concern about the quality and usability of the data. The quality of citizen derived data can be viewed from a variety of ways [57]. Many comparative studies have shown that crowdsourced geographic information can be as good, if not better, than data from authoritative sources [58,59]. A comprehensive literature overview of the latest developments in crowdsourced geographic information research is presented in Reference [60], with a focus on trends related to OpenStreetMap while many others have discussed the quality of this volunteer data source [61,62,63]. Of the topics selected by the authors for future research, they emphasize the areas of: Intrinsic data quality assessment, conflation methods which combine crowdsourced geographic information and other data sources, and the development of credibility, reputation, and trust methodologies for crowdsourced geographic information. Data quality remains a topic of great interest and importance in this domain. In their work, the authors of [64] conclude that there is a trade-off between potentially improved data quality of crowdsourced geographic information and the requirement of facilitation and oversight which is resource intensive. Introducing overly burdensome structures to ensure quality could damage the potential contributions from related socially-conscious and citizen-focused data collection and mapping efforts. A review is provided in Reference [65] of the distinct types of citizen science projects and the expectations on the quality of the information they deal with, and in particular the quality of crowdsourced geographic information in those projects. They go on to propose an innovative model based on linguistic decision making for assessing the quality of a crowdsourced geographic information database created in citizen science projects. The authors build this model from the understanding that quality depends on several factors, both extrinsic and intrinsic, but also pragmatic, depending on the intended purpose and user needs, and so a flexible quality assessment method is necessary.
For the majority of sites, it is difficult to establish whether there is any quality control. In other words, quality control may be occurring in the background but may not be apparent from viewing the site alone. Thus, based on a review of what was apparent from the sites alone, most would appear to have no quality control. For those sites where some quality control is in place, this included one or more of the following: automated methods of checking (e.g., answers that fall outside an acceptable range); peer review, which could include comments, actual involvement in the validation process or ranking of the participant (see next item); ranking of participants, whether through an automated procedure or by other users, which may then influence the level of confidence in the contributions provided by the users; use of multiple observations at the same site as a cross-checking mechanism; and review by experts.
There are examples where some minimal qualifications are required (e.g., in some disaster mapping sites such as GEOCAN, a minimum number of years of remote sensing experience are required), which is checked in the registration process. However, the assignment of a reliability score to a user based on his/her experience, or to double-check any submission by a relative novice, does not seem to be commonly undertaken [66].
The greatest evidence of quality control, however, was in sites related to ecology and weather, although map creation sites such as OpenStreetMap and Google MapMaker have a range of quality assurance measures including automated checking, peer review and use of multiple observations. Greater attention to quality was apparent for sites where the data are used for scientific research, with evidence of publications. However, publications using the data were also listed on websites where no quality control was explicitly mentioned.
It is important to note that data quality is traditionally constrained to precise and accurate locations. For some applications and even scientific studies, the data quality issue may not be a problem at all; in other words, the fitness-for-use of the data will depend on the context, which must be well-defined by the potential user. Data quality is an issue when the data are scarce, but some authors argue that it will become less of an issue in the era of big data [67]. For example, the authors of [68] took entire country street networks of France, Germany and the UK and found that while the street networks are incomplete, especially in rural regions, this constituted only a minor problem for their particular study, which aimed to identify scaling patterns in street blocks, because the available data offered millions of street blocks for the countries under study.
Data quality is likely to remain an important topic in crowdsourced geographic information research for some time. A diverse range of approaches exist for quality assessment and control, e.g., [69,70,71], and guidelines for some applications are emerging [72].

3.6. Information about Participants

The sites can be categorized into three types based on the information they obtain about contributors: (i) no registration required, so no tracking of observation with the individual; (ii) registration required but only name and email entered; and (iii) registration required with additional information collected such as address, organization, age, level of expertise, motivation behind participation or registration via a social networking site such as Facebook, which implies additional information is retrievable from participants. The majority of sites reviewed fell into the first two categories, which implies that very little analysis of the crowdsourced geographic information can be undertaken in relation to the background of individuals. Some exceptions include research on contributors to OpenStreetMap [73] and Geo-Wiki [58].

3.7. Incentives for Participation

Understanding the motivations of citizen participants in the crowdsourcing of geospatial data remains one of the principal topics in current and future research. What are the ingredients for a successful crowdsourcing project and how are they achieved and maintained? As some research results are demonstrating, crowdsourcing of geospatial data is sometimes best seen as complementary to professional approaches rather than being considered as a direct competitor or replacement to these traditional approaches. Hence, motivation may be to enhance authoritative data sets rather than replace them, although it has the potential to be a competitor to established public and commercial sources of geographic information [74].
Looking at the sites of active crowdsourced geographic information, two generic incentives for participation can be identified: (i) being part of a good cause or contributing to the greater good, which often involves a one-way information flow (e.g., damage mapping) and (ii) gaining something tangible from the site such as information about traffic problems, evidence of response to reporting of waste/environmental problems, different kinds of advice, access to data, or geocached treasures, which often involve a two-way information flow. In both cases, but only evident in a much smaller number of the sites reviewed, are the use of additional incentives integrated into the contribution process such as social elements like discussion forums, gamification (e.g., through leaderboards and prizes), recognition of effort through achievement levels and interaction with experts. Sites that appear to be less successful, evidenced by a lack of recent contributions, are those that offer only the first type of incentive. An obvious exception to this generalization is OpenStreetMap or GoogleMapMaker, both of which are very successful, but where motivations for participation are not so easily explained. More studies into participant motivation are needed, as suggested by the authors of [75], so that we can understand which crowdsourcing managerial control features such as reward systems, different level of collaboration, voting and commenting or trust-building systems are required to deliver innovative, problem-solving types of crowdsourcing.

4. Discussion and Conclusions

A range of terms to describe the general subject area of citizen-derived geographic information exist and have been used variably over time. Similarly, there are a wide range of Internet sites that, in one way or another, use citizen-derived geographic information. Based on the review of sites, it is clear that most of the crowdsourced geographic information is actively contributed, which implies that motivation, incentives and community building are important considerations in terms of sustainability; a bias to active involvement would be expected given the set of sites selected and passive activity is less apparent. The majority of sites rely on participation because of the desire to aid a greater cause or for a worthy reason as the overarching incentive rather than more tangible incentives. This may have implications for successful and sustainable crowdsourced geographic information collection. Some sites operate for a finite period of time, such as the site on fracking which completed the work for a given state in the USA, or some disaster related sites where the focus is on campaign(s) associated with specific events that eventually end.
The majority of sites do not collect very much information about participants. This may make participation easier but it means that very little research can be undertaken on the relationships between participation, data quality and demographics, or on the understanding of motivational factors. The lack of information on participants may also make it difficult to develop and target training activity. Training provision varied from site to site but the amount of training material provided is a function of the difficulty of the task and the end use of the data (e.g., if for research or other authoritative uses). The lack of information on participants may also hamper some approaches to quality assurance as the background and expertise of the contributors and hence inferred quality of their data is unknown.
Very few sites are focused on the collection of framework data, which is of relevance to national mapping agencies, yet the latter have a strong and growing interest in the use of crowdsourced geographic information [76]. In addition, metadata standards are only mentioned as an option on one site. Research into metadata for crowdsourced geographic information is required, which could build on work such as that undertaken in Reference [77]. Sites that use the data for research or assimilation into models, on the other hand, are strongly driven by data collection protocols. The literature review also indicates that the crowdsourcing of geospatial data is often most suited to complementing professional approaches and that research into conflation methods should be a key area of future research. Minimum data collection protocols that would facilitate the use of crowdsourced geographic information in a way that could complement authoritative data should be developed. Although the free tagging of OpenStreetMap could be viewed as an inclusive approach, it also means that there are inconsistencies in the way that the data are tagged, which limits their use for some applications (e.g., land cover and land use mapping). Further work should also consider what conflationary methods have already been tested and where further developments need to take place. Data collection protocols could be built into the incentivisation strategy of crowdsourced geographic information sites, which is one further area of potential research. The literature also emphasizes the idea of using crowdsourced geographic information for change detection and/or low specification mapping tasks, leaving the more static baseline data to professional data collection.
Information about data quality is lacking on the majority of sites reviewed, but this may still be happening behind the scenes. Some data quality measures were evident when the data were then used for further scientific research, although this was not always the case. The literature highlights quality assurance as a key area for research, e.g., [23,78,79,80], as well as the development of credibility, reputation and trust methodologies for crowdsourced geographic information such as discussed in Reference [81]. Fortunately, a variety of broad approaches for quality assessment and control exist [70], although further work is required if the full potential of citizen-derived data is to be achieved.
Most citizen science and crowdsourcing projects are focused on growing the number of users and the volume of the data since these factors can significantly influence the value of the service. This may bring with it challenges, not least associated with the size and heterogeneity of the data obtained. Indeed, the size, volume, and quantity of data from citizen-orientated approaches to geospatial data generation and collection poses major problems for the extraction of knowledge from these data streams and their subsequent storage, visualization and interpretation [82]. Aspects such as visualization could, however, also play a key role in helping to make sense of increasing data streams of crowdsourced geographic information. The actual usefulness of the geographic data collected was rarely addressed explicitly; one aspect of the research agenda set out in Reference [83] calls for a greater understanding of the different use cases for crowdsourced geographic information, particularly to better understand those applications/domains to which it could contribute the most.
Most of the sites focus only on collecting the data, which are then most often available through their website. It is rare that the projects also provide tools to easily and indiscriminately share the data. Even in those examples where services for sharing the data are provided, they are mostly in the form of “widgets” or “snippets” to integrate the data in a predefined form into another website, severely limiting the possibilities of data reuse (and thus the real power of crowdsourcing). The example of “real” sharing is obviously OpenStreetMap. Examples of predefined and limited sharing are Google MapMaker (only possible with the integration of Google Maps, which has server limitations) and Panoramio (only possible to include specific images). For those sites with large data volumes and demands for data, APIs are used to provide access. This represents the most ideal solution from a database point of view, yet the majority of sites serve the data in flat file formats (or not at all).
This review has not addressed all of the issues relevant to citizen sensing in the geographical information domain. Below is a list of areas that require further research beyond those already mentioned in the discussion above:
  • There is a need to gain a better understanding of the currency of the data. This issue is critical for integration of crowdsourced geographic information with authoritative sources, particularly if crowdsourced geographic information is to be used for change detection. Crowdsourced geographic information is often assumed to be more current than the framework equivalent, but it is not always clear whether this is the case and requires further study.
  • Investigation of how the interrelationships between terms in Table 1 have changed and evolved over time could be undertaken since phrases come in and out of fashion or they become synonyms for related but different activities. This would require an extension of this research into the domain of semiotics, for example, to develop a semantic or text mining analysis of the similarity of the changing contexts within which terms are used and is the subject of a research topic on its own.
  • More research into incentives for participation and citizen motivations is required. More use of online surveys, see for example, [84,85], may help to better meet the needs of citizens in the future. For example, how can citizens be encouraged to map an area that has already been mapped in the last few years or be more actively engaged in change detection mapping?
  • Issues of copyright, ownership, data privacy and licensing will become much more prevalent in the future as data contributed by citizens is integrated with base layers that are created by third parties. Saunders et al. [86] consider the licensing and copyright issues from a Canadian legal perspective when using a range of online current mapping tools. Data privacy laws vary from country to country but generally require the protection of personal information, i.e., information that could allow people to be identified [87]. However, location-based information can reveal personal information that could be disclosed without consent if the users of the data are not careful in how the data are subsequently employed [88]. Ethical issues surrounding the use of crowdsourced geographic information with respect to health and disease surveillance have been raised by Blatt [89] so this is a growing area where further research is needed.
  • Data interoperability was not considered in the above review of websites but if the data are to be used in future projects or for different purposes than those for which the data were originally collected, more research into data standards for crowdsourced geographic information is required. Work is ongoing in this area within the COBWEB citizen observatory project [90] while the authors of [91] have presented a unified model for semantic interoperability of sensor data and VGI.
In summary, this article has clarified the meaning of various terms used in discussion of what we have collectively referred to as crowdsourced geographic information. We have shown the variation in the usage of terms and provided a snapshot of some of the issues connected with contemporary geographical information sites on the Internet. The subject is expected to grow and evolve in the future. Developments in data mining and knowledge discovery may increase the role of passive contributions. Related to this, the volume and diversity of data sets may grow, requiring developments in relation to issues such as data quality assessment, visualization, data harmonization and metadata. The citizens themselves may also feature more prominently with increased attention to their motivation, training and general involvement in tasks increased by activities such as feedback on contributions and on how the data are used. Linked to this, a suite of legal and ethical issues such as those connected to data ownership, responsibility and privacy will require increasing attention.

Acknowledgments

The authors would like to acknowledge the support and contribution of COST Action TD1202 “Mapping and Citizen Sensor” (http://www.citizen-sensor-cost.eu) and partial support from the EU-funded ERC CrowdLand project (No. 617754). Bin Jiang's work is partially supported by special fund of Key Laboratory of Eco Planning & Green Building, Ministry of Education (Tsinghua University), China. The authors would also like to acknowledge COST Action IC1203 “ENERGIC”, in particular Cristina Capineri and Sofia Basiouka, for development of a joint glossary to which the terms in Table 1 were originally contributed. A much condensed and modified version of Table 1 will be published in [92].

Author Contributions

Linda See, Peter Mooney and Giles Foody wrote the paper and worked together with Lucy Bastin, Jacinto Estima, Steffen Fritz, Norman Kerle, Mari Laakso, Hai-Ying Liu, Grega Milčinski, Matej Nikšič, Marco Painho, Andrea Pődör and Ana-Maria Olteanu-Raimond to undertake the review of websites. Alexis Comber did the semantic analysis while Bin Jiang and Martin Rutzinger provided useful comments on the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AGIAmbient Geographic Information
CCGICitizen-contributed Geographic Information OR Collaboratively Contributed Geographic Information
CGIContributed Geographic Information
PPGISPublic Participaton in Geographic Information Systems
PPSRPublic Participation in Scientific Research
iVGIInvoluntary Volunteered Geographic Information
UCCUser Created Content
UGCUser Generated Content
VGIVolunteered Geographic Information

References

  1. McConchie, A. Hacker cartography: Crowdsourced geography, OpenStreetMap, and the hacker political imaginary. ACME Int. E-J. Crit. Geogr. 2015, 14, 874–898. [Google Scholar]
  2. Goodchild, M.F. Citizens as sensors: The world of volunteered geography. GeoJournal 2007, 69, 211–221. [Google Scholar] [CrossRef]
  3. Turner, A. Introduction to Neogeography; O’Reilly: Sebastopol, CA, USA, 2006. [Google Scholar]
  4. Howe, J. The rise of crowdsourcing. Wired Mag. 2006, 14, 1–4. [Google Scholar]
  5. Bonney, R.; Cooper, C.B.; Dickinson, J.; Kelling, S.; Phillips, T.; Rosenberg, K.V.; Shirk, J. Citizen science: A developing tool for expanding science knowledge and scientific literacy. BioScience 2009, 59, 977–984. [Google Scholar] [CrossRef]
  6. Krumm, J.; Davies, N.; Narayanaswami, C. User-Generated Content. IEEE Pervasive Comput. 2008, 7, 10–11. [Google Scholar] [CrossRef]
  7. Jokar Arsanjani, J.; Zipf, A.; Mooney, P.; Helbich, M. OpenStreetMap in GIScience; Lecture Notes in Geoinformation and Cartography; Springer International Publishing: Cham, Switzerland, 2015. [Google Scholar]
  8. Crooks, A.; Croitoru, A.; Stefanidis, A.; Radzikowski, J. #Earthquake: Twitter as a distributed sensor system. Trans. GIS 2013, 17, 124–147. [Google Scholar]
  9. Estima, J.; Painho, M. Flickr geotagged and publicly available photos: Preliminary study of its adequacy for helping quality control of Corine land cover. In Computational Science and Its Applications—ICCSA 2013; Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O., Eds.; Lecture Notes in Computer Science; Springer Berlin Heidelberg: Cham, Switzerland, 2013; pp. 205–220. [Google Scholar]
  10. Harvey, F. To volunteer or to contribute locational information? Towards truth in labeling for crowdsourced geographic information. In Crowdsourcing Geographic Knowledge; Sui, D., Elwood, S., Goodchild, M., Eds.; Springer Netherlands: Dordrecht, The Netherlands, 2013; pp. 31–42. [Google Scholar]
  11. Jiang, B.; Thill, J.-C. Volunteered Geographic Information: Towards the establishment of a new paradigm. Comput. Environ. Urban Syst. 2015, 53, 1–3. [Google Scholar] [CrossRef]
  12. Heipke, C. Crowdsourcing geospatial data. ISPRS J. Photogramm. Remote Sens. 2010, 65, 550–557. [Google Scholar] [CrossRef]
  13. Spyratos, S.; Lutz, M.; Pantisano, F. Characteristics of citizen-contributed geographic information. In Proceedings of the AGILE’2014 International Conference on Geographic Information Science, Castellón, Spain, 3–6 June 2014.
  14. Elwood, S.; Goodchild, M.; Sui, D.Z. Vgi-Net. Available online: http://vgi.spatial.ucsb.edu/ (accessed on 6 December 2013).
  15. Elwood, S.; Goodchild, M.F.; Sui, D.Z. Researching volunteered geographic information: Spatial data, geographic research, and new social practice. Ann. Assoc. Am. Geogr. 2012, 102, 571–590. [Google Scholar] [CrossRef]
  16. Stefanidis, A.; Crooks, A.; Radzikowski, J. Harvesting ambient geospatial information from social media feeds. GeoJournal 2013, 78, 319–338. [Google Scholar] [CrossRef]
  17. Science Communication Unit. Science for Environment Policy Indepth Report: Environmental Citizen Science; University of the West of England: Bristol, UK, 2013. [Google Scholar]
  18. Bonney, R. Citizen science: A lab tradition. Living Bird 1996, 15, 7–15. [Google Scholar]
  19. Miller-Rushing, A.; Primack, R.; Bonney, R. The history of public participation in ecological research. Front. Ecol. Environ. 2012, 10, 285–290. [Google Scholar] [CrossRef]
  20. SOCIENTIZE. White Paper on Citizen Science for Europe; Socentize Consortium: Zaragoza, Spain, 2014. [Google Scholar]
  21. Haklay, M. Neogeography and the delusion of democratisation. Environ. Plan. A 2013, 45, 55–69. [Google Scholar] [CrossRef]
  22. MacGillavry, E. Collaborative Mapping; Webmapper: Utrecht, The Netherlands, 2003. [Google Scholar]
  23. Bishr, M.; Kuhn, W. Geospatial information bottom-up: A matter of trust and semantics. In The European Information Society; Fabrikant, S.I., Wachowicz, M., Eds.; Springer Berlin Heidelberg: Berlin, Germany, 2007; pp. 365–387. [Google Scholar]
  24. Keßler, C.; Maué, P.; Heuer, J.T.; Bartoschek, T. Bottom-up gazetteers: Learning from the implicit semantics of geotags. In GeoSpatial Semantics; Janowicz, K., Raubal, M., Levashkin, S., Eds.; Springer Berlin Heidelberg: Berlin, Germany, 2009; pp. 83–102. [Google Scholar]
  25. Buhrmester, M.; Kwang, T.; Gosling, S.D. Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect. Psychol. Sci. 2011, 6, 3–5. [Google Scholar] [CrossRef]
  26. Estellés-Arolas, E.; González-Ladrón-de-Guevara, F. Towards an integrated crowdsourcing definition. J. Inf. Sci. 2012, 38, 189–200. [Google Scholar] [CrossRef]
  27. Haklay, M. Citizen science and volunteered geographic information: Overview and typology of participation. In Crowdsourcing Geographic Knowledge; Sui, D., Elwood, S., Goodchild, M., Eds.; Springer Netherlands: Dordrecht, The Netherlands, 2013; pp. 105–122. [Google Scholar]
  28. Maceachren, A.M.; Brewer, I. Developing a conceptual framework for visually-enabled geocollaboration. Int. J. Geogr. Inf. Sci. 2004, 18, 1–34. [Google Scholar] [CrossRef]
  29. Tomaszewski, B. Geocollaboration. In Encyclopedia of Geography; SAGE Publications, Inc.: Thousand Oaks, CA, USA, 2010; pp. 1209–1211. [Google Scholar]
  30. Herring, C. An Architecture of Cyberspace: Spatialization of the Internet; U.S. Army Construction Engineering Research Laboratory: Champaign, IL, USA, 1994.
  31. MacGuire, D. GeoWeb 2.0: Implications for ESDI. In Proceedings of the 12th EC-GI&GIS Workshop, Innsbruck, Austria, 21–23 June 2006.
  32. Fischer, F. VGI as big data: A new but delicate geographic data source. GeoInformatics 2012, 3, 46–47. [Google Scholar]
  33. Snook, T. Hacking is a Mindset, not a Skillset: Why Civic Hacking is Key for Contemporary Creativity. Available online: http://blogs.lse.ac.uk/impactofsocialsciences/2014/01/16/hacking-is-a-mindset-not-a-skillset/ (accessed on 16 July 2014).
  34. Sfetcu, N. Game Preview; Nicolae Sfetcu: Bucharest, Romania, 2014. [Google Scholar]
  35. Sui, D. Mashup and the spirit of GIS and geography. GeoWorld 2009, 12, 15–17. [Google Scholar]
  36. Szott, R. Neogeography Defined. Available online: http://placekraft.blogspot.co.uk/2006/04/neogeography-defined.html (accessed on 6 December 2013).
  37. Szott, R. Psychogeography vs. Neogeography. Available online: http://placekraft.blogspot.co.uk/2006/04/psychogeography-vs-neogeography.html (accessed on 6 December 2013).
  38. Burke, J.A.; Estrin, D.; Hansen, M.; Parker, A.; Ramanathan, N.; Reddy, S.; Srivastava, M.B. Participatory Sensing; Center for Embedded Network Sensing: Los Angeles, CA, USA, 2006. [Google Scholar]
  39. Karatzas, K.D. Participatory environmental sensing for quality of life information services. In Information Technologies in Environmental Engineering; Golinska, P., Fertsch, M., Marx-Gómez, J., Eds.; Springer Berlin Heidelberg: Berlin, Germany; Heidelberg, Germany, 2011; pp. 123–133. [Google Scholar]
  40. Bonney, R.; Ballard, H.; Jordan, R.; McCallie, E.; Phillips, T.; Shirk, J.; Wilderman, C.C. Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education; Center for Advancement of Informal Science Education (CAISE): Washington, DC, USA, 2009. [Google Scholar]
  41. Sieber, R. Public participation geographic information systems: A literature review and framework. Ann. Assoc. Am. Geogr. 2006, 96, 491–507. [Google Scholar] [CrossRef]
  42. Shneiderman, B. Science 2.0. Science 2008, 319, 1349–1350. [Google Scholar] [CrossRef] [PubMed]
  43. Bücheler, T.; Sieg, J.H. Understanding Science 2.0: Crowdsourcing and open innovation in the scientific method. Procedia Comput. Sci. 2011, 7, 327–329. [Google Scholar] [CrossRef]
  44. Gartner, G.; Bennett, D.A.; Morita, T. Towards ubiquitous cartography. Cartogr. Geogr. Inf. Sci. 2007, 34, 247–257. [Google Scholar] [CrossRef]
  45. OECD. Participative Web: User-Created Content; OECD: Paris, France, 2007. [Google Scholar]
  46. Tsou, M.-H. Revisiting web cartography in the United States: The rise of user-centered design. Cartogr. Geogr. Inf. Sci. 2011, 38, 250–257. [Google Scholar] [CrossRef]
  47. Tapscott, D.; Williams, A.D. Wikinomics: How Mass Collaboration Changes Everything; Portfolio: New York, NY, USA, 2006. [Google Scholar]
  48. Comber, A.; Schade, S.; See, L.; Mooney, P.; Foody, G. Semantic analysis of citizen sensing, crowdsourcing and VGI. In Proceedings of the AGILE’2014 International Conference on Geographic Information Science, Castellón, Spain, 3–6 June 2014.
  49. Google. Google Trends. Available online: https://www.google.com/trends/ (accessed on 31 October 2015).
  50. Whittaker, J.; McLennan, B.; Handmer, J. A review of informal volunteerism in emergencies and disasters: Definition, opportunities and challenges. Int. J. Disaster Risk Reduct. 2015, 13, 358–368. [Google Scholar] [CrossRef]
  51. Walford, R. The 1996 geographical association land use-UK survey: A geographical commitment. Int. Res. Geogr. Environ. Educ. 1999, 8, 291–294. [Google Scholar] [CrossRef]
  52. Silvertown, J. A new dawn for citizen science. Trends Ecol. Evol. 2009, 24, 467–471. [Google Scholar] [CrossRef] [PubMed]
  53. Ushahidi. Ushahidi. Available online: http://www.ushahidi.com (accessed on 6 December 2014).
  54. Tomnod. Tomnod. Available online: http://www.tomnod.com (accessed on 6 December 2014).
  55. Humanitarian OpenStreetMap Team. Humanitarian OpenStreetMap. Available online: http://hotsom.org (accessed on 6 December 2014).
  56. Muller, C.L.; Chapman, L.; Johnston, S.; Kidd, C.; Illingworth, S.; Foody, G.; Overeem, A.; Leigh, R.R. Crowdsourcing for climate and atmospheric sciences: Current status and future potential. Int. J. Climatol. 2015, 35, 3185–3203. [Google Scholar] [CrossRef]
  57. Antoniou, V.; Skopeliti, A. Measures and indicators of VGI quality: An overview. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 1, 345–351. [Google Scholar] [CrossRef]
  58. See, L.; Comber, A.; Salk, C.; Fritz, S.; van der Velde, M.; Perger, C.; Schill, C.; McCallum, I.; Kraxner, F.; Obersteiner, M. Comparing the quality of crowdsourced data contributed by expert and non-experts. PLoS ONE 2013, 8, e69958. [Google Scholar] [CrossRef] [PubMed]
  59. Dorn, H.; Törnros, T.; Zipf, A. Quality evaluation of VGI using authoritative data—A comparison with land use data in Southern Germany. ISPRS Int. J. Geo-Inf. 2015, 4, 1657–1671. [Google Scholar] [CrossRef]
  60. Neis, P.; Zielstra, D. Recent developments and future trends in volunteered geographic information research: The case of OpenStreetMap. Future Internet 2014, 6, 76–106. [Google Scholar] [CrossRef]
  61. Jokar Arsanjani, J.; Mooney, P.; Zipf, A.; Schauss, A. Quality assessment of the contributed land use information from OpenStreetMap versus authoritative datasets. In OpenStreetMap in GIScience; Lecture Notes in Geoinformation and Cartography; Jokar Arsanjani, J., Zipf, A., Mooney, P., Helbich, M., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 37–58. [Google Scholar]
  62. Barron, C.; Neis, P.; Zipf, A. A comprehensive framework for intrinsic OpenStreetMap quality analysis. Trans. GIS 2014, 18, 877–895. [Google Scholar] [CrossRef]
  63. Yang, A.; Fan, H.; Jing, N.; Sun, Y.; Zipf, A. Temporal analysis on contribution inequality in OpenStreetMap: A comparative study for four countries. ISPRS Int. J. Geo-Inf. 2016, 5. [Google Scholar] [CrossRef]
  64. Cinnamon, J.; Schuurman, N. Confronting the data-divide in a time of spatial turns and volunteered geographic information. GeoJournal 2012, 78, 657–674. [Google Scholar] [CrossRef]
  65. Bordogna, G.; Carrara, P.; Criscuolo, L.; Pepe, M.; Rampini, A. On predicting and improving the quality of Volunteer Geographic Information projects. Int. J. Digit. Earth 2014. [Google Scholar] [CrossRef]
  66. Kerle, N.; Hoffman, R.R. Collaborative damage mapping for emergency response: The role of cognitive systems engineering. Nat. Hazards Earth Syst. Sci. 2013, 13, 97–113. [Google Scholar] [CrossRef]
  67. Mayer-Schönberger, V.; Cukier, K. Big Data: A Revolution That Will Transform How We Live, Work, and Think; Houghton Mifflin Harcourt: Boston, MA, USA, 2013. [Google Scholar]
  68. Jiang, B.; Liu, X. Scaling of geographic space from the perspective of city and field blocks and using volunteered geographic information. Int. J. Geogr. Inf. Sci. 2012, 26, 215–229. [Google Scholar] [CrossRef]
  69. Haklay, M.; Basiouka, S.; Antoniou, V.; Ather, A. How many volunteers does it take to map an area well? The validity of Linus’ Law to volunteered geographic information. Cartogr. J. 2010, 47, 315–322. [Google Scholar] [CrossRef]
  70. Goodchild, M.F.; Li, L. Assuring the quality of volunteered geographic information. Spat. Stat. 2012, 1, 110–120. [Google Scholar] [CrossRef]
  71. Foody, G.M.; See, L.; Fritz, S.; van der Velde, M.; Perger, C.; Schill, C.; Boyd, D.S.; Comber, A. Accurate attribute mapping from Volunteered Geographic Information: Issues of volunteer quantity and quality. Cartogr. J. 2015, 52, 336–344. [Google Scholar] [CrossRef] [Green Version]
  72. Fonte, C.C.; Bastin, L.; See, L.; Foody, G.; Lupia, F. Usability of VGI for validation of land cover maps. Int. J. Geogr. Inf. Sci. 2015, 29, 1269–1291. [Google Scholar] [CrossRef]
  73. Yang, A.; Fan, H.; Jing, N. Amateur or professional: Assessing the expertise of major contributors in OpenStreetMap based on contributing behaviors. ISPRS Int. J. Geo-Inf. 2016, 5. [Google Scholar] [CrossRef]
  74. Neis, P.; Zielstra, D.; Zipf, A. Comparison of volunteered geographic information data contributions and community development for selected world regions. Future Internet 2013, 5, 282–300. [Google Scholar] [CrossRef]
  75. Saxton, G.D.; Oh, O.; Kishore, R. Rules of crowdsourcing: Models, issues, and systems of control. Inf. Syst. Manag. 2013, 30, 2–20. [Google Scholar] [CrossRef]
  76. Olteanu-Raimond, A.-M.; Hart, G.; Foody, G.M.; Touya, G.; Kellenberger, T.; Demetriou, D. The scale of VGI in map production: A perspective of European National Mapping Agencies. Trans. GIS 2015. [Google Scholar] [CrossRef]
  77. Kalantari, M.; Rajabifard, A.; Olfat, H.; Williamson, I. Geospatial Metadata 2.0—An approach for Volunteered Geographic Information. Comput. Environ. Urban Syst. 2014, 48, 35–48. [Google Scholar] [CrossRef]
  78. Allahbakhsh, M.; Benatallah, B.; Ignjatovic, A.; Motahari-Nezhad, H.R.; Bertino, E.; Dustdar, S. Quality control in crowdsourcing systems: Issues and directions. IEEE Internet Comput. 2013, 17, 76–81. [Google Scholar] [CrossRef]
  79. Esmaili, R.; Naseri, F.; Esmaili, A. Quality assessment of volunteered geographic information. Am. J. Geogr. Inf. Syst. 2013, 2, 19–26. [Google Scholar]
  80. Coleman, D. Potential contributions and challenges of VGI for conventional topographic base-mapping programs. In Crowdsourcing Geographic Knowledge; Sui, D., Elwood, S., Goodchild, M., Eds.; Springer Netherlands: Dordrecht, The Netherlands, 2013; pp. 245–263. [Google Scholar]
  81. Bishr, M.; Mantelas, L. A trust and reputation model for filtering and classifying knowledge about urban growth. GeoJournal 2008, 72, 229–237. [Google Scholar] [CrossRef]
  82. Herrera, F.; Sosa, R.; Delgado, T. GeoBI and big VGI for crime analysis and report. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), Rome, Italy, 24–26 August 2015; pp. 481–488.
  83. Mooney, P.; Rehrl, K.; Hochmair, H. Action and interaction in volunteered geographic information: A workshop review. J. Locat. Based Serv. 2013, 7, 291–311. [Google Scholar] [CrossRef]
  84. Land-Zandstra, A.M.; Devilee, J.L.A.; Snik, F.; Buurmeijer, F.; van den Broek, J.M. Citizen science on a smartphone: Participants motivations and learning. Public Underst. Sci. 2016, 25, 45–60. [Google Scholar] [CrossRef] [PubMed]
  85. Reed, J.; Raddick, M.J.; Lardner, A.; Carney, K. An exploratory factor analysis of motivations for participating in Zooniverse, a collection of virtual citizen science projects. In Proceedings of the 2013 46th Hawaii International Conference on System Sciences (HICSS), Wailea, HI, USA, 7–10 January 2013; pp. 610–619.
  86. Saunders, A.; Scassa, T.; Lauriault, T.P. Legal issues in maps built on third party base layers. Geomatica 2012, 66, 279–290. [Google Scholar] [CrossRef]
  87. Scassa, T. Legal issues with volunteered geographic information. Can. Geogr. 2013, 57, 1–10. [Google Scholar] [CrossRef]
  88. Scassa, T. Geographic information as personal information. Oxf. Univ. Commonw. Law J. 2010, 10, 185–214. [Google Scholar] [CrossRef]
  89. Blatt, A.J. Data privacy and ethical uses of Volunteered Geographic Information. In Health, Science, and Place; Springer International Publishing: Cham, Switzerland, 2015; pp. 49–59. [Google Scholar]
  90. Cobweb. Cobweb. Available online: http://cobwebproject.eu (accessed on 6 November 2015).
  91. Bakillah, M.; Liang, S.; Zipf, A.; Arsanjani, J. Semantic interoperability of sensor data with Volunteered Geographic Information: A unified model. ISPRS Int. J. Geo-Inf. 2013, 2, 766–796. [Google Scholar] [CrossRef]
  92. Capineri, C. European Handbook of Crowdsourced Geographic Information; Ubiquity Press: London, UK, 2016. [Google Scholar]
Figure 1. Placing crowdsourced geographic information in the context of the terminology found in the literature and the media. AGI: Ambient Geographic Information; CCGI: Citizen-contributed Geographic Information OR Collaboratively Contributed Geographic Information; CGI: Contributed Geographic Information; PPGIS: Public Participaton in Geographic Information Systems; PPSR: Public Participation in Scientific Research; iVGI: Involuntary VGI: Volunteered Geographic Information.
Figure 1. Placing crowdsourced geographic information in the context of the terminology found in the literature and the media. AGI: Ambient Geographic Information; CCGI: Citizen-contributed Geographic Information OR Collaboratively Contributed Geographic Information; CGI: Contributed Geographic Information; PPGIS: Public Participaton in Geographic Information Systems; PPSR: Public Participation in Scientific Research; iVGI: Involuntary VGI: Volunteered Geographic Information.
Ijgi 05 00055 g001
Figure 2. The frequency of occurrence of different terms found in the literature relating to Crowdsourced Geographic Information.
Figure 2. The frequency of occurrence of different terms found in the literature relating to Crowdsourced Geographic Information.
Ijgi 05 00055 g002
Figure 3. Trend in the term “crowdsourcing” (in blue) and the phrase “citizen science” (in red) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Figure 3. Trend in the term “crowdsourcing” (in blue) and the phrase “citizen science” (in red) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Ijgi 05 00055 g003
Figure 4. Trend in the term “GeoWeb” (in blue) compared with “crowdsourcing” (in red) and “web mapping” (in yellow) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Figure 4. Trend in the term “GeoWeb” (in blue) compared with “crowdsourcing” (in red) and “web mapping” (in yellow) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Ijgi 05 00055 g004
Figure 5. Trend in the term “citizen science” (in blue) and the phrases “collaborative mapping” (in red), VGI (in yellow) and participatory sensing (in green) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Figure 5. Trend in the term “citizen science” (in blue) and the phrases “collaborative mapping” (in red), VGI (in yellow) and participatory sensing (in green) over time. The y-axis is a relative volume expressed between 0 and 100 where the maximum search volume is set to 100.
Ijgi 05 00055 g005
Figure 6. Types of crowdsourced geographic information from the review characterized by framework/non-framework and active/passive. Crowdsourced geographic information in blue comes from other sources, e.g., academic publications.
Figure 6. Types of crowdsourced geographic information from the review characterized by framework/non-framework and active/passive. Crowdsourced geographic information in blue comes from other sources, e.g., academic publications.
Ijgi 05 00055 g006
Table 1. Terminology and definitions found in the literature arranged alphabetically. Type I refers to information generated, while P refers to a process-based term.
Table 1. Terminology and definitions found in the literature arranged alphabetically. Type I refers to information generated, while P refers to a process-based term.
TerminologyDefinitionType
Ambient geographic information (AGI) (2013)This term first appeared in Stefanidis et al. [16] in relation to the analysis of Twitter data. AGI, in contrast to VGI, is passively contributed data in which the people themselves may be seen as the observable phenomena, rather than only as sensors. These observations can therefore help us to better understand human behavior and patterns in social systems. However, the focus can also be on the content of the data.I
Citizen-contributed geographic information (CCGI) (2014)CCGI was introduced in Spyratos et al. [13], where the definition is based on the purpose of the data collection exercise. CCGI therefore has two main components, i.e., information generated for scientific-oriented voluntary activities, i.e., VGI, or from social media, which they refer to as social geographic data (SGD).I
Citizen Cyberscience (2009)Citizen Cyberscience is the provision and application of inexpensive distributed computing power, e.g., the Large Hadron Collider LHC@ home project developed by the European Organization for Nuclear Research (CERN) and SETI@Home.P
Citizen science (Mid-1990s)Citizen science was the name of a book written by Alan Irwin in 1995 which discussed the complementary nature of knowledge from citizens with that of science [17]. Rick Bonney of Cornell’s Laboratory of Ornithology first referred to citizen science in the mid-nineties [18] as an alternative term for public participation in scientific research although citizens have had a long history of involvement in science [19].
A more recent definition from the Green Paper on Citizen Science for Europe [20] reads as follows: “the general public engagement in scientific research activities when citizens actively contribute to science either with their intellectual effort or surrounding knowledge or with their tools and resources. Participants provide experimental data and facilities for researchers, raise new questions and co-create a new scientific culture. While adding value, volunteers acquire new learning and skills, and deeper understanding of the scientific work in an appealing way. As a result of this open, networked and trans-disciplinary scenario, science-society-policy interactions are improved leading to a more democratic research, based on evidence-informed decision making as is scientific research conducted, in whole or in part, by amateur or non-professional scientists.” The idea of more “democratic research” and the democratization of GIS and geographic knowledge has recently been challenged in Reference [21], who argues that neogeography (see below for a definition) has opened up access to geographic information to only a small part of society (technologically literate, educated, etc.).
P
Collaborative mapping (2003)Collaborative mapping is the collective creation of online maps (as representations of real-world phenomena) that can be accessed, modified and annotated online by multiple contributors as outlined in MacGillavry [22].P
Collaboratively contributed geospatial information (CCGI) (2007)CCGI is a precursor to the term VGI, meaning user contributed geospatial information, which appeared in Bishr and Kuhn [23] and again in Keßler et al. [24]. CCGI implies collaboration between individuals while VGI has more of an individual component based on the views of Goodchild—see the definition of VGI below.I
Contributed Geographic Information (CGI) (2013)Harvey [10] distinguishes between CGI and VGI where CGI refers to geographic information “that has been collected without the immediate knowledge and explicit decision of a person using mobile technology that records location” whereas VGI refers to geographic information collected with the knowledge and explicit decision of a person. In VGI, data are collected using an “opt-in” agreement (e.g., OpenStreetMap and Geocaching where users choose to actively participate) in contrast to contributed CGI where data are collected via an “opt-out” agreement (e.g., cell phone tracking, RFID-enabled transport cards, other sensor data). Since opt-out agreements are more open-ended and offer few possibilities to control the data collection, this has implications for quality, bias assessment and fitness-for-use of the data in later analyses or in visualization. Harvey [10] raises issues such as data provenance, potential reuse of the data, privacy (both of the data and the location of the individual) and liability as key concerns for CGI.I
Crowdsourcing (2006)Crowdsourcing first appeared in Howe [4] where it was defined as a business practice in which an activity is outsourced to the crowd. The word crowdsourcing also implies a low cost solution, the involvement of large numbers of people and the fact that it has value as a business model. A classic example of a business-oriented crowdsourcing site is Amazon Mechanical Turk, which provides micro-payments to participants for undertaking small tasks, e.g., classification and transcription tasks [25].
More recently, Estellés-Arolas and González-Ladrón-de-Guevara [26] examined 32 definitions of crowdsourcing in the literature to produce a single definition as follows: “Crowdsourcing is a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task. The undertaking of the task, of variable complexity and modularity, and in which the crowd should participate bringing their work, money, knowledge and/or experience, always entails mutual benefit. The user will receive the satisfaction of a given type of need, be it economic, social recognition, self-esteem, or the development of individual skills, while the crowdsourcer will obtain and utilize to their advantage what the user has brought to the venture, whose form will depend on the type of activity undertaken.” This definition emphasizes the online nature of the activity, which makes it narrower than other definitions in this table. Data collection in citizen science projects can be undertaken in the field or using paper forms. Moreover, not all crowdsourcing need be open to all but could be restricted geographically or to groups with certain expertise. Digital and educational divides also impose barriers on participation. Finally, crowdsourcing may not always entail mutual benefit if the data collected are then used for another purpose that differs from the one for which they were originally intended.
P
Extreme citizen science (2011)Extreme citizen science can be attributed to Muki Haklay and his team at UCL (Excites). Extreme citizen science is at level 4 (or the highest level) of participation in the typology presented in Haklay [27]. Level 4 refers to collaborative science where the citizens participate heavily in, or lead on problem definition, data collection and analysis. It conveys the idea of a “completely integrated activity … where professional and non-professional scientists are involved in deciding on which scientific problems to work and on the nature of the data collection so that it is valid and answers the needs of scientific protocols while matching the motivations and interests of the participants. The participants can choose their level of engagement and can be potentially involved in the analysis and publication or utilisation of results.” Scientists have more of a role as facilitators or the project could be entirely driven and run by citizens.P
Geocollaboration (2004)First defined by MacEachren and Brewer [28] as “visually-enabled collaboration with geospatial information through geospatial technologies.” Geocollaboration involves two or more people to solve a problem or undertake a task together involving geographic information and a computer-supported environment. Tomaszewski [29] emphasizes that geocollaboration is multidisciplinary in nature, drawing upon human-computer interaction, computer science and psychology, and that it is a subset of the more general computer-supported collaborative work.P
Geographic citizen science (2013)Citizen science with a geographic or spatial context. The term appears in Haklay’s [27] chapter on typology of participation in citizen science and VGI.P
GeoWeb (or GeoSpatialWeb) (or Geographic World Wide Web) (1994/2006)The GeoWeb is the merging of spatial information with non-spatial attribute data on the web, which allows for spatial searching of the Internet. The concept (but not the actual term) was first outlined by Herring [30]. MacGuire [31] describes the GeoWeb 2.0 as the next step in the publishing, discovery and use of geographic data. It is a system of systems (GIS clients and servers, service providers, GIS portals, standards, collaboration agreements, etc.), which is very much in line with the idea of GEOSS (Global Earth Observation System of Systems).P
Involuntary geographic information (iVGI) (2012)This term first appeared in a paper by Fischer [32]. iVGI is defined as georeferenced data that have not been voluntarily provided by the individual and could be used for many purposes including mapping but also for more commercial applications such as geodemographic profiling. These type of data are usually generated in real-time from various kinds of social media.I
Map Hacking/Map Hacks/Hackathons (and Appathons) (1999)The term “hacker” has been used to refer to someone who tries to break into a computer system. A more positive use of the term is someone who can devise a clever solution to a programming problem; someone who generally enjoys programming; or someone who can appreciate good “hacks” [33]. The term “Map hacking” has been used quite specifically in relation to computer/video games in which a player executes a program that allows them to bypass obstacles or see more of what they should actually be allowed to see—essentially a type of cheating [34]. However, a positive usage relates to creating creative and useful solutions with digital maps, e.g., see the book called “Hacking Google Maps and Google Earth” or “Google Maps Hack” or “Mapping Hacks: Tips & Tools for Electronic Cartography”. Hackathons such as “Random Hacks of Kindness” have resulted in geospatial solutions in the area of post-disaster response. Appathons are now appearing with a particular emphasis on developing mobile applications.P
Mashup (1999 or around time of Web 2.0)The term mashup was borrowed from the music industry where it originally denoted a piece of music that had been created by blending two or more songs. In a geographic context, a mashup is the integration of geographic information from sources that are distributed across the Internet to create a new application or service [35]. Mashup can also refer to a digital media file that contains a combination of elements including text, maps, audio, video and animation, to effectively create a new, derivative work for the existing pieces.P
Neogeography (2006)Neogeography has been defined by Turner [3] as the making and sharing of maps by individuals, using the increasing number of tools and resources that are freely available. Implicit in this definition is the movement away from traditional map making by professionals. The definition of neogeography by Szott [36,37] encompasses broader practices than GIS and cartography and includes everything that falls outside of the professional domain of geographic practices.P
Participatory sensing (2006)Participatory sensing was introduced by Burke et al. [38] as the use of mobile devices deployed as part of an interactive participatory sensor network which can be used to collect data and share knowledge. The data and knowledge can then be analyzed and used by the public or by more professional users. Examples include noise levels collected by built-in microphones and photos taken by mobile devices which can be used to gather environmental data. Often used together with environmental monitoring and recently developed by Karatzas [39].P
Public participation in scientific research (PPSR) (2009 for Bonney et al. review [38] but is most likely older)PPSR was reviewed by Bonney et al. [40] in relation to informal science education. PPSR is defined as “public involvement in science including choosing or defining questions for the study; gathering information and resources; developing hypotheses; designing data collection and methodologies; collecting data; analyzing data; interpreting data and drawing conclusions; disseminating results; discussing results and asking new questions.” Bonney et al. [40] categorize PPSR projects into three main types: contributory (mostly data collection); collaborative (data collection and refining project design, analyzing data, disseminating results; and co-created (designed together by scientists and the general public where the public inputs to most or all of the steps in the scientific process). PPSR appears to be equivalent to citizen science, with the typology defined by Bonney et al. [40] mapping fairly closely onto that of Haklay [27].P
Public Participation Geographic Information Systems (PPGIS) (1996)The term PPGIS (Public Participation Geographic Information Systems) has its origins in a workshop organized by the National Center for Geographic Information and Analysis (NCGIA) in Orono, Maine USA, on 10–13 July 1996. PPGIS are a set of GIS applications that facilitate wider public involvement in planning and decision making processes [41]. PPGIS has been identified as relevant in processes of urban planning, nature conservation and rural development, among others.P
Science 2.0 (2008)Coined by Shneiderman [42], the term Science 2.0 refers to the next generation of collaborative science enabled through IT, the Internet and mobile devices, which is needed to solve complex, global interdisciplinary problems. Citizens are one component of Science 2.0.P
Swarm Intelligence (2011 but may be older)Appears in Bücheler and Sieg [43] as a “buzzword” for paradigms like citizen science, crowdsourcing, open innovation, etc. From an Artificial Intelligence (AI) perspective, however, swarm intelligence refers to a set of algorithms that use agents (or boids) and simple rules to generate what appears to be intelligent behavior. These algorithms are often used for optimization tasks and often rules for success in various contexts are derived from the emergent behaviours observed.P
Ubiquitous cartography (2007)Defined in Gartner et al. [44] as “… the study of how maps can be created and used anywhere and at any time.” This term emphasizes the idea of real-time, in situ map production versus more traditional cartography and covers other domains such as location-based services and mobile cartography.P
User-created content (UCC) User-generated content (UGC) (2007 but likely older)UCC/UGC arose from web publishing and digital media circles. It consists of users who publish their own content in a digital form (e.g., data, videos, blogs, discussion forum postings, images and photos, maps, audio files, public art, etc.) [45]. Other synonyms for UCC/UGC are peer production and consumer generated media. More recently, Krumm et al. (2008) refer to “pervasive UGC” where UGC moves from the desktop into people’s lives, e.g., through mobile devices.I
Volunteered Geographic Information (VGI) (2007)First coined by Goodchild (2007), VGI is defined as “the harnessing of tools to create, assemble, and disseminate geographic data provided voluntarily by individuals”. In Schuurman (2009), Goodchild argues that crowdsourcing implies a kind of consensus-producing process and the assumption that several people will provide information about the same thing so it will be more accurate than VGI. VGI, on the other hand, is produced by individuals without any such opportunity for convergence. Elwood et al. (2012) define VGI as spatial information that is voluntarily made available, with an aim to provide information about the world.I
Web mapping (Mid-nineties)A term used in parallel with the development of web-based GIS solutions, which has recently evolved to mean “the study of cartographic representation using the web as the medium, with an emphasis on user-centered design (including user interfaces, dynamic map contents, and mapping functions), user-generated content, and ubiquitous access” and appears in Tsou [46].P
Wikinomics (2006)The name of a book by Tapscott and Williams [47], wikinomics embodies the idea of mass collaboration in a business environment. It is based on four principles: (a) openness; (b) peering (or a collaborative approach); (c) sharing; and (d) acting globally. The book itself is meant to be a collaborative and living document that everyone can contribute to.P
Table 2. Subject area of crowdsourced geographic information sites in the review.
Table 2. Subject area of crowdsourced geographic information sites in the review.
SubjectDescription
CommunicationsProviding IP addresses, mobile cell ids, wireless networks
Crime/Public SafetyMap showing reported crimes
Disasters (natural and man-made)Mapping after a natural or manmade disaster
EcologySpecies identification, reporting of roadkill, species counts
EducationEnvironmental monitoring in schools, e.g., through the GLOBE (Global Learning and Observations to Benefit the Environment) program, where the primary focus is education
Environmental monitoringWater levels and quality
Feature mappingMapping of buildings, other features of interest
FishingFishing hotspots, stories, community building
GazetteerPlace name site
GeocachingGeocaching is an outdoor location-based treasure hunting game (http://www.geocaching.com).
Hiking/TrailsTrail guides, GPS trails plotted on a map/mobile device
Land coverSatellite and photograph classification by volunteers, e.g., Geo-Wiki and Picture Pile
Location-based social mediaSites that bring together people in close proximity, photo sharing sites, georeferenced check-in data, which has been used for mapping natural cities, etc.
Mobile data/BehaviorUsed to target customers by location
Search engine dataGoogle Trends, e.g., Google applications for monitoring trends in flu and dengue fever using archive of search data
Sky/StarsIdentification of stars, condition of the sky
Places of interest/TravelStories (text and video) and photos of places of interest; travel guides; travel advice
TransportNavigation, real-time traffic, cycle routes, speed traps, mapping of roads
WeatherWeather data collection, snow depths, avalanches

Share and Cite

MDPI and ACS Style

See, L.; Mooney, P.; Foody, G.; Bastin, L.; Comber, A.; Estima, J.; Fritz, S.; Kerle, N.; Jiang, B.; Laakso, M.; et al. Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information. ISPRS Int. J. Geo-Inf. 2016, 5, 55. https://doi.org/10.3390/ijgi5050055

AMA Style

See L, Mooney P, Foody G, Bastin L, Comber A, Estima J, Fritz S, Kerle N, Jiang B, Laakso M, et al. Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information. ISPRS International Journal of Geo-Information. 2016; 5(5):55. https://doi.org/10.3390/ijgi5050055

Chicago/Turabian Style

See, Linda, Peter Mooney, Giles Foody, Lucy Bastin, Alexis Comber, Jacinto Estima, Steffen Fritz, Norman Kerle, Bin Jiang, Mari Laakso, and et al. 2016. "Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information" ISPRS International Journal of Geo-Information 5, no. 5: 55. https://doi.org/10.3390/ijgi5050055

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop