Introduction

In the current polarised political and media environment (), with public access to a vast choice of information sources (), there is an increasing need for effective public engagement and science communication. There is, therefore, an argument for the democratisation of science, to make information accessible to everyone, to engage the public in scientific issues, and to involve them in scientific research endeavours (; ). Democratizing science in ecology and conservation has the potential to increase understanding of environmental issues and scientific research methods, catalysing bottom-up action, greater environmental stewardship, and ecological conservation. Furthermore, scientists can involve the public in the research process through gaining insight into local knowledge and value systems, and through volunteer contributions to data collection and interpretation (). Involving the public in research can be a highly effective means of public engagement and science communication, as it involves sustained, longer-term engagements. Also, there is often a two-way dialogue in which both the public and researchers can provide input and feedback, consulting and collaborating on the research (; ). One way that public engagement is increasingly embedded in ecological research is through data collection by members of the public. For ecology and conservation, specifically, the public can contribute to species monitoring and biological recording, documenting species’ occurrences to track species’ distribution, abundance, and/or phenology ().

Volunteers play a key role in biological recording and have been contributing to ecological datasets for centuries (; ; ; ). This process falls under the overarching term citizen science which broadly encompasses any volunteer involvement in science (). The term was coined in the 1990s as a strategy for improving public trust and understanding in science (). More recently, the term has been adopted to describe a range of initiatives and research endeavours across disciplines (), with citizen science now featuring more in published literature (). Within the field of ecology, in addition to biological recording, citizen science schemes can also include tasks such as identifying species from photographic records or digitising data associated with specimen collections ().

Citizen science recording schemes have collected some of the longest-running time-series datasets of species populations (). Such datasets play a key role in assessments of species’ changes in relation to pervasive anthropogenic pressures such as climate change, pollution, invasive species, and urbanisation (). Biological recording benefits from contributions by volunteers because those contributions increase the geographical range and temporal span over which species can be recorded, providing long-term species-distribution datasets that can be used to assess and compare ecological trends (). These recording schemes typically rely on ad hoc, opportunistic records, although there are examples of hypothesis-led citizen science schemes, as well as schemes that have set up standardized monitoring protocols (; ).

Data quality is a concern with citizen science data, as generally unstructured sampling protocols can introduce bias and noise (; ; ). This can present challenges when analysing citizen science datasets and can limit the scientific questions that can be addressed (). The accuracy of citizen science data has also been questioned, owing to issues surrounding validation and verification (). Validation is a process through which records are checked to ensure the data have been submitted correctly. Verification is the process of checking records for correctness; within ecological citizen science schemes, this generally means confirming species identity (). Verification is a critical process for ensuring data quality of, and trust in, citizen science datasets (), enabling those datasets to be used in environmental research, management, and policy development ().

In this review, we explore the different approaches that published citizen science schemes use to verify their data, the breadth of information they use to verify each record, and the citizen science scheme attributes that may influence choice of verification approach. Our aims are to identify the options available for verification of citizen science data and to examine whether citizen science schemes are using the most suitable verification approach to maximise confidence in, and validity of, the data, whilst also ensuring efficient verification of records.

Systematic Review Method

To survey the verification approaches across existing citizen science schemes, we conducted this review based on the systematic review protocol developed by the Collaboration for Environmental Evidence (). The search terms we used were replicated from a review of the diversity and evolution of citizen science programmes carried out by Pocock et al. (). These terms were “citizen science,” “take part AND (nature OR environment),” “volunteer based monitoring,” “public participation in scientific research,” and “participatory science.” We also used the search term “volunteer.” Searches were carried out in October and November 2019 using Web of Science, and were filtered by “ecology,” “zoology,” “entomology,” and “ornithology.” To ensure that our keyword searches in Web of Science were not missing large components of the literature that might be found elsewhere, additional searches for the terms “ecology AND (volunteer OR citizen science)” were carried out using Google Scholar, and the first 100 results were reviewed.

We excluded papers if there were no mentions of a specific citizen science scheme, or if volunteers had been recruited to assist with the research but the contributions did not continue beyond the study and were not linked to a particular scheme. For example, Flaherty and Lawton () requested, using various media outlets, information on grey squirrel, red squirrel, and pine marten sightings by the general public; public sightings were used alongside hair tube and live trapping surveys to assess species distributions. In another example, data were collected from recreational anglers to combine with mark-recapture data to estimate populations of fish species (). These volunteer contributions were only for the duration of the study and were not linked to any particular scheme. We also excluded review papers, or results that discussed citizen science from a theoretical point of view. Finally, we excluded papers if the citizen science scheme focused on collecting data solely on the abiotic environment. These schemes included those collecting data on water quality () or on soil quality (). Where papers had used data from multiple schemes, we recorded all of the schemes included in the research. Citizen science schemes nested within a larger citizen science initiative or repository were considered separately if the paper identified the specific scheme. For example, Snapshot Serengeti (), Penguin Watch (), and Season Spotter () were referenced specifically, even though they all fall under the Zooniverse citizen science community, and therefore we recorded them as separate schemes. By contrast, Torney et al. () referenced only the Zooniverse, and therefore the Zooniverse was also recorded. The search yielded 434 papers (see Supplemental File 1 for full reference list), which drew on 259 citizen science schemes (see Supplemental File 2 for full list of schemes).

The search strategy aimed to encompass a broad range of citizen science programmes, including recording schemes that do not identify as a citizen science scheme but do fit the definition of citizen science. It is, of course, likely that schemes will have been overlooked by the searches—most notably, schemes that have not led to published outputs. The term citizen science has been widely used only in recent decades, although volunteers have been contributing to ecological datasets for centuries (; ; ; ), and therefore such volunteer contributions may not be linked to a specific volunteer recording scheme and are not referenced in literature. Furthermore, schemes may not provide information on the citizen science scheme attributes or verification approach publicly, and therefore would not be included in the results of this literature review. Although these searches did identify some schemes from non-English-speaking communities and regions, the search strategy is inherently biased towards schemes that operated in English (). These biases in the search methodology should not systematically impact the conclusions of the review.

Identifying verification approaches and citizen science scheme attributes

Verification approaches used by citizen science schemes were not always documented in the paper itself. Therefore, we carried out searches to obtain information on verification approaches and the information used to verify records, as well as citizen science scheme attributes, in both academic and non-academic search engines. We obtained this information from either the published literature in which the scheme featured or the scheme’s public online platform, which may be a website specifically for a scheme, or a web page embedded within a larger website (see Supplemental File 2 for full list of schemes, attributes, and sources).

For each citizen science scheme, we identified the following attributes: number of species recorded through the scheme, number of occurrence records collected through the scheme, data type, number of participants, geographical extent, and duration in years. Data type refers the amount of information or evidence needed to submit an occurrence record to a scheme. For example, some schemes require photos, recordings, or physical specimens to be submitted before an occurrence can be confirmed. Other schemes allow indirect or direct sightings to be submitted without further evidence. Indirect sightings include observations such as mammal tracks or dung at a given location. Direct sightings refer to a species being observed but the minimum information required for an occurrence to be submitted is species name, location, and date.

Data analysis

We performed simple analyses to investigate two questions. First, we asked what attributes of schemes influence whether we were able to find information on their approaches to verification. Second, using those schemes for which we were able to find information on approaches to verification, we asked which attributes of the schemes influenced the approaches that were used.

Some attribute categories included very few schemes. Therefore, we aggregated some categories for our analysis. Specifically, we classified numbers of participants as either ≤ 1,000 or > 1,000; numbers of records as either ≤ 1 million or > 1 million; and data type as either “No evidence” (for reports of direct or indirect sightings without physical evidence) or “Evidence available” (for those data points associated with specimens, photographs, or recordings).

To assess whether scheme attributes influence whether or not we were able to find information on verification approaches, we focused on schemes for which all scheme attributes were available. Inevitably, this biased the data towards schemes with more complete and accessible information. However, this was necessary for a complete investigation of which scheme attributes seemed most predictive of whether verification information could be identified, and still resulted in reasonable sample sizes of schemes with and without verification information. Using this focused dataset, we ran a binary logistic regression including the main effects of geographic scale, participant numbers, record numbers, species numbers, data type (all categorical), and scheme duration (continuous). We used the dredge function from package MuMIn () to determine the most informative models nested within this global model.

To assess which scheme attributes appear to influence verification approach, we used multinomial regression (function multinom from package nnet; ). Specifically, we modelled the probability that expert, automated, community consensus, or other verification approaches would be used as a function of the same scheme attributes included in the saturated binary logistic regression. Once again, we focused on only those schemes for which all attributes were available. Some schemes used more than one approach, in which case those schemes appeared in our data set once for each approach used. The dredge function was used again for model selection, considering main effects only.

Results

Summary of citizen science recording schemes

Of the 259 citizen science schemes, the focal taxa were birds (N = 97), invertebrates (N = 67), mammals (N = 24), plants and fungi (N = 17), and amphibians and reptiles (N = 8). As well, there were schemes that allowed any taxa to be recorded (N = 27) and schemes that focused on marine taxa (N = 9). There were also schemes that recorded invasive species (N = 6) and schemes that recorded roadkill (N = 4). There was substantial variation in the number of species recorded through the schemes. Where this information was available (N = 203), 68 schemes had recorded 1–10 species, 50 schemes had recorded 11–100 species, 59 had recorded 101–1,000 species, 15 had recorded 1,001–10,000 species, and 11 had recorded more than 10,000 species.

Of the schemes for which record number was available (N = 140), 12 schemes had fewer than 1,000 records, 95 schemes had between 1,000 and 1 million records and 33 had more than 1 million records. The data type submitted with each record varied across schemes: 18 allowed indirect sightings to be submitted, 165 required direct sightings to be submitted, 51 required photo submissions, 10 required recordings, and 15 required specimens to be submitted.

To determine the number of citizen scientists involved in each scheme, we included both those who collected data and registered users who may verify data. Of the schemes for which this information was available (N = 165), 76 had between 1 and 1,000 participants, 86 had between 1,000 and 1 million participants, and 3 had more than 1 million participants.

In terms of geographical extent, 17 schemes collected data at a global, cross-continental scale. Across the remaining schemes, 34 operated across multiple countries within the same continent, 125 schemes collected data at a country level, and 83 schemes operated at a regional level (i.e., the level of a region within a country). There were schemes operating on every continent besides Antarctica, with 106 in Europe, 96 in North America, 17 in Oceania, 10 in Asia, 8 in Africa, and 5 in South America.

The schemes we reviewed spanned a wide range of ages. Of schemes where duration was available (N = 225), 90 schemes had been running for less than 10 years, 64 had been running for between 10 and 20 years, 34 had been running between 20 and 30 years, and 37 schemes had been running for longer than 30 years.

Approaches to data verification in citizen science schemes

Across the 259 citizen science schemes, no information was found on verification approach for 117 of the schemes. Within the schemes for which verification information was found, 118 schemes relied on expert verification, 24 verified data through community consensus, and 14 used automated approaches, which encompassed algorithmic approaches without human classification. Several of the schemes used multiple verification approaches, and all of the schemes that used automation to verify data used at least one other method of verification on a subset of the data. Most commonly, automation was used alongside expert verification. Other verification approaches included using existing independent () or expert () datasets to confirm the likely accuracy of citizen-submitted records, and carrying out follow-up surveys in a subset of locations ().

The information used to verify citizen science data refers to the record-level information that is used by citizen science schemes when carrying out data verification of species occurrences. This was categorised as species, environmental context, and recorder expertise. Species information is based on ease of identification (), confusion with other species (), rarity, and co-occurrence with other species. Environmental context takes into account the time, date, and location of the observation and, therefore, whether the species’ occurrence was likely given the time of day, season (), habitat (), documented range of the species (), and phenology (). Attributes of the recorder that are of interest could include the experience and expertise of the individual submitting the record. This can be considered qualitatively when submitting the record, by asking the recorder to state their confidence in identification () or experience with biological recording (). Recorder expertise can also be quantified after record submission, using metrics such as how long the individual has been participating in the scheme, volume of records submitted, and accuracy of previously submitted records (; ). Schemes can also use novel approaches to account for recorder expertise. One example of this is iSpot, in which recorders develop a taxon-specific reputation via points earned once records they have submitted are verified as correct by other participants ().

Schemes were allocated to one or more of these categories based on information provided by the scheme on its verification approach. For many schemes, these details were not publicly available. Furthermore, individual expert verifiers may take into account all, or a combination, of these factors on a record-by-record basis, using their regional and taxonomic expertise as well as their personal knowledge of individual contributors’ abilities to identify species correctly. Therefore, it is unlikely that we were able to catalogue for our analysis all of the information considered by schemes and verifiers. Across the schemes for which the required information was available, 105 used information on the species itself, 86 considered the environmental context, and 13 used information on recorder expertise. The majority of schemes used species information and environmental information together.

Citizen science scheme attributes and verification approach

We restricted our analysis to 103 schemes with complete information on scheme attributes. As expected, this biased schemes towards those with available verification information (all data: schemes with verification information = 142, schemes without = 117; complete attribute data: schemes with verification information = 73, schemes without = 30; Fisher’s test, p = 0.006). Nevertheless, we were still able to model the propensity for verification information to be found. The best-performing model (based on Akaike information criterion) included data type, number of records, and scheme duration (Figure 1). Only more complex versions of the same model had ΔAICc < 6, and ΔAICc for the null model was > 8.

Figure 1 

The probability of verification information being found given the numbers of participants (left panel, ≤ 1 million [M]; right panel, > 1 million [M]), duration of schemes, and data type. Fitted probabilities (lines) and standard errors (filled polygons) are estimated using the best-performing binary logistic regression model.

Using the 73 schemes for which scheme attributes and verification approach were found, we modelled the factors that best predicted the verification approaches used. Among the schemes we considered, 61 used expert approaches, 7 used automated approaches, 12 used community consensus approaches, and 8 used other approaches. Given the low sample sizes, there was limited evidence of clear predictive effects of scheme attributes. Among the models examined, only those including number of participants, data type, or both, performed better than the null (ΔAICc for the null model was 1.9). Recognising that these are weakly supported effects, we nonetheless note that a model including both number of participants and data type suggests that: (i) automated approaches are used only for schemes with more participants and are slightly more common for schemes without physical evidence; (ii) community consensus approaches are more common for schemes with more participants and for which evidence is available; (iii) expert approaches are more common in schemes with fewer participants, but for schemes with more participants, they are more common when no physical evidence is available; and (iv) other approaches are most common for schemes with a smaller number of participants and for which no tangible evidence is available (Figure 2).

Figure 2 

The probability of each verification approach (see panel headings) being used for schemes with different numbers of participants and different data types. Fitted probabilities (filled columns) are estimated using the best-performing parameters in multinomial regressions.

Discussion

With data quality as a key concern across citizen science datasets, there is a need to ensure validity and increase trust of these data through verification. This review identifies patterns in approaches to data verification among citizen science schemes. By identifying the range of approaches available and by considering scheme attributes that appear to contribute to choices in verification approach, we demonstrate the options available to both new and existing schemes. Here, we also present an idealised system for data verification, identifying where and how such a system could be implemented within citizen science schemes.

Existing patterns in verification of citizen science data

No information on data verification was found for over 40% of the schemes we reviewed. Our analyses suggest that information on verification was less likely to be found for older schemes, schemes with fewer participants, and schemes that do not require the contribution of physical evidence (specimens, photos, or recordings). Lack of available verification information does not mean that no verification is carried out; for schemes that lack a web presence and do not report verification methods in publications, verification methods are simply not publicly available or therefore are hard to identify. There may, however, be schemes that do not consider verification, trusting the recorders’ abilities to report species correctly (). This may be justifiable if schemes specifically recruit knowledgeable volunteers () or provide training to volunteers before surveying (). Some citizen science schemes focus recording effort on selected days annually (). In these cases, volunteers may be joined and led by an expert () and therefore errors could be identified and corrected, in person, during the data collection. Smaller-scale citizen science schemes may focus on collaborative, community-based approaches with small numbers of participants (). In these cases, there may be an established trust amongst members, or verification may happen more informally between participants. Acknowledging this, there is still an imperative to report on verification methods to increase trust in the dataset and to benefit end users of the data. Arguably, this imperative is even more pronounced for those schemes that do not require physical evidence, for which verification information is currently harder to find. If there is transparency in verification approach, then the data quality can be better understood, and potential error or bias can be quantified and accounted for in analyses of the data ().

Where verification information was available, expert verification was the most common approach. Verification by experts, although not flawless (), has a high accuracy (), and therefore may be a more suitable approach to obtain the level of data quality required for published research outputs (). Furthermore, schemes that monitor rare () or invasive species, for which accuracy of individual records is crucial to guide management actions (; ), require expert verification to pinpoint occurrences and ensure high-quality data. Expert verification can be time consuming for large datasets (; ), and schemes that operate at a large geographic scale rely on extensive networks of taxonomic and regional experts (). A lack of verifiers in certain regions or with particular specialisms can lead to gaps in verified data (). As a result, there can be a significant time lag between submission and verification of records ().

Community consensus was the second most common verification approach. It was more common among schemes with a larger number of participants and for schemes that required evidence to be submitted with each record. Community consensus may be preferable for schemes with sufficient participants, as crowdsourcing the assessment of physical evidence spreads the task of verification across a greater number of individuals, and can be particularly useful when verifying camera trap datasets, which can rapidly grow to very large sizes (; ). Community consensus approaches can also be used alongside automated approaches in a hierarchical verification system (). Once multiple users have classified a record, consensus algorithms can be applied to analyse classifications and to categorise confidence in a record (; ). Community consensus approaches also have the potential to enhance public engagement and community development. Diversifying the tasks in which citizen scientists can be involved can make the scheme more accessible to those who do not have the access or mobility to go to areas where they can record species (). When using community consensus approaches, expert verification may still be required if datasets contain species that are less straightforward to identify, such as commonly-confused species pairs (). This approach relies on a large number of citizen scientists investing time in the scheme (; ), and therefore may not be suitable for schemes with smaller numbers of users. Furthermore, if community consensus approaches are used for schemes that operate on a global scale and record many species, the community may not have the local knowledge required to verify records for species that are less straightforward to identify or are less well known amongst the general public (). As a result, the verified data in these schemes may be skewed toward widely recognised, charismatic species.

Perhaps unsurprisingly, owing to their recent emergence, automated approaches were not widely used among the subset of citizen science schemes reviewed. Schemes that used automation, did so in conjunction with other methods including, most frequently, expert verification. Automation is typically the first step in the verification process, with records being checked for a range of attributes. These include whether they are in the expected geographical and temporal range, if the species is particularly rare, or for schemes that ask for the number of individuals of a species recorded, whether that number is unusually high (; ; ; ). Any records that do not meet set criteria are flagged and then sent to expert verifiers (; ; ). Automation reduces the burden on expert verifiers by decreasing the volume of records that require verification. Automated approaches are widely applicable across citizen science schemes and can be applied to records for a huge diversity of taxa (). Automation is the most time-efficient way of verifying citizen science data, allowing data to be reviewed in real time as records are submitted, as well as—potentially—providing citizen scientists with immediate feedback on their submissions (; ; ; ). From the perspective of participant involvement, having rapid feedback on submitted records has the potential to strengthen engagement and to increase motivation to continue recording (). Although automation can reduce the number of records that require expert review, careful consideration of the verification rules is required to reduce the burden on experts without leading to classification errors ().

With the distributions and abundances of many species changing rapidly in response to persistent anthropogenic environmental change, timely and accurate verification is important to ensure the availability of up-to-date biodiversity information (). Verification by experts has perhaps been the default approach for citizen science schemes in the past (; ). With the growing volume of citizen science data that has been and will continue to be collected, there is an argument for schemes to explore and implement other verification approaches that allow large quantities of data to be verified more efficiently. The most appropriate verification approach may vary from scheme to scheme, and research may be required to assess the risks or rewards of alternative approaches. Expert verification is likely always to be required for a subset of the data, but given the emergence of community consensus and automated verification in recent decades (), these approaches should be carefully considered for schemes moving forward. As the position of citizen science in ecological research evolves, with new schemes continually being established, verification approaches must evolve to suit the needs of schemes whilst also ensuring data quality and accuracy of records.

Recommendations for verification of citizen science data

Our review highlights the range of verification methods used by different citizen science schemes. In some cases, this variation might reflect deliberate and informed choices based on what works best given the attributes of different schemes. In others, it is likely that choices reflect historical contingency, or cost and ease of implementation. Some schemes may be limited to a certain approach due to available resources, time, or personnel. Others may feel bound to a verification approach in order to maintain consistency over time. In those situations, retrospective application of new methods, or calibration by running two systems in tandem, might provide reassurance to enable the implementation of new approaches.

Whilst a range of factors may influence choice of, or lack of, verification approach, transparency of documentation of verification approaches is required to increase confidence in citizen science as a means of collecting reliable data. Therefore, we recommend that citizen science schemes publicly report their verification approach. Schemes that lack a platform on which this information can be made readily available should ensure that published research clearly identifies whether and how the data were verified.

An idealised system for verification

Considering the options available for verification and the attributes that may contribute to the choice of verification approach, we have outlined a hierarchical system for verification (summarised in Figure 3). This approach considers the data that can be used to verify records, where automated and community consensus approaches can be implemented, and when expert verification may still be required.

Figure 3 

Summary of recommendations for an idealised system for verification of ecological citizen science data. Considerations for verification highlight some of the questions that can be answered using the record-level information and secondary metadata. If the answer to these questions is yes, then we propose further levels of verification may be required. First-level verification indicates the attributes of schemes that could use community consensus and automated approaches. Additional verification highlights the kinds of records that may be flagged and therefore will need to be reviewed by experts.

When verifying records, schemes should consider the breadth of information available to improve verification, making use of all data that accompanies each record (see Figure 3). Ideally, recorders should submit the maximum available evidence with each record, such as photos or recordings, assuming the user interface through which volunteers submit records is fit for purpose. Submitting photos or other evidence may not be possible for every scheme, particularly those centred around annual count events, such as the Batumi Raptor Count () or Christmas Bird Count (), where large numbers of species are recorded during a constrained period. Furthermore, requiring more information to be submitted with every species record may discourage volunteers from taking part, creating a trade-off between data completeness and data volume. For many schemes, the minimum amount of information required is date, location, and species name. For other schemes, indirect sightings can be submitted, particularly those recording mammal species, which are often less abundant, frequently nocturnal, and less likely to be observed directly. Verification approaches need to be developed and applied in view of the minimum amount of information that typically comes with each record. Even with the limited record-level information that may accompany each record, verification approaches can still take into account information on the species, the environmental context, and the recorder (see Figure 3). This can be done through input from expert verifiers, or by using secondary metadata such as historical data recorded through the scheme or external datasets. These data can then be used to cross-reference the metadata with each record (). If schemes have large volumes of data across many species and records with varying amounts of information, a hierarchy of approaches could be implemented. This allows the bulk of records to be verified by automated and community approaches, and then flagged records undergo additional levels of verification (see Figure 3).

Automated verification approaches are flexible and—resources for implementation permitting—could be used more widely across citizen science schemes to verify large quantities of data efficiently. Automation can be implemented within schemes that already have large quantities of historic data, as these can be used to inform algorithms and develop filters for the datasets (). To account for verification metrics for the species, environmental context, and recorder expertise (see Figure 3), automated approaches can incorporate record-level information and secondary metadata (), as well as expert knowledge (). For automated approaches to account for environmental factors, location, date, and time are required, as well as prior knowledge of the species’ geographical and temporal range (). Using contextual information is most useful for schemes that focus on monitoring species’ phenology, or when there are no photos or recordings submitted with a record. However, it is associated with the risk that sightings could be rejected if the species displays novel activity patterns or range shifts. To account for recorder expertise, individual recorders require a unique ID. It is important to consider that as individuals submit more records, their accuracy when identifying species may improve. When accounting for environmental context or recorder expertise in automated verification approaches, it is essential to retain flexibility, with rules being dynamically updated as unexpected sightings accumulate or as recorder expertise improves.

Another approach that can be used as the first level of verification is community consensus (see Figure 3). This approach is less widely applicable than automated verification and typically requires an online platform that connects recorders and verifiers, and large enough numbers of volunteers to verify the volume of records (; ; ; ). Community consensus approaches are more suitable for species that are more widely recognised by the public and where there is photographic evidence with each record (), as this means that the record can be verified based on visual attributes of the species, and no prior knowledge of the environmental context is required.

If automated and community approaches cannot verify records with an appropriate level of certainty, experts can provide additional levels of verification (see Figure 3). It is important, therefore, for schemes to decide on their required level of certainty, which may vary depending on the species and the purpose for which the data will be used. For most schemes, a proportion of the data will ultimately need to be referred to experts for verification. A key aim of automated approaches is to minimise the proportion of the data that require expert verification. This additional verification is likely to be required for species that have not been recorded before through the scheme, for rarer species, for invasive species for which pinpointing the exact location of individuals is necessary (), and for species that are recorded beyond their typical range or habitat. If a scheme is focusing exclusively on these kinds of species, expert verification may be the most appropriate approach. Expert insight can also be used to inform automated verification approaches, by providing information on the species and environmental context that can be accounted for in data filters. Furthermore, if a scheme is considering recorder expertise when verifying data, expert insight could also be beneficial to identify trusted recorders, allowing their submissions to be used in place of a gold standard when verifying and analysing data.

Conclusions

We reviewed approaches to data verification across ecological citizen science datasets, and assessed factors that appear to influence the choice of verification approach. Alongside this, we highlighted that the verification approaches of many citizen science schemes are not readily available to the public. We recommend how citizen science schemes can approach verification and make appropriate choices to ensure data quality. Citizen science plays an important role in data collection at a geographical and temporal scale unmatched by other data collection methods, and is a valuable means of engaging the public in scientific endeavours. By developing improved verification approaches and using the full range of information available, issues of data quality within citizen science datasets can be addressed, thereby increasing trust in citizen science approaches and strengthening the place of citizen science within ecological research.

Supplementary Files

The supplementary files for this article can be found as follows:

Supplemental File 1

Literature Search References. DOI: https://doi.org/10.5334/cstp.351.s1

Supplemental File 2

Citizen Science Scheme Data. DOI: https://doi.org/10.5334/cstp.351.s2