Introduction

Citizen science is a powerful tool for garnering interest in science from the nonscientific community, as well as for allowing researchers to collect data at greater volumes and at a larger scale than would be feasible with a more limited number of professional scientists (; ; ). Even without any formal scientific background, citizen scientists have contributed to ecological research by successfully identifying millions of camera trap images (), by quantifying species diversity (), and by contributing to global biodiversity datasets such as eBird () and iNaturalist (). However, large-scale citizen science projects and incorporation of these datasets into research are not as common as they could be because many researchers are skeptical of the accuracy of data produced by non-experts (; ; ). In fact, there can be considerable variation in accuracy of data among citizen scientists and even among expert scientists (; ; ). Identifying the factors that contribute to the consistent collection of highly accurate data by volunteers is necessary to help researchers design better volunteer training and data collection protocols, thus making citizen science more useful for biological and other scientific research.

Of the few studies that have investigated the accuracy of citizen science data, most have focused on accuracy variation as it relates to the amount of training the volunteers received, suggesting that training improves accuracy (; ; ; ; ). In addition, various studies assess accuracy as it pertains to the difficulty of the task, with greater accuracy associated with easier tasks (versus harder—e.g., identifying familiar vs. rare species) (; ; ; ; ; ), increased experience performing a task (; ; ), and increased background experience in the related scientific field (). The majority of these studies have focused on longer-term training programs (e.g., 2–3 days of training before beginning a project or a lifetime of birding experience). Unfortunately, longer-term training can be a significant commitment for volunteers, thus deterring citizen scientists from participating in research, and consequently discouraging researchers from attempting to attract volunteers. As a result, some researchers have begun to utilize JITT for their projects (; ; ), training volunteers on the spot—that is, in conjunction with the research they are performing (). This approach provides the necessary resources for participants to use at their discretion. The term JITT originated within industry and manufacturing as a way to provide on-the-job training by making resources available to employees as needed (). This training method can be used for other purposes as well, such as subject identification. Studies that have assessed subject identification accuracies in the absence of any form of training have found the accuracies to be either inconsistent across individuals or low, both for experts and non-experts (; ). These studies demonstrate the need for tools and training to assist citizen scientists performing identification tasks. However, it remains unclear whether JITT is a sufficient tool for training citizen scientists.

Although there are multiple applications for JITT for citizen scientists completing subject identification tasks, this training may be particularly useful in the analysis of camera trap images. Camera traps, also known as trail cameras, are motion-sensitive cameras used to take photos of wildlife. These cameras are helpful tools for researchers who are monitoring wildlife in areas where they do not want to interfere with the animals; this is especially useful when dealing with elusive creatures (). However, camera traps can produce hundreds or even millions of images (), making it difficult to process the resulting data. Citizen scientists can assist by identifying organisms in photos through what is referred to as human computation, in which humans carry out tasks that computers are not yet able to perform, thereby allowing researchers to more rapidly process the datasets (). Online platforms, such as Zooniverse (www.zooniverse.org), provide a point of access for citizen scientists to find and participate in research. In a recent study, citizen scientists on Zooniverse contributed to identifying more than a million and a half photos of wildlife in Tanzania (). Equally important, camera trap datasets provide a great opportunity for volunteers to get involved in a citizen science project because they can participate in research anywhere or at any time they have access to the internet. JITTs have been used with camera trap identification projects (e.g., Snapshot Serengeti; ), but the impact of these trainings on the accuracy of data collected has not been explored.

We developed an experiment that compared the impacts of online JITT on the data accuracy of citizen scientists with varying levels of biology experience using a baseline of groups that received no online JITT. Participants were asked to identify wildlife photos from camera traps set on an urban college campus. We hypothesized that if JITT improves accuracy, then citizen scientists with limited to no background in biology who receive training will be able to correctly identify wildlife as accurately as participants with a more extensive background in biology. Alternatively, if training did not improve accuracy, then we expected volunteers with a background in biology to maintain a significantly higher accuracy than volunteers without a biology background, even when those volunteers received training. Further, we explored the different ways in which accuracy was impacted, comparing the frequency with which participants selected the wrong species, did not spot the organism in the photo, or could not decide which species was present. Finally, we assessed the differences in accuracy across the different species observed in our study site. Here, by quantifying accuracy in identifications of wildlife images from camera traps, we investigate the impact that JITT has on the quality of data collected by citizen scientists.

Methods

To test our hypothesis, we collected photos of wildlife from camera traps set up on the Occidental College campus in Los Angeles, California. Using the Zooniverse platform, participants viewed and identified the species appearing in each photo. We grouped participants based on their biology experience and whether they received training, and then assessed the accuracy of their identifications.

Camera trap photos

Reconyx HC500 camera traps (Reconyx, Holmen WI) were used to capture the wildlife photos. The cameras were set to high-sensitivity motion activation and were adjusted to capture either 5 or 10 pictures after motion was detected. All the camera traps were secured to trees on the Occidental College campus. They were attached approximately 1 ft (30.48 cm) off the ground. There were three camera stations: Station 1 camera was set up on February 9, 2017; Station 2 camera on March 28, 2017 (this camera was removed on May 13, 2017 because arborist work was blocking the camera); and Station 3 camera on June 5, 2017. The cameras were checked once per week. We went through each photo and removed the images with people, identified the wildlife species present, then unmethodically selected 966 photos to upload to Zooniverse. Though photos were taken by camera traps in bursts of either 5 or 10, they were displayed to participants individually rather than as a consecutive series. Approximately 89% of the photos selected had an organism visible, while the remaining had no wildlife.

Participants

The experiment ran from June through July of 2017, in March and in October of 2018, and finally from April through June of 2019 to increase our sample size. To attract participants for our study, we advertised through email, on social media, and on SurveyCircle (www.surveycircle.com), a site specifically designed for recruiting survey participants. Participants were given a chance to win an Amazon gift card for making the most identifications on Zooniverse or through a raffle. The volunteers were required to specify whether they had no background in biology, some background (e.g., some high school or college biology), or an extensive background (a degree and/or career in biology) in which they were considered professional biologists.

Accuracy experiment

To quantify accuracy of photo identification by citizen scientists with varying backgrounds in biology, we either provided or did not provide JITT during the identification process. We used the citizen science website Zooniverse to create two separate conditions under which participants would identify images: the JITT treatment that offered resources to the volunteers and the control (no JITT) that did not. Participants were randomly assigned to one of these conditions and were required to classify a minimum of 5 images. In both treatments, the participants were asked to determine if an animal was present, identify the species, and to indicate the total number of individuals visible in the photo. The species options included: bird, bobcat, cat, coyote, dog, mouse, possum, raccoon, rat, skunk, and fox squirrel. The remaining options were “Other,” “Don’t Know,” and “Nothing Here.” The identification process was repeated as many times as the participant desired.

In the “No JITT” control treatment, participants were directed to a Zooniverse interface in which a camera trap image appeared along with a multiple-choice list of the possible species (see the previous paragraph). However, participants received no images, descriptions, or other resources to help them identify the image (Figure 1i). In contrast, participants who received the “JITT” treatment were presented with a different Zooniverse interface that provided images, descriptions, and additional identification resources for all of the potential species they may be asked to identify (Figure 1ii). On this interface, participants were first presented instructions on how to use the interface and the resources that were available to them. For each photo needing identification, each of the possible species on the multiple-choice list was accompanied by a small thumbnail image. After selecting an animal, participants were shown, via pop-ups, 2–3 additional example images and a short description before being asked to verify their choice. The pop-up images were from the same camera, location, and time period as the photo being identified to ensure that examples were similar but not identical. In addition, there was a filter available to narrow potential options based on shape, color, and pattern. If a participant was unsure about an animal, they would have the option to utilize this on-demand resource. For instance, the “Like” category displayed multiple silhouettes of wildlife, all with varying morphologies. The participant could select the morphology that they believed most accurately represented the animal in the image, and the choices would be narrowed to all the animals that fit into that morphological category. The same system was offered for the animal’s coat pattern and color, though the “Color” tab was relevant only for photos that were taken in the daylight. In addition, multiple tabs could be used at once, thereby allowing the participants to narrow their choices based on multiple factors. While the filter was not required, the example images and descriptions were presented for each photo being identified.

Figure 1 

Zooniverse treatments for identifying wildlife images from camera traps. (i) The “No JITT” treatment includes the choices available on the right for identifying the animal, but no further assistance is provided. (ii) The “JITT” treatment includes tutorials to assist the user in identifications. Shown is what the participant would see if they selected the “Like” button, which displays the morphology choices. The “Color” and “Pattern” filters are also available to the participant with the “JITT” treatment, displaying the animals’ possible colors and coat patterns, respectively. In addition to these three categories, each animal choice has a photo associated with it, as well as a short description once that animal is selected.

Analyses

A JSON parsing R script (provided by Alexandra Swanson) was used to compile raw data from Zooniverse and extract the participants’ identifications. The participants’ responses to the survey and their classifications from Zooniverse were combined. Only participants that completed both the survey and 5 or more identifications were included in the analysis. For each identification for each participant, we calculated accuracy by comparing the participant’s identification to our official identification. To increase confidence in the accuracy of the official identification, identification was determined, using photos in bursts of 5 to verify the observation, and corroborated by each of the three authors prior to image upload onto Zooniverse (). When calculating accuracy of participants, “Don’t Know” and “Other” responses were categorized as incorrect.

To evaluate whether there were significant differences in mean accuracy among the treatment groups, we used an ANOVA. We used the proportion of correctly identified images for each participant as our response variable and biology background (none, some, and professional biologist), training treatment, and the interaction between biology background and training treatment as the explanatory variables. A Levene’s test was used to assess equality of variances among treatment groups prior to the ANOVA (Test Statistic = 2.63, p = 0.03). As the Levene’s test indicated unequal variances, we conducted an arcsine square root transformation on the proportion of correctly identified images per participant, which resulted in equal variances among treatment groups (Test Statistic = 1.22, p = 0.31). Because the ANOVA results remained consistent regardless of the transformation, we used the original data to make interpretation of the results easier. Finally, a Tukey-Kramer test was conducted to determine which treatment groups significantly differed in mean accuracy while accounting for multiple comparisons.

Since there are multiple ways in which an identification could be incorrect, we also assessed differences in incorrect answers across the different treatment groups. The three types of incorrect responses are 1) “Don’t Know,” selected when participants were not confident in the animal’s identity or in whether an animal was present, 2) “Nothing Here,” selected when an animal was present, and 3) the wrong animal, selected by choosing either the incorrect species or “Other.” To determine which of these options was responsible for a difference in accuracy (e.g., if the participants were selecting “Don’t Know” less frequently or if they were identifying the correct species more often), we compared the percentages of incorrect identifications in each of these categories out of all identifications for participants from each background, with and without training. Finally, we calculated the proportion of correct and incorrect identifications for each image category to assess which species and which photo types were most frequently identified incorrectly. All data analyses were conducted in the R programming language (R Core Team, 2017).

Results

Participants

A total of 94 participants volunteered for the study (23 had no biology background, 37 had some background, and 34 had at least a degree and/or profession in biology). Three participants were excluded (one person with some biology background and two professional biologists) from the analysis because they did not meet the minimum requirement of five image identifications, resulting in 91 participants. A total of 3,164 classifications were made; the number of identifications made by each participant ranged from 5 to 451, with an average of 35 identifications per participant.

Accuracy experiment

Accuracy of identifications was associated with both the background of the participants and whether they received the training treatment, with a significant interaction between background and training treatment (Background: F-ratio = 5.76, p = 0.0045, df = 2; Treatment: F-ratio = 16.87, p = 9.00e-5, df = 1; Interaction: F-ratio = 7.61, p = 0.00091, df = 2) (Table 1; Figure 2). When the participants did not receive any training, the volunteers with biology backgrounds identified with higher accuracy than the volunteers with no background in biology (no background: mean = 51.8%, SE = 6.0%; some background: mean = 74.7%, SE = 2.6%; professional biologist: mean = 77.6%, SE = 2.1%). However, when training was provided, the disparity between volunteers with biology backgrounds and volunteers without biology backgrounds dissipated and they had similar levels of accuracy (no background: mean = 81.9%, SE = 3.6%; some background: mean = 76.3%, SE = 3.2%; professional biologist: mean = 85.1%, SE = 2.5%). As such, only the group of participants with no biology background and no training had significantly lower mean accuracy than remaining groups (Tukey: p ≤ 0.01) (Table 2). These remaining five groups did not have significantly different mean accuracies from one another (Tukey: p > 0.05) (Table 2). We are therefore able to reject our null hypothesis that JITTs are not associated with increased accuracy.

Figure 2 

Accuracy of identifications (proportion of photos correctly identified) based on biology background of participants and training received. This boxplot displays the median and interquartile range for each category of biology background and treatment type (n = 91). Volunteers with no biology background were able to provide identifications that were as accurate as volunteers with biology backgrounds when training was provided but were less accurate when no training was provided (ANOVA Background by Treatment Interaction: F-ratio = 7.61, df = 2, p = 0.00091). Letters denote significance (Tukey-Kramer: p ≤ 0.01).

Table 1

ANOVA results comparing mean accuracy of photo identifications across participants based on biology background of participants and amount of training received.

TermDFSSF-ratioP-valueη2 95%CI Lwr95%CI Upr

Background20.295.760.00450.09–0.030.25
Training10.4316.879.00E-050.130.020.24
Background*Training20.397.610.000910.120.010.24
Residuals852.15

Participants self-identified their background in biology as either “No Background,” “Some Background,” or “Professional Biologist.” Participants received either the treatment with no training or were provided with just-in-time training (JITT). Significant values are italicized. For each term in the model, the following are reported: degrees of freedom (DF), sum of squares (SS), F-ratio, P-value, eta-squared (η2), and the lower and upper 95% confidence intervals (CI).

Table 2

Tukey-Kramer Honestly Significant Difference results comparing differences in mean accuracy between treatment groups.

ComparisonDifference95%CI lwr95%CI uprAdj P-value

Some–None0.090.010.180.021519
Biologist–None0.140.050.220.000558
Biologist–Some0.04–0.030.120.344381
JITT–No JITT0.110.050.160.00019
Some*No JITT–None*No JITT0.230.090.370.000144
Biologist*No JITT–None*No JITT0.260.120.394.00E-06
None*JITT–None*No JITT0.300.140.464.00E-06
Some*JITT–None*No JITT0.250.110.382.40E-05
Biologist*JITT–None*No JITT0.330.170.491.00E-06
Biologist*No JITT–Some*No JITT0.03–0.090.150.980451
None*JITT–Some*No JITT0.07–0.070.220.702743
Some*JITT–Some*No JITT0.02–0.110.140.998839
Biologist*JITT–Some*No JITT0.10–0.050.250.339966
None*JITT–Biologist*No JITT0.04–0.100.180.948098
Some*JITT–Biologist*No JITT–0.01–0.130.110.999552
Biologist*JITT–Biologist*No JITT0.07–0.070.220.658372
Some*JITT–None*JITT–0.06–0.200.090.867683
Biologist*JITT–None*JITT0.03–0.130.200.992882
Biologist*JITT–Some*JITT0.09–0.060.230.516319

Participants self-identified their background in biology as either “No Background,” “Some Background,” or “Professional Biologist.” For each comparison, mean difference is shown with the lower and upper 95% confidence intervals (CI) and the adjusted P-value. Significant values are italicized.

Incorrect identification responses

For participants with no background in biology, the proportion of incorrect observations with wrong species and “Don’t Know” was significantly lower for those with JITT than for those without (X2 = 173.42, df = 3, p < 2.2e-16). In contrast, the proportion of observations with “Nothing Here” that were incorrect remained constant (X2 = 0.21, df = 1, p = 0.64). This latter result was also consistent regardless of background or training, with the number of incorrect observations marked as “Nothing Here” showing no significant difference across any of the treatment groups (X2 = 6.81, df = 5, p = 0.24; Figure 3). When considering individual species, possums were misidentified most frequently (39.6% accuracy overall; Figure 4), whereas dogs were misidentified least frequently (97.1% accuracy; Figure 4).

Figure 3 

Proportion of correct and incorrect identifications for participants with and without training. Incorrect identifications were split into three categories: 1) “Don’t Know” was assigned to pictures identified as having an organism but the participant was unsure of the species, 2) “Nothing Here” was assigned to pictures identified as having no organisms in them when in fact there were organisms present, and 3) “Wrong Species” was assigned to pictures identified with the wrong species. Results are shown for participants with varying backgrounds in biology (none, some, and professional) and for both treatments (just-in-time training [JITT] and no training).

Figure 4 

Proportion of correct and incorrect identifications of wildlife photos for each species for participants with and without training. For each of the official identification categories, the proportion of correct and incorrect identifications are shown. Incorrect identifications were split into three categories: 1) “Don’t Know” was assigned to pictures identified as having an organism but the participant was unsure of the species; 2) “Nothing Here” was assigned to pictures identified as having no organisms in them when in fact there were organisms present; and 3) “Wrong Species” was assigned to pictures identified with the wrong species.

Discussion

Citizen scientists are able to contribute high quantities of data to important biological research (; ; ), but these contributions depend on the accuracy of the data produced by the volunteers (). Using camera trap images, we assessed the accuracy of wildlife identifications made by volunteers with little to no biology background versus volunteers with professional biology backgrounds, with or without the added assistance of JITT. Our results demonstrate that when provided with relatively modest training materials, volunteers with no biology background can improve the accuracy of their identifications (Table 1; Figure 2). Based on these results, we conclude that citizen scientists can produce accurate data for scientific research when provided JITT materials.

Our results complement previous studies, which demonstrate that thorough training of volunteers improves accuracy, from tree identification () to visual surveys of fishes (). Not only did the trainings in our study improve the wildlife photo identification accuracy of the citizen scientists with no biology background in our study, but the rates of accuracy for citizen scientists in this study that received training were also on par with those of previous studies with more intensive training programs (the range of mean accuracy of species identification in our study was 76.3–85.1% compared with 70–95% accuracy in other studies [; ]). Importantly, this finding demonstrates that minimal training, such as JITT, not only improves identification accuracy but also improves the accuracy just as well as other training methods, including longer-term trainings. Our results add to the existing research by indicating that minimal training can offer large dividends in improving the accuracy of identifications for volunteers with limited backgrounds in biology. Thus, extensive training may not be necessary for some types of studies, particularly with subject identification tasks.

Improvements in accuracy, however, may also vary with the difficulty of the task (; ). Even with training, both just-in-time and long-term, participant accuracy in more difficult tasks may not reach the level required for inclusion in professional research, especially for short-term volunteers. However, for projects that rely on volunteers that are engaged for short periods or irregular intervals, training conducted in parallel with task completion may be the only feasible option. For these reasons, data quality controls, such as multiple identifications for each photo, expert validation, and using standardized equipment should continue to be used to account for inaccuracy in data collected by citizen scientists and to further improve data quality (). In addition, assessing the factors that may influence the accuracy of data collected by citizen scientists (e.g., age, level of education, etc.) and applying eligibility requirements can help ensure that researchers understand the reasons for potential variations in accuracy and that citizen scientist participants are able to provide sufficiently accurate data ().

JITT may increase accuracy because participants misidentify fewer organisms overall, because participants become more confident in their responses and select “Don’t Know” less frequently, or because participants less frequently select “Nothing Here” since they are better equipped to discern organisms in the photos. Our results suggest that the observed post-training increase in accuracy of participants with no background in biology is due to a combination of the first two mechanisms (Figure 3). Participants with no background in biology who received JITT did not show any improvement in accurately identifying photos of hard-to-notice organisms; the proportion of inaccurate selection of “Nothing Here” remained consistent across participants with no background in biology with or without JITT. This was also true for participants with a background in biology (Figure 3). In fact, despite the additional training resource, mean accuracy of participants with professional biology backgrounds did not exceed 85%, and the majority of incorrect observations were photos that were incorrectly identified as “Nothing Here.” This implies that additional factors, such as photo quality, may be influencing identification accuracy. After reviewing photos with incorrect identifications, we noted that many of these photos were in fact low quality with difficult-to-distinguish wildlife (e.g., a photo in which an animal is moving out of the field of view). Further, species that are more difficult to distinguish (e.g., fox squirrels) were more likely to be incorrectly identified with the “Nothing Here” option than species that stand out clearly (e.g., dogs; Figure 4). Perhaps additional training materials to help participants identify wildlife from non-ideal images would likewise help with improving accuracy overall. Future research should investigate how to improve accuracy for more difficult images.

While our results demonstrate an added benefit of JITT for subject identification tasks, the scale of this benefit may differ depending on the type and difficulty of tasks required for various citizen science projects (). For example, while text and image training resources closed the gap in mean accuracy of wildlife photo identifications for participants in this study, advanced participants still outperform novice participants when identifying invasive plants with the assistance of text and image training materials (). In this case, plant identification may present a greater degree of difficulty than the wildlife identification in our study. As a result, it may be ideal for those managing citizen science projects to test different methods of training to determine the best approach for the specific tasks involved in each study (). Our results suggest that including an analysis of the effects of training methods on the accuracy of data in citizen science-based research can help assure data accuracy. Including these quality control data in studies that involve citizen science data may thus improve perceptions of citizen science-based research. Further, this approach can help shed light on which aspects of research citizen scientists can be most helpful and which aspects may be best left to experts (e.g., ). Future research should therefore focus on determining the success of JITTs for other types of citizen science tasks and at various levels of difficulty for each task.

Although the data suggest a significant boost to citizen science photo-identification accuracy with minimal training, there are some caveats to consider. One of the primary limitations was that participants were able to self-identify their levels of biology experience. Participants were not asked to specify their biology background, though that may have been important in assessing their qualifications (for instance, a background in organismal biology would have likely been more useful in this study than a background in botany or biochemistry). In fact, there was extensive variation in accuracy scores even for participants self-identifying as professional biologists. Despite this limitation, we still see a significant improvement in scores for participants with no biology background who receiving trainings. Future research may be needed to determine how a general background in biology translates to accuracy in citizen science projects. Another possible caveat is that the number of images that each participant identified in Zooniverse varied (e.g., one participant identified 451 photos, whereas another identified 5), which could lead to a bias in the data if accuracy improves with more experience with the task, as has been shown by others (). However, we found no significant difference in the mean ln-transformed number of photos identified regardless of background or training treatment (ANOVA: Background: F-ratio = 0.72, p = 0.50, df = 2; Training: F-ratio = 0.12, p = 0.73, df = 1; Interaction: F-ratio = 3.0, p = 0.06, df = 2, p > 0.05).

Some of the most important benefits of citizen science-based research are found in education and outreach, which can be achieved through training. Citizen science not only provides data that can be used in scientific studies, but also helps to educate the public about scientific research (; ). By improving training materials for citizen scientists, we can improve the quality of the education volunteers are receiving. In this particular study, JITT was used to educate participants on wildlife identification, which includes having an understanding of the animals’ morphologies, colors, coat patterns, and sizes. As an added benefit, citizen scientists also learned about the types of wildlife living in urban Los Angeles, including organisms rarely seen during the day. Thus, even studies that require little training should consider including JITTs to provide a reciprocal service to citizen scientists. Future research should focus on how JITTs contribute to education and outreach, which provide alternate metrics of success for citizen science research ().

Conclusion

Citizen science is a growing and developing field that makes it possible for researchers to collect large sets of data. Citizen science projects, however, are often considered inferior because the nature of these studies requires the involvement of people who have typically not had a significant amount of formal training in a particular scientific field (). This study challenges that mentality by demonstrating that when citizen scientists with little to no scientific background are provided with JITT, they can identify wildlife images from camera traps with as much accuracy as citizen scientists with a professional background in biology. Thus, these results suggest that citizen scientists with no background in the field can contribute accurate and meaningful subject-identification data to scientific research even when provided with only limited JITT and on-demand resources. Future research should focus on how JITTs benefit other types of subject-identification tasks as well as other citizen science-based research.

Data Accessibility Statement

Data can be accessed in the online supplemental material.

Supplementary File

The supplementary file for this article can be found as follows:

Supplementary File 1

Project Data. DOI: https://doi.org/10.5334/cstp.219.s1