Contrasting the Views and Actions of Data Collectors and Data Consumers in a Volunteer Water Quality Monitoring Project: Implications for Project Design and Management

Caren B. Cooper; Lincoln R. Larson; Kathleen Krafte Holland; Rebecca A. Gibson; David J. Farnham; Diana Y. Hsueh; Patricia J. Culligan; Wade R. McGillis

Introduction

Citizen science is newly recognized as a distinct discipline with a rapidly emerging theoretical base (; ; ; ). New studies provide insights on the structure, processes, function, and outcomes of citizen science. The primary products of citizen science activities are typically scientific publications or datasets, but additional outcomes are possible for participants. For example, citizen science participation can increase understanding of the scientific method (; Price and Lee 2012); public understanding of science (); and skills of analytical thinking (). Civic outcomes are typically in the context of environmental monitoring; for example, citizen science can lead to greater personal agency and political participation (; ) as well as provide communities with clout and greater legitimacy in legal venues or regulatory contexts (). Citizen science also can lead to participant advocacy in promoting environmental action () that may foster greater accountability and industrial compliance with regulatory agencies (). With regard to outcomes for conservation of natural resources, findings have been equivocal: some studies reported minimal or no changes in conservation behaviors () while others suggested that citizen scientists may be more likely than other individuals to undertake conservation actions (; ; ; ; ).

To better understand these outcomes and how they might be derived, it is important to explore the many different ways that citizen scientists engage with research projects. Typically they take on the role of volunteer data collectors, but often they engage by using the data as well. In most projects, contributions are heavily unbalanced with only a small percentage of highly active participants influencing resulting datasets, but many other individuals potentially using them (). This means that a best practice in citizen science is the provision of data-out or report-back capacity (), which provides participants with opportunities to interact with project data. Data-sharing strategies, ranging from raw data downloads to elaborate data visualizations, are thus employed by citizen science projects across a variety of disciplines.

Individuals associated with citizen science projects can be characterized based on their level of engagement in both data collection and data use. Potential categories might include data collectors (individuals who collect data and may or may not interact with the information later) and data consumers (individuals who engage with a project by viewing/using citizen science data without collecting any data themselves). Because data consumers view and use data but do not advance a project’s scientific data collection, project managers may not consider them as project participants. Hollow et al. (2014) referred to people who enroll in projects but don’t collect data as onlookers. Others have referred to these individuals as “freeloaders” (). While few studies have examined the onlooker phenomenon in citizen science, evidence suggests that it may be important with respect to broader project outcomes. We hypothesize that, when broader outcomes beyond scientific data generation are considered, onlookers who solely consume data might play a critical role. To explore this possibility, we studied a community-based water monitoring project originally created by kayakers and later conducted in partnership with researchers in the New York City area (Figure 1). This project, though small in scale, represented an ideal system in which to examine the prevalence and potential impact of onlookers because many individuals who did not actively sample regularly interacted with project data.

Figure 1

Some participants in Citizens’ Water Quality Testing Program often use kayaks to collect water-quality data.

Our study focused on both the data collectors and consumers in this project for two primary reasons. First, both groups have the potential to utilize citizen science data. Second, both have the potential to transfer what they learned from the project into decisions that affect other aspects of their lives (e.g., about whether to kayak or swim in a given area on a given day based on bacteria levels in the water; advocating for stronger water quality). Building on a framework for public participation in scientific research introduced by Shirk et al () – a logic model that focuses on inputs, activities, outputs, outcomes, and broader project impacts across multiple scales – our study compared personal characteristics of engagement (inputs such as demographics, motivations, barriers, self-efficacy) and examined the potential association, or lack thereof, between the nature of project engagement (activities that define the roles of collector vs. consumer) and various outputs and outcomes (such as civic and conservation behaviors) (Figure 2).

Figure 2

Logic model of hypothesized relationships among project inputs, data collection and consumption, and project outcomes, synthesized and extended from Shirk et al. () and McKinley et al. (). Variables in orange boxes represent those examined in this study. Arrows represent hypothesized causality, which our study could not address. Activities of data collecting lead to scientific outcomes as well as social learning and conservation outcomes, whereas activities of data consuming lead only to social learning and conservation outcomes. This framework highlights the potentially important role of data consumers as a type of project participant (i.e., active onlookers) with respect to larger project outcomes and impacts.

Demographic attributes (e.g., gender, age, education level, political orientation) are associated with voluntary participation in a variety of activities, ranging from political advocacy () to nature-based recreation (). Whether some of these variables influence participation in citizen science projects is not yet clear. For some, citizen science might be viewed as a form of outdoor recreation (). Individuals’ recreation preferences might therefore influence project engagement and associated outcomes. For example, studies have demonstrated links between nature-based recreation and conservation behaviors ().

Motivations represent another important factor that influences different types and levels of participation in a variety of leisure activities (; ), including volunteer citizen science projects (; ; ; ; ; ). Perceived structural barriers (e.g., time, money) or personal ones (e.g., lack of confidence or knowledge) also might affect participation rates and contribute to differences between data collectors and consumers (; ). Self-efficacy, in particular, has been hypothesized to be an important predictor (or outcome) of citizen science engagement in previous studies (e.g., ).

In addition to these participation characteristics (inputs), we compared data collectors and data consumers in terms of potential outputs and outcomes (Figure 2). The broader outcomes of citizen science have received increasing attention as the field evolves (; ; ; ; ). Some of these outcomes include project-related variables that have been associated with citizen science participation in other contexts, such as participants’ perceptions of data quality and accuracy and trust in data sources (). We wanted to understand how different types of project engagement (activities) might relate to different types of project outcomes.

Shirk et al. () describe outcomes pertaining to three broad categories: Science, social-ecological systems, and individual learning, growth, and development. McKinley et al. () distinguish two pathways (one via acquisition of scientific knowledge, the other via public input and engagement) through which citizen science can influence management and policy outcomes. Our analysis focused on outcomes similar to these: Scientific knowledge, social learning and exchange, and conservation behavior (Figure 2). The first outcome highlights an important aspect of citizen science since it emerged centuries ago (). The second outcome, fostering social learning and sense of community within social-ecological systems, has historically received less attention but may be equally important, especially when considering connections between beliefs about data quality/accuracy, trust in data sources, and project participation (; ; ). The third outcome is also receiving increasing attention as the field progresses, with more researchers and practitioners recognizing the contributions that the civic and conservation-oriented actions of citizen scientists can make with respect to environmental management (; ; ; ).

Whether or how different levels of project engagement might influence these outcomes is not yet clear. Effects of citizen science on conservation behavior, in particular, remain hazy (; ). For example, Larson et al. () described different domains of pro-environmental behavior including conservation lifestyle behaviors (consumer actions such as recycling and water conservation), social environmentalism (social interactions leading to action), environmental citizenship (political activism and policy support), and land stewardship (actions that improve the ecological features of a particular place). Different levels of participation in citizen science projects, and water quality monitoring projects specifically, could impact all of these behavioral dimensions to different degrees (). We wanted to understand if and how increased knowledge of an environment derived from the consumption and use of citizen science data might influence individuals’ sense of community and personal recreation choices, which could in turn impact conservation behavior ().

In summary, our primary goal was to test the null hypothesis that individuals who were engaged as data collectors would not differ significantly from individuals engaged only as data consumers in terms of personal characteristics of engagement (inputs such as demographics, motivations, barriers, self-efficacy) or outcomes such as impacts on scientific knowledge, social learning and exchanges, and civic and conservation behaviors.

Methods

Study population

We studied people affiliated with the Citizen’s Water Quality Testing (CWQT) Program in New York City (NYC) (http://www.nycwatertrail.org/water_quality.html), a monitoring effort initiated in 2011 by residents in the NYC watershed. The CWQT program engages recreationists, most of whom are kayakers affiliated with the NYC Water Trail Association, to collect water samples at various locations throughout the summer and to deliver samples to labs to test for the presence of Enterococci microbes. Scientists, including researchers at Columbia University, process water samples at locations throughout the city (Figure 3). Weekly results are published on the CWQT project website, making all water quality test results available to the public. The CWQT engages about 50 regular data collectors and an additional 100–150 individuals affiliated with the projects’ communication network, the CWQT listserv. This listserv represented the sample frame for our study.

Figure 3

Map of New York City area showing Citizen Water Quality Testing Project sampling sites (blue pins) and labs where water quality testing occurs (red stars). (Source: New York City Water Trail Association).

Data collection

We conducted a web survey in October 2015 of individuals on the CWQT email listserv (approximately 200 members including regular data collectors and other project affiliates). We contacted listserv members three separate times over a three- week period, encouraging anyone affiliated with the project in any capacity (including individuals who consumed data but had never collected any water samples) to participate, generating an overall 32% response rate. We grouped respondents into two categories based on their type of engagement: (1) Individuals who collected water samples and accessed data (“Data collectors,” n = 40, response rate = 80%), and (2) Individuals who accessed data but did not collect water samples (“Data consumers,” n = 24, response rate = 15%). This work was in compliance with Columbia University’s Institutional Review Board, protocol IRB-AAAO8012.

Discerning collectors from consumers

For the survey we crafted a set of questions about current and anticipated project participation including CWQT-related roles (four choices: primary sampler, standby sampler, sample processor, data viewer/user), years participating, frequency and intensity of participation during 2015 season (e.g., number of weeks sampling, amount of time checking results, hours spent on CWQT activities), interactions with other people in the CWQT network (face-to-face and virtual), and likelihood of future participation in various CWQT roles (rated on a scale from 1 = Very unlikely to 5 = Very likely).

Measuring project inputs (participant characteristics)

We included demographic questions about participants’ age, education, employment status, and political orientation. We also asked respondents about their water-based recreation participation in the NYC area, including questions about swimming and paddling experience (e.g., number of trips in past year) and skill level (rated from novice to expert).

To assess motivations for participating in the CWQT network, we developed 13 items (rated from 1 = Not at all important to 4 = Extremely important) to represent a range of potential categories based on existing motivation scales and constructs used in previous volunteering and citizen science research (; ; ). We used principal components analysis (PCA), a multivariate statistical technique designed to reduce the number of variables in a dataset into a smaller number of meaningful dimensions or categories (), and Cronbach’s alpha (α), a statistic used to measure the internal consistency for scales with two or more items (), to reduce items into the following five motivation categories (Appendix A, supplemental documents): improve environmental health (2 items, α = 0.814), scientific discovery (4 items, α = 0.774), get outdoors and enjoy nature (1 item), social interaction and sharing (4 items, α = 0.821), and personal accomplishment and recognition (2 items, α = 0.552).

We assessed barriers to participation using 11 items (rated from 1 = Not a barrier at all to 4 = Major barrier) adapted from hypothesized barriers to citizen science participation discussed in other studies (e.g., ; ; ). In this study, many of these items barely registered as barriers. Only six items had means above 1.04, and we used these barriers in subsequent analysis. Because each item on this list described a unique barrier, we did not search for underlying categories using data reduction techniques (PCA or α).

We also included a series of questions about participants’ self-efficacy, or beliefs in one’s ability to succeed in specific situations or to accomplish a task (). We used three items to assess environmental efficacy (α = 0.866), or beliefs about individuals’ ability to address environmental problems (e.g., “I can make a difference when it comes to solving environmental problems”), and three additional items to measure science efficacy (α = 0.839), or beliefs about one’s ability to interpret and/or conduct scientific inquiries (e.g., “I think non-scientists can play a very important role in research”) (Appendix B, supplemental documents).

Measuring project outcomes

We assessed participants’ beliefs about perceived outcomes linked to the CWQT project (“do the following represent outcomes of the CWQT program?”) using 10 items (rated from 1 = Strongly disagree to 5 = Strongly agree). These items were based on stated project goals and potential positive outcomes of citizen science noted by other authors (e.g., ; ; ; ). Using PCA and α, we were able to reduce the set of 10 items into three categories (Appendix C, supplemental documents): Improve environmental health and safety (4 items, mean α = 0.672), generate data to inform management (3 items, mean α = 0.810), and foster sense of community (3 items, mean α = 0.810).

We also asked questions about the likelihood of consulting CWQT data (rated from 1 = Very unlikely to 5 = Very likely) prior to deciding if (or how) individuals might engage in different types of conservation and recreation activities. We used nine items that reduced to three categories following data reduction (Appendix D, supplemental documents): Communicating about or advocating on behalf of water quality issues in NYC (6 items, α = 0.884), participating in water-based recreation in NYC (2 items, α = 0.714), and donating money to address environmental issues affecting water quality (1 item).

Finally, we included several items to measure perceptions regarding CWQT data quality/accuracy and trust in various data sources, factors that have impacted citizen scientists’ perceptions and participation in other contexts (). We asked four questions about level of confidence (rated from 1 = Not at all confident to 4 = Very confident) in different aspects of the CWQT data (e.g., “samplers consistently follow standard protocols for data collection,” “sampling locations accurately reflect water conditions in immediate surroundings”). Based on principal components analysis, we condensed these four items into a single indicator (Cronbach’s α = 0.821) (Appendix E, supplemental documents). We used a single item to ask people in the CWQT network to compare their level of confidence regarding CWQT data to similar data collected by government regulatory agencies such as the NYC Department of Environmental Protection (rated on a scale from –2 = Much less confident in CWQT to 2 = Much more confident in CWQT, with 0 as a neutral point).

Data analysis

Prior to analysis, scales with multiple items were reduced using various statistical techniques into a smaller set of core, interpretable constructs. In all principal components analyses (PCA), we followed generally suggested cutoff criteria () and retained factors with eigenvalues ≥ 0.8 (and variance explained ≥ 5%) and items with factor loadings of ≥ 0.400 following orthogonal (Varimax) rotation. When assessing internal consistency with Cronbach’s alpha (α), we followed conservative recommendations () and selected 0.7 as our cutoff point. We made exceptions for single-item indicators or multi-item indicators where inclusion/retention was logical based on item content (e.g., personal accomplishment and recognition on the motivations scale). See Appendices for more details.

We compared responses for data collectors (primary and secondary samplers and sample processors) and data consumers (data viewers/users who did not collect or process samples) across all outcome variables of interest using chi-square tests (for categorical variables) and independent samples t-tests (for continuous variables). We used Welch’s t-tests for these comparisons, which are more reliable than standard t-tests when two samples have unequal samples sizes and unequal variances (). We applied Holm-Bonferroni corrections to adjust the familywise error rate for all tests involving multiple comparisons of related outcome variables (). We report significant relationships (based on error rates of α = 0.05 and 0.10). We conducted all analyses using the IBM SPSS Statistical Package (Version 23.0).

Results

Discerning collectors from consumers

Self-reported participation rates confirmed differences in project engagement between data consumers and data collectors. Of the 40 self-identified CWQT data collectors, 32 actively sampled in 2015, either as primary (53%) and/or secondary samplers (35%). About 35% of the self-identified data collectors had collected water samples in all four years since the project had been initiated. All of the data collectors viewed CWQT data at least once in 2015. None of the 24 self-identified data consumers had engaged in active sampling at any point in the project’s history, yet all of them had accessed project data at least once since the program’s inception. Overall, 93% of individuals in this group viewed CWQT data in 2015.

On average, both data collectors and consumers viewed project data over 15 times during the summer, but data collectors spent an average of 1.5 more hours per week on project-related activities than data consumers (Table 1). Data collectors were also more likely to interact face-to-face with other participants, though both groups reported comparable levels of online interactions. Face-to-face interactions were roughly three times more common among data collectors than data consumers, with only 10% of collectors rating this level of interaction as inadequate. On the other hand, despite similar levels of virtual interactions, 42% of data consumers said face-to-face interactions during the project were inadequate (Table 1).

Table 1

Variables describing participation in CWQT Program: Data Collectors (n = 40) vs. Data Consumers (n = 24).

Participation Variables	Data Collectors	Data Consumers	Overall Project

Participation Frequency
Number of times checking CWQT results this summer	20.3	15.1	18.3
Hours per week spent on CWQT activities (avg. including sampling and data viewing)	2.46**	0.96	1.89
Participant Interactions
Number of times interacting “face-to-face” during season	9.2**	3.3	7.0
Number of times interacting “virtually” (online) during season	11.2	12.1	11.5
Percentage rating project interactions as inadequate	10%	42%	22%
Likely or very likely to participate in future years of CWQT as …
… data generator? (primary sample collector)	56%**	13%	40%
… data user? (viewer)	89%	96%	92%

*, ** Denote significance of Welch’s t-test or Chi-square test at Holm-Bonferroni adjusted α = 0.10 and 0.05, respectively.

About 56% of data collectors said they would continue to serve in that capacity in future years. Retention rates for data consumers were higher than retention rates for data collectors, as about 90% of individuals in both groups intended to continue participation in the project by continuing to access CWQT data in future years (Table 1). We observed limited interest for people to transition from being a data consumer to a data collector, with only 13% of data consumers indicating that they were likely to transition into a role as an active sampler in the future (Table 1).

Socio-demographic characteristics did not differ between data collector and consumer groups. Overall, CWQT participants across the entire project engagement spectrum tended to be male (56%), liberal (83%), well-educated (57% had graduate degree), and advanced/expert paddlers (60%). Most were regular swimmers and kayakers in NYC waters (Table 2). We detected differences in median ages between the two groups, with 43% of data collectors younger than age 40 (median = 45 years) compared to only 11% of data consumers (median = 50 years). Motivations for project participation were consistent across both groups. Participants across the project engagement spectrum ranked improving environmental health and scientific discovery as their top reasons for participating, followed by getting outdoors and enjoying nature (Table 3). Both data collectors and consumers ranked social interaction and sharing as low importance, though personal accomplishment and recognition was more important to data collectors than data consumers.

Table 2

Demographic Distribution, Recreation Participation, and Self-Efficacy of CWQT Participants: Data Collectors (n = 40) vs. Data Consumers (n = 24).

Socio-Demographic Variables	Data Collectors	Data Consumers	Overall Project

Demographics
Gender (male)	59%	50%	56%
Median Age (in years)	45.5	50.8	48
Age (% under 40 years)	43%*	11%	30%
Education (grad degree)	50%	67%	57%
Political Orientation (liberal)	80%	88%	83%
Water Recreation
Paddling in NYC (past year, with avg. number of trips)	87% (17.2)	96% (19.9)	91% (18.2)
Paddling skill (advanced/expert)	59%	61%	60%
Self-Efficacy^a
Environmental Efficacy	4.32	4.18	4.27
Science Efficacy	4.44**	4.01	4.28

*, ** Denote significance of Welch’s t-test or Chi-square test at Holm-Bonferroni adjusted α = 0.10 and 0.05, respectively.

^a Self-efficacy scale ranged from 1 = Strongly disagree (very low efficacy) to 5 = Strongly agree (very high efficacy) (mean values are presented); Environmental efficacy = 3 items, Science Efficacy = 3 items.

Table 3

Mean Ratings for Motivations and Barriers to Participation in the CWQT Project: Data Collectors (n = 40) vs. Data Consumers (n = 24).

Motivations and Barriers	Data Collectors	Data Consumers	Overall Project

Motivations^a
Improve environmental health	3.75	3.73	3.74
Scientific discovery	3.68	3.51	3.62
Get outdoors and enjoy nature	3.16	3.58	3.32
Social Interaction and sharing	2.66	2.52	2.61
Personal accomplishment and recognition	2.19*	1.67	1.99
Barriers^b
I don’t have free time	3.00	3.54*	3.20
I feel that others can collect data better than me	1.36	1.88	1.56
I don’t understand project data collection and analysis protocols	1.08	1.46*	1.23
I don’t understand the goals of the project	1.13	1.04	1.10
I don’t have anyone to help or teach me how to participate	1.05	1.13	1.08
I am not interested in water quality monitoring	1.05	1.04	1.05

*, ** Denote significance of Welch’s t-test at Holm-Bonferroni adjusted α = 0.10 and 0.05, respectively.

^a Motivation Scale ranged from 1 = Not at all important to 4 = Very important (mean values are presented); Improve environmental health scale = 2 items, Scientific discovery scale = 4 items, Get outdoors and enjoy nature scale = 1 item, Social interaction and sharing scale = 4 items, Personal accomplishment and recognition scale = 2 items.

^b Barriers Scale ranged from 1 = Not a barrier to 4 = Major barrier (mean values are presented); All single-item indicators.

We found marginally significant differences between data collectors and data consumers in their perceptions of barriers to participation. Data consumers were more likely than data collectors to report lack of free time as the most significant barrier to participation. Data consumers were also more likely than collectors to rate “I don’t understand project data collection and analysis protocols” as a barrier (Table 3). Individuals in the two groups were similar in having high levels of environmental efficacy, but marginally differed in that only the data collectors reported equally high levels of science efficacy (Table 2).

Project outcomes

We found that, for the most part, data collectors and consumers perceived similar positive outcomes associated with the project (Table 4). Both groups thought the project effectively generated data to inform management and helped to foster a sense of community among participants. However, we observed slightly weaker beliefs among collectors for one of the project’s primary objectives: improving environmental health and safety. In fact, when compared to collectors, consumers believed the project more effectively accomplished this goal.

Table 4

Perceived Outcomes Associated with the CWQT Program: Data Collectors (n = 40) vs. Data Consumers (n = 24).

Outcomes^a	Data Collectors	Data Consumers	Overall Project

Generate data to inform management	4.55	4.63	4.59
Foster sense of community	4.09	4.35	4.17
Improve environmental health and safety	3.79	4.14*	3.92

* Denotes significance of Welch’s t-test at Holm-Bonferroni adjusted α = 0.10.

^a Outcome scales ranged from 1 = Strongly disagree to 5 = Strongly agree (mean values are presented); Improve environmental health and safety scale = 3 items, Generate data to inform management scale = 4 items, Foster sense of community scale = 3 items.

We found that data collectors and consumers reported similar rates of engaging in conservation behaviors (Table 5), although these rates may not have been influenced by participation in the project itself. For example, more than 90% of individuals in both groups routinely educated others about water quality concerns, about 70% volunteered to clean up waterways, and over 50% communicated with local leaders about water quality and advocated for improved water quality management and enforcement. Data collectors were marginally more likely than consumers to make predictions about water quality trends (Table 5).

Table 5

Current and Future Conservation Behaviors of CWQT Participants: Participation Rates of Data Collectors (n = 40) vs. Data Consumers (n = 24) During Past 12 Months and Likelihood of Consulting CWQT Data Before Participating.

	Data Collectors	Data Consumers

Current Behaviors
Educate others about WQ concerns	92%	92%
Volunteer to clean up waterways	72%	67%
Communicate with local leaders about WQ	58%	54%
Advocate or lobby for improved WQ management or enforcement	58%	50%
Make predictions about WQ trends	66%*	33%
Recruit others to participate in CWQT	58%	33%
Donate money to address environmental issues affecting WQ	34%	38%
Likelihood of Consulting CWQT Data Before Engaging in Future Behaviors^a
Communicating about or advocating on behalf of WQ issues in NYC	4.14	3.67
Water-based recreation in NYC	3.83	4.15
Donating money to address environmental issues affecting WQ	2.92	2.96

*, ** denote significance of Chi-square test at Holm-Bonferroni adjusted α = 0.10 and 0.05, respectively.

^a Future behavior scales ranged from 1 = Very unlikely to 5 = Very likely (mean values are presented); Communication and advocacy = 6 items, Water-based recreation = 2 items, Donating money = 1 item.

The CWQT data appeared to influence the behaviors of both data collectors and consumers (Table 5). For example, many individuals in both groups (74% of collectors, 50% of consumers), were likely to consult CWQT data before communicating about or advocating on behalf of water quality issues in NYC. A majority of individuals in both groups (64% of collectors, 71% of data users) were also likely to consult CWQT before deciding whether or not to engage in water-based recreation in NYC. CWQT data appeared to have weaker influence on donations that address environmental issues (Table 5).

Both data collectors and data consumers were equally confident in the validity/quality of the CWQT data (Table 6). Both groups also indicated a higher level of confidence in water quality data collected by CWQT’s citizen science monitors compared to data collected by government regulatory agencies such as the NYC Department of Environmental Protection. No significant differences were found between confidence ratings for both CWQT data collectors and consumers (Table 6).

Table 6

Confidence in Data Generated from CWQT Participants: Data Collectors (n = 40) vs. Data Consumers (n = 24).

Confidence Variables	Data Collectors	Data Consumers	Overall Project

Overall confidence in CWQT data^a	3.03	3.18	3.09
Relative confidence in CWQT data vs. data from government regulatory agencies^b	+0.46	+0.70	0.55

*, ** Denote significance of Welch’s t-test or Chi-square test at Holm-Bonferroni adjusted α = 0.10 and 0.05, respectively.

^a Overall confidence scale consisted of four items ranging from 1 = Not at all confident to 4 = Very confident (mean scores are presented).

^b Relative confidence scale rated from –2 = Much less confident in CWQT to +2 = Much more confident in CWQT (mean scores are presented).

Discussion

Our focus was to compare individuals engaged as data collectors and consumers to those engaged only as data consumers in terms of their participant characteristics (inputs such as demographic attributes, motivations, barriers, self-efficacy) and outcomes (impacts such as scientific knowledge, social learning and exchange, and civic and conservation behaviors). Despite marked divisions in levels of project engagement, we discovered few differences between CWQT data collectors and data consumers for most inputs and outcomes of interest, ranging from demographic characteristics and motivations for conservation behaviors. Both data collectors and consumers were generally engaged in the project for similar reasons, and both groups reported similar goals, outcomes, and behaviors linked to project participation. Collectors and consumers believed that the project was successfully achieving the desired goals of generating data to inform management and fostering a sense of community among participants. In fact, consumers were generally more likely to recognize these benefits than collectors. These results suggest a potential need to expand the way that project managers recognize citizen science participation, making a concerted effort to acknowledge the important role that data consumers may play with respect to the generation of project outcomes. In other words, there could be substantial conservation benefits derived from managing public engagement beyond an exclusive focus on data collectors, which has dominated the citizen science discourse so far.

We suggest that data consumers, or active onlookers, be considered a type of project participant. This consideration may be particularly important given the prevalence of skew in most citizen science projects (; ; ; ). Though skew is typically unplanned, the reality is that citizen science contributions are often heavily unbalanced with a small percentage of highly active participants having the greatest influence on the resulting dataset. Furthermore, skew appears to be a common characteristic of citizen science projects irrespective of whether participation is primarily online or on-the-ground (). This phenomenon is known as the Pareto principle and although not unique to citizen science, it is a common feature of most participatory science projects (). The phenomenon has been acknowledged by participants as well, who also noted that even among the more active contributors to a project, there were still a smaller set of core participants ().

Skew can affect the scientific products of citizen science in several ways. Data suggest that a small percentage of highly skilled birders contribute the most data to eBird, a global citizen science project of bird observations (Sullivan et al. 2011). This unevenness of contributions can inflate spatial bias and distort results, e.g., certain locations are overrepresented based on heavily skewed contributions of a small percentage of participants (). Similarly, skew might affect the broader byproducts of citizen science. For example, highly involved citizen science volunteers might be more likely to engage in conservation or civic-oriented actions than their less-involved peers, or conversely, a few high profile and conspicuous volunteers might alienate or diminish potential contributions of other project participants. Given our findings that data collectors and data consumers differed in their level of project engagement but were both contributing to civic and conservation outcomes, we suggest that a sole focus on skewed participation in data collection might miss important engagement such as we observed with data use or consumption.

Before examining other potential management implications, we note several caveats. First, our project was conducted with a relatively small population of participants in a single citizen science project, and it is not clear the extent to which these results might translate to other contexts. Second, citizen science volunteers, such as those in the CWQT program, typically self-select to their role in the program. Our study design did not allow us to distinguish between traits that people had when they began their affiliation with the program and traits they may have gained as a result of their program activities. For example, the lack of differences in conservation behaviors between data collectors and consumers could mean that conservation behaviors are not an outcome, per se, but simply an input; that is, those who undertake conservation behaviors are predisposed to self-select for affiliation with the project, regardless of their level of project engagement. Longitudinal research could determine whether variables such as conservation behaviors are indeed causal outcomes of participation or a characteristic of those who self-select for citizen science projects. However, even without causality between varying levels of project engagement and key outcome variables, our findings highlight some important considerations for future citizen science project management.

One possible response to our findings is to maintain a flexible and broader definition of citizen science () than the definition that the Oxford English dictionary printed in 2014, which placed centrality on volunteer activities of data collection and analysis (processing information into data). Most researchers treat data collection as the common denominator or defining characteristic of citizen science projects. For example, the typologies of Shirk et al. () and Cooper et al. () place data contributions, either through the collection and sharing of observations or the processing or analysis of data into information, as the key feature common to all types of citizen science projects. The collection of data is undoubtedly a central component to carrying out science, and thus highly relevant to examining scientific outcomes of citizen science. Yet there are multiple ways to participate in citizen science, and activities other than data collection may have explanatory power for examining learning, civic, and conservation outcomes of citizen science.

Consequently, project managers could plan more explicitly for skew or active onlooker effects. That is, they could actively cultivate the learning, civic, or conservation byproducts of citizen science by managing those affiliated with the project in ways other than collecting data, valuing their connection to the project (). For CWQT participants, response rates were much lower for data consumers than data collectors, even though the initial email and survey invitation asked everyone affiliated with the project in any capacity (including simply accessing and viewing data) to respond. This suggests that data consumers might be somewhat reluctant to view themselves as participants and might characterize themselves as passive onlookers instead. Lower reported levels of science efficacy among data consumers lends support to this assertion. Considering the similarities between collectors and consumers and the potential contributions of both in the conservation policy arena, failure to communicate more directly with data consumers, either as participants or onlookers, represents a lost learning opportunity ().

CWQT is a community-driven project and thus it is not surprising that public use of CWQT data extends beyond those collecting the data. Our findings are nevertheless broadly applicable to scientist-driven, or top-down projects, as all have data users. The array of approaches to report-backs or “data out” mechanisms of data sharing demonstrates how citizen science projects serve data users. Some projects invest heavily in data access and visualizations to help public consumers make sense of both individual and aggregate data, and many projects provide downloads of raw data. Thus, the aggregate impact of onlooker effects across the entire citizen science landscape of projects could be substantial. By strategically leveraging these online interactions, project managers can enhance broader learning, civic, and conservation outcomes and perhaps widen the scope of participation. Future research should examine project outcomes relative to the degree of participation in data collection and the degree of data use.

While few studies have examined the onlooker phenomenon in citizen science, evidence suggests that it may be important. In an examination of the policy implications of citizen science, Hollow et al. () compared wildlife management preferences and priorities of data collectors, onlookers (which they defined as people who signed up for a project but did not collect data), and people unfamiliar with the project. While data collectors had different attitudes toward koala management policies than the general public, onlookers typically fell in the middle of the spectrum between data contributors and the general public. Thus, onlookers generated potential conservation benefits (in the form of political support) even without collecting data. Our study builds upon these differences in attitudes by highlighting similar results related to grassroots conservation behaviors.

In our study, both collectors and consumers trusted volunteer-collected data more than government-collected data, highlighting the heightened sense of community and trust that developed among all types of project participants relative to external management authorities. Government sampling sites are often at lower densities than volunteer environmental monitoring sites, and the government sites are not necessarily located in places most relevant to community members or sampled within the relevant time frame. For example, Inuit knowledge and observations provided different insights than data from Environment Canada weather stations (), and the EPA’s stationary air monitoring sites in Louisiana did not align with citizen science reports of the locations of problems and odors (). In the case of the CWQT program, one might speculate that volunteer water monitoring by kayakers was temporally and spatially more relevant for other kayakers and recreationists than government sampling sites. More research is needed to understand how beliefs about data quality/accuracy and trust influence citizen science participation and consumption of citizen science-generated data.

While data collectors and consumers recognized similar beneficial project outcomes with respect to data generation to inform management, consumers in this study had little potential to influence scientific outcomes because of their reluctance to become active data collectors. This is due, in part, to several participation barriers. Our findings suggest that those who collected data had more free time, greater science efficacy, and a firmer grasp on potentially complex field sampling protocols than those who only use data. Although we cannot draw conclusions about causality, these observed differences could mean that (a) science efficacy is a barrier to serving as a CWQT data collector or (b) data collection leads to increased science efficacy. Additional research is needed to explore both possibilities and to understand if or how the act of viewing or consuming data might serve as a gateway to future data contributions.

Finally, it is important to note that people associated with citizen science projects who do not collect data, whether considered participants or onlookers, may represent a captive audience with outstanding opportunities for synergy and social learning. Despite a relatively small sample and limited focus on a single project, our findings support this assertion. Future research with more citizen scientists across diverse contexts is needed to make inferences and examine how to recruit such participants/onlookers, how to manage them, and whether there are drawbacks to their association with a project. Furthermore, skew in participation is a useful feature for research about citizen scientists. When pre- and post-studies are not feasible, or when participants enter projects at the top of evaluation scales for science learning and conservation (which is common because citizen science participants most commonly self-select), skew in participation can be used for studying associations between levels of project engagement and key project outcomes and impacts. Based on our results, we could consider both data collectors and data consumers to be citizen scientists varying in their modes and intensity of participation, contributing unequally to scientific data generation but equally to broader civic and conservation outcomes.

Additional Files

The additional files for this article can be found as follows:

Appendix A

Principal Components Analysis Depicting Five-factor Structure of Items^a Describing Motivations to Engage in the CWQT Project (n = 64). DOI: https://doi.org/10.5334/cstp.82.s1

Appendix B

Principal Components Analysis Depicting Two-factor Structure of Environmental and Science Efficacy Items^a (n = 64). DOI: https://doi.org/10.5334/cstp.82.s1

Appendix C

Principal Components Analysis Depicting Three-factor Structure of Items^a Describing Perceived Outcomes of CWQT Project (n = 64). DOI: https://doi.org/10.5334/cstp.82.s1

Appendix D

Principal Components Analysis Depicting Three-factor Structure of Actions^a Potentially Influenced by the Use of CWQT Data (n = 64). DOI: https://doi.org/10.5334/cstp.82.s1

Appendix E

Items (and Scales) Used to Assess Confidence and Trust in CWQT Data (n = 64). DOI: https://doi.org/10.5334/cstp.82.s1

Citizen Science: Theory and Practice

Research Papers

Contrasting the Views and Actions of Data Collectors and Data Consumers in a Volunteer Water Quality Monitoring Project: Implications for Project Design and Management

Abstract

Introduction

Methods

Study population

Data collection

Discerning collectors from consumers

Measuring project inputs (participant characteristics)

Measuring project outcomes

Data analysis

Results

Discerning collectors from consumers

Project outcomes

Discussion

Additional Files

Acknowledgements

Competing Interests

References