Introduction

Thousands of programs around the world create opportunities for the public to be involved in scientific research, which we will refer to as public participation in scientific research or PPSR (). The types of activities in PPSR programs vary widely, as do the expectations of institutions and scientists for how involved the public will be in the scientific process (; ). In addition to research goals, PPSR programs often have educational goals (), and the participants themselves have personal goals ().

Evidence is mounting that participation in these programs is associated with individual learning outcomes (; ). Previous studies have found linkages between participation and increases in participants’ content and science knowledge (; ). Beyond knowledge, there is evidence that participation is related to increases in participants’ interests, self-efficacy, motivation, science inquiry skills, and even certain behaviors (; ; ; ; ).

Much of what we know about linkages between PPSR and participant outcomes is from programs focused on the data collection process. There is a growing call by practitioners and funding agencies to create opportunities for participants to contribute more often and in deeper ways (; ; ). This type of full participation is often referred to as “co-created,” whereby participants engage in all facets of the scientific process. This co-created approach is purported to result in stronger learning outcomes than other types of participation, such as “contributory,” whereby participants engage in only data collection (; ).

Little is known, however, about how participant outcomes relate to the degree of participation in the scientific process (; ). Recent research has found that the degree of participation within data collection can influence participant outcomes (; ), but there remains a gap in our understanding of how the degree of participation, defined by Shirk et al. () as the “extent to which individuals are involved in the process of scientific research,” relates to individual outcomes. Furthermore, most examples of successful co-creation are at the local or community level (), and it is unclear if co-creation is possible for projects that seek to involve participants at large spatial scales, entirely online.

To address these gaps, we sought to examine how providing opportunities for participants to engage in multiple stages of the scientific process relates to individual learning outcomes through Bird Cams Lab, a virtual space for online cam watchers to work with scientists to design and implement co-created scientific investigations. We created Bird Cams Lab to 1) provide opportunities for the public to engage in all parts of the scientific process, 2) advance understanding of effective project design for co-created PPSR programs at a large spatial scale, and 3) investigate how the degree of participation was associated with participant learning outcomes.

We focused on one Bird Cams Lab investigation to test the hypothesis that participant outcomes would be more robust as the degree of participation in the scientific process increased (). We focused on five outcomes based on the Framework for Articulating and Measuring Individual Learning Outcomes from Participation in Citizen Science (), and sought to answer the following questions: 1) Is a greater degree of participation in the scientific process associated with increases in participants’ content knowledge, self-efficacy in engaging in the investigation, interest in birds, science inquiry skills, and behaviors related to birding, science, and conservation? 2) If participation is correlated with learning outcomes, which phases of the scientific process are associated with the greatest increases?

Methodology

Study design

From 2018 to 2020, Bird Cams Lab created six investigations that engaged approximately 16,000 people, of which 2,014 took part in the focal investigation called “Battling Birds: Panama” (). “Battling Birds: Panama” ran December 2020–June 2021 and used the Cornell Lab of Ornithology’s Panama Fruit Feeder cam, a live-streaming wildlife camera focused on a bird feeding platform located at the Canopy Lodge in El Valle de Antón, Panama.

We employed a pre-post survey design with people who were invited to engage in optional activities spanning the entire scientific process, which were categorized into four phases: question design, data collection, data exploration, and sharing findings. For this study, we constrained our analyses to the first three phases because we sent out the post-survey before the “sharing findings” phase as a result of time constraints (Table 1). We gathered participation data using a variety of sources, including online discussion board comments and votes, webinar registration and attendance data, website login information, and survey questions.

Table 1

For each of the scientific phases included in the analyses, participants were given opportunities to engage with each other and scientists in a variety of ways (See Supplemental File 1: Activities for more details and screenshots of each activity).


PHASEACTIVITIES

Question design 1. Propose and/or comment on possible research questions.

2. Vote for the preferred question on the discussion board.

3. Attend a webinar on study design.

4. Vote for what type of data would be collected.

Data collection 1. Test the data collection protocol and give feedback.

2. Collect data from video clips recorded by the cam.

3. Post on the discussion forums.

4. Take a quiz about species identification and behavior.

Data exploration 1. View the interactive visualization pages.

2. Comment and/or vote on the discussion boards.

3. Attend a webinar on data interpretation.

4. Request and/or analyze data.

This study was conducted under the guidance and approval of the Institutional Review Board for Human Participants (IRB) at Cornell University under protocol #1804007970.

Survey instrument

We administered the optional pre- and post-surveys via the Qualtrics platform. Based on participant feedback and performance during a separate study (), we edited survey questions and dropped any that did not provide meaningful data. Both surveys included questions about respondents’ contribution to the investigation, participation in other Bird Cams Lab investigations, demographics, interest in birds, self-efficacy in engaging in the investigation, content knowledge specific to the Panama Fruit Feeder cam, science inquiry skills, and behaviors related to birding, science, and conservation. The post-survey also included self-report questions regarding respondents’ improvement in science inquiry skills. The pre- and post-survey questions relevant to this study are available in the supplemental materials (Supplemental File 2: Pre-survey Questions and Supplemental File 3: Post-survey Questions).

Independent variables

We quantified the degree of participation with two variables: 1) “number of phases” in which a respondent engaged in the investigation, and 2) the “specific phase(s)” in which the participant engaged (e.g., data collection and data exploration). We created these variables using respondents’ self-reported data from the post-survey and participation data. We treated both variables as categorical with “number of phases” having four categories (0,1,2,3) and “specific phase(s)” having eight categories (no phases, question design only, data collection only, data exploration only, question design and data collection, question design and data exploration, data collection and data exploration, and all three phases). We considered “number of phases” = 0 and “specific phase(s)” = “no phases” as the baseline and reference level for analyses.

The other independent variables included were age, gender, education level, science training, and participation in other Bird Cams Lab investigations. We were unable to account for any potential differences due to race and ethnicity because data was limited (Table 2). Owing to small sample sizes, we reduced gender to female/male and education to five levels, excluding “grade school” in the main model analyses (although we were able to include all levels for both variables in the dropout analysis, which is explained below). For science training, there were four possible levels, which we collapsed to two (yes/no) to indicate if they had any training at all (Supplemental File 2: Pre-survey Questions). For participation in other Bird Cams Lab investigations, we created a binary variable (Yes/No) based on self-reported and actual participation data.

Table 2

Characteristics of all those who completed the pre-survey, completed the pre-survey only, and completed both the pre- and post-surveys.


TOTAL (ALL PRE-SURVEY RESPONDENTS)PRE-SURVEY ONLYPRE- AND POST-SURVEY



N% OF TOTALN% OF TOTALN% OF TOTAL

Total number of respondents1801100%144580.2%35619.8%

N% OF SAMPLEN% OF SAMPLEN% OF SAMPLE

Number of phases *None126570.2%112477.8%14139.6%

127215.1%18913.1%8323.3%

21096.1%322.2%7721.6%

3583.2%30.2%5515.4%

Missing975.4%976.7%00.0%

Specific phase(s) *Question design26914.9%15010.4%11933.4%

Data collection21111.7%463.2%16546.3%

Data exploration18510.3%664.6%11833.1%

Participated in other
Bird Cams Lab investigations *
Yes58232.3%37526.0%14941.9%

No121967.7%107074.0%20758.1%

Pre-survey campaign *First109861.0%85759.3%24167.7%

Second70339.0%58840.7%11432.0%

Highest level of formal educationGrade school80.4%70.5%10.3%

High school19911.0%16711.6%329.0%

Associate degree1538.5%1208.3%339.3%

Bachelor degree52729.3%41528.7%11231.5%

Master’s degree45825.4%35224.4%10629.8%

Doctorate degree19110.6%14710.2%4412.4%

Prefer not to answer573.2%493.4%82.2%

Missing20811.5%18813.0%205.6%

Science trainingYes61634.2%48533.6%13136.8%

No97053.9%76753.1%20357.0%

Missing21511.9%19313.4%226.2%

Gender identity *Female116764.8%90462.6%26373.9%

Male35219.5%29020.1%6217.4%

Non-binary241.3%221.5%20.6%

Prefer to self-describe120.7%70.5%51.4%

Prefer not to disclose372.1%322.2%51.4%

Missing20911.6%19013.1%195.3%

Race/ethnicityAmerican Indian or Alaska Native10.1%10.1%00.0%

Asian241.3%221.5%20.6%

Black or African American20.1%20.1%00.0%

Hispanic221.2%211.5%10.3%

Native Hawaiian or Other Pacific Islander20.1%20.1%00.0%

White53129.5%43330.0%9827.5%

Other160.9%151.0%10.3%

Prefer not to answer321.8%251.7%72.0%

Missing118866.0%94165.1%24769.4%

Mean (SE)NMean (SE)NMean (SE)N

Age *54.60 (0.43)152353.80 (0.50)120257.40 (0.85)321

Notes: For age, standard errors are reported for means and the sample sizes (N) do not include missing values. Asterisks next to variables indicate a statistically significant difference (p < 0.05) between respondents who completed the pre-survey only and those who completed both surveys.

Dependent variables

Interest in birds: We calculated the difference between pre- and post-surveys (hereafter post-pre difference) in a composite score created from three statements (Supplemental File 4: Dependent Variable Details). Scale reliability for the score using Cronbach’s alpha was 0.813 for the pre-survey, suggesting good internal consistency ().

Self-efficacy: We calculated the post-pre difference in a composite score created from six statements measuring self-efficacy in engaging in the investigation (hereafter “self-efficacy”) (Supplemental File 4: Dependent Variable Details). Scale reliability for the score using Cronbach’s alpha was 0.766, suggesting acceptable consistency ().

Content knowledge: We calculated the post-pre difference in a score based on correct answers for a nine-question multiple-choice quiz about the birds and food seen on the Panama Fruit Feeder cam as well as relevant scientific terms (Supplemental File 4: Dependent Variable Details).

Observed science inquiry skills (identifying answerable research questions): We created the post-pre difference of a score for a question that presented five research questions and asked respondents to identify which were answerable (Supplemental File 4: Dependent Variable Details).

Observed science inquiry skills (interpreting data): We created the post-pre difference of a score for a question that asked respondents to interpret a stacked bar chart (Supplemental File 4: Dependent Variable Details).

Self-reported improvement in science inquiry skills: We modified the Skills of Science Inquiry scale from the Technical Brief Series () such that it began with the statement, “This project improved my ability to,” we reduced the number of statements from twelve to eight, and we customized the wording of each statement to the investigation experience (Supplemental File 4: Dependent Variable Details). We calculated the mean score of the eight statements presented on the post-survey to create a composite “science inquiry improvement” score. Scale reliability for the score using Cronbach’s alpha was 0.908, suggesting excellent internal consistency ().

Behavior: We created the post-pre difference in the number of behaviors respondents indicated that they had done in the past year (hereafter “behavior score”). The research team created a list of ten options, which included behaviors related to birds, science, and conservation (Supplemental File 4: Dependent Variable Details).

Survey administration, participation, and cleaning

We used a convenience sampling method () to recruit people to take the optional pre-survey in two campaigns. Before the investigation began, we distributed the survey to potential participants and the Bird Cams Lab community in November and December 2020. Then, in January 2021, we distributed the survey again to recruit additional participants before the data collection phase. While this second distribution allowed us to reach more people, because we used this method of recruitment, participants who completed the pre-survey in this second distribution campaign may have already engaged in the question design phase. On May 17, 2021, after the data exploration phase was complete, we distributed the optional post-survey to 8,479 individuals via an email list that included those who took the pre-survey as well as anyone who opened at least one Bird Cams Lab email, subscribed to the Bird Cams Lab email list, or contributed to any Bird Cams Lab investigation. For more information about survey administration, see the supplemental materials (Supplemental File 5: Survey Administration and Cleaning).

The pre-survey was opened 2,060 times. We could not calculate a true response rate because we distributed the survey via social media channels and other platforms. There were 1,800 useful responses after we removed 260 responses (Supplemental File 5: Survey Administration and Cleaning). The post-survey was opened by 1,490 people (17.57% of the 8,479), and we had 999 useful responses after removing 491 responses (Supplemental File 5: Survey Administration and Cleaning).

Analyses

Creating sample for analyses

Of the 1,800 pre-survey responses and 999 post-survey responses, we matched 356 respondents’ surveys by email address and first and last names (19.8% of pre-survey responses; Table 2). We also matched participation data to the 356 responses using email addresses, usernames, and first and last names. Then, we created the variables for analyses and checked survey responses for straightlining (i.e., when a respondent selects the same answer for blocks of questions resulting in zero variance) as a check for bots and low-quality responses, finding no evidence of low quality or bots ().

Dropout analysis

Because participants self-selected the phases in which they engaged, we tested for potential sample bias in who completed the post-survey responses. Using t-tests, Chi-square tests, and Fisher’s Exact tests, we compared participants who took only the pre-survey with those who took both the pre- and post-surveys in terms of the number of phases in which they engaged, whether or not they engaged in each phase, their demographics, the survey distribution campaign, and whether they contributed to other investigations.

Model analyses

We ran linear regressions in R 4.1.3 () to determine if participants’ self-reported improvement in science inquiry skills and changes in interest in birds, self-efficacy, content knowledge, observed science inquiry skills, and behavior were associated with 1) increased involvement in the scientific process and 2) certain phases of the scientific process.

We modeled each dependent variable twice as a function of: 1) the “number of phases,” and 2) the “specific phase(s).” The following variables were included in the models to control for possible confounding effects: age (centered at its mean), gender, highest education level, science training, participation in another Bird Cams Lab investigation, and survey distribution campaign.

We assessed multicollinearity of each predictor variable by calculating the generalized variation inflation factor (GVIF, ) using the car package (). The GVIF values of all predictor variables in the various models were less than three, indicating there was no problematic multicollinearity ().

For each model, we assessed the statistical importance of each predictor variable with Type II Wald F tests using the car package (; ). To understand the effect size of the statistically significant predictors, we used the emmeans package () to calculate the estimated marginal means (EMMs). For the main predictors, we visualized the EMMs using the ggplot2 package (), and we used the multcomp package () to compare the different categorical levels via t-tests, adjusting the p-values using the Tukey method. The confidence intervals in the graphs are for visualization purposes only, and do not reflect the t-tests performed in the pairwise comparisons between the EMMs.

For all models and comparisons, we assessed statistical significance at α = 0.05. We confirmed model assumptions were met by visually inspecting diagnostic plots using the performance package ().

Results

Participant demographics and participation

Respondents who completed both the pre- and post-surveys (N = 356) were more likely than those who completed only the pre-survey to contribute to the investigation (as measured by “number of phases” or “specific phase[s]”), to participate in another Bird Cams Lab investigation, to take the pre-survey during the first distribution campaign, to identify as female, and to be older (p ≤ 0.031; Table 2). There was no statistically significant difference in science training or in highest education level between those who completed both surveys and those who completed the pre-survey only (p ≥ 0.082).

Of the respondents who completed both surveys, most contributed to at least one phase, with the greatest percentage engaging in data collection (Table 2). When considering all the possible combinations of engagement, the most participants engaged in all three phases, followed by engagement in data collection and data exploration (Figure 1).

Figure 1 

The number of respondents who completed both surveys (total N = 356) and contributed to one, two, or three phases of the scientific investigation.

Interest in birds

The average “interest in birds” score on the pre-survey was 4.84 (Standard Deviation (SD) = 0.40, N = 356, Range = 1–5; Figure 2), and the average post-pre difference in scores was 0.04 (SD = 0.34, N = 355; Figure 2). Most respondents (77%) had the maximum “interest in birds” score on the pre-survey and had a post-pre difference of 0. In the model analyses, the full models did not fit the data better than the intercept models (“number of phases” model: R2 = 0.02, F12,295 = 0.58, p = 0.862; “specific phase(s)” model: R2 = 0.06, F16,291 = 1.18, p = 0.285). For “interest in birds” and all following dependent variables, model output tables are available in the supplemental materials (Supplemental File 6: Model Output).

Figure 2 

For each dependent variable with a pre-post difference (1–6), distributions of (a) pre-survey scores and (b) post-pre differences in scores with the mean shown as a dashed line. Note that the range of the horizontal axes and vertical axes differ, and that this figure shows raw data without controlling for other variables included in the statistical analyses.

Self-efficacy

The average self-efficacy score on the pre-survey was 3.55 (SD = 0.67, N = 356, Range = 1–5; Figure 2), and the average post-pre difference in scores was 0.08 (SD = 0.51, N = 313; Figure 2). The more phases a respondent contributed to, the greater the change in their self-efficacy score from pre- to post-survey (F3,259 = 11.71, p < 0.001; Figure 3a). However, the increase in self-efficacy scores was not statistically significantly different compared with those who didn’t contribute until respondents were contributing to at least two phases (Figure 3). No covariates were statistically significantly associated with the post-pre difference in self-efficacy scores in this model (p ≥ 0.063; Supplemental File 7: Type II Wald F Test Results).

Figure 3 

The predicted mean post-pre difference in self-efficacy scores (a) increased with the more phases in which a respondent participated, (b) with the greatest increases associated with data collection. Each point is the Estimated Marginal Mean (EMM) post-pre difference in self-efficacy scores with 95% confidence intervals (CI). Means not sharing any letter are statistically significantly different from the other means by the t-test adjusted using the Tukey method. The horizontal dashed line represents the score for those who did not participate at all, and black squares indicate which values are statistically significantly different to the baseline (“no phases”). The range on the y-axis reflects the range of values in the matched pre- and post-survey data.

In the second model, the “specific phase(s)” in which the participants engaged was associated with the post-pre difference in self-efficacy score (F7,255 = 7.77, p < 0.001), with the greatest increase in self-efficacy scores seen for respondents who engaged in the data collection phase (Figure 3b). Education was the only other predictor variable associated with the post-pre difference in self-efficacy score (F4,255 = 2.61, p = 0.036). Those who had a doctorate degree had 0.37 higher self-efficacy scores compared with those who had an associate degree (p = 0.028). The other covariates were not statistically significantly associated with the post-pre difference in self-efficacy scores in this model (p ≥ 0.095; Supplemental File 7: Type II Wald F Test Results).

Content knowledge

The average content knowledge score on the pre-survey was 3.91 (SD = 2.50, N = 308, Range = 0–9; Figure 2), with scores increasing by 1.33 points, on average, from pre- to post-survey (SD = 2.14, N = 290; Figure 2). The more phases a respondent contributed to, the greater the change in their content knowledge score (F3,251 = 20.51, p < 0.001; Figure 4a). The increase in content knowledge scores leveled off at two phases, such that there was not a statistically significant difference between two and three phases (Figure 4a). Additionally, the post-pre difference in content knowledge score was associated with the survey distribution campaign: respondents who completed the pre-survey in the second campaign increased their score pre- to post-survey, on average, by 0.52 more points than those who completed the pre-survey in the first campaign (F1,251 = 3.97, p = 0.047). The post-pre difference in content knowledge score was not associated with the other covariates (p ≥ 0.176; Supplemental File 7: Type II Wald F Test Results).

Figure 4 

The predicted mean post-pre difference in content knowledge scores (a) increased the more phases in which a respondent participated, (b) with the greatest increases associated with data collection. Each point is the EMM post-pre difference in content knowledge scores with 95% CI. Means not sharing any letter are statistically significantly different from the other means by the t-test adjusted using the Tukey method. The horizontal dashed line represents the score for those who did not participate at all, and black squares indicate which values are statistically significantly different from the baseline (“no phases”). The range on the y-axis reflects the range of values in the matched pre- and post-survey data.

In the second model, “specific phase(s)” was associated with the post-pre difference in content knowledge scores (F7,247 = 12.10, p < 0.001), and participants who engaged in the data collection phase showed the greatest improvement in scores (Figure 4b). None of the covariates were statistically significantly related with the post-pre difference in content knowledge scores (p ≥ 0.135; Supplemental File 7: Type II Wald F Test Results).

Science inquiry

For the two questions measuring science inquiry skills, the average scores on the pre-survey were 4.40 (SD = 1.05, N = 308, Range = 0–5) for “identifying answerable research questions” and 1.72 (SD = 0.51, N = 347, Range = 0–2) for “interpreting data,” and the average post-pre difference in scores were 0.16 (SD = 0.97, N = 299) and 0.02 (SD = 0.63, N = 334), respectively (Figure 2). Most respondents scored perfectly on either question on the pre-survey (identifying answerable research questions: 66%, interpreting data: 74%). Additionally, most respondents had a post-pre difference of 0 for both questions (62% and 69%, respectively). When we conducted the model analyses, we found that the full models did not fit the data better than the intercept models (identifying answerable research questions: “number of phases” model: R2 = 0.04, F12,260 = 0.95, p = 0.500; “specific phase(s)” model: R2 = 0.06, F16,256 = 1.05, p = 0.403; interpreting data: “number of phases” model: R2 = 0.06, F12,284 = 1.51, p = 0.120; “specific phase(s)” model: R2 = 0.06, F16,280 = 1.19, p = 0.273).

The average self-reported improvement in science inquiry skills score was 3.47 (SD = 0.73, N = 280, Range = 1–5; Figure 5). In the first model, the more phases a respondent contributed to, the greater their score (F3,236 = 11.53, p < 0.001; Figure 6a). However, respondents who contributed to one phase did not have a score statistically significantly different from those who did not contribute; statistically significant differences were seen for respondents who contributed to at least two phases (Figure 6a). Additionally, respondents who participated in another Bird Cams Lab investigation had science inquiry scores that were 0.32 higher (or 9.47% higher) than those who did not (F1,236 = 11.30, p < 0.001). None of the other covariates were statistically significantly associated with the score (p ≥ 0.181; Supplemental File 7: Type II Wald F Test Results).

Figure 5 

For the self-reported improvement in science inquiry skills score, the distribution of scores with the mean shown as a dashed line.

In the second model, “specific phase(s)” was associated with the improvement in science inquiry skills score (F7,232 = 5.24, p < 0.001; Figure 6b). Respondents who contributed to all three phases had the greatest improvement in science inquiry skills compared with those who did not contribute (p < 0.001); no other combinations of phases were statistically significantly different compared with those who did not contribute at all (p ≥ 0.354; Figure 6b). Similar to the first model with “number of phases” as the main predictor, respondents who participated in another Bird Cams Lab investigation had science inquiry scores that were 0.29 higher (or 8.90% higher) compared with those who did not (F1,232 = 8.82, p = 0.003). None of the other covariates were statistically significantly related to the score (p ≥ 0.155; Supplemental File 7: Type II Wald F Test Results).

Figure 6 

The predicted mean self-reported science inquiry improvement scores (a) increased the more phases in which a respondent participated, and (b) no one phase was associated with greater increases. Each point is the EMM scores with 95% CI. Means not sharing any letter are statistically significantly different from other means by the t-test adjusted using the Tukey method. The horizontal dashed line represents the score for those who did not participate at all, and black squares indicate which values are statistically significantly different to the baseline (“no phases”). The range on the y-axis reflects the range of values in the post-survey data.

Behavior

The average behavior score on the pre-survey was 7.40 (SD = 1.64, N = 356, Range = 1–10; Figure 2), and the average post-pre difference in scores was 0.27 (SD = 1.32, N = 356; Figure 2). Most respondents (75%) selected at least seven of the 10 possible behaviors on the pre-survey. In the model analyses, the full models did not fit the data better than the intercept models (“number of phases” model: R2 = 0.05, F12,296 = 1.50, p = 0.248; “specific phase(s)” model: R2 = 0.05, F16,292 = 1.03, p = 0.430).

Discussion

Using a pre-post survey design, we found that as the degree of participation in the scientific process increased, so did changes in three learning outcomes: self-efficacy, content knowledge, and self-perceived improvement in science inquiry skills. Our findings build on what Shirk et al. () found: that individual-level outcomes were related to the degree of involvement in the scientific process. However, there was not a linear relationship between number of phases and amount of change or self-perceived improvement. When we considered the specific phases in which participants engaged, not just the number, we found that the impact of each phase was not the same.

Of the three phases participants could engage in, data collection was associated with the greatest increases in content knowledge regardless of engagement in question design and/or data exploration. This result builds on several studies that found a link between engagement in PPSR programs and increases in content knowledge (; ; ; ; ). Interestingly, we found this relationship between data collection and content knowledge even though additional pathways for engagement with the scientific process existed, suggesting that observing and recording observations through data collection is key to increasing knowledge. Such findings align with experiential learning theory, which suggests that “knowledge is created through the transformation of experience. Knowledge results from the combination of grasping and transforming experience” (). Indeed, Dickinson et al. () proposed that learning could be the greatest during data collection because when participants make observations, they start to form questions. Additionally, our findings build on an experimental study that found no difference in content knowledge gains between participants who engaged in data collection and those who engaged in data collection and data analysis ().

Similar to the content knowledge results, we found that data collection was associated with the greatest increases in self-efficacy, and that this relationship existed even with additional pathways for engaging in the scientific process. There was no difference in how much self-efficacy scores increased among those who engaged in data collection only, data collection and data exploration, or all three phases. Our findings support previous work that assessed overall participation in PPSR programs and participants’ self-efficacy (; ; ; ), but contrast with other work. Lynch et al. () found that participants’ self-efficacy was maintained as a result of participating in a contributory project, and Price and Lee () found self-efficacy decreased as a result of participation in a co-created project. In our study, data collection may have been associated with the greatest changes in self-efficacy because data collection skills were easier to gain compared with skills that were needed in question design or data analysis (). Perhaps the easier it is to gain skills, the greater a participants’ increase in self-efficacy in engaging in the investigation.

Interestingly, no single phase was associated with the greatest scores reflecting participants’ self-reported improvement in science inquiry skills. This supports the hypothesis linking a greater degree of participation to more robust participant outcomes because only engagement in all three phases was associated with a science inquiry skill score that was statistically significantly different compared with the baseline group (no engagement) (; ). Additionally, our findings provide answers to questions proposed by Bonney et al. (): “To what extent do participants gain from projects because they help shape them?…other questions relate to the overall impacts of PPSR participation, including participation in areas of inquiry that have not been well studied such as…data visualization…” Participants who helped shape the investigation, participated in data visualization, and contributed to data collection had the highest self-perceived improvement in science inquiry skills. Our findings also support previous work that linked PPSR to gains in science inquiry skills (; ; ).

In terms of the “interest in birds,” observed science inquiry skills, and behavior, we found little to no change, and subsequently no relationship with degree of participation. Our results contrast with previous work that found an increase in interest in the study topic through engagement in PPSR (; ). Phillips et al. () suggested pre-existing interest could be high, and as a result, it would not change via participation, which is exactly what we found for the “interest in birds” score. We focused our advertising efforts on the Cornell Lab’s existing audiences, a group generally already interested in birds and engaged in behaviors related to birds, science, and conservation before the investigation was underway, so there may have been little opportunity for increasing interest or changing behaviors within our sample population. Additionally, the lack of change in observed science inquiry skills may have resulted from questions that were too easy, given that the majority of respondents’ scores on the pre-survey were correct (Figure 2).

There was also no evidence for any relationship between learning gains and the other predictor variables in most models, except for three instances. First, respondents whose highest education level was a doctorate degree increased their self-efficacy score more than those who had an associate degree. However, education was only related to change in self-efficacy in one of the two models, and there were no other statistically significant differences between the other education levels. Second, we found respondents who completed the pre-survey in the second survey distribution increased their content knowledge more than those who completed the pre-survey in the first distribution. Respondents who completed the surveys during the different campaigns may have been exposed to or understood the project differently, ultimately influencing how much they learned about the subject matter. However, again, survey distribution was only statistically significantly related to change in content knowledge in one of the two models. Third, respondents who participated in another Bird Cams Lab investigation had self-reported improvement in science inquiry skills scores that were higher than those who had not. Participation in other investigations influencing self-perceived improvement in science inquiry supports the claim by Bonney et al. () that gains in science inquiry skills require the opportunity for reflection on one’s role within the project and scientific process. If a participant was involved in another investigation, their exposure to the scientific process was greater.

Limitations and further considerations

While we successfully used a pre-post survey design to assess change in participant-level outcomes, and included a baseline group to compare against, there are still important limitations. First, our sample population was self-selected, biasing our sample toward those who are already interested in the research topic and/or scientific investigations, and who are willing to complete surveys. While having a baseline group helped mitigate this problem, we recommend future studies randomly assign participants to treatment groups (e.g., ) and analyze actual participation data (e.g., log files, ) to establish causation as opposed to correlation. Second, we relied on quantitative survey measures, and potentially missed perceived changes detectable only via qualitative methods () or unintended outcomes. Third, we created new questions and customized standardized scales, which means that our results are not directly comparable to other studies. We encourage future work to use validated, unmodified scales (e.g., ). Fourth, we assessed outcomes for participants engaged in one investigation. There were multiple Bird Cams Lab investigations, and 42% of survey respondents who completed both surveys engaged in another investigation (Table 2). Recent work suggests that multi-project participation is the norm (), and project design, including the research topic, can influence learning (). Finally, we recognize that the degree to which participants are engaged in the scientific process is only one aspect of participation. Quality is another key aspect in understanding participation () as well as the other dimensions of engagement that exist in PPSR programs (; ).

With regards to our finding that data collection was associated with the greater increases in participants’ self-efficacy and content knowledge, there are two important considerations: 1) Data collection was the phase in which most respondents who completed both surveys engaged (Table 2), and 2) some participants may not understand they can or may lack a desire to engage in phases outside of data collection. Regarding the first consideration, Bruckermann et al. () also found that when participants were given the opportunity to engage in data collection and data analysis, data collection was the phase most participants engaged in. We do want to note, however, that data collection was not the most popular phase in all Bird Cams Lab investigations. Regarding the second consideration, when we invited the public to engage in the question design phase, several participants were confused about their role and asked where and when they could start collecting data. Participants’ confusion as to their role in the scientific investigation is not new () and could have influenced their participation. Alternatively, participants may be satisfied with their role as data collectors and the role of professional scientists in leading the research process (), or their perceived role as data collectors could prevent them from taking on roles beyond data collector ().

Because our findings are based on an investigation that was entirely online and centered on basic research, we recommend that future research consider how the degree of participation relates to learning outcomes in the context of investigations in different settings and different research topics. We expect participant motivations to differ depending on the context, and motivations can influence learning (; ). Additionally, we recommend future research to consider other types of analyses, such as path analysis (), because some of the outcomes we measured may be influencing each other and participants’ learning potential. We hope that future work builds on this study’s findings so that practitioners have a clearer picture of how best to involve the public in scientific research in order to meet their educational goals.

Conclusion

Our research provides insights into how the degree of participation in the scientific process relates to individuals’ learning outcomes. In support of the hypothesis proposed by Bonney et al. () and evidence found by Shirk et al. (), we found that the greater the degree of participation in the scientific process, the greater the changes in learning outcomes. However, perhaps more importantly, we found that even when we created opportunities for participants to engage in other stages of the scientific process, engagement in data collection was associated with greater gains in participants’ self-efficacy and content knowledge compared with other phases.

For programs with limited funding and resources that seek to influence participant-level outcomes, focusing efforts on data collection may be the most impactful on content knowledge and self-efficacy. Additionally, data collection may be the phase in which the public prefers to participate; of those who completed both surveys in our study and contributed to the investigation, most participants engaged in data collection. However, if programs seek to increase participants’ science inquiry skills, we recommend investing in a co-created or collaborative process in which participants can be a part of other activities in addition to data collection.

While our study was not experimental, we were able to measure changes in learning outcomes with a pre-post survey design and compare these changes across varying degrees of participation, including no participation. Depending on which participant-level outcomes a PPSR program seeks to focus on and the resources available, creating opportunities for participants to engage in the scientific process outside of data collection may or may not be in their best interest.

Data Accessibility Statement

The data and analyses supporting the findings of this study are openly available in Mendeley Data at https://doi.org/10.17632/9jnxkkcxxd.1.

Additional Files

The additional files for this article can be found as follows:

Supplemental File 1

Activities. DOI: https://doi.org/10.5334/cstp.594.s1

Supplemental File 2

Pre-Survey Questions. DOI: https://doi.org/10.5334/cstp.594.s2

Supplemental File 3

Post-Survey Questions. DOI: https://doi.org/10.5334/cstp.594.s3

Supplemental File 4

Dependent Variable Details. DOI: https://doi.org/10.5334/cstp.594.s4

Supplemental File 5

Survey Administration and Cleaning. DOI: https://doi.org/10.5334/cstp.594.s5

Supplemental File 6

Model Output. DOI: https://doi.org/10.5334/cstp.594.s6

Supplemental File 7

Type II Wald F Test Results. DOI: https://doi.org/10.5334/cstp.594.s7