Learning from the Trees: Using Project Budburst to Enhance Data Literacy and Scientific Writing Skills in an Introductory Biology Laboratory During Remote Learning

Citizen science projects can be used in college laboratory settings to allow students to gain hands-on experience in research during emergency remote learning. During the 2020 spring semester, we used the citizen science project, Budburst, in our introductory biology laboratory during the COVID-19-induced emergency remote learning period. The instructors were able to quickly adapt the project for emergency remote teaching because of the versatility of citizen science projects. The goals of this paper are to describe the project the students completed and to determine which data literacy and scientific writing skills were gained through the process. The students were provided with the research question: “How does temperature affect the phenophases of your trees?” Students collected their own data and downloaded Budburst data sets from the website to compare between years and to connect their results to long-term temperature data sets. The final project was a scientific paper based on their findings from both data sets. After the semester, a subset of papers was scored by two researchers using a previously validated rubric designed to evaluate students’ research skills. We evaluated students’ higher-order thinking by investigating their ability to develop a prediction statement, and to improve their qualitative skills by developing graphs, statements on the limitations for methods and results, and alternative explanations for their findings. We saw that using citizen science during remote teaching enabled the students to gain authentic research experiences and continue to improve their skill set even if they could not be in the laboratory. CORRESPONDING AUTHOR: Deborah Lichti University of Delaware, US


INTRODUCTION
One of the major goals of college introductory laboratories is to foster research experiences for the students (Gormally et al. 2009;Brownell et al. 2012;Spell et al. 2014;Brownell and Kloser 2015;Bakshi et al. 2016;Dolan 2016;Indorf, et al. 2019;Lansverk et al. 2020). This concept has been around for some time and was brought to the forefront by the American Association for the Advancement of Science (AAAS) Vision and Change in Undergraduate Biology Education (Woodin et al. 2010;AAAS 2011). The activities included in authentic research laboratory experiences, including hypothesis development, interpretation of results, and encountering unknown outcomes, allow students to participate in science and develop important science practice skills (Weaver et al. 2008;Gormally et al. 2009;Brownwell et al. 2012;Brownell and Kloser, 2015;Clemmons et al. 2020). One way to incorporate science practice skills into introductory laboratories is to include citizen science projects as the foundation for authentic research.
Citizen science enables non-scientists to participate in the scientific process and has been incorporated into school curriculum (Miller-Rushing et al. 2012;Kobori et al. 2016). Citizen science projects allow students to be included in science research occurring around the world and to contribute to growing scientific knowledge (Miller-Rushing et al. 2012). As a result, participants in citizen science projects have seen a growth in their science literacy (Bonney et al. 2009;Vitone et al. 2016;Mitchell et al. 2017;Aristeidou and Herodotou 2020). Moreover, citizen science projects allow students to collect their own data while learning to investigate and incorporate large long-term data sets. These projects also give students opportunities to develop a variety of different scientific skills (e.g., collecting data, and analyzing and interpreting data). Student participation in citizen science data collection, especially that which occurs around their homes, allows students to make connections between their findings and scientific topics.
Citizen science projects have many positive aspects, such as fieldwork experiences, that make them beneficial components in introductory biology laboratory courses (Shah and Martinez 2016;Mitchell et al. 2017). Fieldwork can provide hands-on experiences for students to collect their own data outdoors and learn content from biological components (Easton and Gilburn 2012; Morales et al. 2020;Race et al. 2021;Barton 2020;Bacon and Peacock 2021). From research to fieldwork, the students learn the process of science and incorporate their knowledge gathered through their different experiences intertwined with citizen science projects.
The learning outcomes that can be attributed to citizen science not only meet the process skills of labs, including data collection and analysis, but also provide an opportunity for students to combine their data with previously collected data into a bigger data set, because the projects are designed so that any person can participate in data collection (Bonney et al. 2009;Shah and Martinez 2016;Mitchell et al. 2017). Many citizen science projects allow participants to search databases and to use the data collected over time and across locations, allowing students to gain experience with big data sets (Bonney et al. 2009;Mitchell et al. 2017). Authentic data is defined as quantitative or qualitative data that was gathered from real-life phenomena (Magnusson et al. 2004;Kastens et al. 2015;Kjelvik and Schultheis 2019). These types of data are found throughout the citizen science projects. Therefore, student use of authentic data in citizen science projects may allow them to improve their data literacy skills. Data literacy is demonstrated when students successfully work to analyze the data, interpret the information gathered from authentic data, and then communicate these findings (Gibson and Mourad 2018; Kjelvik and Schultheis 2019). Through incorporating all these skills into a project, students are able to expand on data literacy.

EMERGENCY REMOTE LEARNING AND LABORATORIES
Once the World Health Organization (WHO) declared COVID-19 a pandemic, higher education institutions worked quickly to move from in-person courses to emergency remote learning (temporary shift of instructional delivery to an alternative mode due to a crisis; Hodges et al. 2020). Once the COVID-19 pandemic started, instructors that were not already teaching online struggled to make laboratories accessible remotely and to continue to give students authentic, hands-on experiences (Race et al. 2021;Barton 2020;Bacon and Peacock 2021). Laboratory course instructors struggled to find ways to continue inquiry-based, hands-on, and field experiences when the students had to move off campus, and everyone had restrictions in place with stay-at-home orders (Race et al. 2021;Barton 2020;Bacon and Peacock 2021). One way we overcame these limitations of remote learning was to incorporate citizen science, specifically Budburst (Budburst 2020), into our remote teaching version of our introductory biology laboratory courses.
Budburst was founded in 2007 by climate science researchers who wanted to incorporate volunteers to collect data on plant phenology (Budburst 2020). The program was sponsored by the National Science Foundation and run by the National Ecological Observatory Network with the Chicago Botanic Garden (Budburst 2020). One overarching goal of Budburst is to collect data on plant phenology throughout all the seasons to determine how climate change affects these plants (Budburst 2020). The program has allowed participants to not only collect data but also engage with a long-term data set that is free to download from their website.
The purposes of this paper are to describe how we integrated citizen science into a remote teaching laboratory module and to identify the scientific skills students gained as demonstrated by an individual write-up of the project. First, we describe how we converted the citizen science project Budburst into a remote learning experience for an introductory laboratory. Then we assess the skills that the students developed by using long-term data sets from Budburst and students' data to communicate their findings on phenology. We investigated these skills by scoring an anonymous subset of individual research papers using a validated empirical and representational skills rubric. We predicted that students would develop scientific skills such as experimental design, quantitative methods, acknowledgment of method limitations, and data interpretation, despite the emergency remote nature of the course in spring 2020.

STUDY LOCATION AND DEMOGRAPHICS
This study consisted of students in the integrated introductory biology and chemistry course at an R1 institution in the mid-Atlantic region. The course is the second semester of a two-part introductory course that covers plants, evolution, ecology, and physiology. The students in the course are a majority freshmen (>90%), and are made up of biology majors (40%) and other life science majors including medical diagnostics, exercise science, biomedical engineers, and wildlife and ecology majors.

REMOTE LEARNING BUDBURST PROJECT
This remote-learning version of the Budburst project was adapted from the in-person version (Lansverk et al. 2020) and implemented during spring 2020 when the university moved from in-person teaching to remote learning because of the COVID-19 pandemic on March 12, 2020. This modified version allowed students to complete it while studying at home and following all the safety guidelines put forth by individual states and communities. Students were instructed to follow all local safety precautions. If the students could not collect their own data due to safety restrictions or were in areas that did not have trees, then they used a data set that was collected near the university by the course teaching team. The project continued to be aligned with AAAS Vision and Change competencies ( Table 1) (Clemmons et al. 2020). This project was incorporated into the remote learning curriculum to allow students a handson experience with collecting field data, using large data sets, and analyzing and interpreting results during a time when all other hands-on activities were limited ( Table 1).
The project was completed over a five-week period during remote learning ( week for the students to complete their individual paper assignment (Table 1; see also Supplemental Table 1). The students were provided the research question, "How does temperature affect the phenology of trees?" at the beginning of the project and were introduced to the Budburst website to learn more about phenology and the different stages that occurred during the spring season. Notable changes from the in-person project were the elimination of the physical collaborations of teamwork; instead, students completed the data collection individually by measuring the trees that were found at or near their homes.
The first week in remote learning, students were introduced to Budburst and learned about phenology and what stages could be seen during the spring for deciduous trees. The methods on Budburst instruct participants to identify a phenophase and determine whether a tree is in the early, middle, or late stages by percent. Students recorded percentages weekly to allow them to have a continuous data set and to visualize the changes in phenophase and stage for each of their trees. We worked with the students to develop an Excel data sheet to collect their individual and long-term data. Budburst now has a data sheet available on their website, which did not exist during this course. During the first week, the students took time to investigate the trees in their yard or local neighborhood, and pick three trees of identical species or three trees of different species to study. Students used Seek (www.inaturalist.org/pages/seek_app) from iNaturalist to help identify the trees, contacted a senior laboratory technician in the program that is trained as a botanist (titled: Botanist on Call) for assistance, or spoke with family members who planted the trees in their yard and knew the identity of the trees to determine the focal trees for their study. Students collected data on those specific trees over four weeks. Students were required to take GPS coordinates and photos of the focal trees throughout the four weeks to have visuals and to be able to ask any questions if they needed guidance on phenophase. GPS coordinates were used only during data collection to allow students authentic field research methods, and those coordinates were not shared with other students or included in the paper. Students also could document any issues that arose with the tree or phenophase as a result of weather; for example, there were major storm events in the region that resulted in all the flowers being blown away or even loss of the tree. All of these observations and data collections allowed the students to participate in field sampling, which prompted the students to observe the differences between field and laboratory data. These experiences allowed the students to reflect on the limitations of the data used for this project. Students could submit their data to Budburst but were not required to because of constraints in the species found in their surrounding areas and their collection location (e.g., students outside the USA). Students who could not collect their own tree data included those who lived in areas with especially strict local lockdown regulations.
During the third week of the project, the students were introduced to long-term data sets, and were shown how to search, download, and work with the data in Budburst and Weather Underground (www.wunderground.com/). The students then used the Budburst website to find the closest geographic location and tree species to their own and then downloaded the data. The students had to work through the large data sets to determine what data was needed and if they could find the appropriate phenophase. For temperature data, students used Weather Underground's long-term data sets to determine how temperatures differed over time. During synchronous online laboratory time, students were taught how to graph the different data types using Excel and to interpret their results ( Table 1). Each student wrote up their findings in an individual paper.
The individual paper allowed the students to not just participate in a citizen science project, but also discover how scientists might use the data sets from these projects in their own research, and it allowed students to refine the skill of communicating scientific findings (Supplemental Table 2). The students were able to describe phenophases for their trees and describe their findings for both their individual data set and the data set collected from the Budburst website. The students discussed the limitations of their own field data and the data set from Budburst in the methods and discussion sections. In the discussion, students identified what factors might be affecting phenology besides temperature, and any other limitations that were not directly related to the methods section. Finally, students were supposed to connect their findings to the bigger picture and reflect on how, in general, collecting these data could help study climate change.

QUANTITATIVE CODING
The students completed a consent form at the beginning of the semester that encompassed all assignments and surveys completed during their time in the integrated biology and chemistry course, which fulfilled the university's IRB requirements. The students' final individual writing assignment was used as the summative assessment for the ecology module in the biology course and for qualitative research assessment. All personal identification was removed and replaced with a random identification number. The researchers selected a random subset (n = 60) of the individual papers to code using an adapted and validated rubric based on the Assessment of Scientific Argumentation in the Classroom (ASAC) to evaluate laboratory argumentation skills (Walker et al. 2018;Walker et al. 2019). The original rubric consisted of 23 skills that we believed second semester biology majors would be able to obtain in the introductory biology laboratory course (Supplemental Table 3). These skills were determined prior to the start of the project.
The researchers were two of the four instructors in the laboratory component who oversaw the graduate teaching assistants but were not involved in the original grading of the individual papers during the course for course credit. The rubric was placed in Qualtrics and set up as a survey to gather all coding responses. We used the rubric that was validated for written argumentation or skills (Lansverk et al. 2020). We revalidated the rubric based on the remote learning paper assignment. We had an outside instructor help us validate the survey for the paper assignment to make sure all coders interpreted the description in the same manner and were not biased based on our knowledge of the project. The two coders completed two more rounds of validation by individually scoring three papers using the descriptions for each rubric item. The inter-rater reliability was then calculated, and if any item did not result in an 85% or greater score, those areas were reviewed and the description for that rubric item was updated until we came to a consensus for all items on the rubric. Once we were confident in the rubric and in the rubric descriptions, and the inter-rater reliability was at or greater than 85%, we each scored 60 random papers. We had a mean of 93% (range: 89% to 100%) for inter-rater reliability for all the papers. The coding was completed by presence or absence of the skill and if students completed those skills to a specific standard. For example, the figures had to have correct axis labels and be the accurate graph type for the data set. Once the scoring was complete, the data were totaled and percentages were calculated to determine how well the students were able to complete each skill.

DATA AND LIMITATIONS
When we scored for data collection and visual representation skills, we found most students were able to work both on their own data and on the long-term data set from Budburst. The students were able to determine the limitations of the experiment and the data sets used (100%) and locate relevant information (95%) from Weather Underground and Budburst to help expand the information used to answer their research question and support or reject their hypothesis/prediction (Figure 1). Students created appropriate figures (88.3%) and tables (83.3%) based on the data that needed to be included to discuss their findings (Figure 1). We found that students had a grasp on developing effective figure labels (71.7%) but struggled with the table labels (38.3%) (Figure 1). Finally, we defined troubleshooting technical issues to include weather events that resulted in either the students not being able to collect their data or the students describing how these events affected their trees and their study results. Not many students discussed troubleshooting technical issues (18.3%), because not many students experienced the severe weather events because they were located through the mid-Atlantic region, USA and a few were international.

EXPERIMENTAL DEVELOPMENT AND PAPER CONCLUSION
Students were able to gather and use the literature in their introductions including in-text citations (93.3%; Figure 2). The experimental development skills included the creation of the hypothesis or prediction statement, the experimental design, interpretation of results, and conclusions (Supplemental Table 4). All the students included in the coding were able to design and incorporate key elements to their experimental design (100%; Figure 2). Even though all students were given a research question at the beginning of the project, only 63.3% of them included the actual question in the paper (Figure 2). Students were able to generate a hypothesis/prediction (95%), create a claim supporting or rejecting the statement (96.7%), and then use the evidence to support their claim (90%) (Figure 2). Some students did not include the interpretations of the figures (76.7%) and tables (66.7%) (Figure 2). In their conclusions, 73.3% of the students were able to discuss possible alternative explanations for the data and possible differences seen between trees of similar species (Figure 2). Students struggled to connect their findings about trees with changes in temperature and to identify additional information needed to support their results from individual and long-term data collection (23.3%; Figure 2).

DISCUSSION
During the pandemic, most higher education laboratory courses transitioned to remote teaching, resulting in instructors investigating ways to incorporate projects for hands-on learning. Educators suggested using citizen science or backyard science to help continue these experiences during remote learning (Bacon and Peacock 2021; Richter et al. 2021). Though it is known that citizen science has been used in the classroom, there is very little in the literature about these experiences in higher education (Mitchell et al. 2017;Beacon and Peacock 2021). For example, one group used urban ecology and the students' backyards to help continue connecting the students to the ecology content and material and to allow the students to collect authentic data from their local areas during emergency remote teaching (Richter et al. 2021). By using Budburst in the laboratory course, we have helped to fill the gap in research about using citizen science in higher education by showcasing how citizen science was incorporated into an online introductory laboratory and by highlighting the skills developed by the students during emergency remote teaching.
We found that while completing this project remotely, students developed skills such as writing a prediction/ hypothesis; developing and writing the methods, including the limitations of the methods; collecting individual data and incorporating data from databases; analyzing data to support research conclusions; and discussing alternative explanations for their findings. Students gained experience collecting data in the field and discovered how unexpected events could affect their data set. These practical skills were difficult to foster in remote learning because many laboratory courses lost the ability to have students collect field data (Barton 2020; Bacon and Peacock 2021). Students made positive comments to instructors in passing about the project, but these were not formally collected in any survey from the course. Students stated that the project allowed them to get outside when most of their time was spent indoors during the early days of the pandemic. One student told an instructor that they never really thought about the trees in their backyard and how they transition between phases in the spring, and this project opened their eyes to it. Overall, there were a few areas that expanded the students' abilities and we discuss them in depth below. These might not have occurred in other activities in remote learning.
Authentic data sets are an important component of laboratories that allow students to expand their data literacy skills. We found that students enhanced their data literacy by working with their individual data set and the larger data set from Budburst. Students were able to construct figures and tables and analyze and interpret the data sets. The students were able to construct their own figures (88%) and tables (83%) while remote. We found that students were able to incorporate a few features from authentic data sets. One feature of using citizen science projects is students were able to work through the dataselection process from the website, and then curate the data by determining what data was needed. Students were also able to describe the messiness with their original data collection and secondary data sets from Budburst. These processes were described in the features of using authentic data sets (Kjelvik and Schultheis 2019). Messiness in data is important to foster critical thinking skills in students and to help them recognize that data variability is common (Kjelvik and Schultheis 2019). Data messiness also allowed students to work through issues when data were missing (Kjelvik and Schultheis 2019). In their papers, students described the messiness of the data sets, and how missing data and the variation could have affected their results. For example, one student wrote, "Also, I was limited with the amount of data I was able to find on the Budburst website. I tried to use data that was from the state I live in, but it was difficult to find a wide range of data from Pennsylvania. Some of my data points were from an unknown location in New York… Overall, the lack of data available from Budburst may have affected the results." The differences we saw in the limitations were based on the locations of the students' study sites, which allowed the students to drive the explanation of the results, and allowed the students to own their data and give more indepth discussions of possible alternative explanations for their data sets.
Students struggled to support their claims with evidence and additional information, with only 23% of students using scientific evidence to support their findings. Even though these findings might be discouraging, we felt that it fell under the category of "desirable difficulties." Desirable difficulties are found throughout STEM, and allow students to fail and then overcome these challenges (Bjork 1994a(Bjork , 1994bKapur and Bielaczyc 2012). Since students struggled with some components of the discussion, we should investigate more to determine how we could improve the module or assignment to help them through these difficulties. One aspect of the discussion in which students succeeded was describing alternative explanations and assessing how other parameters could affect the different phenophases, which is a step in the right direction and allows students to incorporate their findings in the conclusions.
Another skill that students were able to demonstrate in their papers was the ability to describe the limitations of the methods and the project. Students were able to evaluate the methods and how those limitations would affect how data was collected and interpreted. Being able to describe limitations in science is a key component of science literacy. 100% of our students described some limitations during the project. The major method limitations students discussed were the length of the experiment, the fact that data was not always available on Budburst, and the students' ability to identify the trees and then determine the percent for each phenophase. Students also described limitations on their data sets through a description of the short-term collection time, the large variation even from similar trees, and the availability of data on Budburst for their trees or their locations. When describing the limitations on data interpretation, students would describe weather events (e.g., high winds or severe thunderstorms) that could affect collecting data or affect the data that was collected. For example, a student commented in their paper, "One factor that could have affected my data is the large amount of storms my area has experienced this month. The first day I took observations for my trees there was a big storm the night before, which caused the majority of the flowers to fly off the cherry tree. If this storm did not occur, there may have been a higher percent of the tree in the flower phenophase than recorded." Being part of the process allowed the students to observe limitations firsthand and determine how far they could interpret the data collected and selected from the database.
One challenge seen when using citizen science data sets was the ability for scientists and policy makers to agree that the data was rigorous enough to use. They question using citizen science because of credibility, completeness, noncomparability, and possible bias of the data sets (Conrad and Hilchey 2011;Golumbic et al. 2017). We saw a similar phenomenon with some students questioning if the data from the database was accurate and valid. One student wrote in their paper, "Also, the data received from Budburst also affect my results because even though it is a great website, one can never know the accuracy of it." Another student stated, "A limitation of this study is that there is no way to know the accuracy of the budburst data since it is given from everyday citizens." Another student commented in their paper, "Lastly, with citizen science there is going to be bias and opinion that play a role in the observations made. Everyone sees pieces differently which limits the reliability of data." Mitchell et al. (2017) found a similar response from freshmen students participating in ClimateWatch, with 31% agreeing that the data was reliable by the end of the project. We want to further investigate this view that students have on volunteers collecting data sets and what would need to be included to allow students to feel comfortable using the data sets.

LESSONS LEARNED AND RECOMMENDATIONS
The first lesson learned was that the project needed more scaffolding and structure to help guide the students through the five-week module, and more time during online synchronous labs should be dedicated to student discussion, enabling them to work through the difficulties that arose. In future semesters, we would include some smaller formative assignments (weekly activities) to help students through the challenges they faced. For example, to help students connect their findings to a greater science concept, we would ask them to describe climate change and how it relates to phenology, budgeting time for instructors to check figures, tables, and their corresponding captions. Finally, we would set aside time, during the laboratory and studio (the 30 minutes allotted for discussion prior to laboratory) times, to encourage students to work through how phenology can fit into the larger picture of climate change, and how data collected by the public can be trusted as part of rigorous data sets. On the basis of the semester length and the timing of phenophases, we would try to extend the time for students to collect data from just four weeks to the whole semester.

CONCLUSION
In conclusion, we recommend integrating citizen science projects into introductory laboratories to give students the opportunity to continue to expand their knowledge and skills as scientists. Incorporating citizen science can be done for a versatile classroom experience. Further follow up will be needed to fully understand the students' perception of the validity of citizen science data sets. Hopefully, as instructors start to incorporate more citizen science into their courses, the data sets will expand. We feel that citizen science projects enable students to interact with the environment around them, and hope students will continue to participate in them even after the course is over.

DATA ACCESSIBILITY STATEMENT
Data not available owing to confidentiality of student information. The individual papers were not shared because others might want to complete this assignment, and we do not want well-meaning students to find the examples. Please reach out to the corresponding author for example papers. The researchers' coding of the papers can be found at https://drive.google.com/drive/folders/1Ee8Dh7wj84a3Qr45IKa-rckoPco-F-4b?usp=sharing.

SUPPLEMENTARY FILES
The supplementary files for this article can be found as follows: • Supplemental

ETHICS AND CONSENT
We obtained IRB approval and student consent through the University of Delaware (#1413165-4). The student consent was for any assignment or survey conducted during the course.