Data Quality from a Community-Based, Water-Quality Monitoring Project in the Yukon River Basin

This paper examines the quality of data collected by the Indigenous Observation Network, a community-based water-quality project in the Yukon River Basin of Alaska and Canada. The Indigenous Observation Network relies on community technicians to collect surface-water samples from as many as fifty locations to achieve their goals of monitoring the quality of the Yukon River and major tributaries in the basin and maintaining a long-term record of baseline data against which future changes can be measured. This paper addresses concerns about the accuracy, precision, and reliability of data collected by non-professionals. The Indigenous Observation Network data are examined in the context of a standard data life cycle: plan, collect, assure, and describe; as compared to professional scientific activities. Field and laboratory protocols and procedures of the Indigenous Observation Network are compared to those utilized by professional scientists. The data of the Indigenous Observation Network are statistically compared to those collected by professional scientists through a retrospective analysis of a set of water-quality parameters reported by all three projects over a number of years. No statistical differences were found among the three projects for pH, Calcium, Magnesium, or Alkalinity, although statistically significant differences were found for Sodium, Chloride, Sulfate, and Potassium concentrations. The statistical differences found were small and likely not significant in terms of interpreting the data for a variety of uses. Our results suggest that Indigenous Observation Network data are of high quality, and with consistent protocols and participant training, community based monitoring projects can collect data that are accurate, precise, and reliable.


Introduction
When using Community-based Monitoring (CBM) data for research or making decisions, scientists and policy makers are understandably concerned about data accuracy and reliability. Several studies have demonstrated that CBM data are of similar accuracy and reliability as those collected by professional scientists (Mellanby 1974;Reynoldson et al. 1986;Au et al. 2000;Fore et al. 2001;Shelton 2013;Danielsen et al. 2014). A study by Mattson et al. (1994: 10) compared data from surfacewater samples collected by volunteers and professionals taken on the same day at the same location and found only minor differences, concluding, "… that volunteers can collect samples and meet high quality control standards." Jollymore et al. (2017) studied the comparability of samples collected by citizens with those collected by researchers through a resampling of citizen sample sites and a statistical comparison of two laboratory analyzed parameters. They found no significant difference in dissolved organic carbon, although nitrate was significantly greater in the citizen-collected samples (Jollymore et al. 2017). Because researcher and citizen sampling locations did not exactly overlap, these researchers concluded that greater nitrate concentrations reflected not inaccurate data but a citizen bias toward sampling more impacted sites (Jollymore et al. 2017). Side-by-side field measurements made by trained citizens and a water professional were compared by Shelton (2013), who found no significant differences in all but one measurement. Shelton (2013) concluded that further training in instrument calibration and field methods may be necessary for some parameters and more complex field instruments. Nicholson et al. (2002) statistically compared data collected by a CBM network with parallel data collected by scientists at the same streams over a number of years. The results were similar to those described for other studies, with significant differences found in some, but not all, parameters.
Despite evidence that CBM data can equal professional data in quality, continued skepticism implies that scientists and policy makers remain less likely to trust conclusions based on CBM (Riesch & Potter 2013). During a time when environmental monitoring is of growing importance to establish baselines against which to measure future changes brought about by changing climate regimes, CBM programs have been growing, while a bias against the quality of the data persists (Bird et al. 2014;Buckland-Nicks, Castledon, and Conrad 2016;Freitag et al. 2016).
This paper presents a case study examining the data collection activities and resulting data of a CBM project, the Indigenous Observation Network (ION), in the Yukon River Basin (YRB) of Alaska and Canada. Data collection protocols and data quality are examined in the context of a standard data life cycle of planning, collecting, assuring, and describing data (Ball 2012) that supports accurate, precise, and reliable data collection. We begin with background information about ION to provide context for our study.

Indigenous Observation Network
The YRB is the fourth largest drainage basin in North America encompassing 855,000 square kilometers in northwestern Canada and central Alaska (Brabets et al. 2000) and is home to 76 Tribes and First Nations. Originating in British Columbia, Canada, the Yukon River (YR) flows 3,701 kilometers through one of the largest and most diverse ecosystems in North America to its outlet into the Bering Sea in southwestern Alaska (Brabets et al. 2000). The YR is the largest river in Alaska and the longest free flowing river in the world (Nilsson et al. 2005;Walvoord and Striegl 2007). Apart from a small dam located in the headwaters, its flow is unimpeded by locks, dams, and levees. It is fundamental to the ecosystems of the Bering and Chuckchi Seas as it provides most of the freshwater runoff, sediments, and dissolved solutes to these systems ( Lisitsysn 1969;ACIA 2005). Similar to other Arctic rivers, the YR has two distinct flow regimes: A winter period dominated by low flows when the river is frozen and the rest of the year, which is characterized by the spring runoff season (Gordeev et al. 1996;Walvoord and Striegl 2007;Brabets and Walvoord 2009;Holmes et al. 2012). Ion concentrations, which are relatively low in the YR, are also similar to those of other large Arctic rivers due to slow weathering processes and low vegetation growth (Gordeev et al. 1996;Holmes et al. 2011). These characteristics of Arctic rivers, combined with a low population density, mean that the YR is relatively pristine compared to other rivers of its size. The remoteness of this region has made it historically difficult to collect scientific data, which, until recently, has led to a lack of information characterizing this system. US Geological Survey (USGS) hydrologists began working with the Yukon River Inter-Tribal Watershed Council (YRITWC) in 2005 to develop and design a communitybased water-quality project, including a Quality Assurance Project Plan (QAPP), which today is known as ION (YRITWC 2017). The ION is collaboratively managed by the YRITWC and the USGS. The YRITWC is an Indigenous non-profit organization with a mission of protecting and preserving the YR for future generations (YRITWC 2016). ION surfacewater samples are collected across the YRB by staff from the Tribal Environmental Program or First Nation Lands and Resources departments who reside in the YRB. Over 35 Alaska Native and Canadian First Nation communities have contributed to the data collection efforts of ION.
As ION surface-water samples are collected by the staff of Tribal Environmental Program or First Nation Lands and Resources departments they are not really collected by "volunteers," because the staff are paid for the time that they spend sampling. However, ION community technicians are not paid by the USGS or the YRITWC for sample collection and are not professional scientists. Still, people who participate in CBM or citizen science initiatives generally care about the environment, are comfortable outdoors, and have some awareness of the scientific process (Cohn 2008), and this is true of ION community technicians. Alaskan community technicians are funded by the US Environmental Protection Agency -Indian General Assistance Program (EPA-IGAP) (EPA 2017), and are eligible to receive regular environmental monitoring and science training. Furthermore, qualitative research showed that the majority of ION community technicians had attended college (Wilson 2017). Thus, ION community technicians are likely to be more aware of scientific processes and more engaged in environmental issues before participation in the project than the average citizen.
The YR at Pilot Station (Figure 1) receives drainage from the entire YRB above the Yukon Delta and is the farthest downstream location where the river is contained in a single channel and unaffected by tidal fluctuation ). These qualities have made this location important for both water-quality and water-quantity monitoring and research. The USGS was engaged in intensive studies in the YRB from the years 2000-2005 and today continues to collect samples and maintain a streamgaging station (USGS station number 15565447) at the YR at Pilot Station as part of the National Water Quality Network (NWQN) (USGS 2015). Pilot Station is also a water quality and quantity sampling location for the Arctic Great Rivers Observatory (Arctic GRO) project at Woods Hole Research Center (WHRC) (Arctic GRO 2016). The operation of two professional and one CBM and research projects at one location presents the opportunity to compare the sample collection and field measurement processes and protocols as well as laboratory procedures and data to assess how well ION, as a community-based project, compares with professional scientific projects.

Methods
We compared the procedures used by the ION project and professional water-quality monitoring and research activities of the NWQN and Arctic GRO to analyze the appropriateness of the protocols used by ION to meet its goals and to collect high-quality data. Community members have collected samples at more than 50 locations across the YRB over the decade that ION has been operating ( Figure 1). This study examined only previously collected data from Pilot Station, as this is the only ION sampling location that has a long-term data record collected by professional scientists in addition to the longterm data record collected by ION community technicians. in the Yukon River Basin Art. 1, page 3 of 13

Field and Laboratory Protocols
Sample collection protocols were obtained from the YRITWC "Water Quality Monitoring Field Manual" (YRITWC 2016) provided to community technicians and available on the YRITWC website; the NWQN "Fiscal Year 2017 Work Plan Guidance" provided to USGS hydrologic technicians; and the Arctic GRO "Field collection and analytical details" (Arctic GRO 2016) document provided to samplers for that program. All documented protocols were verified with technicians and project coordinators from each project. Each project utilizes a different laboratory to analyze their samples. While analyzing the quality assurance and quality control (QA/QC) data from each laboratory to examine laboratory quality is beyond the scope of this study, laboratory protocols were examined to compare standard operating procedures (SOP). The protocols and procedures were obtained from the NWQL website (USGS 2016), communication with the USGS Project Laboratory managers (Uhle, 2017 personal communication), and Arctic GRO documentation (Arctic GRO 2016a). This study assumes that each laboratory is producing data of sufficient scientific quality. Samples were collected by the various groups at the location of the USGS streamgage. The NWQN dataset consisted of 54 samples; that of Arctic GRO had 28 samples; and the ION dataset had 42 samples. Though most samples were collected on different days and at different times, there were instances of collection by different projects on the same day or within two days of one another. Specifically, seventeen NWQN and Arctic GRO samples were collected within two days of one another and 15 ION samples were collected within two days of either NWQN or Arctic GRO (professional). In all cases analyzed here, samples collected within two days of each other represent a change in river discharge of less than 5%. This amount is within the USGS error when measuring discharge (Turnipseed and Sauer 2010), meaning that discharge is essentially identical, and samples could be treated as replicates (See Supplementary Materials for analyses).
Statistical analysis applied to the whole dataset, regardless of sample collection date, included assessment of frequency of abnormal values by examining box-plots, stem and leaf diagrams, and concentration vs. date plots. Analysis of variance (ANOVA) and its non-parametric counterpart, the Kruskal-Wallis technique (Gibbons 1976), were also applied to the data to further determine differences. Each variable was examined for statistical normality using the Anderson-Darling (AD) test Darling 1952, 1954) as well as probability plots to assess the suitability of using ANOVA. In cases, for which it was deemed that ANOVA could give misleading results, Kruskal-Wallis was used to determine the equality of group medians. The possible confounding factor of time was considered by examining the year that the samples were collected divided into four two-year categories, 2007-08 ("Pre-GRO"); 2009-10 ("Early"); 2011-12 ("Mid"); and 2013-14 ("Late") (USGS 2017; Herman-Mercer 2016, Arctic GRO 2016b). Because Arctic GRO did not collect samples during the first two-year category (2007-08), an ANOVA was done without this group. A pairwise t-test was applied to those samples considered replicates (detailed results of all statistical analyses completed, but not presented here due to the constraints of space, are provided in Supplemental Materials).

Results
Results are organized following the steps of a standard data lifecycle: Planning, collecting, assuring, and describing (Ball 2012). The results of planning and collecting are given through a comparison of ION protocols with standard scientific field and laboratory protocols. The lifecycle step assuring is presented by describing the QA/QC and community technician training and scientific oversight practices of ION. Data accuracy is analyzed through the retrospective statistical comparison of data reported by all three projects. Finally, describing is presented through an account of ION dissemination practices.

Planning: Field and Laboratory Procedures
Data accuracy, precision, and reliability can be achieved by following rigorous, standard protocols in the field and laboratory. The NWQN follows the protocols of the USGS field manual (USGS variously dated). ION was designed using USGS protocols for sample collection, processing, and laboratory analysis. The locations of ION sampling sites are also based on USGS procedures and are chosen in collaboration between the community and YRITWC staff facilitated by a site visit. This ensures that samples are collected upstream of a community and removes the potential of community technicians choosing a biased sampling location, thereby increasing the accuracy of representing the chemical constituents of the stream. The Arctic GRO project also follows a modified USGS sample collection protocol.

Collecting: Surface-Water Samples
Surface-water sample collection is accomplished by ION community technicians through the collection of what is referred to as a grab sample, which is collected from a sin-gle point in the river. The sample is collected from a boat mid-river in the main flow of the river, and a sample rod is used to facilitate sampling roughly 32 centimeters below the water surface to ensure a representative sample. This differs from sample collection procedures described in the USGS national field manual (variously dated) and followed by USGS hydrologic technicians, who collect an equaldischarge-increment (EDI) sample, which consists of five depth-integrated samples across the river channel. The five samples are then combined to form a composite sample for analysis. Samplers for the Arctic GRO project collect three grab samples at a similar depth as ION technicians: 1 L from the left bank of the river channel, 2 L mid river, and 1 L from the right bank of the river channel. The total volume of the three grab samples is combined for analytical purposes. Arctic GRO and ION projects supply samplers with kits containing all required sampling supplies prepared and cleaned at their respective facilities, decreasing the possibility of contamination introduced in the field by technicians, which would result in a less accurate sample. Community technicians ship the samples the same day that they are collected via planes that arrive in the communities on a regular schedule. All samples are preserved and kept chilled (USGS variously dated) and required to meet a two-week hold time, accomplished by shipping a cooler overnight from Anchorage, AK to Boulder, CO.
Instantaneous field measurements are collected by all three projects. The pH is measured directly from the river upstream of the boat by ION technicians and at the midriver sampling location directly from the river by Arctic GRO samplers. USGS hydrologic technicians measure pH from each of the five EDI locations and report the median value. The NWQN project has samples analyzed at the USGS National Water Quality Laboratory (NWQL) in Denver, CO; ION samples are analyzed at USGS Project Laboratories in Boulder, CO; and samples collected for the Arctic GRO project are sent to specialized laboratories for analysis. Anions are analyzed via Ion Chromatography (IC) by all three laboratories, whereas cations are analyzed via Inductively Coupled Plasma -Atomic Emission Spectroscopy (ICP-AES) at NWQL and USGS Project Laboratories with WHRC using Inductively Coupled Plasma -Mass Spectrometry (ICP-MS). Results of examining field protocol and analytical similarities and differences between the programs are highlighted in Table 1. QA/QC procedures followed in the field as part of ION include the collection of two separate field replicate samples and one field blank sample throughout the season. This allows for the detection of any contamination or crosscontamination that may be introduced during sampling (Herman-Mercer 2016) that would affect the accuracy of the sample. The examination of laboratory SOP revealed QA/QC measures taken by each laboratory. The USGS Project Laboratories QA/AC procedures include running 10% of samples as a laboratory replicate to assess instrument precision. Any replicate samples found to differ by more than 10% are re-analyzed. Additionally, Standard Reference Samples (SRS) provided through participation in the USGS Branch of Quality Systems (Long et al. 1998) are also analyzed with environmental samples. The SRS contain known amounts of various constituents measured in environmental samples and provide further assessment of the precision of analytical instruments and laboratory procedures, and thus the precision of the resulting data. These QA/QC procedures are similar, and in some cases identical, to those followed by the NWQL and the analyzing laboratories for the Arctic GRO project. More detailed information concerning specific laboratory methods can be found at the NWQL and Arctic GRO project websites (USGS 2016 and Arctic GRO 2016, respectively). Beyond the laboratory QA/QC SOP, data quality is further assured through regular training of community technicians. The ION project begins each spring with a community technician training required annually for project participation. The training consists of teaching: (1) proper surface-water sampling procedures following USGS protocols (USGS variously dated); (2) proper calibration and utilization of instruments to collect instantaneous field measurements; and (3) recording calibration readings, measurements, and other pertinent information and observations on a standard field sheet. A training manual and video with step-by-step instructions in sample collection, instrument calibration, taking field measurements, sample preservation, and shipping requirements is also available on the YRITWC website for community technicians to refer to throughout the sampling season (YRITWC 2016).
Field sheets are thoroughly checked by YRITWC and USGS project personnel to ensure that all measurements taken in the field were completed and all sample bottles were filtered and filled. Any issues found when samples arrive at the YRITWC offices are noted on the field sheet to alert USGS project personnel before sample analysis. If problems with the field sheet or the sample itself are found, the community technician is contacted by YRITWC to explain the issue and correct any problems in protocol before the next sample is collected, reinforcing the importance of consistent protocol.
Consistency in protocol can depend to a certain extent on consistent project personnel. As of 2015, community technicians had been involved with the project for an average of four years (range 1-9 years) and had attended trainings at least once each year since becoming involved in the project (Wilson 2017). Descriptive statistics for sample collection by community technicians for the years 2006-2014 for all ION sampling locations are shown in Table 2; the same statistics specifically for Pilot Station are highlighted in Table 3. Note that two community technicians are typically involved in sample collection at each location.

Assuring: Data Accuracy and Reliability
Reliability has been linked to how well data ranks on accepted characteristics (Agmon and Ahituv 1987).
Here we treat data collected by professional scientists as accepted data and statistically compare data collected by ION community technicians to assess accuracy and reliability. Discharge data 1 from the USGS streamgage located on the Yukon River at Pilot Station and values reported by each project for the parameters analyzed here are shown in Figure 2. Parameter values are graphed by the month the data were collected to highlight how concentrations change with the flow of the river at different times of year. Data consistency was statistically assessed for each project using box-plots, stem and leaf diagrams, and concentration vs. date plots (see Supplemental Materials for all results). Box-plots were used to examine each data set for outliers defined as observations that are at least 1.5 times the interquartile range (Q3-Q1) from the edge of the box. The NWQN dataset contained only one outlier out of 395 reported observations, resulting in a frequency of 0.3%; the Arctic GRO dataset had seven outliers out of 160 reported observations, for a frequency of 4.4%; and the ION dataset contained three outliers out of 310 reported observations for a frequency of 1.0%. Box-plots of each variable examined comparing the data reported by each project are shown in Figure 3. For analytical purposes, samples collected by various projects, within two days of each other, were treated as replicates. Using this metric allowed us to perform a series of paired t-tests. Results presented in Table 4 show no statistically significant differences between the NWQN and Arctic GRO data for any of the constituents. This allows us to combine NWQN and Arctic GRO as professional data and collectively compare them to ION data (Table 5). Here, we see statistically significant differences between the professionally collected data and those collected by ION for Cl and SO 4 , but not for the any of the other constituents.
Further statistical investigation into differences between ION and professional data included ANOVA and Kruskal-Wallis analysis presented in Table 6. This analysis revealed no statistical differences among the three groups for pH, Ca, Mg, and Alk. However, for both Na and Cl, the ION values were significantly higher than either the Arctic GRO or NWQN values. The median overall value for all Cl samples collected by NWQN and Arctic GRO was 0.71 mg/L; in contrast, the median value for the ION samples was nearly 90% higher at 1.33 mg/L. For Na, the median value among the NWQN and Arctic GRO samples was 2.85 mg/L, whereas the median value for the ION samples was 3.73 mg/L, or 31% higher. For both Na and Cl, the standard deviation and mean absolute deviation (Rousseeuw & Van Zomeron 1990) was also greater in the ION samples   The Arctic GRO dataset was not statistically less than the NWQN dataset for either ion. Statistical analysis to determine whether a temporal component in terms of difference over time existed also was completed. This analysis found that for Na and Cl the ION samples were always higher than the samples collected by NWQN, while for K there was a time-dependence to the results as shown in Table 7. During the middle years (2009-12, "Early" and "Mid" categories), ION samples were statistically less than the NWQN samples, but during the beginning and ending sampling times (2007-08 "PreGRO" and 2013-14 "Late"), the ION and NWQN K values were statistically equivalent.

Describing: Dissemination
Part of collecting high-quality data is ensuring that the data get into the hands of the user in a frame wherein the current state of the system is described (Wand and Wang Table 6: Summary of One-way analysis of variance (ANOVA) and Kruskal-Wallis comparing the groups of collectors. If significant differences were found, the differences column indicates which specific differences were significant.
[F-value, F-statistic associated with analysis of variance; p, Probability; KW p, Kruskal-Wallis probability; ION, Indigenous Observation Network; NWQN, National Water Quality Network, Arctic-GRO, Arctic Great Rivers Observatory; <, less than; >, greater than].  1996). If receipt of data is delayed, then the data may be useless, both in terms of understanding the system as well as taking actions based on the data, thereby diminishing its quality. Additionally, data must be disseminated in a way that is understandable and relevant to the questions they are meant to answer. To produce data that are timely and hold currency, USGS personnel assist in data analysis and interpretation and communicate directly to the community technicians through presentations given at the YRITWC's biennial summit meetings and other venues as the opportunities arise. The ION data have been published in different formats to reach the community technicians themselves, the public, and the scientific community. In 2010, a USGS Open-File Report (Schuster et al. 2010) was published that listed all the data collected from the beginning of sample collection in 2006 through 2008 and includes references that describe sample collection methods and laboratory analytical methods. ION data also have been used to inform a long-term trend analysis of water quality in the YR (Toohey et al. 2016). Effort is also made to get data to the communities as quickly as possible by creating community reports containing preliminary data with a USGS disclaimer notifying the user that the data have not yet been approved for publication, are not citable, and are subject to change. This allows the communities to see the results of their hard work without waiting for the USGS to approve the release of the data, which also supports the currency of the data. In addition to hard copy reports, a USGS data release was recently published with downloadable, machine readable spreadsheets of the data through 2014 (Herman-Mercer 2016).
The community reports given to the communities themselves (Tribal or First Nation governments) provide opportunity to use the data in management decisions. Of communities participating in the program interviewed by Wilson (2017), 72% reported that they were using the data, or intending to use the data, for various internal and external educational, planning, and decision-making processes, which indicates that the data are both timely and hold currency in the communities. Additionally, research conducted by Wilson et al. (2018) highlighted the trust that participating communities place on the data as opposed to data collected by government or industry sources. This trust is based on the YRITWC as an Indigenous-led organization and the relationships between staff from both the YRITWC and USGS with the community technicians throughout the YRB. Although the parameters measured by ION do not always address site specific contamination concerns expressed by some communities, ION monitoring is still considered valuable as a source of baseline data (Wilson et al. 2018). Further, in Alaska, where technicians are supported by EPA-IGAP, several years of baseline monitoring are necessary before one can apply for new funding to address site-specific concerns. Thus, one of the ways that ION holds currency for communities is in the support of achieving other monitoring goals.

Discussion
Comparing ION data to those of professional scientific data through the lens of a data lifecycle finds that ION protocols used in the field and laboratory as well as QA/QC procedures are very similar to those used by scientific professionals, which supports accurate, precise, and reliable data. Community participation in required training sessions is high and checking of data by professional scientists support consistent data collection. Directly comparing ION, NWQN, and Arctic GRO data show that ION data track the environmental trends found in the professional data. Retrospective statistical analysis showed no statistical differences for four parameters, and the differences that were found for Na, Cl, SO 4 , and K were small and likely not significant in terms of interpreting the data.
The QAPP for all three projects are very similar. The largest difference found in program protocols was in the sample collection methods, which reflect each project's different objectives. The NWQN is a nationwide project with the objective of determining the status and trends of loads and concentrations of contaminants, nutrients, and sediment in 22 large river coastal sites. Arctic GRO is a coordinated, international effort to collect and analyze a time-series of water samples from the six largest Arctic rivers using identical sampling and analytical protocols. ION is guided by the fifty-year vision of the YRITWC, "to be able to drink water directly from the Yukon River" (YRITWC 2017).
The sampling protocols varied from the collection of one grab sample as used in ION, collection of three grab samples by Arctic GRO, and EDI sampling conducted by NWQN. We find, due to the overwhelming consistency in the resulting data, that the program protocols used by ION are sufficient for collecting samples that would detect if particular parameters are elevated, indicating possible contamination, while simple enough to be followed accurately by someone with minimal training. While a grab sample collected from a heterogeneous stream could give different results than an EDI sample, the YR at Pilot Station is considered to be well mixed (personal communication Solin, USGS 2017). The results of the paired t-test conducted in our analysis, which showed that three grab samples across the river did not give different results than an EDI sample, support this statement. Therefore, a grab sample should be sufficient for collecting data that meet the needs of the ION.
Effects of the analyzing laboratories were not investigated in this study, and it was assumed that each laboratory produced data of sufficient quality. However, despite laboratory QA/QC procedures that make it unlikely there is a large bias in the data produced by each laboratory, there is always the possibility that the analyzing laboratory or instrumentation has affected the results. Just as the three projects have different objectives, the three laboratories have slightly different procedures. While we would expect to see differences in more of the analytes (i.e., in Ca, Mg, and Alk) if laboratory analysis or instrumentation variation were the cause of the statistical differences found in Na, Cl, SO 4 , and K, we cannot entirely rule out a laboratory bias based on the data that we examined. To determine this, QA/QC data from each laboratory associated with the data examined here would need to be analyzed, an onerous, though potentially worthwhile task.
The potential influence of sample collection year was examined and showed no time dependence associated with statistically significant differences with the exception of K. This time dependence does not appear to be related to changes in personnel, as community technicians in Pilot Station have stayed fairly consistent over the years of the project. Further, no outliers were found when examining the ION K data, indicating that the data are internally consistent. The influence of time on the data does not appear to be related to hydrologic flows, as the paired t-tests, which removed the question of hydrologic variability, did not find a statistically significant difference in K data when comparing ION and professional data. Additionally, the ION data track the seasonal trends found in the professional data.
No statistical differences were found among the three projects for pH, Ca, Mg, or Alk. While statistically significant differences were found when comparing ION to professional data for Na, Cl, SO 4 , and K concentrations, the difference in the mean values are small and likely not significant for their interpretation in many uses, beyond the purposes of ION. The results comparing pH are particularly important, as these are in situ measurements made on a water quality meter. These meters must be calibrated by the technician prior to taking measurements, and the agreement between ION data and the professionally collected data show that the community technicians in Pilot Station have followed protocol for field measurements.
While other studies (Mattson et al. 1994, Shelton, 2013 have analyzed data collected simultaneously by professional scientists and non-professionals, this type of analysis was not possible with the available funding and data for Pilot Station. Expanding this analysis to include a side-by-side comparison as well as including more ION sampling locations would strengthen the results found here. Nevertheless, this analysis agrees with previous findings (Mellanby 1974;Reynoldson et al. 1986;Au et al. 2000;Fore et al. 2001;Shelton 2013;Danielsen et al. 2014) that non-professionals can, and do, collect highquality water-quality data and support the conclusion that program design is an enabling condition for data quality in CBM programs: "With appropriate protocols, training, and oversight, [non-professionals] can collect data of quality equal to those collected by experts." (Bonney et al. 2014(Bonney et al. : 1436.

Conclusion
Data quality ultimately lies in the eyes of the data user. The use of data by external parties, such as scientific researchers or government agencies, is not always the primary goal of CBM programs. While data quality-including accuracy, comparability, completeness, and timeliness-is important in motivating use of CBM data by outside parties ( Conrad & Hilchey 2011), the level of rigor required ultimately depends on the intended use of the data. Thus, if the methods, such as sample collection, processing, preservation, and data quality objectives match the program objectives, having the same level of rigor as professional environmental monitoring is not necessary (Bliss et al. 2001).
The statistical comparison methods in this case study allowed us to observe where ION data differ from professional data and highlight the importance of project and protocol design for achieving high-quality data collected by non-professionals. In the case of CBM in the YRB, ION's program design, which includes annual training and the use of USGS SOP and quality assurance protocols, provides ION with a high level of support for the collection of high-quality data. Future work should coordinate sample collection activities between the three projects so that simultaneously collected samples can be compared.
The ION, with distributed community-based technicians collecting samples from 50 locations across the YRB, has been pivotal in addressing sub-Arctic and Arctic data gaps, and sustaining sample collection at key locations when the USGS switched its focus to other large watersheds. The monitoring data collected by ION community technicians is valuable for both detecting whether the system is deviating from typical values (Legg & Nagy 2006), but also aids in providing a baseline for shorter term, processed-based studies (National Research Council 2014). Although statistically significant differences were found between the professional and non-professional data in our analyses, we do not believe these differences equate to a significant difference in the quality of ION data when compared to professionally collected data. ION community technicians follow rigorous protocols in the field when collecting surface-water samples, producing complete, consistent, and timely data, with the currency to inform the objectives of the project.

Note
1 These data are preliminary or provisional and are subject to revision. They are being provided to meet the need for timely best science. The data have not received final approval by the US Geological Survey (USGS) and are provided on the condition that neither the USGS nor the US Government shall be held liable for any damages resulting from the authorized or unauthorized use of the data.

Additional File
The Additional File for this article can be found as follows: • Supplemental Information. Detailed results of all statistical analyses completed. DOI: https://doi. org/10.5334/cstp.123.s1