Smartphone Sensors for Citizen Science Applications: Light and Sound

Given that designing mobile platforms for specific citizen science project needs can cost tens of thousands of dollars, according to Yarmosh (2017), the ubiquity of smartphones makes them a highly desirable platform (Dehnen-Schmutz 2016). In recent years, citizen science projects have begun to adapt smartphones to provide novel crowdsourcing opportunities (Stoop 2017) across fields as diverse as bird watching (eBird) and aurora sightings (Aurorasaurus), to precipitation monitoring (mPing) and meteor spotting (Meteor Counter). Notably, the current ensemble of more than 1,600 citizen science projects catalogued by SciStarter (2019) includes 50 that generally employ smartphone cameras and text-based data, but they do not actually involve the use of smartphone sensors themselves. The only exceptions appeared to be CrowdMag, which uses the smartphone magnetometer to make geomagnetic field measurements, and an informal project Earth Rotation Detector that uses the accelerometer to detect local acceleration differences caused by Earth’s latitude-dependent centrifugal acceleration. Previously, Odenwald (2019) examined how modern smartphone magnetism and radiation sensors available since 2015 performed against professionally-calibrated systems. This knowledge was then applied to using smartphones for the detection of geomagnetic storms (Odenwald 2018) and radiation doses at airline altitudes (Odenwald 2019). Similar radiation and magnetism measurements form the basis for citizen science projects such as Our Radioactive Ocean, and the previously mentioned project CrowdMag. This paper extends the work by Odenwald (2019) and investigates light and sound sensors to assess their accuracy and precision in comparison with professional instrumentation. The available literature on the rigorous calibration of these sensors is modest to nonexistent and in all cases is many years out of date as newer smartphones have entered the market. Studies have been conducted to evaluate the accuracy and precision of smartphones as substitutes for expensive professional light-metering systems, which can cost upwards of $500. DIAL (2016) investigated the iPhone 5, iPhone 6, Samsung Galaxy S5, and Sony Xperia Z1 and Z2, and Gutierrez-Martinez et al. (2017) studied the LG Odenwald, S. 2020. Smartphone Sensors for Citizen Science Applications: Light and Sound. Citizen Science: Theory and Practice, 5(1): 13, pp. 1–16. DOI: https://doi.org/10.5334/cstp.254


Introduction
Given that designing mobile platforms for specific citizen science project needs can cost tens of thousands of dollars, according to Yarmosh (2017), the ubiquity of smartphones makes them a highly desirable platform (Dehnen-Schmutz 2016). In recent years, citizen science projects have begun to adapt smartphones to provide novel crowdsourcing opportunities (Stoop 2017) across fields as diverse as bird watching (eBird) and aurora sightings (Aurorasaurus), to precipitation monitoring (mPing) and meteor spotting (Meteor Counter). Notably, the current ensemble of more than 1,600 citizen science projects catalogued by SciStarter (2019) includes 50 that generally employ smartphone cameras and text-based data, but they do not actually involve the use of smartphone sensors themselves. The only exceptions appeared to be CrowdMag, which uses the smartphone magnetometer to make geomagnetic field measurements, and an informal project Earth Rotation Detector that uses the acceler-ometer to detect local acceleration differences caused by Earth's latitude-dependent centrifugal acceleration.
Previously, Odenwald (2019) examined how modern smartphone magnetism and radiation sensors available since 2015 performed against professionally-calibrated systems. This knowledge was then applied to using smartphones for the detection of geomagnetic storms (Odenwald 2018) and radiation doses at airline altitudes (Odenwald 2019). Similar radiation and magnetism measurements form the basis for citizen science projects such as Our Radioactive Ocean, and the previously mentioned project CrowdMag. This paper extends the work by Odenwald (2019) and investigates light and sound sensors to assess their accuracy and precision in comparison with professional instrumentation. The available literature on the rigorous calibration of these sensors is modest to nonexistent and in all cases is many years out of date as newer smartphones have entered the market.

RESEARCH PAPER
Smartphone Sensors for Citizen Science Applications: Light and Sound Sten Odenwald In recent years, the continued improvement and deployment of smartphone technology has resulted in the growth of smartphone sensor systems. Citizen science projects that require on-the-spot measurement of a variety of parameters such as illuminance and sound level may benefit from this new technology. As yet, there have been few attempts at critically assessing these sensor systems and their accuracy.
I calibrated the light and sound sensors on a variety of apps and on three platforms using professionalgrade instruments. Provided that modest adjustments to the recorded data are made using a small set of calibration curves, the usefulness of smartphone sensor light and sound systems for conducting citizen science experiments can be dramatically improved. Light intensity measurements for illuminances above 5,000 lx can be made with ±12% accuracy compared with calibrated values, and sound intensity measurements from 35 to 90 dB can be made to within ±1.5 dB. A significant failing among the light-metering apps is that none of them save the measurements in exportable file formats for later analysis. This greatly limits the usefulness of these apps for non-photographic applications.
Nexus 5. Among the non-photographic uses, Cerqueira, Carvalho, and Melo (2018) investigated the iPhone 5 for use in the occupational health industry as general illuminance monitors to check workplace compliance with safety regulations. The available literature on smartphone sound sensors is equally modest. Most published discussions are either informal or based upon smartphone and sensor technology that is significantly out of date. Moreover, discussions do not cover methodologies for properly calibrating smartphone sensors but merely address the single-measurement discrepancies between measured and calibrated standards.
Detailed acoustic studies (Brown and Evans 2011) of an iPhone 3GS provide some insight into how well older smartphones can be used for making precision acoustic measurements. Kardous and Shaw (2014) selected nine different smartphone platforms available by January 2013 and examined ten iOS apps and four Android apps that purportedly measured sound intensity. Robinson and Tingay (2014) tested Galaxy S2 and Nexus 7 smartphones and concluded that under real-world conditions, these specific smartphones were generally unreliable in measuring sound intensity.
A limitation of the studies cited above is that, while they assess the accuracy and precision of the sensors, they do not provide a calibration for them that can overcome some of the measurement inaccuracy. These earlier light and sound sensor studies often take the form of a bloginformal discourse rather than articles in which data are systematically presented in graphical or tabular form and are formally analyzed. Nor is it common to find discussions about how to calibrate light and sound sensors to improve their accuracy (but see Cerqueira, Carvalho, and Melo 2018). Moreover, older smartphone models (in some instances more than six years out of date) do not reflect more recent sensor improvements. For example, sound measurements use the output from smartphone microphones that are based on micro-electro-mechanical systems (MEMS). These devices have more than doubled in their sensitivity (signal-to-noise) in the past ten years, owing to increasing consumer demand, and they now rival electret-based microphones (Widder and Morcelli 2014;Kardous and Shaw 2016). Changes in smartphone sensitivity can appear when comparing earlier versions of smartphones by the same manufacturer (e.g., iPhone 5s versus iPhone 11), or differences between manufacturers. MEMS as mechanical systems are subject to wear, so there is also the potential for them to become less sensitive over time. There are, however, no current studies that have investigated smartphone microphone aging.
In the following sections, I detail the methodology I used to compare light and sound readings of two Android models, the Samsung Note 5 and Galaxy S8; and an iPhone 6S. Specifically, I describe the laboratory instruments used as standards, the apps I selected for light and sound reading on each phone, and the procedures to quantify the noise inherent in the sensors, the variability in readings among the same phone model running identical operating systems and apps, and the overall accuracy and precision of the phones as sensors. For both light and sound, I derive calibration curves that, when applied to phone sensor readings, could improve their accuracy significantly.

Methodology Selection and characteristics of smartphones for analysis
Owing to availability and world-wide popularity, I used the Samsung Note 5 (2015: Android) and the Samsung Galaxy S8 (2017: Android) smartphones, as well as the iPhone 6s (2015: iOS). Specific information about the popularity of individual models is hard to assess because of the nonuniformity of publicly available sales reports. By the middle of 2016, the Samsung Note 5 was among the Top-10 smartphones in use with a 7.8% share worldwide (Ehsan 2016). The Samsung S8 model was the world's best-selling Android smartphone and achieved a 2.8% market share by the second quarter of 2017 (Mawston 2017). Similarly, by 2017, 728 million iPhones were in use, of which 47% were the iPhone 6/6s model (van der Wielen 2017).
In practice, calibration of smartphone sensors must regard these platforms as essentially black boxes. The exact sensor system found in a particular hardware configuration is generally very difficult to determine from public data. An extensive GOOGLE search was performed to uncover the specific sensor models being used for the iPhone 6s (Table 1). Samsung specifications were found by downloading the Spec Device app on each phone.

General calibration approach
The investigation of smartphone light and sound measurements and their calibration follows a common methodology. First an assessment is made of the measurement noise contributed by the sensor itself, usually a result of the sensor's analog-to-digital converter and its digitization noise. This feature can be the result of a variety of internal electronic noise sources that influence the performance of the analog-to-digital converters as discussed in Engineer Zone (2012,2018) and Maxim Integrated (2017). Because digitization noise is non-Gaussian and does not follow a square-root law, it represents the minimum noise level that can be achieved by the sensor even after averaging a significant number of measurements. This is particularly important for crowdsourced measurements because it sets a limit to the maximum accuracy obtained by averaging multiple measurements, which would normally improve measurement accuracy by reducing the dispersion about the mean value by √N.

Selection of calibration instrumentation
For light measurements, I used the Extech Instruments LT-300 Light Meter ($115; hereafter LT-300), which utilizes a remote light sensor equipped with a hemispherical light-diffusing dome. The nominal light-metering accuracy of this instrument is ±5%. A comparison of this lightmetering instrument with a Sekonic C-7000 ($2,500) metering system yields a similar ±5% accuracy and both comply with JIS C1609-1:2006 general qualifications for an A-class illuminometer. For the sound measurements I used the B&K Precision Model 732A Sound Level Meter ($310; Hereafter BK-Meter), which provides 30-130 dB measurement capability at ±1.5 dB. The BK-Meter meets the IEC 60651 Type II sound level meter standard, which is the general-purpose grade for field work. Systems that meet the more restrictive Type I certification at σ = ±0.7 dB, by comparison, would be considered precision grade for lab work. Given the expected quality of the measurements with these smartphone systems, it is unnecessary to obtain instruments of any higher quality, which come at considerably greater expense.

Light intensity
Unlike other physical properties such as magnetism, sound, and acceleration that use a single sensor system across different types of phones, illuminance measurements are being made using different approaches. A dedicated light sensor near the front camera is used by virtually all android apps, while iOS apps use data provided by the back camera and written into the exchangeable image file formal (EXIF) data stream for the scenery being imaged. EXIF data includes the camera model, f/stop, exposure speed, ISO number, and scene brightness value, as well as date, time, and location. When an image is taken, this data is written into the header of the image file. The practical difficulty in using the android light sensor is that to read the illuminance value, the user has to be close enough to the screen to see the display, but this immediately interferes with the illuminance measurement itself. Since none of the apps allow the data to be recorded in a file for later use, there is no way to avoid the user-interference issue. Only the back-camera EXIF data allows user non-interference. Meanwhile, although the light sensor produces a continuous range of illuminance values at about 1 lx resolution over a range from 0 to 60,000 lx, the backcamera metering system combines information from the exposure speed and f/stop to determine the illuminance of the field-of-view via the spot-metering areas. This leads to sudden jumps in the calculated illuminance, so this is not a smoothly varying quantity.
For investigations with the Samsung Note 5 and Galaxy S8, the Lux Light Meter app by Doggo Apps (Lux) uses the dedicated light sensor on the front of the smartphone but does not provide an image to guide the location of the field of view. Light Meter by WBPhoto (LM) uses the back camera to measure the reflected light illuminance from the EXIF data. It also uses the front light sensor to directly measure the incident light illuminance. For the iPhone 6s, tests were conducted on two iOS apps: Galactica ($1.99) by Flint Soft Ltd. that uses the back camera and provides an image. Other camera-based apps such as Light Meter by Vlad Polyansky and Light Meter by Elena Polyanskaya were also examined but the data were identical to those provided by Galactica. Galactica uses the EXIF data stream to compute the reflected light illuminance. For both the Android and iOS phones, no diffusing dome was used, but only the smartphone in a normal measurement mode, which would be commonly employed by the average user.
For the methodologies used by, for example, DIAL (2016) and Kardous and Shaw (2014), a professional-grade instrument was used to compare the smartphone sensor readings against a calibrated metering system. Previous studies focus on deviations in single point measurements between a calibration system and the smartphone output, rather than availing themselves of the improved accuracy by combining multiple measurements to define a calibration curve that reduces the overall measurement uncertainty. Calibration is the process of comparing a carefully measured input signal to the output provided by a detector. This results in a functional relationship that can be modeled by a regression curve that minimizes the overall variance in the ensemble of measurements. Previous studies may note that the calibrated input and resulting output differ by large factors, but do not take the next step in the calibration process and create a calibration function that can be used to correct the output values and place them on a proper calibration scale. I have taken the next step and also use these measurements to establish the smartphone sensor calibration curve, which is defined by performing a regression analysis on multiple measurements.
For the light calibration, the total post-calibrated uncertainty in the measurements σ T , is the quadrature sum of the uncertainty in the calibrator σ C and in the smartphone measurement σ S according to σ T 2 = σ S 2 + σ C 2 . Even with perfect smartphone measurements such that σ S = 0, the calibration process is limited by how well the calibrator itself has been calibrated. The specifications for the LT-300 indicate a σ C = ±5% accuracy in light metering across its full scale (10 to 100,000 lx), which implies σ C = ±0.5 lx at the low end and σ C = ±5000 lx at the high end of the scale.
We see that these measurement accuracies for the light and sound meters is generally higher than the measurement uncertainties themselves, and so these instruments are appropriate calibrators within the available price ranges of the investigations.
An extensive discussion of light-metering theory and techniques is provided by Hiscocks (2014). The proper method for calibrating a light sensor is to use light sources with the same color temperature at various brightness levels; however this approach is expensive to implement. Instead, a convenient light source is the sun (5,770 K), which varies from 10,000 to 500,000 lx over the course of a day and over several seasons. Provided that the sun angle is high enough in the hours before local noon, the color temperature of the solar spectrum will not change appreciably because of atmospheric absorption, which will tend to scatter the blue light and make the color temperature progressively colder (i.e., redder) as the sun sets.
Light levels were measured outdoors by each app in direct sunlight by placing a white sheet foamboard on a leveled surface and using the back camera to measure the reflected light. Without casting shadows over the surface, the back camera was aimed at the foamboard so that its image filled the entire field of view. The calibrated LT-300 meter was used to measure the direct sunlight striking the foamboard by placing the sensor dome on the foamboard facing skywards. The derived albedo of the foamboard, defined as the ratio of the reflected to incident light, was 0.7.
The dynamic range of the light measurements ranges from 1 to 500,000 lx depending on which app is used. It is virtually impossible for calibrations using a single function to work well over such large dynamic ranges. Most solid-state sensors usually have a combination of linear and nonlinear responses, which defeats obtaining a single linearization between the input illumination and the output measurement. Consequently, I considered three separate ranges that correspond to similar environmental conditions corresponding to high (5k to 500,000 lx) bright sunlight, mid-range (100 to 5000 lx) shade and indoors, and low (0 to 100 lx) dawn/twilight illuminance levels.

Sources of noise among the phone models
The level of the digitization noise will be determined for each sensor by making a series of measurements of a fixed source reference, and re-scaling the display of the sensor readings to reveal the step-wise signature of the digitization. Smartphone models do not generally use the same sensor type, and slight manufacturing differences in placement of the sensor inside the smartphone can also create a copy-to-copy change in measurement accuracy. To investigate these variations, measurements with four Samsung Note 5 devices, four Samsung Galaxy S8 devices, and two iPhone 6s devices will be compared.
One of the most basic features of a light-metering system is its response to a constant source of light. A system that suggests a light source is varying in intensity when physically it is not can be a problematic instrument for any number of applications. For many apps that return measures of magnetic field strength, barometric pressure, or acceleration, it is common to see a rapidly changing display in part caused by the digitization level at 1-LSB of the sensor output. By comparison, light-metering displays typically present data to integer values of lx that remain fixed in value for a constant-intensity light source. The Galactica app does not produce a continuous reading of illuminance as light levels are smoothly increased using a variable-intensity lamp. Instead, they jump by discrete steps that are as much as 50% of the illuminance for dim light (ca 10 lx) and decrease to <5% at >10k lx. However, other iOS apps such as Light Meter Polyanskaya (2017) uses the EXIF data and a different processing algorithm to provide smooth continuous measures at 1 lx intervals that remains stable for a fixed light source, suggesting that the equivalent digitization noise is <1 lx corresponding to <1% measurement error above 100 lx. A similar response is provided by the Android Light Meter app by WBPhoto that uses the light sensor. The result is that the light-sensor (Android) metering system is superior to the camera-based (iOS) metering systems in terms of their measurement accuracy.
In Figure 1, a selection of measurements using the iOS and Android illuminance-metering apps is shown, and is scaled in such a way to magnify the step changes between light-level measurements as the ambient light intensity increases. The percentage of the step interval to the applied illuminance shows how the steps in measurement of about 10 lx in absolute terms represent a growing percentage of uncertainty for low-level illuminances than at higher levels. The iOS Galactica readings are significantly noisier throughout the range and especially below 1,000 lx but drop significantly above 1,000 lx. This occurs because of the designer choice to use the F/stop, exposure speed, and ISO numbers of the camera setting in the EXIF image information to calculate equivalent lx. These parameters are discrete, which leads to nonuniform steps in lx across the range of the metering. The Light Meter app by Polyanskaya (2017) also uses the EXIF data but supplements this with a proprietary mathematical algorithm that smooths out the final readings into a semi-continuous series of values. In contrast, the illuminance readings from the Android phone app, Light Meter (LM) has a dedicated Sensor Meter mode using the front light sensor, which provides a continuous variation in output measurements in steps of 1 lx. We see in Figure 1 that for camera-based EXIF apps such as Galactica, the step change between consecutive illuminance levels is about 5% for higher illuminance measurements, but for lower illuminance levels this change can amount to more than 25%. This implies that the app cannot discriminate between two illuminance levels to better than 25% at the low end of the scale. This behavior is analogous to digitization noise, which limits the ability to measure faint illuminance changes accurately when they are comparable to the 1-LSB level.

Comparison between multiple platform copies
The measurements from four Samsung Note 5 and four Samsung Galaxy S8 phones were compared at the same light levels to gauge the copy-to-copy variations within the two phone models. Since smartphone models often use different light sensor and imaging array technologies as shown in Table 1, this test is important for determining consistency across platform copies.
I also performed the measurement on the two Android light meter apps to assess whether the apps themselves were contributing any measurement error. Once again, Light Meter (LM) uses the camera meter (reflected light) and Lux Light Meter (Lux) uses the light sensor (incident light) to make the metering measurements. The resulting measurements of a white foamboard are shown in Table 2.
For the Extech meter, the ratio of the reflected to incident light (albedo) is 0.5, 0.71, and 0.71 respectively, while the smartphone metering systems are generally inconsistent with the Extech expected value near 0.7 for the white foamboard. Because the smartphone measurements are not made with the same sensor system, it is possible that the albedo errors occur because of differing lx calibrations between the systems (back camera for reflected light versus dedicated light sensor for incident light). Nevertheless, the same app operating on the two different platforms yields similar results for high illuminance levels (e.g., 16,300 lx) but offers very discrepant measures under low illumination (e.g., 114 lx). The Note 5 performs less accurately in general.
High illuminance levels (3k to 500k lx) Each point in Figure 2 represents an individual measurement at the given calibration level for the reflected light measured with the Extech meter using an albedo of 0.7. The regression curve has been forced to y-intercept of 0 to reflect the natural condition that zero-illumination corresponds to a zero measurement. The R 2 value exceeds 0.9 and indicates that the calibration regression accounts for the majority of the correlation observed. For purposes of calibration, the reflected light measurement value is known (y-axis) so it is the range of values along the x-axis that determines how well the measurement leads to a unique calibrated value. Once the smartphone has been calibrated using the regression line to obtain the equivalent calibrated, reflected illuminance, the residual postcalibration illuminance error is σ c = ±12%. The use of other iOS-based apps such as Light Meter by Guidicelli and Lux Light Meter by Butta yield nearly identical data values and regression curves and are not shown in Figure 2.   Figure 2: Comparison of reflected illuminance levels between calibration (LT-300) and iPhone 6s camera-based Galactica app (triangle).

Odenwald: Smartphone Sensors for Citizen Science Applications
Art. 13, page 7 of 16 Medium illuminance levels (100 to 3,000 lx) Although high levels of illumination are of interest to outdoor daylight studies, other applications of light metering can emerge for indoor conditions (100 to 600 lx), full illumination in well-lit rooms, and outdoor shady conditions (3,000 lx). In Figure 3, there is little difference between the regression constrained to a 0 lx intercept and one free to determine this parameter; however, only the former is physically reasonable since illuminance is a positive quantity. The resulting dispersion of the measured values about the y = 3.1x regression is σ c = ±16%, owing primarily to the deviations between 1,500 lx and 600 lx.
Low illuminance levels (0 to 100 lx) Intrinsically faint sources such as moonlight (0.1-2 lx), aurora (1-50 lx) or the twilight sky (10-200 lx) occupy this illuminance zone, so the behavior of smartphone apps under these conditions is of interest. Once again, using natural unfiltered sunlight under sunset and twilight conditions insured that the spectral distribution remained that of a pure black body, though at a bluer effective color temperature of 12,000 K. Aurora are reported to have a color temperature between 3,000 and 5,000 K (Sammtleben 2018) while the full moon is approximately 4,000 K. No correction for this color effect was made in the testing process. A variety of measurements were made using indirect, shaded sunlight with the results shown in Figure 4. Note that at about 20 lx it is not possible to read newspaper type face smaller than 8-point font, and by 3 lx one cannot easily read headlines written in 12-point font: These are circumstances often reported by observers of very bright auroral displays. For the measurements, the camera-based metering clearly shows jumps at 30, 40, 60, and 100 lx.
By applying the y = 2.5x calibration, the residual error becomes σ c = ±18%, largely because of the influence of the jumps in smartphone measuring over this range. Extending the y = 3.1x regression for the medium-illuminance range yields a distinctly different (dashed line in Figure 4) calibration, so low-illuminance measurements require a separate calibration curve.

Sound intensity
For the sound analysis we used DecibelMeter by Byhunghun Yang, which provides average, peak, and current levels and a simple bar graph display. Decibel Level by Qi Chen is a free app that offers a bare-bones display of the peak, average, and current dB measures, but is populated by frequent popup advertisements. Decibel Sound Meter by Lee Pyoung Lo also offers no graphical information but only a single number that changes so rapidly that it is nearly impossible to record sound levels. The app Decibel 10 th (or Decibel X) by SkyPaw Co. Ltd. has a superior display including real-time plot, an analog meter dial, and digital display of the average, current, and peak sound levels. It allows for variable sampling rates from 4 to 20 Hz, and is also one of very few sound-meter apps that is claimed to be pre-calibrated. The data can also be recorded and exported as a .csv file.
Sound measurements were made using a variety of environmental sources that were then measured with the BK-Meter to establish their calibrated levels, and they were simultaneously measured with the smartphones to create the calibration curve for each smartphone model. An additional convenient sound source was created by Figure 3: Data for Galactica (triangle) with linear regression forced to a 0 lx y-intercept to avoid an unphysical, negative illuminance as the y-intercept.
tuning an AM-band radio between stations so that only the random noise was present. The three smartphone models together with the BK-Meter were placed on a table at the same distance from the speaker, and the volume of the radio was adjusted over a BK-Meter range from 30 to 75 dB. The average sound level for each of the smartphones with the Decibel 10 th app running continuously was then noted. For purposes of statistical analysis involving the calculation of averages and standard deviations, sound intensity has to be considered differently. Sound intensity dB units obey what is termed a lognormal distribution in that the decibel units themselves may follow a normal, Gaussian distribution characterized by a mean and standard deviation, but the physical units of power follow a lognormal distribution. The choice of which system to use depends on the purpose being served and either representation is valid (see for example the discussions in Science Direct 2019). For example, consider a series of five measurements: 47, 43, 42, 43, 45 dB. Method 1, which works directly with the normal distribution of the dB values, would give <I> = 44.0 ± 2.0 dB. For Method 2, if we converted the dB values into their linear power units we would get 5.1 × 10 -8 , 2.0 × 10 -8 , 1.58 × 10 -8 , 2.0 × 10 -8 , and 3.16 × 10 -9 watts for which the average is 2.75 × 10 -8 ± 1.39 × 10 -8 watts, and the equivalent dB unit is just <I> = 44.4 ± 2.4 dB. Because of the nonlinear lognormal scale, measurements about the mean value <I> will depart from a Gaussian random distribution (i.e., the derived standard deviation differs by more than 20% between Method 1 and 2) as the dB variance increases beyond about ±2 dB, and so it is not useful to characterize the power distribution by an average or a standard deviation that is accurate only for distributions of measurements that are close to a normal, Gaussian distribution. Because for our noise measurements we are working with dB values that are very close together in magnitude, and these measured values are plotted directly to show their dispersion in dB units, we can calculate a mean and standard deviation directly from the dB values and their distribution. Since most sound-level meters are to avoid an unphysical, negative illuminance as the y-intercept. Also shown is the regression used for the medium illuminance levels in Figure 3 (dashed). calibrated in the normally distributed units of dB and not in the lognormal units of watts, we will not concern ourselves with the actual delivered noise power and its distribution properties. This also simplifies the direct plotting of the data on a linearized coordinate axis.

Measurement noise
To determine the measurement noise, the Decibel 10 th app was activated on three smartphones: Samsung Galaxy S8, Note 5, and iPhone 6s, and data was taken at a cadence of five samples/sec for several minutes and stored in .xls spreadsheets for analysis. An example of this data is shown in Figure 5 for the iPhone 6s. The digitization level at 1 LSB for the iPhone 6s, as for the Samsung Galaxy S8, is clearly seen as 0.1 dB.

Comparison with multiple Samsung phones
All eight smartphones were placed in three different, constant-intensity sound environments and were allowed to record data with the Decibel 10 th app. Table 3 shows that there is considerable copy-to-copy variation within the same model type. The Galaxy S8 seemed to perform better overall, especially for the quieter sound levels. One of the Note 5 phones (shaded gray) gave dramatically dif-ferent measurements than its other three copies, and in fact the measurements were found to be closest to the calibrated values. The total variation of the measurement would consist of the contribution from the calibration residual (σ c ) and the copy-to-copy dispersion (σ n ). Within each model line, the sound measurements had an internal precision of approximately σ n = ± 1.5 dB for the Galaxy S8 and ±0.4 dB for the Note 5. The corresponding calibration residuals are σ c = ±3.3 dB for the Note 5, and ±1.6 dB for the Galaxy S8. When added in quadrature, we get a total accuracy of σ t 2 = σ c 2 + σ n 2 = ±3.3 dB for the Note 5, and ±2.1 dB for the Galaxy S8. Although the Galaxy S8 appears to perform marginally better than the Note 5, such large residual uncertainties after calibration would not meet the requirements of an IEC 60651 Type II metering system (σ = ±1.0 dB), which is considered a general-purpose sound-metering system adequate for field work. Nevertheless, there may be some applications for which this level of post-calibration accuracy is adequate for less formal investigations.
Smartphone platforms use a variety of microphone technologies, and so we need to compare sound sensitivity in each phone and also determine their zero-points. A Samsung Note 5 and iPhone 6s were placed side-by-side in  a quiet room with the same Decibel 10 th app in operation. Initially, the Samsung phone registered the quiet conditions as being near 65 dB. The calibration adjustment feature in the app was used to shift the Samsung data by −13dB so it matched the iPhone values, which were believed to be more typical of the quiet room sound level near 40 dB.
To examine the quietest levels of sound sensor performance, the website Online Tone Generator (https://www. szynalski.com/tone-generator/) (Szynalski 2019) provides a tunable (0 to 20 kHz) pure tone, whose sound level can be adjusted with the computer speaker volume slider. A tone at 1,000 Hz was selected at a BK-Meter level of 40 dB and played in a quiet room where the ambient sound level was 35 dB as measured by the BK-Meter. The Decibel 10 th app was activated on three smartphones: Samsung Galaxy S8, Note 5 and iPhone 6s, and data was taken at a cadence of five samples/sec for several minutes and stored in .xls spreadsheets for analysis. Figure 6 shows three minutes of data sampled at the app's fast cadence of 0.2 seconds. The standard deviation of the data shows that for the iPhone system, its measurement noise is ±1.5 dB, while the Samsung Note 5 is much quieter at ± 0.5 dB. The Samsung smartphone with an offset of −10.5 dB added, seems to be a more sensitive system (lower σ) than the iPhone 6s.

Relative performance tests
An iPhone 6s was placed on a flat surface face up with the microphone exposed. The results for various apps and environmental circumstances are shown in Table 4. Given that the dB scale is logarithmic, the variation in the measured values from app to app under the same environmental conditions is significant when compared with the ±5 dB for the individual app measurements.

Absolute performance tests
Starting at 75 dB, the radio volume was lowered by 5 dB on the BK-Meter and the smartphone's sound level values were noted until the ambient basement sound level was reached. The result of these measurements is shown in Figure 7. The linear regression calibration curve has a slope of 1.07, after the measured smartphone values were adjusted by applying an offset of −12 dB (Note 5), +5 dB (Galaxy S8) and −8 dB (iPhone) to reduce the dispersion of the measured values relative to the calibrated values.

Discussion
A previous investigation of smartphones by Odenwald (2018Odenwald ( , 2019 for accurate detection of radiation and magnetism, together with a recent investigation of smartphone surface gravity measurements by Odenwald (2018), suggested that these sensors and apps were able to return  measurements adequate for several formal and informal citizen science investigations. The current study of smartphone-based sound and light measurements reveals that for some applications this technology may also find a niche suitable for formal citizen science and informal crowdsourcing projects. What the previous analysis shows is that although sound measurements can lead to consistent results, the diversity of the light measurement systems used by smartphones leads to a far more complex challenge.

Light
Providing a simple calibration for light meters is a challenge because the apps themselves are based on different operating principles. The majority of the iOS apps use the camera image EXIF data to calculate an illuminance, while Android-based apps predominantly use the dedicated light sensor. Also, the dynamic range from 1 to 500,000 lx is a challenge to simulate under controlled conditions with constant color temperature.
We have already seen that illuminance measurements with smartphones involve a wide range of apps and sensors. The iOS phones typically use the back camera and extract illuminance information from the EXIF data stream, but this creates discrete jumps in the scene illuminance value (Figure 1) as the f-stop, exposure time, and ISO settings are incrementally changed. For very low light levels below 100 lx, these jumps can amount to more than 50% of the illuminance value, whereas for very bright scenes above 1,000 lx, the uncertainty falls to below 5%. Android phones routinely use a dedicated light sensor on Figure 7: iPhone 6s (dot), Samsung Note 5 (square) and Galaxy S8 (triangle). The smartphone data obtained by the Decibel 10 th app has been shifted by: Note 5 (−12 dB), Galaxy S8 (+5 dB), and iPhone (−8 dB) to place them as close as possible to the same linear regression calibration curve (solid line).
the front face of the phone adjacent to the front camera.
Although this data is reported as a continuous stream of values with a digitization of approximately 1 lx resulting in a measurement error of <1% above 100 lx, these light sensor-based apps require that the phone face the user, which means that shadowing of the sensor will invariably corrupt the data. Moreover, unlike the other systems studied (accelerometer, magnetometer, sound level) none of the common light-measuring apps allow for the data to be stored in an exportable data file (e.g., .xls). Consequently, the measurement process requires constant human intervention, and a reliance on the clarity of the smartphone app displays, some of which cannot be easily read under high-illuminance conditions. DIAL (2016), as in the current study, also used the smartphones with no diffusing dome over the camera lens. In that study, it was reported that the Galactica app was 180% above the reference value at 10 lx and 50% below the reference value at 10,000 lx using a low-voltage Halogen lamp, a compact fluorescent lamp (2,700 K), and an LED lamp (3,000 K). Our results for a 5,770 K solar illuminance also found similar over-and underestimates. A more detailed investigation also turned up other issues that have significant consequences to light measurement accuracy and precision.
Detailed measurements made with one platform (iPhone 6s) and one camera-based app (Galactica) yield a uniform scaling (Figure 2) that is very close to 3.4 from 3,000 to 50,000 lx, and 3.1 from 100 to 3,000 lx (Figure 3). A significant reason this is not exactly 1.0 is that the Extech measurement was made with a diffusing dome whereas the smartphone measurements were made with no diffusing dome covering the camera lens.
By applying the appropriate calibration curves to the high, medium, and low-illuminance conditions displayed in Figures 2, 3 and 4, the large errors in illuminance measurement exceeding 150% reported by DIAL (2016) can be considerably reduced. The post-calibration errors for these three ranges are σ = ±12% (high range), σ = ±16% (medium range), and σ = ±18% (low range) and largely follow the increase because of the jumps in estimated illuminance calculated from the EXIF data.
Because of the differing regression slope scalings between apps that use camera-based or dedicated sensors, this calibration needs to be repeated for each app and smartphone combination. Table 5 is a list of common light sources and their illuminance level as gauged by the LT-300 light meter. Compare your light-metering app and platform with these values to establish your own calibrated scale relative to the professional metering system.
For example, if you use the GE Daylight LED bulb (5,000 K) at a distance of 1 meter, you should measure a direct (not reflected) illuminance of 263 ± 3 lx. The soft-white (2,700 K) bulb should give a slightly lower direct illuminance of 235 ± 3 lx at 1 meter.
Calibration can proceed by using the Galactica app to measure the reflected illuminance from a white foamboard or other white paper (albedo = 0.7). Multiply the incident illuminance from the light source in Table 5 by the albedo factor to get the expected reflected illuminance from the white surface. The scale factor for the app and smartphone combination is then the ratio of the calibrated reflected illuminance (lx) to the measured reflected illuminance (lx). For subsequent reflected illuminance measurements using Galactica, divide the app's lx values by the appropriate scale factor derived from the regression curves above to get the calibrated, reflected illuminance.
As a final note, for some applications involving solar power electrical systems, a conversion between lx units of incident illuminance and watts/m 2 is needed. Although this is not a feature provided by the light-measuring apps considered in this study, I used a DT-1307 solar power meter manufactured by CEM, Ltd. ($90), which has a 1 w/m 2 resolution and an accuracy of ±10 watt/m 2 . Both the LT-300 power meter and DT-1307 light meter use what at least superficially appear to be identical, remote, silicon photodiode light sensors under a hemispherical diffusing dome at the end of a coiled connecting cable. Simultaneous measurements under the same light conditions spanning a factor of 10,000 in lx show a linear scaling such that the constant of proportionality between the lx and power (flux) scales is 115 ± 10 lx per watt/m 2 .

Sound
Detailed acoustic studies (Brown and Evans 2011) of an iPhone 3GS provide some insight into how well older smartphones can be used for making precision acoustic measurements yielding accuracies of approximately ±4.0 dB using acoustic sources of known intensity and frequency; however, they do not indicate how to calibrate the sensor data to place them on a standardized measurement scale. Kardous and Shaw (2014) selected nine different smartphone platforms available by January 2013 and examined ten iOS apps and four Android apps that purportedly measured sound intensity. Their tests using a calibrated acoustic generator found that between 65 and 95 dB, the smartphones and apps were able to reproduce the calibrated sound intensities to about ±2 dB. Robinson and Tingay (2014) test smartphones under more realis- tic real-world conditions of usage. A comparison of two platforms (Galaxy S2 and Nexus 7) available by 2013 with five different apps yielded an average ±11.8 dB difference between reported and calibrated sound levels. In deference to the controlled studies, they concluded that under real-world conditions, smartphones were generally unreliable in measuring sound intensity. The current investigation involving the more recent iPhone 6s, Samsung Galaxy S8, and Note 5 phones, along with a variety of apps for each platform, reveals considerable variability. When tested under quiet conditions using the same app (Decibel 10 th ), the Samsung Note 5 offers a lower noise level (±0.5 dB) than the iPhone 6s (±1.5 dB); however, this lower measurement noise was offset by the fact that compared with the calibration measurement of 30 dB, the two phones gave very different measures of 42 dB (iPhone) and 52 dB (Note 5). In fact, a comparison of the sound measurements for several different apps operating on the iPhone 6s (Table 4) revealed very discordant measurements simply due to the particular app selected. Of the ones tested, Decibel 10 th gave the most precise sound-level values compared with the BK-Meter calibration instrument. The range of measurements for a quiet room spanned nearly 30 dB, which is a factor of 1,000 in acoustic power. For the other end of the acoustic range, a rock concert measurement at the same location in the arena varied by 20 dB corresponding to a factor of 100 in power.
Calibration of these measurements poses a challenge because, although the measurements obtained with a given app and smartphone are linear over the range from 40 dB to 80 dB (Figure 7), below 40 dB the measurement dispersion dramatically increases down to the nominal digitization floor of 30 dB. After calibration, the residual dispersion in the measurements for the three phone models is σ = ±3.3 dB for the Note 5, ±1.6 dB for the Galaxy S8, and ±1.6 dB for the iPhone 6s. These dispersions appear to be quite small and match the results by Brown and Evans (2011). However, without properly comparing the measured values for each app/phone combination against a calibration standard, the data could not be corrected for the offsets of −12 dB (Note 5), +5 dB (Galaxy S8) and −8 dB (iPhone 6s). This would lead to a large systematic noise component for each measurement of approximately ±15 dB, which matches the results by Robinson and Tingay (2014).
The accuracy of the calibrated measurements between ±1.6 and ±3.3 dB is clearly a desirable goal for using smartphones to make high-quality measurements; however, the challenge is that, when combining data from multiple observers, you must keep track of each model and app being used, and apply a calibration to them based on a procedure such as the one described above. After applying a model-dependent offset, the three models tested (Figure 7) gave very nearly the same linear correlation with an average slope of 1.07 in the regression, so assuming that this is a common feature of all smartphone models, we need to establish the offset at only one fiducial point to fix the calibration curve according to y = 1.07x + C. The value for C can be established by using the smartphone to measure a calibrated sound source (x) so that the measured value (y) gives C = y − 1.07x. To within an accuracy of about ±5 dB, a convenient calibration level could include a very quiet basement room (35 dB) or some other convenient reference point that the project developer establishes before recruiting participants.

Citizen Science Applications
Smartphone sensor utilization follows a common bias that favors smartphones as data-gathering camera systems or social messaging text-based applications. For example, projects such as Bugs in our Backyards (https:// www.bugsinourbackyard.org/) has participants photograph unusual insects and upload the images, while Fish Watchers (https://www.fishbase.us/FishWatcher/menu. php) asks participants to record the type of fish, size and location. Very few programs actually use smartphone sensors to make measurements. A small number of citizen science sensor-based projects have appeared. One straightforward application is in the design of student-based crowdsourced projects such as those found at Anecdata. org, specifically Silent Earth and Earth Rotation Detector.
A previous study of smartphones used as radiation dosimeters and magnetometers (Odenwald 2019) demonstrated their utility for citizen science projects, and in the case of the magnetometers, identified a citizen science project, CrowdMag, in which these sensors are being used to map Earth's magnetic field. A prospective citizen science project is also under development to detect geomagnetic storms (Odenwald 2018). In general, the use of smartphones to gather quantitative data is still in its infancy. A survey of more than 1,600 citizen science programs cataloged by SciStarter finds fewer than 50 that use smartphones as mobile data-gathering platforms, and the majority of these use only the camera and texting capabilities to provide project data. With the exception of CrowdMag, none of these projects require the calibration of their data. CrowdMag collects data on the smartphone model being used and accesses the smartphone magnetometer via its own app software, so that a common app is used that can be directly calibrated given the smartphone model. The other apps use smartphone camera image data and need not be calibrated.
Generally, it is expensive to build a dedicated project app with controllable properties, so the opportunity to use commercially available, low-cost apps arises as an intriguing cost-cutting opportunity for future citizen science projects. Odenwald (2019) described how common magnetometer apps might be adapted for more rigorous project applications. The advantage of the magnetometer apps is that they all use the same 3-axis magnetometer in the smartphone, and the differences occur only in the manner of displaying and storing the data. The quality of the data remains uniformly high when allowance is made for temperature variations and the appearance of glitches, which can be eliminated by simple adjustments to measurement protocols. As described by Odenwald (2019) and implemented by CrowdMag, smartphone magnetometers can be used successfully in citizen science applications.
For the current light-and sound-level measurements the situation is more complicated. When compared with calibrated sources and measurement instruments, this investigation has found that the sound and illuminance values can be considerably discrepant despite the enormous promise of offering millions of mobile sensors from which to design future citizen science projects. Although sound measurements, when calibrated for the particular app being used, can yield accuracies of ±1.5 dB, the diversity of light-metering approaches and apps leads to a more challenging calibration process that may be prohibitive and yield accuracies of ±12% at illuminances above 5,000 lx corresponding to full-daylight conditions. The most significant difficulty is that the dedicated light sensors used by Android phones are front-mounted and corrupted by shadowing by the observer, while back-camera metering leads to large jumps in measured illuminance below 3,000 lx. Moreover, there is no simple scaling relationship between the reflected and incident illuminance that applies over the dynamic range of the metering from 0 to 500,000 lx. Assuming that the observer can follow a protocol to compute and report a calibrated illuminance on the front end, or that the project developer can track the smartphone model and app used and then perform the calibration on the back end, what can we do with smartphone light and sound meters that have been calibrated?
There have been many studies of ambient sound levels. For example, (Record the Earth 2018; Noise Planet 2018; Sound Around You 2018) have participants use their smartphones to upload recordings of local sounds. The National Park Service has also created a mathematical model of the sound levels across the United States to study ambient acoustic conditions primarily in the national park system (NPS 2017). With smartphone sound sensors and apps such as Decibel 10 th , capable of making measurements at approximately ±1.5 dB, the opportunity exists for creating citizen science projects in which the average person lacking an expensive precision sound meter can use their own smartphones to measure sound levels with an accuracy comparable to the IEC 60651 Type II sound level meter standard, which is the general-purpose grade for field work. Light-metering applications are far more problematic.
It is not expected that there are many citizen science applications for which ambient light measurements at ±12% would be suitable; however, these sensors might be useful in informal education and crowdsourcing applications.
The Sunlight Tracker project at Anecdata.org is a simple application of daylight photometry to measure the seasonal and latitudinal change in noontime sunlight levels. Participants use one of the light-meter apps described in this paper to measure the clear-sky, noontime light levels under direct sunlight conditions on a series of days during the year. These measurements collectively create a global map of the seasonal change in insolation in absolute lx units due to the change in the elevation of the sun above the horizon at different latitudes throughout the year.
In 2019, the Silent Earth project was developed on the Anecdata.org platform and uses smartphone technology to make spot measurements of the minimum sound volume in a participant's environment. This project is unique because, although current projects make sound recordings, none of them attempt to measure an average sound volume in decibels. For instance, no one has measured the sound levels from the Sahara Desert, in the tundra above the Arctic Circle, or from the top of the Eiffel Tower.
The long-term goal of Silent Earth is to create an interesting geographic database of ambient sound levels in the nominally quietest places to which participants have access through their travels. There may be some longerterm scientific value in this database that would elevate it to a traditional citizen science project, such as comparing it with the NPS national sound map. Silent Earth is currently partnering with the NPS to map the sound levels in the national park system. Meanwhile, Silent Earth will stimulate personal inquiry into a question that many participants and communities might like an answer to: Where can I find some peace and quiet?

Conclusions
Based on the calibrated results from a variety of apps and platforms, the direct use of smartphone light and sound sensor systems for conducting some types of citizen science experiments is warranted if adjustments to the recorded data are made using a small set of calibration curves. Properly calibrating the data can result in a significant decrease by, in some instances, an order of magnitude in the measurement uncertainty.
Light intensity can be measured after calibration to an accuracy of ±12% under high-illuminance conditions but there is significantly worse performance below 3,000 lx. Sound measurements have a random noise component of approximately ±1.5 dB, but systematic errors that are platform dependent can be as high as ±15 dB. These systematic shifts can be removed on a model-to-model basis and preserve the apparently linear relationship between the smartphone and professional sound scales from 40 to 90 dB.
Overall, given the various formal and informal studies by previous investigators, my results were better than expected. When proper measurement and calibration protocols are applied, smartphone sensors can generate good-quality data that compare reasonably well with professional-grade systems, but at far lower cost. This may open the door for citizen science and crowdsourced applications, but the quality of the calibrated measurements appears to be generally more suitable for informal K-12 educational activities.

Additional Files
The additional files for this article can be found as follows: