Smartphone Sensors for Citizen Science Applications: Radioactivity and Magnetism

In recent years, citizen science projects have begun to adapt smartphones to provide novel crowdsourcing opportunities (Stoop 2017) for collecting data across fields as diverse as bird watching (eBird) and aurora sightings (Aurorasaurus) to precipitation monitoring (mPing) and meteor-spotting (Meteor Counter). Virtually all citizen science project developers agree that the enormous popularity of smartphones (Dehnen-Schmutz 2016) makes them a highly desirable platform to incorporate into citizen science projects. However, the overhead cost to smartphone-enabling a prospective citizen science project can be a daunting expense that may in some instances dissuade scientists from developing a project. A recent survey by Yarmosh (2017) found that large industrial developers require upwards of US $500,000 to $5 million to develop a smartphone application program (app). However, shops with development teams smaller than a few people can build apps at much less expense depending on the level of interactivity and advanced features required. For example, Sung (2017) used a $100,000 grant from the US National Science Foundation (NSF) to create an app and 3-D printed lens system that converts a smartphone into a microscope, and the RINSE project (Reducing the Impact of Invasive Non-Native Species in Europe) developed a smartphone app for iOS and Android at a cost of $24,000 (Adriaens et al. 2015). Those who are building very simple apps and can do the coding themselves can bring the cost down to $10,000 (CodeWithChris 2018). A second potential approach to developing citizen science projects featuring smartphone interactivity is to adapt freely available software already developed to project needs. Very little has been done to explore this low-cost approach to citizen science, however, possibly because project developers prefer a platform that prominently features their own “logo” or may not believe that third-party apps can meet their technical needs for data accuracy and reporting. Since the introduction of the iPhone 3GS in 2009, smartphones routinely come equipped with a suite of sensors to determine their orientation in space and to provide lightmetering data for the operation of digital cameras, among other functions. Literally thousands of apps have been written to access various sensor outputs and numerically display their values, but assessing whether smartphone sensor measurements provide scientifically accurate information poses a severe challenge. Normally, science-grade measurements are made with instruments whose properties and construction are well understood. Many instruments are designed and fabricated by the scientists themselves, so their inner workings and operating principles are also well understood. This also applies to the software used to calibrate and analyze the data. Smartphones, however, represent a challenge Odenwald, S. 2019. Smartphone Sensors for Citizen Science Applications: Radioactivity and Magnetism. Citizen Science: Theory and Practice, 4(1): 18, pp. 1–15. DOI: https://doi.org/10.5334/cstp.158


Introduction
In recent years, citizen science projects have begun to adapt smartphones to provide novel crowdsourcing opportunities (Stoop 2017) for collecting data across fields as diverse as bird watching (eBird) and aurora sightings (Aurorasaurus) to precipitation monitoring (mPing) and meteor-spotting (Meteor Counter). Virtually all citizen science project developers agree that the enormous popularity of smartphones (Dehnen-Schmutz 2016) makes them a highly desirable platform to incorporate into citizen science projects. However, the overhead cost to smartphone-enabling a prospective citizen science project can be a daunting expense that may in some instances dissuade scientists from developing a project. A recent survey by Yarmosh (2017) found that large industrial developers require upwards of US $500,000 to $5 million to develop a smartphone application program (app). However, shops with development teams smaller than a few people can build apps at much less expense depending on the level of interactivity and advanced features required. For example, Sung (2017) used a $100,000 grant from the US National Science Foundation (NSF) to create an app and 3-D printed lens system that converts a smartphone into a microscope, and the RINSE project (Reducing the Impact of Invasive Non-Native Species in Europe) developed a smartphone app for iOS and Android at a cost of $24,000 (Adriaens et al. 2015). Those who are building very simple apps and can do the coding themselves can bring the cost down to $10,000 (CodeWithChris 2018).
A second potential approach to developing citizen science projects featuring smartphone interactivity is to adapt freely available software already developed to project needs. Very little has been done to explore this low-cost approach to citizen science, however, possibly because project developers prefer a platform that prominently features their own "logo" or may not believe that third-party apps can meet their technical needs for data accuracy and reporting.
Since the introduction of the iPhone 3GS in 2009, smartphones routinely come equipped with a suite of sensors to determine their orientation in space and to provide lightmetering data for the operation of digital cameras, among other functions. Literally thousands of apps have been written to access various sensor outputs and numerically display their values, but assessing whether smartphone sensor measurements provide scientifically accurate information poses a severe challenge.
Normally, science-grade measurements are made with instruments whose properties and construction are well understood. Many instruments are designed and fabricated by the scientists themselves, so their inner workings and operating principles are also well understood. This also applies to the software used to calibrate and analyze the data. Smartphones, however, represent a challenge because not only are their detailed designs and functions hidden under the shroud of corporate secrecy, but the designers who create the apps that access the raw data are usually not willing to discuss the equally proprietary details of how their apps function. Consequently, an assessment of how well smartphone sensors perform must regard these platforms as essentially "black boxes." In this paper I will assess how well a variety of basic smartphone sensors perform and function by analyzing data streams provided by the sensors and comparing them against professional-grade and accurately calibrated sensor systems.

Platforms
There are two major operating systems for smartphones: Android and iOS. Android is operating on more than 24,000 different android-compatible platforms (Morani 2015) and more than 50 models are available for iOS (Wikipedia 2018). With so many different hardware configurations, comparison testing is challenging. Fortunately, the sensor systems are far less variable and constitute only a small number of unique designs. The exact sensor system model found in a particular hardware configuration is generally very difficult to glean from public data on these systems. Consequently, I have decided to work backwards starting from smartphones commonly available and researching these systems. For this research, I used the iPhone 6s (iOS) and the Samsung Note 5 (Android). An important caveat is that normally one would like to examine multiple copies of identical platforms to assess smartphoneto-smartphone variability, which is an issue for validating how well these presumably identical copies consistently make the same measurement. The practical problem in this approach is that purchasing several smartphones is expensive, with each requiring its own access plan with a carrier to be able to download apps from an online store or to email data. One might ask to borrow smartphones from family members or colleagues, but few people will cooperate because smartphones are in constant use and are considered private devices. Nevertheless, through the NASA education grant that funded this research, four Samsung Note 5 and four Samsung Galaxy 8s were temporarily acquired along with the author's own iPhone 6s so that some comparison testing could be performed. A comparison of the relevant, though publicly incomplete, sensor information is shown in Table 1. The Samsung specifications were found by downloading the Spec Device app. The physical areas of the camera arrays are estimated in square millimeters for square arrays using A = (# Megapixels) × (pixel size in microns) 2 .

Apps
The basic operation of an app involves three components: A sensor interface, an application code, and a user interface. Development tools such as Xcode, Swift, and iOS SDK (Klosowski 2015) for Apple systems (iPhones, iPads) and Android SDK, Eclipse, and Android Studio ( Ravenscraft 2014) for Android platforms provide the necessary resources to bring app design and development into the hands of even a suitably motivated high school student. The quality of the data reported out of the sensor is, itself, limited by four considerations. First is the raw output of information in the sensor bus. This can be limited by the range and accuracy of the analog-to-digital converter model that is used, as well as the quality of the analog sensor and its susceptibility to environmental factors such as temperature or vibration. The second consideration is the manner in which the app developer chooses to display these data. Data with decimal accuracy may be truncated or rounded to an integer format for reasons of display artistry, or may be represented in some less-than-ideal format that reduces the user's ability to discern numeric values.

iPhone 6S Samsung
For example, some apps use digital displays for magnetic field strength, while others use a less-precise analog "dial" display. Third, there is the issue of data logging, which is crucial for many citizen science applications, but is not a regular feature of most apps, especially those that are available at no cost. Finally, apps developed for one platform may not work the same way on another platform. Part of this platform performance variability is due to technology improvements, but other factors intrinsic to the software/smartphone interface may also lead to data quality changes. In this section, I will investigate a small number of apps that report basic physical data and provide permanent logs, usually .csv files, which may be used offline for subsequent analysis. This is not meant to be an exhaustive survey, but it will minimally provide data points useful to scientists in assessing whether existing apps may be used for their specific applications. Citizen science project developers should undertake their own calibration studies to confirm platform/app suitability under their specific observing and measurement requirements.
In the discussions to follow, averages for N measurement values A i have been calculated in the usual way according to Equation 1, and the dispersion of the measurements around the average value <A> has been calculated according to Equation 2. The terms "dispersion," "sigma," "standard deviation," and "error" all refer to values computed from Equation 2. For measurements of quantities limited only by Gaussian random noise, the value for σ should be reduced by 1/ N as more observations are averaged. If there are uncorrected systematic effects in the data, this progressive reduction will not be the case and will plateau at some irreducible minimum value that may be significantly above the expected random noise level.

Radiation
Radiation comes in two forms: Particles and electromagnetic. Particle radiation includes protons, neutrons, electrons, and the nuclei of many kinds of atoms such as helium and iron. Electromagnetic radiation includes all forms of light, which travels at 300,000 km/sec and includes gamma rays, x-rays, ultraviolet, infrared, and radio-forms. Radiation in all its forms can be measured accurately in terms of units called Grays and Sieverts, and to understand the basics of radiation dosimetry, some terms and concepts need to be defined following the definitions provided by Odenwald (2013).

Radiation dosimetry basics
Almost as complex as the units for light illumination and flux are the adopted units and formats for reporting radiation intensity. Radiation dose is a measure of the amount of total energy that is absorbed by matter over a period of time. This matter can be human tissue or sensitive computer circuitry. The unit for dose is the Gray (1 Gy = 1 Joule of energy deposited in 1 kilogram of matter). The term "Dose Equivalent" compares the amount of absorbed energy in Grays to the amount of tissue damage it produces and is measured in Seiverts (Sv). Each type of radiation, for the same exposure level in Grays, produces a different amount of damage. Mathematically, this is represented by the equation: Dose Equivalent (in Sieverts) = Dose (in Grays) × Q. X-rays and gamma-rays produce "one unit" of tissue damage, so for this kind of radiation Q = 1, and this is also the case for beta radiation. For alpha particles, Q = 15 -20, and for neutrons, Q = 10. Typically, our total, annual radiation dose is about 3.7 milliSieverts, or alternatively in terms of dose rate this is about 0.4 µSv/hr. Under certain circumstances individuals can be exposed to significantly higher dose rates. For example, during the 2011 Fukushima reactor meltdown in Japan, residents in Tokyo some 240 km away temporarily experienced levels of 0.8 µSv/hr. By comparison, if you are traveling in a commercial jet at an altitude of 33,000 feet, you can expect a dose rate of about 2 µSv/hr for equatorial and mid-latitudes and about 7.0 µSv/hr for polar latitudes for a few hours.
The detection of radiation in order to measure dose rates depends on the type of material in the detector, the energy of the particles, and the type of particle involved, so there is no single detection system that works for all possibilities. High energy gamma-rays and neutrons can penetrate matter relatively easily, but the heavier, charged alpha particles can be easily shielded before they reach the detector. There are two ways to measure particle radiation and gamma rays using smartphones. You can obtain a plug-in module that converts your smartphone into a Geiger counter, or you can use the smartphone camera as a track (actually a flash) detector. Each method has its advantages and disadvantages.
Camera methods involve closing off the front or back camera apertures so that the camera chip is in fully dark mode. High-energy particles such as gamma rays and neutrons will collide with one or more pixels in the camera array chip and cause them to "light up" with excess charge. Once the data have been corrected for the unavoidable "dark noise" from the pixels themselves, the result is a count of the number of hits (flashes) per sampling interval, usually in counts per minute (CPM). This can be related to the level of radiation measured in µSieverts/hour in your environment after calibration. External sensors usually plug into the audio jack of your smartphone. They are small-volume, solid-state devices that react to energetic particles by producing a voltage or current spike that is picked up through the smartphone headphone jack and counted.

Radiation apps
Two radiation-counting apps were tested on an iPhone: Radioactivity Counter and Smart Geiger Radiation Counter. The former system uses the iPhone camera and can measure gamma radiation and some beta radiation that is energetic enough to pass through the camera case. It cannot measure alpha rays, which are almost entirely blocked by the case. The Smart Geiger uses a plug-in sensor, which costs US $24.00 and installs in the iPhone's headphone jack. Both apps allow data to be logged and exported via email so that they can be downloaded into an Excel spreadsheet. The Radioactivity Counter app has extensive literature provided online by Klein (2018) for how the app was professionally calibrated for a variety of platforms and camera arrays. A comprehensive discussion is also available under the "I" key within the app. Successful operation requires that the user calibrate the app/smartphone to detect and set the background level. For Smart Geiger, there is no accessible literature, especially in English, and the only adjustable parameter is the integration time for the observation, which can be 3, 5, or 10 minutes.
The performance of these apps on the same iPhone platform was compared against a professional-grade Geiger Mueller dosimeter obtained from Mazur Instruments. The PRM-9000 (US $595.00) is a compact, hand-held device with a single display window, but featuring a variety of data summaries including CPM; µSv/hr dosage rates for average, maximum, and minimum; and current conditions. A port is also available to export the data to a laptop or PC via a USB connector. The instrument is suitable for regulatory inspections and for the detection, measurement, and monitoring of broad spectrum, low energy radionuclides, including naturally occurring radioactive material.
Unlike other environmental situations in which a wide range of measurement possibilities can be found, radiation dosimeters are limited to only a few accessible possibilities. You can make measurements on the local background dose rate or in planes at a variety of altitudes. Other possibilities include measuring the radioactivity of granite kitchen counter tops, or samples of minerals known to have some activity. Under these circumstances you will not generally exceed about 5 or 10 µSv/hr, nor should you actively pursue finding conditions where the prolonged ambient radiation is much higher for obvious safety reasons. This restriction means that the smartphone systems will be tested at their lowest operating ranges rather than "mid-scale" where the random root-N sampling noise per measurement would be lowest.

Comparison of dosimetry systems
The iPhone radiation apps and the Mazur dosimeter were compared in a variety of accessible environments to establish their consistency. Because the count rates were very low, the measurements shown in Table 2 were carried out for an hour, and the count rates were averaged to obtain a measurement precision of approximately ±10%. An important caveat is that the count rates in CPM between systems with differing sensors cannot be directly compared. The number of counts or interactions between the radiation and the sensor depends upon such factors as the surface area or volume of the detector, the composition of the detector and the surrounding shielding, and the method of the interaction. The Mazur dosimeter is triggered by conducting, ionized tracks appearing between two high-voltage plates as the particle passes through the detector, while Radiation Counter and Smart Geiger rely on direct charge/energy deposition within the sensor volume. The resulting CPMs cannot be directly compared if the camera array areas are different, however each system is calibrated by the developer by comparing the system's CPM against a set of test sources that deliver a calculated dose rate in µSy/hr, so that the dose rates reported by each system can be directly inter-compared.
We see in Table 2 that the first three ground-level dose rates for each system report quite different values for the background rate: Mazur (Average: 0.13 ± 0.01 µSv/h), Radiation Counter (Average: 0.5 ± 0.5 µSv/hr), and Smart Geiger (Average: 0.05 µSv/hr). The Mazur sensor is able to easily detect the background rate, but the two smartphone systems yield conflicting values and low detection significance. At 26,000 feet and above, all three systems are easily able to detect the increased ambient radiation at aviation altitudes, however, the smartphone systems disagree about the exact level at 26,000 feet where the dose rate is near 1.0 µSv/hr. They are in greater concordance at 30,000 feet, where the level is only slightly higher at 2.4 µSv/hr as indicated by the Mazur sensor. It appears that there is a detection threshold for the two different smartphone sensor systems at about 0.5 µSv/hr, with higher dose rates being more consistently and accurately detected. Consequently, these systems respond to naturally occurring background conditions only above an altitude of 26,000 feet. It is possible, however, that repeated measures by these systems over much longer time periods of hours to days may obtain better dose detection through data-averaging so long as the radiation process behaves in a Gaussian manner such that the variance (σ 2 ) of the measurement decreases inversely with the number of samples combined.
Additional detailed measurements were made with the iPhone and Samsung platforms using the common Radiation Counter app during a flight from San Francisco (SFO) to Chicago (ORD) at a cruising altitude of 31,000 feet as well as ground measurements before and after the flight. The results are presented in Table 3.
As expected, the professional-grade Mazur system yields the most consistent measurements at each altitude and across repeated measurements. The Samsung platform does the least well, with significantly different readings at each altitude and in comparison to the Mazur system. Only the iPhone platform yields consistent measures at ground level, and relatively consistent measures at flight altitude compared to the calibrated Mazur system.
Another issue worth exploring is the stability of these measurements with smartphone battery charge level and temperature. Long duration radiation measurements of an hour to improve signal-to-noise in the radiation estimates places some stress upon a smartphone. To explore this effect, ground-based measurements for the iPhone and Samsung platforms were compared for variations in the battery charge and ambient temperature. The smartphones displayed no changes related to battery charging over a range from 100% to 20% during the course of a 2-hour measurement session, however, as shown in Figure 1, the Samsung phone did show consistent changes in the recorded radiation levels that varied slightly with battery temperature at rates between -0.7 to -1.5 µSv/hr per °C. No similar variations were seen for the iPhone.
Although at the time that this research was conducted only one iPhone 6s and one Samsung Note 5 were available, additional copies of the Samsung Note 5 and the newer Samsung Galaxy 8s were also made available for testing while this paper was undergoing review. However, no flights were available on which to measure ambient backgrounds at these flight altitudes, which are known to have a strength of about 2 µSv/hr-some 10 times higher than at ground level. To check whether multiple copies of the same smartphone model gave consistent results at ground level, the available eight Samsung phones were independently operated for 20 minutes using the Radiation Counter app after it was internally calibrated as per the developer's recommendation. The results are shown in Table 4.
In terms of the grand averages between the two phone models, the four Galaxy 8s measurements yield C(front) = 10.2 ± 3.4 cpm and C(back) = 37.9 ± 14.4 CPM, while the four Note 5s yield C(front) = 8.1 ± 3.2 cpm and C(back) = 7.8 ± 7.2 cpm. There is no statistical difference between the front cameras of the Galaxy 8s and Note 5, however, the back cameras do show a very slight difference between these two models at about the 2-sigma level. The variation between copies for the Galaxy 8s is σ = ±15 cpm versus the Note 5 with σ = ±7 cpm after 20 minutes. According to Table 1, the cameras have nearly identical areas, so this cannot be a factor in differences between the Samsung cpm values. Because these measurements are made at the lowest radiation levels with virtually no signal, it is not clear that the counts correspond to external influences; they may be related to the internal "dark noise" of the arrays themselves.

Calibration
Calibration of the smartphones can be performed by the following mathematical analysis (RSSC 2011) and employing the properties of gamma-rays, because they follow the inverse square law. For example, if a dose rate of 100 µSv/hr is detected at 1 meter from a point source of gamma-rays, the dose rate at 10 meters will be 100 µSv/hr/10 2 = 1 µSv/hr.
A method of calculating radiation dosage involves considering the energy of the gamma-rays and the gamma-ray yield of the source through Equation 3: where D will be the dose rate in milliRad/hr, C is the activity in milliCuries, E is the energy of the radiation in MeV, f is the fraction of disintegrations that produce the specific radiation, and d is the distance to the source in feet (RSSC 2011 So D = 0.0036 milliRads/hr, which for gamma-rays with Q = 1 and the unit conversion 1 rad = 0.01, Seivert equals 3.6 × 10 -3 milliRads × 0.01 (Sieverts/Rads) = 0.036 µSv/hr. The above dose rates are only upper limits, because they do not include the interaction of the radiation with the material surrounding the sensor, or the sensor itself. To account for the reduction from shielding and interaction with the sensor requires complex calculations beyond the scope of this paper but which are part of the calibration process by the developers for each sensor system.
Although radiation samples in the millicurie range would be preferable to provide a strong signal for testing the sensors, such samples are not only prohibitively expensive, but must be handled with proper techniques to avoid a safety hazard. A 1 µCuri disk sample of Cs-137 was obtained from United Nuclear Scientific Instruments and Supplies (http://unitednuclear.com/) for US $89.00. These samples are still capable of registering on most radiation detectors, but are entirely harmless and may be shipped from the supplier using conventional mail services. The predicted dose rate based upon the above calculation at 1 foot is 0.036 mSv/hr, which is undetectable by the smartphones, so the disk sample was taped directly to the back lens aperture on each smartphone. According to the Mazur dosimeter, this generated a dose rate near 3 µSv/hr, not unlike what might be expected at flight altitudes. In Table 5 the measured counts provided by the Radiation Counter app are displayed for each phone along with the calibrated CPM and dose rate provided by the Mazur detector. The measurements were all made under similar smartphone room-temperature conditions. The disk sample was taped to the back of the Mazur detector over the sensing region as described for sample testing in the instructions for the unit. This resulted in a calibrated level of 1085 ± 20 CPM and a dose rate of 3.2 ± 0.05 µSv/hr.  The variations in CPM between the four devices are undoubtedly an artifact of the differing effective interaction volumes between the devices and the amount of case shielding, so a direct comparison of the mean CPMs to gauge device sensitivity is not warranted. However, within the two Samsung models, the Note 5 has a significantly smaller measurement uncertainty (σ = ± 9 cpm) than the Galaxy 8s, suggesting that the Note 5 offers much better phone-to-phone consistency-of-measure for the back camera as a radiation sensor. Conversion of these CPM values to actual dose rates is problematic. The difficulty is that the extensive list of smartphone calibrations provided by Klein (2018) does not include the Samsung Note 5 and Galaxy 8s models, so one has to essentially guess which of the platforms can be used to give the CPM-to-Dose rate conversion. Typical values obtained in this way using other Samsung models as calibrations yield dose rates close to a few µSv/hr, which are similar to the values determined by the Mazur device.
To examine how the two apps performed on the same platform, the Smart Geiger and Radioactivity Counter apps were operated on one phone selected from among each of the three model classes. The results for a 3 minute recording period are shown in Table 6. The Smart Geiger sensor was placed directly in contact with the sample disk. The Cs-137 sample was also taped directly to the covered back camera aperture for the Radioactivity Counter measurements.
The Smart Geiger app also calculates the dosage rate using an undocumented internal calibration algorithm, which yields dose rate measures of (iPhone) 0.62 µSv/hr, (Note 5) 3.4 µSv/hr, and (Galaxy 8s) 1.5 µSv/hr. These rates are generally similar to the Mazur value of 3.2 µSv/hr to within 50%. However, unlike the extensive, publicly available information on the Radioactivity Counter app, there is no literature on how Smart Geiger functions or how it performs its calibration on each smartphone platform. Nevertheless, it does provide (accidentally or otherwise) dose rates comparable to values obtained from professional-grade dosimeters such as Mazur.

A crowdsourcing project
Unlike sources of sound and light, few naturally occurring sources of radiation would be easily detected by smartphone systems. That means there is an opportunity for you to measure and report on any unusual sources that you may find in your environment or that you encounter as artificial sources. To be detected by your smartphone, the radiation levels have to exceed about 1.0 µSv/hr. Some geographical locations naturally have elevated background radiation at or exceeding this level. For instance, although the US average background level is 0.3 µSv/hour, on the thorium-rich sands on the Kerala Coast of India, 1.0 µSv/hr is common, and in Guarapari, Brazil, levels as high as 10-20 µSv/hr have been found in selected beach areas (Fuginami, Noya, and Morishima 1999). You also may want to do some urban "prospecting" by checking out kitchen granite counter tops or your local museum's rock collections. A number of companies also sell small, safe, and inexpensive radiation sources that have been calibrated so that students can experiment with dose/distance/shielding and other relationships.

Magnetism
Some attempt has been made to evaluate smartphone magnetic sensors for biometric applications (Mourcou et al. 2015), augmented reality (Blum, Greencorn, and Cooperstock 2015), and even geomagnetic storm detection (Senior 2014;. By far the most commonly tested feature involving smartphone magnetometers involves the "compass" application. There are many anecdotal discussions about the "notorious" inaccuracy of compass bearings, for example, King (2013) reports errors of up to 9°. A systematic study of a variety of iPhone platforms was conducted by Galbraith and Rodriguez (2013), who found that errors from 9° to 18° were not uncommon even following a recommended calibration procedure.
Smartphones come equipped with a magnetometer so that your phone can sense its orientation in space, and they use basic apps like the compass app to determine your location with respect to Magnetic North (or South!). This is done through an internal chip that contains a 3-axis magnetometer, which consists of three separate modules internally aligned separately on the x, y, and z axis of the smartphone. Each of these three sensors measures the intensity of Earth's magnetic field (or a local source of magnetism like a bar magnet) along only one axis using a Hall Effect sensor. These sensors (see Electronics Tutorials 2018) create a changing output voltage as the magnetic field that passes through them changes in strength. Various magneto-resistive materials are used to confine the response of each sensor to only one dimension of the applied field. It is generally recommended that you stay far away from really strong magnetic fields like bar magnets or medical scanners because these can cause the magnetometer to overload, and it takes up to 30 minutes for it to re-acquire Earth's much weaker magnetic field to resume its orientation calculations used in other apps and features.
An example of such a magnetometer sensor is used in the Moto G smartphone and is provided by the AK8963, 3-axis Electronic Compass IC (AsahiKasei 2013). This is a silicon monolithic Hall-effect magnetic sensor with magnetic concentrator, which creates a 3-axis magnetometer on a silicon chip. The AK8963 chip's measurement range is from -4912 µT to +4912 µ (-49 Gauss to +49 Gauss). The voltage generated is in analog form, so an on-chip analog-to-digital converter converts the output to either a 14-bit or a 16-bit data word. Output data resolution is 150 nT/bit for the 16-bit model. The theoretical minimum ADC "noise" is defined as σ 2 = (1 LSB) 2 /12, so that for 1 LSB = 150 nT, the digitization noise limit is ±43 nT.
Two apps were examined that were available on both the Android and iOS platforms. Both allow for the exporting of data in .csv files. Teslameter 11 th , developed by SkyPaw Co. LTD, conveniently and clearly displays the Bx, By, and Bz magnetometer digital values. Tesla Field Recorder by Exelerus displays only the total field magnitude |B| and plots in crude form the separate Bx, By, Bz components, though the scale values are impossible to read. The .csv files are easy to email to a laptop and open via Microsoft Excel to further process, however the text files have to be converted into column-formatted style using the Excel "text to columns" function.
For both platforms, the displayed X, Y, Z axes are based upon the orientation of the chip in the smartphone and align with the so-called Body Frame coordinate system. Generally, when a smartphone is held in normal position with the back/front cameras located at the top, the Body Frame system has +Z-axis perpendicular to the front face of the smartphone and increasing upwards. The X-axis is along the short length and increases to the right, and the Y-axis along the long length of the rectangular smartphone case increases from bottom to top.

Measurement issues and accuracy
An extensive testing of smartphone magnetometer systems was performed by . In that study, the Tesla Field Recorder app was used to make repeated Bx, By, Bz, and |B| measurements over a period of ten minutes to assess the rms noise and noise spectrum properties of a "typical" iPhone magnetometer. Under most measurement conditions, one expects that as more measurements are made and combined in a running average, the rms measurement error will decline as √N if the noise is a Gaussian random process. This behavior was not found by . This implies that the magnetic sensors are white-noise dominated at a level of ±200 nT, and comparable to the ADC noise. This, unfortunately, means that even-lower values for σ cannot be reached simply by continuing to combine more samples.
If this noise limit were the only problem with smartphone-derived magnetometer data, there may still exist some interesting applications for this sensor technology such as detecting geomagnetic storms, which produce changes at the 1000 nT level. In principle, these changes can be detected at about 5 times the ADC noise level (e.g. 5-σ), which is statistically significant. Unfortunately, this level of detectability also may not be attainable due to non-randomness in the magnetometer data.
In the earlier study by , the Tesla Field Recorder was allowed to take data uninterrupted for four hours revealing two significant kinds of artifacts including sudden "glitches" and longer-term DC shifts shown in Figure 2 that can last for hours at a time. Glitches typically have small amplitudes of order <A> = ±1 µT, and are sufficiently few in number that they will rapidly "average out." The DC shifts are far more troubling with amplitudes of 2 to 5 µT, and are a significant contributor to elevating the data noise on the Samsung platform well above the nominal ADC level of ±200 nT. As discussed by , similar jumps and glitches are known to be a problem with some ADCs in which the input analog data are not properly filtered (Engineer Zone 2012, 2018; Maxim Integrated 2017). To check whether there is a difference in these glitches between multiple copies of the same phone models, four Samsung Note 5 and four Samsung Galaxy 8s phones were monitored under identical conditions. The Teslameter 11 th app was set up so that it recorded for four hours, with low data compression and high sensitivity. The results for all eight phones are shown in Figure 3.
Although the newer Samsung Galaxy 8S phones show a relatively smooth trace that is free of significant artifacts, the older Samsung Note 5 phones each show the periodic glitches reported by  and shown in Figure 3. Relative to the start of the data logging for each phone, the glitches occur every 30 minutes (340-350 samples). The amplitudes of these glitches are about 1.5 to 2.0 µT in Bz, and their strengths do seem to be phone-dependent. However, these glitches are not seen in the Bx or By magnetic component data taken at the same time. This strongly suggests that these periodic glitches are a feature of the Bz device electronics and are not a product of the environment outside the smartphone. If enough data points are averaged together, these glitches will have no significant effect on the resulting averages or noise values. However, for the Samsung Note 5 phones, and under the measuring conditions employed for Teslameter 11 th , it is best to wait 30 minutes (400 samples) before using the measurements to allow the values to stabilize. A similar aspect of the data was identified for iPhones by .
For the iPhone 6s in Figure 2, and Samsung Note 5 platform in Figure 3, following power-up the magnetometer readout does not instantaneously record the value of each field component but requires a prolonged period of time before the values stabilize. For the Bx and By components, this happens comparatively quickly within 30 samples (15 minutes), but for the Bz component, this relaxation process takes up to one hour before the reported value is within ±200 nT of its final, mean value. This long time constant in the component measurement process is problematic, because the difference between a "quick look" measurement after a few samples can be as much as 5 to 10 µT different than the value obtained after the process reaches its final, stable, value. This asymptotic process for the iPhone 6s and Samsung Note 5 are repeatable, and are a feature of all measurements made with these two platforms. The Samsung Galaxy 8s phone, in rather stark contrast, shows no such effect and reaches its asymptotic, and stable, value within a few minutes.
The  study also revealed that iPhone and Samsung platforms have different data sensitivities to changes in external temperature. Both the iPhone 6s and Samsung Note 5 phones were placed outdoors in a shaded location with no metallic objects nearby. Each phone was operated for one hour to log the magnetometer data. Over the temperature range from 48°F to 73°F, the Samsung phone remained relatively stable with Bz = -48.7 ± 0.3 µT. The iPhone, however, produced significant measurement variations with temperature change from Bz = -50 µT at 47°F to -41.5 µT at 75°F. For normal, indoor operating temperatures near 68°F, the Samsung's much narrower range (±200 nT) and faster response time is superior to the larger temperature range (±500 nT) and slower response time of the iPhone platform.

Variations among copies of the same model
The eight Samsung phones were individually placed on the same table location with the same orientation, and Teslameter 11 th was used to measure the magnetic field components. In Figure 4, representative samples of these data for each phone show that the recorded field values can be quite discordant even when made under identical conditions. For example, Samsung Note 5 (G) yields Bz = -26.7 µT at the same time that another copy (F) reads Bz = -47.7 µT, a difference of 21 µT in measuring the same field component. The Samsung Galaxy 8S phones perform considerably better than this sample of Note 5s, with only a difference of 1.3 µT between phones B, C, and D with an outlier Phone A measuring 4.0 µT lower compared to the other three copies.

Measuring magnetic orientation
A basic test of smartphone magnetometer systems is in the "compass" application, which is important for many apps and for determining the orientation of the smartphone in space.  tested this function in a local park, 1 km from the nearest power lines or other obvious aboveground metallic objects (e.g., cars, bikes) with the iPhone 6s placed on a non-metallic table-top. The SeeLevel app was used in its "level" mode to check the 2-dimensional level of the surface to better than ±0.1° relative to the local horizontal Earth plane. A carpenter's square was used to draw a 90 o angle on a piece of paper in each of four "cardinal" directions. The paper was then used to provide the 90 o reference. The grid was oriented to magnetic north using an analog compass. The magnetic field components were measured with the Sensor Kinetics app, and three additional quantities were calculated. The horizontal magnetic component |Bh| = (Bx 2 + By 2 ) 1/2 , the total field magnitude |B| = (Bh 2 + Bz 2 ) 1/2 , and the horizontal orientation angle θxy = Tan (By/Bx). The measured magnetic bearings in column 8 were obtained from the GPSCompass app by Tigran Mkhitaryan (iOS). The difference between the true bearing and the measured bearing also was calculated. What  found was that magnetometer-based angle measures can be discordant by several degrees or more in a single measurement, despite the precision of measuring the field components to ±1.0 µT or better.
This study was repeated for the eight Samsung phones to explore model-to-model (Note 5 and Galaxy 8s) and copy-to-copy (four of each model) variations, resulting in Table 7. The table entries are the computed averages and standard deviations of the ensemble of four Galaxy 8s and Note 5s. In this instance, the Compass app by MelonSoft was used to provide magnetic bearing measurements in column 8 and the Kinetic Systems app to measure the magnetic components in columns 2, 3, and 4.
For the corresponding measurements between the two Samsung models in Table 7, the differences are comparable to or smaller than the intra-model differences measured between the four smartphone copies. For example, for the 270° West measurement, the difference between the Compass app and the analog magnetic compass readings of -5° for the Note 5 and +8° for the Galaxy 8s are both consistent with a measurement of 0° given the intra-model standard deviations of ±3.8° for the Note 5 and ±4.7° for the Galaxy 8s. Also, the magnetic bearing measured by the Compass app in column 8 is indeed consistent with the independently computed bearing in column 7 based on the tangent of the actual magnetic component measured in columns 2 and 3, as a check on the methodology used by the app developer. Regardless of the magnetic bearing, the total field magnitude in column 6 should be identical. The measured average value of these entries is 44.3 ± 3.8 µT for the Note 5 and 49.4 ± 1.8 µT for the Galaxy 8s, suggesting that the Galaxy 8s offers a slightly better measure (lower σ) for |B| than the Note 5. This is also reflected in the measures for Bx, By, and Bz, for which the Galaxy 8s provides a consistently more positive estimate than the Note 5 and also a slightly lower σ (±1.8 µT versus ±2.5 µT).
The bottom line from this comparison of multiple copies of the same smartphone is that the Samsung Galaxy 8s may be slightly better at measuring the magnetic components of the geomagnetic field than the Samsung Note 5 in terms of accuracy (σ). In terms of compass applications, however, this difference leads to similar estimates for the magnetic bearings, which can be discordant from the true magnetic bearing registered on an analog compass by up to several degrees for single measurements, but nevertheless should average to zero error with multiple repeated measurements. The reason for this can be seen in the variability of the magnetic field components Bx and By, by as much as σ = ±2.5 µT, which directly factors into an accurate single-point measure of the magnetic bearing from θxy = Tan (By/Bx).

Absolute magnetometry
Unlike other physical characteristics such as temperature, acceleration, or light intensity, magnetism is far more complex and difficult to quantify in absolute terms. Typical vector magnetometers that are calibrated to ±1 nT in absolute terms are prohibitively expensive, and require very careful environmental filtering and background correction to provide measurements at their highest accuracy. The Fredericksburg Magnetic Observatory (FRD 2018) station provides a professional-grade absolute reference for the expected values for |B| and the geomagnetic components (bx, by, bz) that correspond to the smartphone magnetometer coordinates (By, Bx, -Bz) when oriented in the same manner. So long as the smartphone is placed on a level, non-magnetic Initially, only one set of smartphones was available: A single iPhone 6s and a Samsung Note 5. These were placed outdoors on a shaded, wooden table located 50 meters from the FRD magnetometer shed. The table was leveled with a smartphone app (SeeLevel) 2-D bubble level to within ±0.1° of horizontal in each direction. At the time of the measurements, February 27, 2018 at 2:00 pm EDT, the daytime temperature was 60°F. The Teslameter 11 th app was used to digitally display the smartphone (Bx, By, Bz) components.
To check how well various magnetometer apps performed on the same platforms, we compared the apps used in the previous calibration studies, Tesla Field Recorder (iPhone and Samsung) and Teslameter 11 th (iPhone and Samsung) against other popular magnetometer apps-for the iPhone 6s: Sensor Kinetics, Magnitude, Teslameter and Magnetscape.; for the Samsung Note 5: Sensor Kinetics, MagLog, Rici and Advance. The smartphone values are shown in Table 8. All values are in µT.   Table 8 shows that neither platform measures the actual value of the geomagnetic field, however, the iPhone measurements are significantly closer to the FRD values. The averages across the apps for both phones have a σ = ±150 nT (iPhone) and σ = ±250 nT (Samsung). Given the ±200 nT digitization error, these offsets from the FRD weak field measurements for the Samsung platform are statistically significant, but for most applications are not likely to be important so long as this represents a simple zero-point shift and not a non-linear rescaling of the magnetometer scales. The predicted magnetic field values also were determined based upon the international geomagnetic field (IGRF) model (NOAA 2018) for the location of FRD (38.20°N,77.37°W) and the date of the measurements on February 27, 2018, which is also presented in the table. The IGRF model differs from the actual FRD measurements by less than 600 nT, so for this geographic region it is unnecessary to travel to FRD to make on-the-spot calibration measurements. It is more convenient to make comparison measurements at a nearby location and simply use the IGRF values as the absolute reference, given the measurement accuracy of smartphone magnetic sensors.
To compare how multiple copies of the same platform perform in making accurate absolute magnetic field measurements, the geomagnetic field was measured in Kensington (+39.0°N, 77.1°W) using, in this instance, four copies of the Samsung Note 5 and four copies of the Samsung Galaxy 8s. The result of this comparison is shown in Table 9, where all values are in µT. In each instance, 100 measurements for each copy were averaged. The average dispersion of the measurements within each copy was σ = ±200 nT.
The systematic difference between the predicted IGRF value at FRD and the actual value shown in Table 8 is (51.0 nT-50.8 nT) = +200 nT, so one might expect that this same offset exists for the nearby area of Kensington (100 km), in which case the predicted actual field intensity should be |B| = 51.3 nT. Table 9 shows that there are significant, systematic offsets between the actual and measured field values for multiple copies within each platform such that if the values for the phones of like models were simply averaged together, the dispersions of the values within each model would be of the order 3,000 to 10,000 nT, which is substantially worse than the ADC noise of ±200 nT. Moreover, the Samsung Note 5 significantly underperforms in providing a mean value similar to the IGRF prediction. The Samsung Galaxy 8s provides an average measurement across copies that is within about 1,000 nT of the estimated actual field values.
So, in terms of absolute magnetometry of the geomagnetic field, the strategy might be to eliminate all platforms that return values that are substantially discrepant from the IGRF values using a median-filtering technique to eliminate outliers, then to average the remaining measurements. Clearly there seem to be some platforms (e.g., Samsung 8s) that appear to perform better than others (e.g., Samsung Note 5), so the investigator would be encouraged to standardize the measurements on specific platforms that lead to fewer outliers among their multiple copies.

A crowdsourcing project
What does your invisible magnetic environment look like? Your smartphone magnetometer can be used to discover hidden metallic objects in your walls or buried below ground, or to measure the intensity of any strong electromagnetic fields produced by electric motors. Smartphone magnetometers can be subject to many environmental forms of noise at levels of a few µT, so when making your measurements, the smartphone needs to be in the same orientation to within a degree or less in each of the three axis directions. You also need to make your measurements with and without the subject of your measurement present and then subtract the "background" geomagnetic field measurements along each corresponding axis, which can amount to several tens of µT depending on orientation.
In addition, there is a Citizen Science project CrowdMag (https://www.ngdc.noaa.gov/geomag/crowdmag.shtml), developed by NOAA scientist Dr. Manoj Nair, which is attempting to evaluate the accuracy of the IGRF by analyzing tens of millions of smartphone measurements around the world contributed by more than 15,000 participants. The downloadable app keeps track of the geographic coordinates of the measurement and also the model of the smartphone being used.

Conclusions
Based on the calibrated results from a variety of apps and platforms, the direct use of smartphone sensor systems for conducting citizen science experiments is warranted, especially if modest adjustments to the recorded data are made using a small set of calibration curves. Table 10 summarizes the various inter-comparisons and the resulting measurement accuracies. The copy variability is shown in parentheses. Also shown are the apps used and the app-to-app (A-to-A) variations across the listed apps on a single smartphone.
The Radioactivity Counter and Smart Geiger apps are both available on iOS and Android platforms, but only Radioactivity Counter and its developers offer a detailed and extensive calibration process on a model-by-model basis. Radiation dosimetry variations near ±0.05 µSv/hr are dominated by noise, however at altitudes of 26,000 feet or higher, a clear cosmic ray signal can be easily detected above 1.0 µSv/hr at relatively high statistical confidence. The Samsung Note 5 and iPhone 6s phones appear to have a similar response to the applied radiation both for the Cs-137 sample (column 5) and at flight altitudes (column 4), however, the Samsung 8s performs significantly worse with consistently higher σ and average CPMs. Smart Geiger performs in a similar manner (column 7) on all three platforms, as does Radioactivity Counter on the iPhone and Note 5, however, on the Galaxy 8s the recorded CPMs are significantly higher on all four copies when the Radioactivity Counter app is used. For the two Samsung models, the copy-to-copy variation (in parentheses in column 3) was consistently higher than the σ from repeated measurements on a single copy. Aside from measuring radiation backgrounds at flight altitudes, and detecting anomalously high geographic locations containing radioactive clays and sands (e.g., India and Brazil), there are few realworld applications of this smartphone radiation dosimetry that do not include a direct hazard to the user, hence the developers see this as an application more suited for radiation workers as an alert for hazardous conditions. Magnetic fields can be detected near the random noise limit set by the digitization process at about ±200 nT, but a variety of instrument and unknown influences cause systematic errors as high as ±1500 nT, which may be reduced somewhat by following appropriate measurement protocols. A comparison between the Samsung and iPhone platforms suggests that the Samsung smartphones are more reliable (lower measurement error) and suffer from fewer measurement artifacts (glitches and sudden DC offsets) than the iPhone tested. However, the absolute calibration employed in this work suggests that between the Samsung Galaxy 8s and Note 5, the Samsung Galaxy 8s registers field values that are slightly closer to the calibrated values based on the IGRF geomagnetic field model. When the same measurements are performed on multiple copies of the same Samsung phone model, the copy-to-copy variation is significantly higher (column 3 in parenthesis) than making repeated measurements on the same copy, following the results seen in the previously discussed radiation measurements. Also, measuring the same magnetic conditions on three different models running a variety of apps (column 7) leads to a range of measurement outcomes within each model group and across models.
Although some model-to-model and app-to-app measurement variations were identified, these results were better than expected. When proper measurement and calibration protocols are applied, smartphone sensors can indeed generate relatively high-quality data for radiation and magnetism properties that compare well with professional-grade, calibrated systems, but at far lower cost. This may open the door for a new generation of citizen science and crowdsourced applications involving the monitoring of these physical parameters for innovative research. Although this research demonstrates the kinds of accuracies possible with smartphone technology, when contemplating using these sensors in citizen science applications, it is imperative that researchers perform their own calibrations over the ranges, environmental circumstances, and smartphone models relevant to each project.

Supplementary Files
The supplementary files for this article can be found as follows: • Data 1. The raw data and the analysis of these data to create the various figures in this paper are available in three supplemental Excel files. The specific figure data are indicated by the names of the tabs provided with each data file. The data for the radiation analysis is provided in file.