A weighting system to build physical layer measurements maps by crowdsourcing data from smartphones

Received May 1, 2020 Revised May 31, 2020 Accepted June 6, 2020 Mobile devices can sense different types of radio signals. For example, broadcast signals. These broadcasted signals allow the device to establish a connection to the access point broadcasting it. Moreover, mobile devices can record different physical layer measurements. These measurements are an indication of the service quality at the point they were collected. These measurements data can be aggregated to form physical layer measurement maps. These maps are useful for several applications such as location fixing, navigation, access control, and evaluating network coverage and performance. Crowdsourcing can be an efficient way to create such maps. However, users in a crowdsourcing application tend to have different devices with different capabilities, which might impact the overall accuracy of the generated maps. In this paper, we propose a method to build physical layer measurements maps by crowdsourcing physical layer measurements, GPS locations, from participating mobile devices. The proposed system gives different weights to each data point provided by the participating devices based on the data source’s trustworthiness. Our tests showed that the different models of mobile devices return GPS location with different location accuracies. Consequently, when building the physical layer measurements maps our algorithm assigns a higher weight to data points coming from devices with higher GPS location accuracy. This allows accommodating a wide range of mobile devices with different capabilities in crowdsourcing applications. An experiment and a simulation were performed to test the proposed method. The results showed improvement in crowdsourced map accuracy when the proposed method is implemented.


INTRODUCTION
Today we are witnessing a huge revolution in mobile devices especially smartphones. In the past two decades, there has been rapid growth in the number of mobile devices utilized. Smartphones took over the wider consumer market, and as of 2018 according to the Pew Research Center, 77% of US adults use smartphones [1]. Today, most of the wireless network services are delivered through points where mobile devices can access the network, such points are known as access points. Today, this network architecture is used in most of the data network technologies such as Wi-Fi, 5G, and LTE. There are several physical layer measurements that are utilized in these technologies. For example, received signal strength (RSS) 212 measurement indicates the received power of the signal transmitted by an access point. The signal to interference and noise ratio (SNIR) measurement indicates the wanted signal power ratio over the power of the unwanted signal (Interference and noise). There are other physical layer measurements which are utilized by different technologies. Physical layer measurements maps contain the value of the measurements for the signals coming from different access points operating in an area. Physical layer measurement maps are useful as they can be used for different applications including: a. Location and navigation: In this application, they can be used to calculate a device's location; for example, through pattern matching (e.g., RSS patterns at the mobile device are matched with the RSS patterns on the RSS map to determine the location of a device). Furthermore, they are used to aid in navigating indoors; for example, robots navigating through a maze or in some underground tunnel. b. Access control: In this application, they can be used as proof of location in, for example, access control applications. RSSs and SINRs can be used to derive a location proof used for authentication and for proximity detection. Moreover, some noise measurements and interference measurements can be used to indicate the presence of other devices, which can be used in some access control applications. c. Network performance: In this application, they can be used to monitor network coverage and performance. Since the physical layer measurements are indications of the access point's footprint at different locations. For example, higher RSS values mean closer proximitywhich correlates to better coverage. Higher SINR value indicates good channel conditions which entails better performance. Typically, physical layer measurements maps are built for the purpose of evaluating network coverage and performance by dedicating a number of devices to go around an area of interest and gather the physical layer measurements for the access points in an area of interest. Another method to build such maps is by crowdsourcing the physical layer measurements from users whom are already in the area of interest. This method can be feasible when network coverage changes as in some applications, as in high altitude systems [2]. Crowdsourcing is a process through which a problem is solved or a project is completed by a group of diverse participants. It is a joint process and a problem-solving technique that requires participation from a network of elements. Take for example, the Defense Advanced Research Projects Agency (DARPA) Network Challenge that was launched in 2009 [3]. In this challenge, teams competed to locate ten red balloons placed around the United States. The teams were then to report the balloons' coordinates to DARPA. The winning team crowdsourced the problem by recruiting participants via social media a portion of the prize money was used as an incentive). The challenge showed the general effectiveness of using crowdsourcing techniques to solve problems. The DARPA program managers were surprised by how quickly the challenge was completed. The concept of crowdsourcing data from mobile devices developed with the growth of smartphones and the mobile Internet. Now it is used to solve more complicated problems (e.g., Google Maps' traffic conditions feature).
Crowdsourcing is an effective approach to collect data especially with the huge growth in the usage of smartphones and mobile networks. However, one of the obstacles faced in crowdsourcing is when different types of devices with different capabilities are used in the crowdsourcing application. Let us take GPS for example; GPS location accuracy varies between mobile devices. GPS location accuracy has improved in the past few years. Most smartphones today have a GPS accuracy of 10 meters in open areas (i.e., suburban settings). It is anticipated to be accurate within one foot in some of the more recent smartphone models [4]. Meanwhile, older devices tend to provide poorer GPS accuracy. Therefore, in a crowdsourcing application different devices may be reporting location data with different accuracies. One solution for this obstacle is having all the participants in the crowdsourcing application use the same type/model of devices, which might not be feasible in practical applications. Another solution which is proposed by this paper is to assign weights to data points based on the data point source trustworthiness. In this paper, we propose a method to build physical layer measurements maps by crowdsourcing these measurements data from a group of participating mobile devices operating normally. Physical layer measurements are crowdsourced from mobile devices with a specialized application installed on them. This application automatically captures the physical layer measurements of access points in an area and the device's GPS location. In the proposed system, the collected physical layer measurements are correlated to a location (on a map) through the GPS location provided by the device. In the proposed system, data points provided by a participating device are given a weight based on the trustworthiness of the location provided.
There have been several research projects in this area. Gantiet et al. [5] discussed the concept of crowdsourcing measurements from mobile devices, offering several examples of its applications. First, there have been several research efforts that utilize RSS measurements for localization purposes [6][7][8][9] and for access control [10][11][12]. The authors in [13], for example, propose an indoor positioning system by using matching algorithms. RSS measurements are used for navigation [14][15][16] and road traffic information [17]. The radar system [6] uses RSS measurements and propagation models to construct radio maps that are used for locating and tracking users inside buildings. The authors in [18] proposed to combine a collection of available sources of RF signals crowdsourced from participating smartphones to build signal maps that can be used for localization. The authors use GPS signals, along with other sources such as NFC and QR tags, to determine the location of a device. However, their system does not take into consideration the variety of the capabilities of the participating phones. Furthermore, it requires auxiliary infrastructure (e.g., NFC and QR tags). Similarly, the system proposed in [19] constructs radio maps using foot-mounted inertial measurement units (IMUs) and GPS position. However, their proposed system requires extra hardware (i.e., IMUs). Kovalev in [20] proposed an indoor positioning system using Wi-Fi and GPS signals. The author's method uses the Naive Bayes classifier to calculate room level positioning. However, it requires extended user interaction, which takes time and effort from the user and negatively affects user convenience. The authors in [21] propose using crowdsourcing to collect GPS location along with RSS, and use it for positioning based on clustering by utilizing the k-means clustering algorithm. However, they don't consider the diversity of the phones' capabilities, as practical crowdsourcing systems usually include wide range of users with a wide range of different devices with different capabilities. Supporting a wide range of devices allow more users to participate, and consequently increases the amount of data collected. The authors in [22] present five methods for the generation of WLAN maps for indoor positioning using crowdsourced fingerprints. A fingerprint is assumed to contain identifiers that take into consideration the variety of capabilities of the participating phones. However, their system assumes that GPS position is exact. In a real-world, practical application, this is not typically the case. The authors in [20] propose a system to crowdsourcing measurement during large events. However, in their system, they use Bluetooth beacons, which require obtaining and install additional hardware. Moreover, they don't consider the diversity of the phones' capabilities, as practical crowdsourcing systems usually include a wide range of users with a wide range of different devices with different capabilities. Supporting a wide range of devices allows more users to participate, and consequently increases the amount of data collected.
Our proposed system does not require any additional infrastructure and takes into consideration the capabilities of different devices used by assigning a weight for each data point provided by these devices. In fact, this is what is unique about the proposed work, and furthermore, what sets it apart from existing methods. Assigning weights to data points increases the robustness of the crowdsourcing system because it decreases the impact of less accurate data on the system. Furthermore, it enables more users with different types of devices to participate in the crowdsourcing application. Meanwhile, as shown later in this text, any device weaknesses can be considered by using weighting factors.

RESEARCH METHOD
The proposed system architecture consists of the participating devices (which are GPS capable devices loaded with a custom mobile application) and an application server, where participating devices upload their unique ID, GPS location, and the scanned access points along with their corresponding physical layer measurements values. Then, the application server arranges the received data and stores it in a database. It then pre-processes the data and feeds results to the map building engine. This engine uses the received data from the participating devices to build the physical layer measruments maps. As previously mentioned, a special application is installed on each participating device. As shown in Figure 1, the mobile application uploads the physical layer measruments from different access points in an area and their corresponding access point unique identifier (UI). A unique identifier must distinctly identify an access point from other access points. For example, in Wi-Fi the MAC address can be a unique idtenfier for access points. These information is uploaded among the GPS location observed by the smartphone to the application server using a regular network connection.
In the proposed system, the physical layer measruments map is divided into a grid consisting of square-shaped clusters. A cluster is the smallest unit in the map, and it carries the physical layer measruments information for that location on the map. The map building engine updates clusters based on the received data from participating devices. This data is provided automatically by the participating devices and each data point includes the GPS location and physical layer measruments data. The proposed system operates as follows. a. The application installed on the participating smartphones uploads the GPS location along with the UIs of nearby access points (including their corresponding physical layer measurements values). The upload process is done repeatedly, every defined period of time. This period can be optimized by the user. The user can decide the period based on some factors, including processing power requirements, mobile data requirements, and battery life. b. The application server is pre-configured with parameters for each device ID. These parameters are: -Effective radius: The radius that is used for a data point to update the physical layer measruments map.
This value depends on the accuracy of the location source (i.e., GPS). The larger the accuracy, the smaller the effective radius. Therefore, data points originating from more accurate location sources update fewer clusters on the physical layer measruments map than data points originating from less accurate participating devices. -Weighting factor: This factor depends on the trustworthiness of the location data source. The higher the location accuracy of a device the larger the weighting factor. In general, data provided by newer smartphones have a greater weighting factor than data provided by older ones. Figure 1. System overview c. The map building engine uses the device ID to fetch the pre-configured values of the effective radius and weighting factor for a device. d. The map building engine calculates the distance between each of the centers of the clusters on the physical layer measruments map and the location of a data point. This is done using Vincenty's formula for calculating the distance between two points. this function is an iterative method that calculates the distance between two points on the surface of a spheroid. Now, if the distance as calculated above is equals or is less than the effective radius, then the map engine updates the physical layer measurements values for the access points in the cluster. This is done by employing the weighting factor as follows: where: is the current physical layer measurements values stored in the RSS map for a specific access point is the physical layer measurements value provided by a user for the specific access point Wu is the weighting factor of the data point Wm is the sum of all weighting factors that updated this cluster in the past. e. The map engine updates Wm as follows: (2) Employing the weighting factors allows more accurate data to have a larger effect on the physical layer measurements map than less accurate data. Thus, allowing a more diverse range of devices to participate in this crowdsourcing application. This is very useful in practical cases, where different users use a variety of smartphones' models.

RESULTS AND ANALYSIS
In this research, we tested a number of smartphones manufactured between 2010 and 2018. The test was conducted at six different open-to-sky outdoor locations and ten different indoor locations

215
(inside a two-story building). The location service settings in the phones were set to "device only mode" which relies solely on the GPS radio signal generated by the GPS chip built into the phone, unlike the high accuracy mode which uses a combination of GPS, Wi-Fi, Bluetooth, and/or cellular networks. The GPS yields were compared to the ground truths (which are the points of the test with known coordinates). The location error was calculated as the average of the distances between the GPS yield and the ground truth for each of the devices. Results, as shown in

Experiment 1 (Wi-Fi access points)
In our experiments, the six Android-based smartphones shown above were used. Location service settings were set to the device-only such that the location is generated solely based on the GPS integrated circuit (i.e., without correlating it with Wi-Fi, Bluetooth, etc.). An android application was designed and installed on these devices as shown in Figure 2. This application performs the simple function of uploading the GPS data and RSS data for scanned Wi-Fi access points every predetermined period of time. In this experiment, this period of time was set to five seconds (to accelerate data collection in order to collect large amounts of data for analysis). However, in non-experimental applications, this period of time can be set to a minute to avoid overwhelming the resources of participating smartphones. Tests were conducted in a two-story building, where data was collected from the ground floor only (in this experiment, altitude was ignored). Six different devices were used for data collection. The RSS map size was 100 meters by 100 meters, which covers one campus building and its surroundings. The cluster size in this experiment was set to one square meter. This size was chosen according to the GPS accuracy of the participating devices (the most accurate device in the experiment has average GPS location error of .19 meters). The application was used to collect approximately 6,000 data points divided equally between the participating devices. Approximately 800 unique access points were scanned. The data points were processed by the map building engine on the application server to build the RSS map. The collected data points were fed to the map engine, and the RSS map was built using the criteria above.
In order to test the accuracy of the generated weighted RSS map, 100 additional data points were collected at locations with known coordinates. Then the average of the mean absolute percentage errors of the weighted RSS map (different devices were assigned different weights when building the map) was calculated at the test data points' locations as follows.
where: n is index of an access point i is the index of a test point.
( ) is the RSSI recorded by the test point for access point n ( ) is the RSSI value stored in the RSSI map cluster for access point n We build the RSS map again, but this time without employing weighting factors (the weighting factors were set to 1 for all of the data points regardless of the model of the device). Then, using the same test points, we calculated the average of the mean absolute percentage errors for the unweighted RSS map (all devices have the same weighting factor) using (3). The results are shown in Table 2.
As can be seen from Table 2, there is less error when weighing factors are employed compared to when they are not. In this experiment, the weighted map has 6.71% less average mean absolute error than the unweighted map. In the case these RSS maps are used for location fixing purposes, to calculate the location error they would produce, first, we correlated the RSS data of the test points versus the clusters in the RSS map to find which cluster carries the most similar RSS data to each of the test points. This is done using the minimum mean square error. Then, we calculated the location error, which is the distance between the true coordinates of a test point and the coordinates of the center of the cluster which test point's RSS data was matched to. This was done for all of the test points and the average location error was calculated for both RSS maps (i.e., weighted and unweighted). The results are shown in Table 3. The results indicate that less location error is produced by the weighted RSS map compared to the unweighted one. Furthermore, the results show that location errors are less when a cluster has a high Wm value. Moreover, large location errors are less likely for clusters with a high Wm value. This shows how data from less accurate devices can be balanced with the data from more capable ones. This allows the use of diverse range devices in crowdsourcing for practical cases and allows a wide range of users to participate in crowdsourcing applications.

Experiment 2 (LTE cells)
In order to test the accuracy of the generated weighted map, we simulated the system using Matlab. Three LTE base stations with three cells each were placed on a map as can be seen in Figure 3. These cells were assigned physical IDs from 1-9 (in this case are the access points UIs). Antenna toolbox on Matlab was used to simulate the coverage of these antennas. This was done using the 1700 Megahertz band with effective radiated power of 52 dBm. We assigned the effective radius and the weighting factor for the six devices above based on their GPS location accuracy. We configured this simulation to have these devices report RSSI value at random locations on the map, with a random location error in a way the average location error matches the values in Table 1. 10,000 data points were generated.
The generated test points were used to build an RSSI map using the steps discussed in the previous section. In order to test the accuracy of the generated weighted RSSI map, the average of the mean absolute percentage errors of the weighted RSSI map (different devices were assigned different weights when building the map) was calculated at the test data points' locations as follows.
where n is index of an LTE cell. i is the index of a test point. ( ) is the RSSI generated by the antenna coverage simulation for LTE cell n.
( ) is the RSSI value stored in the RSSI map cluster for LTE cell n.

Figure 3. Simulation map settings
We build the RSSI map again, but this time without employing weighting factors (the weighting factors were set to 1 for all of the data points regardless of the model of the device). Then, using the same test points, we calculated the average of the mean absolute percentage errors for the unweighted RSS map (all devices have the same weighting factor) using (4). The results are shown in Table 4. As can be seen from Table 4, there is improvement in the accuracy of the map when weighing factors are employed compared to when they are not. In this simulation, the weighted map has %8.79 less average mean absolute error than the unweighted map. This shows how data from less accurate devices can be balanced with the data from more capable ones. This allows the use of diverse range devices in maps crowdsourcing for practical cases and allows a wide range of users to participate in crowd sourcing applications. In this experiment, due to simulation limitation we only build the map for the RSSI measurement. However, the average mean absolute error value is going to be the same for the other physical layer measurements maps.

CONCLUSION
Physical layer measurement maps are used in different applications including navigation, security, and network performance evaluation. The proposed methodology achieves an effective approach to generate physical layer measurements maps because it does not require dedicated devices and personnel to collect the measurements manually. As discussed previously, the use of crowdsourcing saves many resources and offers clear performance advantages. The proposed system takes into consideration the diversity of mobile devices used by users and does not require any additional equipment nor infrastructure. This work presented a method to build crowdsourced physical layer measurements maps in a way that it accommodate wide ranges of devices, it does not require additional hardware, and take into consideration the capability of 218 different devices. This work is different from current work in that it supports a wide range of devices. This fits well in real-life applications, where there is a wide range of devices' models with different manufacturers and capabilities. The proposed system addresses this by allowing a wide range of users to participate in the crowdsourcing application without affecting the performance of the system. This is achieved by employing weighting factors for each of the data points based on the trustworthiness of the source. The most trustworthy devices have a larger effect on the RSS map than less trustworthy ones. After building the system for two applications and testing them. The test and simulation results showed a visible improvement in the accuracy of the generated physical layer measurement maps when weighting factors was employed compared to when they are not. This paper showed how employing weighting factors when building maps crowdsourced from a range of devices can improve the accuracy of generated maps. This was seen from the test, in which the average mean absolute error of the weighted map decreased by 6.71%, and from the simulation, in which the decrease in error was 8.79%.