How to run the application
Accessing the web application online
Just click on the provided link to access the web application hosted on the EVL server from your browser. It is recommended to use a recent version of Google Chrome.
Hosting the web application locally
Clone the project repository. Open app.R with RStudio, download RStudio
if you don't have them installed on your machine. Download all the R libraries used in the project executing the command install.packages("library") in the RStudio console. The application. The application queires data realtime from the various data sources Array of Things, Darksky , Open AQ and Chicago Traffic Tracker in order to perform various visualizations. Please follow the below steps to do the required preprocessing before running the application to generate the data again.
- There is 1 preprocessing required to generate the necessary data file for the openAQ part of the application. Essentially, what this script does is to generate latitude and longitude pairs for all the openAQ locations in the city of Chicago. This is stored as an fst file which will be loaded into the app when it is run.
- The file is generated by running the preprocess_aqi.R which would create an fst file in the following location: fst/openaq.fst
Once we have all the neceessary data/fst files , we can now proceed with running of the application.
Run the application via RStudio by clicking Run App at the top right on the main RStudio panel. Access it using a browser and with the local machine address and port on which the application is started. (http://127.0.0.1:port/
Getting started and Sidebar
The application starts in full screen. You can open the menu items for different categories in the sidebar to see more options. The inputs item helps the user to switch between metric or imperial units for the whole application , to enable or disable the heatmap visualization and to show/hide the sensor nodes in the tabular form.
This application has only one main page, a full screen map centered in the Chicago area. On the map there are sensor nodes that can be clicked by the user to get real-time, as well as historical data from it. There is also real-time traffic data that can also be clicked on to get more information.
As the rest of the application, this map is responsive and particularly, it was created for the big wall of display that we have in the classroom (11520 by 3240 pixels). In order to allow a practical user experience in terms of the touch screen wall present in the classroom, the UI has been designed to work best in this configuration that is the one applied to this version of the application:
Nodes are groups of various hardware sensors located in the Chicago area. Each node has multiple sensors in it.
The nodes come from 2 different sources: Array of Things
and Open Air Quality
. The AoT nodes are blue, the OpenAQ green. The red nodes are the inactive nodes from AoT. Inactive nodes are all the nodes which are deployed but when queried respond with no observation for each sensor they have.
By clicking on any node, a popup will appear showing more information about the node. The node is also queried and by waiting a few seconds it is possible to visualize the data it provides in the plot panel. We also query at the same time the Dark Sky
APIs that retrieve weather data and other measures for any node and they are displayed in the plots panel as well. Keep in mind that it could take a while to load they are real time data. Especially for the last 7 days and 24 hours the data gathering and preprocessing is really slow, because multiple requests are performed and in particular the AoT APIs have a very slow response time.
The different Array of Things nodes along with their location are shown in tabular form on the left. The table can be shown and hidden based on the user preference. The table show the information about the various AoT nodes, what measures they are reporting and the node overall status active or inactive. The nodes which are reporting measures have "True" corresponding for that measure. The user can filter using the checkboxes below in the panel to show and hide sites based on the measures selected. The user can select any node from the table and visualize the measures for that site in form of graphical and tabular form. The comparison for two different sites can be done by selecting one row at a time and then the visualization for them show up in the current and previous section in the graphical and tabular panel.
The map background can be changed by clicking on the select boxes in the layers menu. There are 4 different map backgrounds. The default is a drawn map with train stations, streets and highways, parks and icons for most important locations.
The Dark Matter allows to make good use of the different colors (traffic and nodes) that pop out on the black background.
The satellite is a usual satellite image of Chicago.
The terrain is a mix of all the previous features: allows to better visualize colors given that the predominant color is white/grey, it has terrain features (see the green areas and water), it has streets and highways highlighted.
Traffic data come from Chicago Traffic Tracker
This data is real-time (at most 20 minutes ago) data that describe the congestion status in the streets of Chicago.
In particular, the dataset contains the current estimated speed for about 1250 segments covering 300 miles of arterial roads.
We decided to integrate traffic data because it could help explain and give a cause to the real-time variations of pollutants such as CO, PM2.5, PM10, NO2 that are also produced by vehicles.
The roads are colored followed usual traffic color used in common GPS navigation applications (e.g. Google Maps). The colors conventions are: grey, no traffic (no cars or no data provided), blue: normal/high speed and above, yellow: medium speed, orange: low speed, red: very low or zero speed.
Each road is clickable and more information are provided in a popup.
The Heatmap can be use be used to visualize the various measure intensity across the city of Chicago. The Heatmap has three inputs : Type of Measure, Value Type (min,max,average) and time range (current, last 24 hours and last 7 days). The user can select any measure from the dropdown for the three data sources and visualize it for various values type and time duration. The Heatmap is created by interpolation all the nodes data for a particular data source in order to get better data for visualization for the entire city of chicago. Based on the amount of data available, the interpolation is not possible for some of the measures if we have very less number of nodes reporting for them and a message is shown to the user on the screen. There is a legend available for the heatmap which tell about the value being shown on the heatmap.
Panels and menus
This is the most important panel of the application. It shows the data queried by the various nodes in the form of line charts or tabular form.
The inputs present in this panel are two: the first one is the time range, which can be selected from current (or most recent data), last 24 hours and last 7 days data. The second one is a set of checkboxes that allow the users to select only the pollutants features that they want to visualize.
It is divided into two tabs: First tab lists the pollutants metrics and the second one lists weather/climate metrics.
Each tab consists of a graphical and tabular view. The graphical view displays the metrics for node which is currently selected and it also shows the graph for the node which was previously selected. Similarly, the tabular view shows metrics for currently selected node in a data table and it also shows the the data table for previously selected node.
Tab 1 - Pollutants
The following are the metrics which can be displayed in the first tab:
1)CO, 2) H2S, 3) NO2, 4)O3, 5)SO2, 6)PM2.5, 7) PM10, 8) BC
The metrics can be filtered/removed from the graph using the checkbox given at the end of the second graph output. Following screenshot shows a sample graph shown in the first tab:
The legend shows what colors are used for each metric along with the units (note that imperial units can be triggered from the main panel on the left). Distinct colors are used for each metric. Similarly, the same data can be viewed in tabular format:
Tab 2 - Weather measures
Similar to tab 1, we have graphs and tables for tab 2. The following are the measures which can be displayed for tab 2:
1)Temperature, 2) Humidity, 3) Intensity, 4) Wind Speed, 5) Cloud coverage, 6) Visibility, 7) Pressure, 8) Ozone
Note that temperature, humidity and intensity are available in both darksky and Array of Things data, hence the distinction used in the legend is that AoT measures have the trailing text "(AOT)" in their name while darksky measures do not have any trailing text. This can be seen in the example graph below:
As usual, you can convert the units to imperial and this is showed in the legend text. We can also visualize the data in tabular format as shown by the example table shown below. As there are many weather metrics, this tab is best visualized by filtering out variabls using the checkboxes at the end of the panel.
that all these graphs and tables have data which is real time and the data is refreshed every minute. No data is stored locally. Even for displaying data for previously selected node's output graph/table, the data is fetched real time.
The layers menu can be opened by hovering over or clicking on the corresponding menu on the top right corner.
From there you can select one of the 4 different map backgrounds, as well as toggle on or off the sensors belonging to the nodes on the map based on the type of measure they give, or if they are active or inactive.
The last toggle is the traffic data, that shows or hides the roads real-time traffic visualization layer.
Units of measure
To allow users from both the United States and the rest of the world to give a meaning to all of this data we provided a practical switch on the sidebar under the inputs tabs so that the User can choose to convert data from Metric to Imperial and viceversa.
Pollutants having the ppm unit (part per million) do not have any equivalent in the imperial system and hence remain the same. The conversion details for other pollutants are listed below:
Used R libraries
This is the list of R libraries used for this project:
Using traffic data
As explain in the documentation, using traffic data could help explain and give a cause to the real-time variations of pollutants such as CO, PM2.5, PM10, NO2 that are also produced by vehicles. A strategy to do this kind of visual analytics with the application would be to select in the layers the nodes that give information about those pollutants, then select nodes with different type of traffic and compare the current pollutants value. Unfortunately, the data provided by the nodes is not really good, the CO measurement is most of the times negative (and the APIs don't explain what could be the meaning of negative values, it could be that it means that it is 0), the NO2 is 0 everywhere in Chicago, and the pm2.5 and pm10 are almost not provided by any sensor. For the sake of explaining the application's functionalities, I provide a screenshot of what the application looks like when interactions of this type are performed.
Last 7 days, downtown vs uptown
I tried to compare 2 different locations in terms of traffic, buildings, amount of people, closeness to the lake.
The upper plot shows the data for the last 7 days for a node located in uptown, close to the lake. The bottom plot shows data for the last 7 days for a node located in downtown, a little more west (further from the lake).
The things that I noticed are that: downtown the humidity level is more constant and overall less than uptown. H2S is only present downtown (it is 0 in the 08F node), the temperature is higher in uptown.
Heatmap for Particulate Matter 2.5 over the week
I tried comparing the average PM2.5 measurement value from the OpenAQ source for the last week and it looks like that south side of Chicago has more PM2.5 particles as compare to north side of Chicago. The reason might be due to presence of scrap yards, distribution warehouses and low income neighborhoods in south part of chicago.
Heatmap for Light Intensity over the last week
I tried comparing the light intensity for Chicago area for the last week. It can be seen that few areas have a compartively very high light intensity as compared to the rest of the city. It may be due to the fact that some locations have better access to light as compared to other places leading to better intensity for those locations.
Temperature over the last week using Darksky
I tried comparing the Temperature values for the last weeek using Darksky API using the Heatmap and sometimes darksky shows absurd values for Temperature (-60 F) like shown in the figure. The reason behind this might be faulty sensors. I came to know by reading more that darksky sensors are prone to errors and can give wrong measurement sometimes.
Data obtained fromopenAQ
OpenAQ reports more or less the same level of values for the weather measures. This is possibly because the weather is constant throughout Chicago or it is possible that the sensors report inaccurate values. The following table shows two openAQ nodes which have almost same values for many of the metrics.
Comparing data from AoT and openAQ
For two locations which are geographically close, the values obtained for the pollutant Ozone is pretty different as seen from the data obtained from openAQ(top graph) and the bottom one(AoT). The measure is reported in ppm and hence even a small difference is a lot.
Temperature obtained from Darksky and AoT
For a given AoT node, the darksky temperature is messed up as it shows -56 degrees celsius as shown in screenshot. On the other hand, the AoT sensor reports it accurately for the past 7 days.