Terry's GIS Studies and Transition to a New Career

Showing posts with label Quantile. Show all posts
Showing posts with label Quantile. Show all posts

Saturday, April 4, 2020

Module 5--Choropleth Mapping

For this module, I produced a choropleth map that showed the population density of European countries along with wine consumption per capita. The task was to properly show both variables on the map using either proportional or graduated symbols. For extra points, I chose to use pictures of wine bottles in lieu of the default template (circles).

For the data display, I chose to use graduated symbols and the quantile data method with four classes. I did not need to normalize the data, because the data was already normalized in the attribute table (wine consumption per capita). To me, the quantile data method with four classes provided the best granularity to perceive and understand the differences in each variable. Additionally, I excluded four countries from the data set because they were outliers--high population density, tiny area, insignificant wine consumption. Additionally, the graduated symbols were easier to manipulate than proportional symbols. For proportional symbols, I had to set the lower limit and the sizes were manipulated automatically. For graduated symbols, I could manipulate the size for each class individually to ensure the viewer could perceive differences.

For the actual map, I used the Albers Equal Area Conic Projection because it was important to maintain the area throughout the map when comparing/analyzing data that has an areal perspective. I chose to use a purple color ramp to display population density because it reinforced and complemented the overall theme of the map--wine consumption. I also added an inset map to show the Balkans area, as this was too crowded on the main map to be usable.
Population Density and Wine Consumption (per capita) in Europe
I prepared most of the map in ArcGIS Pro. The biggest challenge was importing the wine bottles. In order to import the bottles, I found free, non-attribution clip arts that were in .svg form. I then imported the bottle and manipulated the size and other aspects to ensure that they would be added just like any other graduated symbol. In order to move the wine bottles, I converted the wine bottle symbols to graphics, which allowed me to move them. I did the same with the text to ensure there was no overlap and everything was sized appropriately.

For my base map, I used the Ocean base map in ArcGIS Pro. I then added bold italics names for each main body of water, though I altered the size based on the magnitude of the body of water. I also curved or bent my text to convey water flow.

Once all my essential map elements were complete, I then saved it as a .pdf because ArcGIS Pro no longer has the functionality to export to Adobe Illustrator. Once saved as a .pdf, I then opened a new project in AI from the .pdf file. I then touched up and added higher level graphics through AI: Moved wine bottles and names, added drop shadows to Atlantic Ocean, added inner halo in legend and other text boxes. I wanted to add an inner halo to the countries; however, this would cause issues with interpreting the population density. If this were just a regular map (not graduated), I would have added an inner halo.

The biggest issue for manipulating features in AI from a .pdf is that there are hundreds of components nested many times that you must manipulate. You must also control/click each component of the feature to ensure all are manipulated the same. For instance, I had a halo around the countries on the main map. In order to resize or move a country's name, I had to click on numerous components so that they were all changed in the same way.

 My advice to everyone is to be meticulous and systematic, especially when manipulating so many different features. I chose to pick a group of features and toggle between visible/not visible to locate it and then I manipulated as needed. Once I finished I went to the next feature in the right pane. Otherwise, it would be very frustrating to find the features by clicking on the map.

Overall, though very tedious, this was a fun exercise and I learned quite a bit. I can definitely tell that my competence and confidence have improved.



Wednesday, March 25, 2020

Module 4--Data Classification

In this module, I did a refresher on the four levels of measurement (nominal, ordinal, interval, and ratio) along with four common data classification methods to display on the map--equal interval, quantile, standard deviation, and natural break.

The exercise was very straightforward and built on skills previously learned. The task was to identify persons age 65 or over in the census tracts of Miami-Dade County. I then created two series of four maps looking at the data differently. The first series of maps displayed the four classification methods looking at the percentage of the population aged 65 or over in the census tracts. The second set of maps displayed the same methods but with the population normalized for square miles. As a reminder, choropleth maps should use normalized data.

The first set of maps is displayed below and shows the non-normalized percent of the population aged 65 or over:

Percentage of Persons Age 65+, Non-Normalized

As you can see, the data represents the percentage of the population for those census tracts presented in the four classification methods. In my opinion, the least useful is the equal interval method as it imposes artificial data breaks and generalizes the data too much. The quantile method is more effective, but also imposes artificial breaks. However, it provides much more detail, which is consistent with natural breaks and standard deviation. The standard deviation method assumes that the data is normally distributed. Because the data may not be normally distributed or may be skewed, this might not be the best method. Additionally, the data will be placed over six intervals based on +/- 3 std dev. Therefore, the majority of the data will be clustered around the mean and could be influenced by outliers. I personally like the natural breaks (Jenks) method as it utilizes an algorithm to establish intervals that minimize intra-class variance while maximizing inter-class variance. I believe that this provides the best representation of the data and takes into account natural clusters.

The second set of maps is displayed below and shows numbers (not percents) of persons aged 65 or over, normalized based on square miles:

Number of Persons Aged 65+, Normalized for Sq. Miles

The normalized data provides the number of persons aged 65 or over, normalized for square miles. I will not repeat my assessment of the data classifications, as they did not change. However, I acknowledge that choropleth maps should use normalized data. However, in this exercise, I do not believe it is necessary, though the purpose of the map would determine this. With this study, it considers how the people are spread across the area. Therefore, if a large amount of people are spread over a large area, their distribution could appear much smaller than a smaller amount of people compressed over a smaller census tract. Therefore, the data will favor the smaller areas and may not provide accurate results. For instance, if a commission wanted to allocate services to the elderly population, the number of people would be important, but not necessarily how they are in comparison to the area. To further explain my point, if a commission was setting up medical centers, a large area with a larger population may require more providers than a smaller area with a smaller population, even though the normalization will display a higher density in the smaller area.

Again, this was an enjoyable exercise and provides more considerations when designing maps.