Search code examples
rggplot2heatmapggmap

Density Heatmap - odd scale and display issues, I think?


I am attempting to do some mapping of arrest-related data in Los Angeles (using this dataset: https://data.lacity.org/A-Safe-City/Arrest-Data-from-2010-to-Present/yru6-6re4).

When I run the code as shown below, I get the following error:

Warning message: Removed 11,578 rows containing non-finite values (stat_density2d). 

So, that means out of 11,808 data points, only 230 are showing on the map. This seems reasonable, considering I'm zooming in on the two or three-block radius around the LA Coliseum only. This means, in 2017, there were 230 arrests in this area. OK.

But, when I map it, I get a density scale running from 500 all the way up to 2,500 (as seen below).

Here is a tibble of the location2017.df:

> as_tibble(location2017.df)
# A tibble: 11,808 x 3
     lon   lat Frequency
   <dbl> <dbl>     <int>
 1 -118.  33.7         5
 2 -118.  33.7         2
 3 -118.  33.7         1
 4 -118.  33.7         1
 5 -118.  33.7         4
 6 -118.  33.7         2
 7 -118.  33.7         2
 8 -118.  33.7         1
 9 -118.  33.7         1
10 -118.  33.7         4
# … with 11,798 more rows

Here is the code I'm using to set everything up.


And here is the plot output:

enter image description here

As you can see, it is quite "washed out" ... it just seems odd to me that there is a purple hue over the whole map based on an assumed 238 total arrests. As well, the density chart is strange ... why is it going from 500 to 2,500 when there is a limited number of arrests?

So, in the end, two questions:

1. Does it seem correct that there be a purple hue over the entire map?

2. Why is the density scale on the side of the map figuring the way it is with only 238 arrests plotted?

Any thoughts/suggestions/corrections on how to make this plot look/read better is greatly appreciated.

EDIT

Decided to quickly output a geom_point of the same information as above. Here is the plot:

enter image description here

As you can see, the "purple hue" from the first image makes sense. There is limited number of arrests throughout the area, with a large mass where the yellow part of the heatmap is.

So, is there a way to make a stronger contrast between less arrest and more arrest in order to limit the hue that is currently covering the map?


Solution

  • Your second question can aid in answering your first:

    ggplot2 calculates the scale range before it chooses which elements from the dataset it should plot, so it sees you have a range of values from 500 to 2500 - why is it? because of your data. Note that your long lat coordinate values are very wide ( i.e 33.7) while you zoom on very specific areas. Coordinates can have varied degrees of accuracy, so for example if you had some arrests at 33.72515 and more arrests at 33.71235, you would have numbers which more accurately describe the actual number of crimes inside your zoom perimeter.

    This also explains why you have the purple tint over the entire plot- your data suggests that over 500 crimes were committed in those areas, even though it might be more specific streets/alleys in reality.

    What can you do?

    1. IF you preprocessed the coordinates to cut some of the numbers after the decimal point- you should try to use the original coordinate data.
    2. Look at your dataset and find if there is a description of the place where the crime was commited ( usually in gov databases they have large input data for each entry) then you could try and find the exact coordinates of the street which is indicated for each crime, and get more exact coordinates that way - it will take some work and will depend on whether you have the street name for every row in the dataset.

    3. plot a more zoomed out view of the map.

    Here is an example for a project I worked on in the past, where I created a plot of parking citations in Washington DC. As you can see, my degree of accuracy from the dataset was quite high (4 after the decimal point) compared to your set, and you see how this reflects in the density plot:

    enter image description here