Click here to read Part 1 of the mini-series.
I continued working on San Francisco’s Parks and Recreation Department public-facing data on green space properties managed by the authority. In the first article, I cleaned two data sets provided by the Department to exclude properties that cannot be qualified as parks, such as playgrounds, recreation centers, golf and tennis courts, libraries, and the zoo.
The guiding question for this article is, “Are there differences between the cleaned and original park data shown in the 10-minute walking distances around green spaces?”
Data Viz Purpose
This article aims to create two primary data visualizations based on the cleaned and the original data, which would show the audience the 10-minute walking distances (half-a-mile radius) from each mapped green space. The purpose of data visualizations is to
- Showcase the walking distance radii around each park;
- Visualize the places on SF’s map that are not within a half-mile radius of any park since these gaps might be the green space deserts that urban planners and local governments must address;
- Invite the audience to compare and contrast the two primary visualizations and think about the implications this data might have in reality.
Some guiding principles I kept in mind throughout the process are to create data visualizations that inspire creative thinking and to make them accessible and easy to understand.
Technical Process
I worked mainly with the geopandas and matplotlib packages in Python and data in csv and shapefile formats for this project. I visualized the whole process in Figure 1.
Throughout the project, I had two main challenges. First, figuring out how to plot the multipolygon spaces, and second, understanding how to create the half-a-mile radii via estimation. The format of the cleaned datasets caused the former. Since they are csv files, the coordinate values for longitude and latitude were of the type string; thus, I couldn’t plot them. I converted these values from strings to multipolygons (which allows for the diverse shapes of green space and not simple points on a map) appropriate for the geopandas package. Overcoming the second challenge was more difficult since plotting a radius around a simple point with just two coordinate values is easy with the buffer() function in geopandas. Yet, green spaces have complex shapes with multiple values for latitude and longitude. Additionally, I needed to estimate the buffer’s length in meters because the function originally works with the latitudes. Image 1 represents my thought process regarding estimating the buffer value in meters. Overall, I wanted to calculate the change in (the angle a distance between north and south makes at the center of the Earth) specifically for San Francisco. I converted radians to degrees using the circumference function and the approximate Earth’s radius to find this value. Having calculated the value of 111.7 km per degree, I could find the change of in latitudes. After converting the half-a-mile distance in km, I found the latitude distance by using the 111.7 km per degree constant. I got 0.00716204118 degrees, which I further used for radii estimation.
Since this approach is an approximate estimation, I cannot claim that the mapped radii are exact for two reasons. Firstly, I do not account for the grid system or culdesacs and how those might impact the half-a-mile walking distance (e.g., they could make the trip longer). Secondly, I do not account for the elevation of specific sites. For instance, while the Panhandle area east of Golden Gate park is relatively flat (about 200-300 feet above sea level), the area surrounding Twin Peaks is extremely hilly (about 600 feet above sea level). Thus, I plotted the upper bound of these radii, which I specify in all the visualizations’ titles. I assume that the radii are likely to be smaller than the plotted ones due to the two reasons discussed above.
Accessibility Improvements
I have also considered practical accessibility improvements in creating the visualizations, specifically relating to the color selection, title selection, and additions of legends where possible. Firstly, I selected a colorblind-friendly palette (shades of orange and blue) based on online research. I decided to use light grey as the base map color and remove any details to increase the legibility of the smallest green spaces. Secondly, I chose descriptive titles for all four visualizations and specified crucial information within them, such as:
- The fact that the radii show the upper bound;
- Where the data comes from;
- The year when SF Parks and Recreation Department created or updated the data.
Lastly, I included the legends for the first two visualizations to clarify where the plotted data came from to make my approach transparent. While the axes represent longitude and latitude, I decided not to label them since they could have driven the audience’s attention away from the visualization or other important text, such as titles or legends.
Data Visualizations
Discussion
Figures 2 and 3 showcase polygons that represent San Francisco’s green spaces. Figure 2 is based on the two datasets I pre-processed in Assignment 1, and Figure 3 is based on both this cleaned data and the original datasets by SF Parks and Recreation Department. I create Figure 3 to contrast the data and point to how many polygons I’ve excluded from the analysis since they cannot qualify as green spaces. The color selection worked effectively here since even the smallest parks are visible as brightly-colored spots on the map.
Figures 4 and 5 showcase the green spaces and the 10-minute walking distance around the parks. My rationale for creating not one but two visualizations was to highlight the differences in empty spaces between the cleaned and original data. I intentionally made the radii transparent so that the locations with a higher green space concentration would visually stand out and contrast themselves with the base map layer. Compared to Figure 5, Figure 4 has more gaps where no radii intersect with each other, pointing to the lack of green space in the area. Additionally, we should remember that the radii are the estimated upper bounds, meaning they are likely smaller than what I mapped in practice.
The lack of green space in San Francisco and globally disproportionately affects underserved and marginalized communities. Thus, urban planners, community leaders, and local authorities should focus their efforts on equity in green space access since only then could these spaces produce socially equitable effects. Leinen (2020) points out three central problems facing locations with little green space. Firstly, proximity to parks improves public health, both physical and mental. Those who are deprived of it are at risk of developing health problems caused by pollution, such as asthma. Additionally, those lacking quality green space might have fewer opportunities for physical exercise, cycling, or walking and unhoused people might be deprived of their places or rest and living. Secondly, green spaces provide crucial habitats for wildlife, especially in high-density urban areas, help decrease air pollution and assist with water filtration. The green cover is critical to urban settings since it helps reduce the harmful effects of heat islands. In locations where green spaces cannot provide these functions, local governments should develop other systems to satisfy these needs. Still, since these areas are often already underserved, there is even more pressure on these communities, both environmentally and financially. Lastly, green spaces have economic benefits, such as cooling down the areas that might need artificial shading and providing trails and cycling routes to decrease car dependency. Additionally, green spaces could offer educational environments to learn about local plant and animal species, provide areas for community gatherings, and much more.
Trust for Public Land has found that communities of color in the US have, on average, 44% less green space than predominantly white communities. After studying 108 urban areas in the United States, a 2020 study by Hoffman et al. found robust patterns of recent heat waves in formerly redlined areas compared to non-redlined neighborhoods. This historical continuity contributes to the further marginalization of underserved communities and worsens the quality of life in these areas.
Since in 2017, San Francisco became the first city in the US in which 100% of residents live half a mile from a park, data visualizations created as part of this paper should make us wonder about how this data was compiled and for what purposes. While San Francisco seems to have green space deserts in some neighborhoods, we should understand that cities with less funding, smaller Parks and Recreation departments, and more social inequality are likely to have exacerbated issues of green space deserts.
Next Steps
I suggest building on Figures 4 and 5 to improve the robustness of the half-a-mile radii. Specifically, a researcher could combine data on San Francisco’s elevation and map out the 10-minute walking distances that correlate to how hilly or flat the area is. Another improvement could be using the average block size in San Francisco to calculate how the radii would change. My current estimation assumes diagonal distances, and in many cases, a person wouldn’t be able to complete such a route due to the grid structure of the city or existing culdesacs.
Data Sharing
Credits:
Photo by Jeffrey Eisen from Unsplash; edited in Adobe Fresco