The aim of this project was to evaluate whether or not geotagged social media data can be useful in providing insight into a region’s “Sense of Place” using Santa Barbara as a case study.
How and where people experience and value coastal and ocean areas can reveal places we deem special. Sense of place can be defined as the connection people feel to their geographic surroundings, including both the natural and built environment. Locations with a strong sense of place often have a strong identity felt by both locals and visitors. Sense of place is important for the well-being of both people and the places we value because we are likely to take better care of places that are most important to us.
Sense of place has been qualitatively studied over time, but more quantitative studies have been lacking due to limited data. Thanks to location-based social media data we now have unprecedented amounts of location and sentiment data, allowing more quantitative exploration of the shared meaning of place.
This project used geotagged twitter data from Santa Barbara, California to see if we can measure Sense of Place with social media data. Specifically I used the data to:
- look at how people use natural spaces
- understand spatial patterns of different user-groups (tourists and locals)
- apply a sentiment analysis to learn how positive or negative nature-based tweets are over time
within Santa Barbara.
Why Santa Barbara?
The easy answer - I live here! Since I know the city and surrounding areas rather well, I could quickly look at spatial patterns and know what is happening in different locations. The total number of tweets coming from Santa Barbara is also more manageable compared to a much larger, urban city. Additionally, this project was done to look specifically at coastal Sense of Place, requiring a location along the coast.
Also, Santa Barbara is known for being a tourist town, and having beautiful natural and built landscapes (ok - I might be a bit biased here) and therefore provides a unique opportunity to look at two distinct “user-groups” (tourists and locals).
Not surprisingly, tourists and locals both tweet about nature. Tourists tweet about nature more - nearly 42% of all tourist tweets were nature-based, compared to 30% of local tweets. Spatial patterns reveal that tourists tend to stick to popular tourist sites in town including the wharf, waterfront, zoo, santa barbara bowl and more. Santa Barbara locals are also found at these sites just not as in high a proportion. Overall there is significant overlap in tourist and local patterns within the downtown area, indicating that tourists and locals alike share a fondness for the same areas and things.
This project proves that geotagged twitter data gives you the opportunity to examine how people move within a region, what they care about at certain areas and how user-groups align. Since “Sense of Place” is such a difficult concept to quantify, I think the power of an analysis like this lies in comparison to other regions. If we see that Santa Barbara has a higher than normal rate of visitation to natural areas, or positive sentiment around nature-based tweets compared to other similar regions then maybe we can feel more confident in saying that Santa Barbara has a strong nature-based Sense of Place.
Getting twitter data
When I started working on this I thought that twitter data would be easily accessibly based on the number of different projects I had been seeing that used Twitter data and related R packages. But I quickly learned that this was not the case and Twitter only allows free public access to past 9 days of tweets. This was a problem since we wanted all tweets from January 1, 2015 - December 31, 2019.
Twitter data was obtained freely through an established partnership between UCSB Library and Crimson Hexagon. Before downloading, the data was queried to meet the following conditions:
- Tweet came from the Santa Barbara area
- Only original tweets (no retweets)
- Date was marked between January 1, 2015 and December 31, 2019
Crimson Hexagon only allows 10,000 randomly selected tweets to be exported, manually, at a time in .xls format. Due to this restriction, data was manually downloaded for every 2 days in order to capture all tweets (😓). This took a significant amount of point and click time as you can imagine!
Once downloaded, the twitter data did not contain all desired information, including whether or not the tweet was geotagged which was vital to this project. To get this information I stepped outside of my R comfort zone and used the python
twarc library. This library can be used to “rehydrate” twitter data using individual tweet ids, and then store all associated tweet information as .json files. From here I was able to remove all tweets that did not have a geotag, giving a total of 79,981 tweets.
Some recent good news! Twitter recently changed their policy for academics looking to use twitter data in their research 🙌🏻! This is great news for anyone looking to use historical twitter data in their research without the funds to purchase access.
The dataset contained 21811 tweets from tourists, 45420 tweets from locals (32% and 68%). There are 12460 unique tourists and just 1893 unique local users.
Here is a sample of the tweet data:
Tweets over time
The total number of geotagged tweets is going down over time and, most noticeably, there is a significant drop in tweets at the end of April, 2015. It seems this is due “a change in Twitter’s ‘post Tweet’ user-interface design results in fewer Tweets being geo-tagged” ( source). The first 4 months of 2015 have 15,720 tweets, or roughly 19% of all tweets. To reduce a skew in the data and remove geotagged tweets that may have been geotagged without knowledge by the user in those months, I moved forward with all tweets from May 1, 2015 through the end of 2019.
The majority of tweets align with regional centers of Santa Barbara, Isla Vista (home to UCSB), Santa Ynez Valley and the unincorporated areas of Montecito, Summerland and Carpinteria. As you zoom in on the map, clusters will disaggregate. You can click on blue points to see individual tweet text.