Prev Med. 2014 Feb 7;
OBJECTIVE: Recent availability of "big data" might be used to study whether and how sexual risk behaviors are communicated on real-time social networking sites and how data might inform HIV prevention and detection. This study seeks to establish methods of using real-time social networking data for HIV prevention by assessing 1) whether geolocated conversations about HIV risk behaviors can be extracted from social networking data, 2) the prevalence and content of these conversations, and 3) the feasibility of using HIV risk-related real-time social media conversations as a method to detect HIV outcomes.
METHODS: In 2012, tweets (N=553,186,061) were collected online and filtered to include those with HIV risk-related keywords (e.g., sexual behaviors and drug use). Data were merged with AIDSVU data on HIV cases. Negative binomial regressions assessed the relationship between HIV risk tweeting and prevalence by county, controlling for socioeconomic status measures.
RESULTS: Over 9,800 geolocated tweets were extracted and used to create a map displaying the geographical location of HIV-related tweets. There was a significant positive relationship (p<.01) between HIV-related tweets and HIV cases.
CONCLUSION: Results suggest the feasibility of using social networking data as a method for evaluating and detecting HIV risk behaviors and outcomes.