Thesis
Text-based approach to inferring the home city of Twitter users
Washington State University
Master of Science (MS), Washington State University
05/2016
Handle:
https://hdl.handle.net/2376/102176
Abstract
Twitter has been extensively used as a real-time sensor for detecting earthquakes, recent events, and trending topics. Unfortunately, less than 1% of tweets are geotagged. This lack of geotagged information limits the coverage and accuracy of geo-enabled applications. Given the assumption that the user posts majority of his tweets from his home city, we can mitigate the problem by determining Twitter users' primary location instead of localizing every tweet. In this thesis, we tackle the problem of predicting Twitter user's home city using only the contents of their tweets. We build language models of cities and select the city from which the user tweets are most likely to have originated. To alleviate the data sparsity problem inherent to the Twitter domain, we use transformed weight-normalized complement Naive Bayes for estimation. Previous studies have shown that selecting location indicative word features improves prediction accuracy. Thus, we propose and evaluate two simple unsupervised parametric methods for selecting location indicative words. Experimental results show that our approach significantly improves upon baseline performance. Additionally, our method of predicting a user's home city is conceptually simpler than the best-known methods, whilst achieving a performance competitive to the best single statistical classifiers
Metrics
17 File views/ downloads
23 Record Views
Details
- Title
- Text-based approach to inferring the home city of Twitter users
- Creators
- Solongo Munkhjargal
- Contributors
- Scott Andrew Wallace (Chair)Xinghui Zhao (Committee Member) - Washington State University, Engineering and Computer Science (VANC), School ofSarah Mocas (Committee Member)
- Awarding Institution
- Washington State University
- Academic Unit
- Electrical Engineering and Computer Science, School of
- Theses and Dissertations
- Master of Science (MS), Washington State University
- Publisher
- Washington State University; [Pullman, Washington] :
- Number of pages
- 100
- Identifiers
- 99900525081001842
- Language
- English
- Resource Type
- Thesis