A Review of Sentiment Analysis in Twitter Data Using Hadoop


  • L. Jaba Sheela Panimalar Engineering College, Chennai, Tamil Nadu, India




Twitter, Sentiment Analysis, Hadoop, Map reduce, HDFS


Twitter is an online social networking site which contains rich amount of data that can be a structured, semistructured and un-structured data. In this work, a method which performs classification of tweet sentiment in Twitter is discussed. To improve its scalability and efficiency, it is proposed to implement the work on Hadoop Ecosystem, a widely-adopted distributed processing platform using the MapReduce parallel processing paradigm. Finally, extensive experiments will be conducted on real-world data sets, with an expectation to achieve comparable or greater accuracy than the proposed techniques in literature. 


Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of HLT and EMNLP (pp. 347–354). ACL.

Chu, C.-T., Kim, S. K., Lin, Y.-A., Yu, Y., Bradski, G., Ng, A. Y., & Olukotun, K. (2006). Map-reduce for machine learning on multicore. In NIPS (Vol. 6, pp. 281-288).

Lin, J., & Kolcz, A. (2012). Large-scale machine learning at Twitter. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 793-804). ACM.

Bian, J., Topaloglu, U., & Yu, F. (2012). Towards large-scale Twitter mining for drug-related adverse events. In Proceedings of the 2012 international workshop on Smart health and wellbeing (pp. 25-32). ACM.

Liu, B., Blasch, E., Chen, Y., Shen, D., & Chen, G. (2013). Scalable Sentiment Classification for Big Data Analysis Using Naive Bayes Classifier. In Big Data, 2013 IEEE International Conference on (pp. 99-104). IEEE.

ÁlvaroCuesta, D. F., & R-Moreno, M. D. (2014). A Framework For Massive Twitter Data Extraction And Analysis. Malaysian Journal of Computer Science, 50-67.

Skuza, M., & Romanowski, A. (2015). Sentiment analysis of Twitter data within big data distributed environment for stock prediction. In Computer Science and Information Systems (FedCSIS), 2015 Federated Conference on (pp. 1349-1354). IEEE.

Tare, M., Gohokar, I., Sable, J., Paratwar, D., & Wajgi, R. (2014). Multi-Class Tweet Categorization Using Map Reduce Paradigm. International Journal of Computer Trends and Technology, 78-81.

Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.

Bu, Y., et al. (2010). HaLoop: Efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment, 3(1-2), 285-296.

Taboada, M., et al. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267-307.

Rao, T., & Srivastava, S. (2012). Analyzing stock market movements using Twitter sentiment analysis. Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012). IEEE Computer Society.

Dooms, Pessemier, & Martens. (2013). MovieTweetings: A Movie Rating Dataset Collected From Twitter. Ghent University, Ghent, Belgium.

Twitter. (n.d.). Twitter Search API. Available at https://dev.twitter.com/rest/public/search.

Katkar, V. D., Kulkarni, S. V. (2013). A Novel Parallel implementation of Naive Bayesian classifier for Big Data. International Conference on Green Computing, Communication and Conservation of Energy, 847-852.

Kumar, S., Morstatter, F., Liu, H. (2013). Twitter Data Analytics. Springer Science & Business Media.

Bhatnagar, V. (2013). Data Mining in Dynamic Social Networks and Fuzzy Systems. IGI Global.

Elmer, G., Langlois, G., Redden, J. (2015). Compromised Data: From Social Media to Big Data. Bloomsbury Publishing USA.

White, T. (2012). Hadoop: The Definitive Guide (Third Edition). O'Reilly.

George, L. (2011). HBase: The Definitive Guide. O'Reilly.

Hewitt, E. (2010). Cassandra: The Definitive Guide. O'Reilly.

Gates, A. (2011). Programming Pig. O'Reilly.




How to Cite

Jaba Sheela, L. (2015). A Review of Sentiment Analysis in Twitter Data Using Hadoop. Asian Review of Social Sciences, 4(2), 31–37. https://doi.org/10.51983/arss-2015.4.2.2770