CSL: A Combined Spanish Lexicon – Resource for Polarity Classification and Sentiment Analysis

Opinion mining and sentiment analysis in texts from social networks such as Twitter has taken great importance during the last decade. Quality lexicons for the sentiment analysis task are easily found in languages such as English; however, this is not the case in Spanish. For this reason, we propose CSL, a Combined Spanish Lexicon approach for sentiment analysis that uses an ensemble of six lexicons in Spanish and a weighted bag of words strategy. In order to build CSL we used 68,019 tweets previously classified by researchers at the Spanish Society of Natural Language Processing (SEPLN) obtaining a precision of 62.05 and a recall of 60.75 in the validation set, showing improvements in both measurements.

Additionally, we compare the results of CSL with a very well-known commercial software for sentiment analysis in Spanish finding an improvement of 10 points in precision and 15 points in recall.

