Abstract:
In this article, we explore the task of sentiment analysis for Ukrainian and Russian news, analyze different approaches and linguistics resources for sentiment analysis. We developed a corpus of Ukrainian and Russian news and annotated each text with three categories: positive, negative and neutral. Each text was marked by at least three independent annotators via the web interface and the texts marked by all three annotators with the same category were used in the further experiments. We experimented on automate classification of these texts with Naïve Bayes, DMNBtext, NB Multinomial, SVM machine learning methods. Feature selection methods were used for the best feature set detection in each case. Our experimental results show average F1-score of 0.82 for news in Ukrainian and Russian languages.