Text sentiment analysis methods (1) - Lexicon-based sentiment analysis methods
Introduction to text sentiment analysis
Inputting a text,
and then the electronic system automatically feeds you what kind of sentiment
orientation the text has, whether it is positive or negative, this is text sentiment
analysis, also known as Opinion Mining. It refers to the process of collecting,
processing, analyzing, summarizing and reasoning about subjective text with emotion,
which involves various research fields such as artificial intelligence, machine
learning, data mining and natural language processing.
Text sentiment
analysis is an important branch in the field of natural language processing, which
is widely used in public opinion analysis and content recommendation, etc. It is a
hot research topic in recent years. According to the different methods used, they
are classified into sentiment analysis methods based on sentiment lexicons,
sentiment analysis methods based on traditional machine learning, and sentiment
analysis methods based on deep learning.
Introduction of
lexicon-based sentiment analysis methods
The method based on
sentiment lexicons refers to the division of sentiment polarity under different
granularity based on the sentiment polarity of sentiment words provided by different
sentiment lexicons.
Firstly, the text is input and pre-processed through the data (including denoising,
removing invalid characters, etc.), followed by word separation operation, then the
words of different types and degrees from the sentiment lexicons are put into the
model for training, and finally the sentiment types are output according to the
sentiment judgment rules.
Most of the existing sentiment lexicons are
constructed manually, and according to the different granularity of division, the
existing sentiment analysis tasks can be classified into word, phrase, attribute,
sentence, chapter and other levels.
Manual construction of sentiment
lexicons is costly and requires reading a large amount of relevant materials and
existing lexicons, summarizing words containing sentiment tendencies by summarizing
them and labeling them with different levels of sentiment polarity and intensity.
Advantages and disadvantages:
The sentiment
lexicon-based approach can accurately reflect the unstructured features of the text
and is easy to analyze and understand. In this method, the sentiment classification
effect is more accurate when the coverage and accuracy of sentiment words are high.
However, this method still has some defects.
The sentiment
classification method based on sentiment lexicons mainly depends on the construction
of sentiment lexicons, but due to the rapid development of the network at this stage
and the speed of information update, there are many new words on the network, and
the recognition of these new words does not work well, and the existing sentiment
lexicons need to be continuously expanded to meet the needs.
The same
sentiment word in sentiment lexicons may express different meanings at different
times, in different languages or in different domains, so the method based on
sentiment lexicons is not very effective in cross-domain and cross-language.
When using sentiment lexicons for sentiment classification, the semantic
relationships between contexts are often not considered.
Therefore more
scholars are needed to conduct sufficient research on sentiment lexicon based
methods.