ML-Ask: Open Source Affect Analysis Software for Textual Input in Japanese

We present ML-Ask – the first Open Source Affect Analysis system for textual input in Japanese. ML-Ask analyses the contents of an input (e.g., a sentence) and annotates it with information regarding the contained general emotive expressions, specific emotional words, valence-activation dimensions of overall expressed affect, and particular emotion types expressed with their respective expressions. ML-Ask also incorporates the Contextual Valence Shifters model for handling negation in sentences to deal with grammatically expressible shifts in the conveyed valence. The system, designed to work mainly under Linux and MacOS, can be used for research on, or applying the techniques of Affect Analysis within the framework Japanese language. It can also be used as an experimental baseline for specific research in Affect Analysis, and as a practical tool for written contents annotation.


Introduction
Automatic analysis of user behavior and intentions has gained increasing interest through recent years.A large number of applications has been proposed from various sub-fields, including robotics, artificial intelligence (AI) or natural language processing (NLP).One of the most important tasks in such research and its applications is to properly recognize current state the user is in.Depending on application, the focus could be on different states of the user, such as user engagement in conversation (e.g., with a dialog agent [1]), user intention (e.g., to buy a certain product, or chose a specific migration route [2]), user attitude (e.g., toward a specific object, or the agent itself [3]), or user emotions (e.g., to choose different conversation strategy if the user is sad or happy, etc. [4]).In many of those tasks techniques for Affect Analysis have proved to be effective.Affect Analysis refers to recognizing user affective states (emotions, moods, attitudes, etc.).
Several affect analysis systems have been proposed till now [7,19,9,10,14,16,41,21,28].However, none of them has yet been released as an Open Source software.This paper presents the first Open Source system for text-based affect analysis of input in Japanese -ML-Ask.The system has been developed for several years and has matured enough to be released to the public.The system has already proved to be useful in multiple tasks and can be used for Affect Analysis in various research, as well as an experimental baseline for specific research in affect analysis and as a practical tool for annotation of written contents (such as user-generated contents on the Internet).

Background Affect Analysis: Problem Definition
Text based Affect Analysis (AA) has been defined as a field focused on developing natural language processing (NLP) techniques for estimating the emotive aspect of text [5].For example, Elliott [6] proposed a keyword-based Affect Analysis system applying an affect lexicon (including words like "happy", or "sad") with modifiers (words such as "extremely", or "somewhat").Liu et al. [7] presented a model of text-based affect sensing based on OMCS (Open-Mind Common Sense), a generic common sense database, with an application to e-mail interpretation.Alm et al. [8] proposed a machine learning method for Affect Analysis of fairy tales.Aman and Szpakowicz also applied machine learning techniques to analyze emotions expressed on blogs [19].
There have also been several attempts to achieve this goal for the Japanese language.For example, Tsuchiya et al. [9] tried to estimate emotive aspect of utterances with a use of an association mechanism.On the other hand, Tokuhisa et al. [10] as well as Shi et al. [11] used a large number of examples gathered from the Web to estimate user emotions.
Unfortunately, until now there have been no Open Source Affect Analysis systems.Although there exist several online demos, such as "Sentiment Analysis with Python NLTK Text Classification", 1 or "Sentiment Analysis and Text Analytics Demo" by Lexalytics, 2 these refer to Sentiment Analysis, not Affect Analysis.In Sentiment Analysis the focus is usually put on determining emotion valence, or whether input (sentence, paragraph, product review, etc.) is of positive or negative valence.Affect Analysis is a task of much broader scope, focusing not only on the polarity of the input, but on particular emotion classes that are expressed by the input (joy, anger, fear, etc.).
To develop our software for Affect Analysis, we first needed to understand the phenomenon of how emotions are expressed in language.This phenomenon can be explained with the notion of the emotive function of language.

The Emotive Function of Language
Linguistic means used in conversation to inform interlocutors of emotional states are described by the emotive function of language (Jakobson, 1960) [39].Ptaszynski (2006) [46] distinguished two kinds of its realizations in Japanese.The first one are emotive elements (or emotemes) which indicate that emotions have been conveyed, but not detailing their specificity.This group is linguistically realized by interjections, exclamations, mimetic expressions, or vulgar language.The second are emotive expressions -parts of speech like nouns, verbs, adjectives, phrases or metaphors describing affective states.Nakamura (1993) [13] classified emotive expressions in Japanese into 10 emotion types said to be the most appropriate for the Japanese language.They can be translated as: joy, anger, gloom/sadness, fear, shame/shyness, fondness, dislike, excitement, relief and surprise (see Table 2).
Examples of sentences containing emotemes and/or emotive expressions are shown in Table 1.Examples ( 1) and ( 2) represent emotive sentences.( 1) is an exclamative sentence, which is determined by the use of exclamative constructions nante (how/such a) and nanda!(exclamative sentence ending), and contains an emotive expression kimochi ii (feeling good/pleasant).( 2) is also an exclamative.It is easily recognizable by the use of an interjection iyaa, an adjective in the function of interjection sugoi (great), and by the emphatic particle -ne.However, it does not contain any emotive expressions and therefore it is ambiguous whether the emotions conveyed by the speaker are positive or negative (or, in other words -such a sentence can be used in both positive and negative context).The examples (3) and ( 4) show non emotive sentences.Example (3), although containing a verb describing an emotional state aishiteiru (to love), is a generic statement and, if not put in a specific context, does not convey any emotions.Finally, (4) is a simple declarative sentence without any emotive value.

Emotemes
Into the group of emotemes, structurally visualizable as textual representations of speech, Ptaszynski (2006) includes the following lexical and syntactical structures.
Casual Speech.Casual speech is not an emoteme per se, however, many structures of casual speech are used when expressing emotions.Examples of casual language use could be modifications of adjective and verb endings -ai to -ee, like in the example: Ha ga itee! (My tooth hurts!), or abbreviations of forms -noda into -nda, like in the example: Nani yattenda yo!? (What the hell are you doing!?).
Gitaigo.Baba (2003) [37] distinguishes gitaigo (mimetic expressions) as emotemes specific for the Japanese language.Not all mimetics are emotive, but rather they can be classified into emotive mimetics (describing one's emotions), and sensation/state mimetics (describing manner and appearance).Examples of emotive gitaigo are: iraira (be/feel irritated), or hiyahiya (be in fear, nervous), like in the sentence: Juugeki demo sareru n janai ka to omotte, hiyahiya shita ze.(I thought he was gonna shoot me -I was petrified).
Emotive markers.This group contains punctuation marks used as a textual representations of emotive intonation features.The most obvious example is exclamation mark "!".In Japanese, marks like ellipsis "...", prolongation marks, like "-", or "∼", are also used to inform interlocutors that emotions have been conveyed (see Table 1).

Hypocoristics (endearments) in Japanese express emotions and attitudes towards an object by the use of diminutive forms of a name or status of the object (Ai [girl's name] vs Ai-chan [/endearment/]; o-nee-san [older sister] vs o-nee-chan [sis]). Example: Saikin Oo-chan to Mit-chan ga bokura to karamu youni nattekita!! (Oo-chan and Mitchan have been palling around with us lately!!).
Vulgarities.The use of vulgarities usually accompanies expressing emotions.However, despite a general belief that vulgarities express only negative meaning, Ptaszynski (2006) noticed that they can be also used as expressions of strong positive feelings, and Sjöbergh (2006) [48] showed, that they can also be funny, when used in jokes, like in the example: Mono wa mono dakedo, fuete komarimasu mono wa nanda-?Bakamono.(A thing (mono) is a thing, but what kind of thing is bothersome if they increase?Idiots (bakamono)).
Emoticons.Emoticons have been used in online communication as generaly perceived "emotion icons" (icons, or annotation markers, which inform readers of the writer's emotional state) for many years.Their numbers have developed depending on the language of use, letter input system, the kind of community they are used in, etc. Popular emoticons include such examples, as ":-)" (smiling face), or ":-D" (laughing face).These are however not used by Japanese users.Emoticons, which are popular in Japanese communities, in contrast to the Western ones are usually unrotated and present faces, gestures, or postures from a point of view easily comprehensible to the reader.Some examples are: "(^o^)" (laughing face), "(^_^)" (smiling face), and "(ToT)" (crying face).They arose in Japan, where they were called kaomoji, in the 1980s and since then have been developed in a number of online communities.

ML-Ask: Overview of the Software
Based on the linguistic approach towards emotions and the above-mentioned definition we constructed ML-Ask (eMotive eLement and Expression Ana-lysis system) software for automatic analysis and annotation of emotive information on written digital contents.The emoteme databases for the system were gathered manually from linguistic literature and grouped into five types (code, reference research and number of gathered items in square, round and curly brackets, respectively): 1.
[EMOT] Emoticons.For the detection and extraction of emoticons we applied in ML-Ask part of an algorithm of CAO, a system for emotiCon Analysis and decOding of affective information, developed by Ptaszynski et al. [27], which applies a refined set of 149 symbols statistically most frequently appearing in emoticons.
These databases were used as a core for ML-Ask.
A textual input utterance/sentence is thus matched to the emoteme databases and emotive information is annotated.The software first determines whether an utterance is emotive (appearance of at least one emotive feature), extracts all emotive features from the sentence and describes the structure of the emotive utterance.The number of emotemes also expresses an emotive value, or the intensity of emotional load of the input.This is the software's main procedure for emotive information annotation of text collections.Next, in all utterances determined as emotive, the system searches for emotive expressions from the databases.The conceptual flow of the software procedures is represented on Figure 1.

Contextual Valence Shifters
To improve the system performance we also implemented Contextual Valence Shifters (CVS).The idea of CVS was first proposed by Polanyi and Zaenen [17,45].They distinguished two kinds of CVS: negations and intensifiers.The group of negations contains words and phrases like "not", "never", and "not quite", which change the valence polarity of the semantic orientation of an evaluative word they are attached to.The group of intensifiers contains words like "very", "very much", and "deeply", which intensify the semantic orientation of an evaluative word.ML-Ask fully incorporates the negation type of CVS with a 108 syntactic negation structures.Examples of CVS negations in Japanese are structures such as: amari -nai (not quite-), -to wa ienai (cannot say it is-), or -te wa ikenai ( cannot [verb]-).As for intensifiers, although ML-Ask does not include them as a separate database, most Japanese intensifiers are included in the emoteme database.The system also calculates emotive value, or emotional intensity of a sentence, on the basis of the number of emotemes in the sentence, thus the intensification is expressed with the emotive value.Two examples of valence shifting using Contextual Valence Shifters were represented in Figure 2.

Russell's 2-dimensional Model of Affect
Finally, the last distinguishable feature of ML-Ask is implementation of Russell's two dimensional affect space [18].It assumes that all emotions can be represented in two dimensions: the emotion's valence or polarity (positive/negative) and activation (activated/deactivated).
An example of negative-activated emotion could be "anger"; a positive-deactivated emotion is, e.g., "relief".The mapping of Nakamura's emotion types on Russell's two dimensions proposed by Ptaszynski et al. [21] was proved reliable in several research [21,22,27].The mapping is represented in Figure 3.An example of ML-Ask output is represented in Figure 4.

Applications
ML-Ask has been applied to different tasks.Most commonly, the system was used to analyze user input in human-agent interaction [1,3,4,15,23,24,25,26].In particular, the analysis of user input was utilized in decision making support about which conversation strategy to choose (normal conversation or joke) [1], and in an automatic evaluation method for dialog agents [3].ML-Ask was also used to help determine features specific to harmful entries in a task of cyberbullying detection [31].ML-Ask supported with CAO was also applied in annotation of a large scale corpus (YACIS -Ameba blog corpus containing 5.6 bil.words), and together with a supporting Webmining procedure in creation of a robust emotion object database [30].In a recent research, ML-Ask has been used to detect emotions in mobile environment to help develop an accurate and user-adaptive emoticon recommendation system [32].It has also been applied as a supporting procedure in automated ethical reasoning system [36].Implementation and architecture ML-Ask was written in Perl Programming language. 3 It works under Linux and macOS environments.Basic functionalities of ML-Ask can be launched in Windows environment as well, however, due to the differences in how some dependencies (especially MeCab: Yet Another Part-of-Speech and Morphological Analyzer) 4   In case of problems with installation of RE2 engine, it is possible to run ML-Ask by deleting or commenting out from the main system file (mlask [version_number].pl) the line responsible for calling our the engine, namely: use re::engine::RE2 -max_mem => 8<<23; #64MiB To comment out a line in Perl one will put a hash symbol (#) at the beginning of the line, like below: # use re::engine::RE2 -max_mem => 8<<23; #64MiB Commenting out the RE2 engine will not influence the results, only the processing speed.
The system was designed to work in commandline with no additional GUI.This decision was made to reduce ML-Ask processing time to minimum to make the software capable of faster and more memory-efficient way of processing large files, and thus be applicable in BigData research.
The software can be launched in three modes: (1) Demo mode; (2) File processing mode; and (3) File processing mode with output to separate file.Below we describe those three modes.
Demo mode.In demo mode user launches the software using commandline and observes output appearing immediately on the screen (terminal).This mode is launched by typing perl mlask [version_number].pl in the commandline and pressing the [Enter] key, like below: This command initializes the software and the user can input contents of their choice for further processing.An actual example of this input method and the following output is represented in Figure 5. Explanations of output interface is represented in detail in Figure 6.The sentence in the example is pronounced like:  Aaa-, kyō wa nante kimochi ii hi nanda !(^o^)/ which translates into: "Aah..., what a pleasant weather it is today!(^o^)/" The input sentence is considered emotive, and the estimated emotive value (emo_val) is calculated as 4. The found emotemes include interjections (INT), such as nante ("what a"), and aa ("Aah"), exclamative mark (EXC) "!", and an emoticon (EMO) (^o^)/.One type of emotions is found in the sentence, namely joy, or yorokobi (YOR), with its representative emotive expression kimochi ii ("pleasant").This emotion type is marked on Russell's 2D affect space as being positive (POS) and either active or passive (ACT_or_PAS).
File rocessing mode w.STDOUT.In this mode user, except typing the initial command, specifies the file they choose to process in the commandline and press the [Enter] key, like in the following:

$ perl mlask[version_number].pl input_file.txt
The input file may contain multiple entries/sentences.ML-Ask processes and outputs each line separately.The output (all annotated sentences) appears on the screen (terminal).Since ML-Ask does not perform sentence segmentation, if one line contains multiple sentences, they will all be annotated as one document.This way the user has the choice of specifying themselves on what level they choose to analyze their data (document level, sentence level, chunk level, or phrase level).
File processing mode w. output to file.Finally, the user can send the output to an output file by typing in the commandline the following: This mode is useful and efficient for further processing and analysis, especially of very large files.

Software variant: ML-Ask-simple
ML-Ask was originally designed to analyze mostly conversation-like contents.In the first step of ML-Ask analysis the system specifies if a sentence is emotive or non-emotive.Analysis of particular emotion types is performed only on emotive sentences.A sentence is emotive if it contains at least one emoteme, or a marker of emotive context.Emotemes are typical in conversations (in particular spontaneous conversations).Generally perceived narratives (blogs, fairytales, etc., often used in evaluation of affect analysis systems) contain at least two main types of sentences: 1. descriptive sentences for introduction of the main storyline, and 2. dialogs between characters of the narrative.
ML-Ask can be expected to deal with the second type of sentence.However, since emotemes rarely appear in descriptive sentences, the system would not precede to the recognition of particular emotion types for such sentences.Therefore, to allow ML-Ask deal with descriptive sentences as well we compiled a version of the system which excludes emotemes from the analysis and focuses primarily on analysis of emotion types.However, we retained the analysis of CVS and Russell's emotion space.Since in this version of the system we simplified the analysis, we called it ML-Ask-simple.This variation of the software is launched in the same way as the original ML-Ask version.

Quality control
Quality control for this software has been done on two levels.One is the evaluation of the software as a concept, or a system -this is performed from a scientific viewpoint and the results are presented in scientific publications.This answers the question on, if the system works, how closely to fulfilling its goal does it work.Namely, how well does ML-Ask as a system detect and annotate affective states expressed in sentences.This part has been implemented from the start and is described in the first following subsection on "Evaluations." The second dimension of quality control refers to preparing ML-Ask as a working software.This includes regular revisions and improvements of the software code, performance improvements, when required also rewriting significantly large portions of code.Releasing the software as the Open Source and preparing the Open Source License also comes within this quality control spectrum, as well as testing, benchmarking, and releasing every new update.These actions are described in the second subsection, "Release Related Actions."Evaluations ML-Ask has been evaluated a number of times on different datasets and frameworks.In first evaluations, Ptaszynski et al. [12,20,21] focused on evaluating the system on separate sentences.For example, in [20], there were 90 sentences (45 emotive and 45 non-emotive) annotated by authors of the sentences (first-person standpoint annotations).On this dataset ML-Ask achieved 83% of balanced F-score for determining whether a sentence is emotive, 63% of human level of unanimity score for determining emotive value and 45% of balanced F-score for detecting particular emotion types.In [12] Ptaszynski et al. added annotations of third-party annotators and performed additional evaluation from the third-person standpoint.The evaluation showed that ML-Ask achieves better performance when supported by additional Web-mining procedure (not included in the OpenSource version) for extracting emotive associations from the Internet.This evaluation also showed that people are not ideal in determining emotions of other people.Additionally, in [21] Ptaszynski et al. performed an annotation of a Japanese BBS forum 2channel with the use of ML-Ask.The dataset consisted of 1,840 sentences.The evaluation showed that there were two (out of ten) dominant emotion types ( "dislike" and "excitement") which were often expressed by sophisticated emoticons ( multi-line ASCII-Art type), which the system could not detect.Without these two emotion types the system extracted other emotive tokens similarly to human annotators (90% of agreements).
After the above initial evaluations Ptaszynski et al. continued evaluation of ML-Ask on different datasets.
The system was most often evaluated on conversations, both between humans [1] and between human users and conversational agents [4,3,15,23,26].In [1] Dybala et al. showed that ML-Ask presents comparable answers to human annotators when annotating conversations between people of different age and status (in particular young students vs. middle-aged businessmen).In other evaluations Ptaszynski et al. showed that the system performs comparably to humans when annotating human-agent dialogs.This was evaluated using only ML-Ask [3,4], and ML-Ask confronted with the Webmining procedure [23,26].Recently Ptaszynski et al. added also emoticon analysis system CAO to this evaluation [35].
Apart from the above evaluations, ML-Ask was also evaluated on blog contents.Firstly, in [25], using Yahoo!blogs (blogs.yahoo.co.jp) instead of the whole Web contents showed increased performance of the Web-mining procedure.Secondly, ML-Ask (alone and supported with emoticon analysis system CAO) was evaluated on YACIS, a corpus of blogs extracted from Ameba blogs (ameblo.jp).Finally, ML-Ask-simple was also recently evaluated using fairytales [33].The evaluation showed performance of about 60.6% of accuracy, which shows that the system performs better on conversation-like contents, rather than on contents containing descriptive sentences.References to all evaluations of ML-Ask are represented in Table 3. Results of each evaluation were summarized in Table 4.

Release Related Actions
Quality assurance is also controlled for the software in the following ways.
Firstly, we perform regular code revisions and consider further improvements to the software code.
When required, we rewrite significant portions of code to improve performance and get rid of bugs.Except generally analyzing user input in human-agent interaction, the information provided by ML-Ask was used to determine which conversation strategy to choose ( normal conversation or humorous response).
The information on affective states expressed by the user were also used as information on how the user feels about the dialog system they interact with.As the support for decision making systems, ML-Ask has been used to detect emotions in sentences (email, messages, etc.) entered by the user in mobile application to help recommend emoticons that would fit the emotional atmosphere of the message.
Affect analysis of Internet entries with ML-Ask occurred effective in determining features specific to harmful entries in a task of cyberbullying detection.Since affective information provided by ML-Ask is very rich, this could indicate that such information, could be useful in determining distinguishable features for other tasks related to affect and sentiment analysis, or even generally perceived binary text classification.
Since the performance of ML-Ask is sufficient, as it was indicated by applying it to annotate a large scale collection of blogs, corpus annotation with affective information is also one of potential reuse purposes.

Figure 1 :
Figure 1: Conceptual flow of the ML-Ask software procedures.

Figure 3 :
Figure 3: Mapping of Nakamura's classification of emotions on Russell's 2D space.

For
the details of installation of MeCab and MeCab Perl binding refer to the above mentioned Web pages.Installation of RE2 regex engine can be easily done with the use of the CPAN perl module, or cpan shell -CPAN exploration and modules installation software (http:// www.cpan.org/).

Table 1 :
Examples of sentences containing emotemes (underlined) and/or emotive expressions (bold type font).English translations were prepared to reflect both types if possible.
group refers to a a lexicon of expressions describing emotional states.Some examples include: adjectives: ureshii (happy), sabishii (sad); nouns: aijō (love), kyōfu (fear); verbs: yorokobu (to feel happy), aisuru (to love); fixed phrases/idioms: mushizu ga hashiru (give one the creeps [of hate]), kokoro ga odoru (one's heart is dancing [of joy]); proverbs: dohatsuten wo tsuku (be in a towering rage), ashi wo fumu tokoro wo shirazu (be with one's heart up the sky [of happiness]); metaphors/similes: itai hodo kanashii (sadness like a physical pain), aijou wa eien no honoo da (love is an eternal flame); Such a lexicon can be used to express emotions, like in the first example in the Table1, however, it can also be used to formulate non-emotive declarative sentences (third example in Table

Table 3 :
References describing evaluations and applications of ML-Ask.

Table 5 :
Example of benchmarking made for the current version of ML-Ask.