CARMA: Software for Continuous Affect Rating and Media Annotation

Jeffrey M. Girard

(1) Overview

Introduction

Researchers in affective computing, human-computer interaction, and the social sciences have long been interested in perceived and experienced emotion. Measurement of these affective constructs is frequently accomplished by collecting holistic ratings at the end of a task, often using discrete emotion categories. However, both subjective experience and observable behavior unfold over time and across numerous affective dimensions. In order to capture all the shifting nuance of such unfolding, affective measurement must be continuous.

Continuous measurement of affect was first popularized by Gottman and Levenson [] in their studies on marital interactions using the “affect rating dial.” This tool consists of a circular plastic knob mounted to the arm of a chair. Raters are instructed to turn the knob clockwise or counter-clockwise to report their affective experience. A faceplate with a numerical 9-point scale is attached around the knob to indicate to the rater that 1 (i.e., 0 degrees) corresponds to “very negative” and 9 (i.e., 180 degrees) corresponds to “very positive.” Throughout the experiment, an electronic potentiometer captures the dial’s rotation 30 times per second and saves the second-by-second averages to a computer.

The affect rating dial has been used to study dyadic interactions, empathic accuracy, the impact of alcohol on anxiety, and emotional responses to film and music (see [] for a review). Many of these applications, especially those using time-series analysis, would not have been possible without a continuous representation of affect.

The affect rating dial has demonstrated high validity in a number of ways. For instance, Gottman and Levenson [] found that participants’ affect ratings during a conversation were consistent with their spouses’ ratings and also with observers’ objective coding. There was also consistency between participants’ physiological measures when they engaged in a conversation and when they later watched a video of that same conversation and used the affect rating dial.

Over time, numerous improvements have been made to the original affect rating dial (see [] for an example involving colored lights). However, despite such refinements, many researchers would prefer an alternative that does not require them to purchase electronic components and assemble a custom-built device.

Numerous computer programs have been adapted or designed to collect continuous ratings. Many, such as ELAN [] and ANVIL [], are large packages capable of much more than continuous rating; as such, they are powerful in the hands of the experienced user but confusing and unwieldy to the uninitiated and impractical for use by study participants. Others, such as the Continuous Measurement System [], EmuJoy [], FeelTRACE [],and Gtrace [], are more focused on continuous rating. However, most have fallen out of repair and are difficult for non-programmers to implement and customize. EmuJoy and FeelTRACE also require raters to respond on a two-dimensional scale, which is difficult for users and increases the cognitive load of the task [].

Given increasing interest in continuous affective measurement [], [], a new tool is required for the collection of such data. The current study presents CARMA, which fills this need by providing a focused media annotation solution that can be easily customized and used by researchers or participants on a variety of multimedia types.

CARMA was first developed as a computerized version of the affect rating dial to collect observer’s ratings of positive and negative affect in audiovisual recordings of social interactions. We have used previous versions of the software to explore the nonverbal and dyadic aspects of domestic violence [], psychopathology [], and other interesting psychological phenomena. By using these ratings as “ground truth” for supervised learning algorithms, we have also trained computerized systems to automatically analyse affect and facial expression intensity from data [].

Implementation and architecture

CARMA is written in the MATLAB [] programming language and compiled into a Windows application using the MATLAB Compiler. MathWork’s MATLAB is a commercially- available software package and programming language that is commonly used in psychology, humancomputer interaction, and related fields. Whilst the code for CARMA is released under an open license and can be executed using the free MATLAB Compiler Runtime(MCR) [] software from MathWorks, any modifications to the code will require the developer to have the MATLAB Compiler to allow the revised code to be compiled and redistributed for use with the MCR. MCR is a stand-alone set of libraries that enables the execution of MATLAB files on computers without MATLAB. The appropriate version of MCR is automatically downloaded and installed with CARMA; the license terms for MCR are available as part of the download.

Graphical interface elements are implemented using MATLAB GUIDE [], and multimedia playback is implemented using the Windows Media Player [] ActiveX controller. It can load any multimedia file formats compatible with the version of Windows Media Player installed on the computer, including AVI, MPG, and WMV for video and AIF, CDA, MP3, WAV, and WMA for audio. Additional file types (e.g., M4A, MOV, and MP4) may be made compatible by upgrading Windows Media Player and/or installing additional codecs. Figure 1 shows the dependencies and interactions between CARMA’s components.

Fig. 1

Diagram of component interactions (* pre-installed on Windows OS).

CARMA allows users to easily collect continuous measurements on a variety of multimedia file types. When a video file is selected, its soundtrack is played through the speakers and its images are displayed in the multimedia window. When an audio file is selected, its soundtrack is played though the speakers and customizable music visualization is displayed in the multimedia window. Figure 2 shows the main CARMA window during multimedia playback/rating.

Fig. 2

Screenshot of CARMA’s main window with default settings.

Using a slider, whichcan be controlled with a computer mouse or keyboard via the arrow keys, users rate the selected multimedia file in real-time as it plays. Similar to the original affect rating dial, CARMA samples the position of the slider 10 times per second and saves the second-by-second means.

Adjacent to the slider is a customizable rating scale that provides a visual cue for the different slider positions. Users can easily specify the numerical range of this scale, provide labels for its upper and lower bounds, and specify the number of unique positions (or steps) it is composed of; users can also customize the gradient displayed on the scale by selecting any three colors. These settings can optionally be saved as defaults and loaded automatically. Figure 3 shows a screenshot of the CARMA Settings window.

Fig. 3

Screenshot of CARMA Settings window with the default configuration.

At the conclusion of the multimedia file, the Annotation Viewer window is opened and the collected ratings are displayed (Figure 4). From here, the user may playback the multimedia file and view the collected ratings, or export the ratings to an annotation file. Previously exported annotation files can also be loaded into the Annotation Viewer window for playback or re-exporting.

Fig. 4

Screenshot of CARMA’s Annotation Viewer window.

When exporting an annotation file, the user can select an export location and file format for his or her ratings; export files can be either Comma-Separated Values (CSV) files or a Microsoft Excel Spreadsheet (XLS/XLSX) files (Figure 5). In addition to mean ratings for each second of the multimedia file and corresponding timestamps, export files contain metadata summarizing how CARMA was configured when the measurements were collected.

Fig. 5

Screenshot of a CARMA annotation file opened in Microsoft Excel.

Sample multimedia and export files are provided in the code repository and step-by-step instructions for its use can be found in the documentation at http://carma.codeplex. com/documentation. Additional support is available at http://carma.codeplex.com/discussions or by emailing jmg174@pitt.edu .

Quality control

Functional testing has been carried out on Microsoft Windows XP (SP3), Microsoft Windows Vista, Microsoft Windows 7, and Microsoft Windows 8. CARMA behaved as expected on all platforms. Usability testing is currently being undertaken with researchers at the University of Pittsburgh and the University of Miami.

(2) Availability

Operating system

This software has been successfully tested on Microsoft Windows XP (SP3), Microsoft Windows Vista, Microsoft Windows 7, and Microsoft Windows 8.

It was compiled to can run on both 32-bit and 64-bit machines. Due to its dependency on ActiveX, CARMA does not run on Linux or Mac.

Programming language

MATLAB 8.3 (R2014a)

Dependencies

MATLAB Compiler Runtime 8.3 (R2014a) 32-bit Windows version
Windows Media Player (≥9.0.0.0) via ActiveX

List of contributors

Jeffrey M Girard – coding, testing

Code repository

Name

CodePlex

Identifier

http://carma.codeplex.com

License

GNU General Public License version 3 (GPLv3)

Date published

04/08/2014

Language

English

(3) Reuse potential

As a media annotation tool, CARMA provides researches with the ability to collect continuous affective ratings of audio and video files. Possibilities include emotional stimuli, movies, music, commercials, political debates, lectures in classrooms, or even audio and video of the raters themselves.

Raters can be instructed to report on their own subjective experiences while viewing the multimedia file, what they were experiencing during the original task (if re-viewing audio or video of themselves), or what they imagine specific others in the audio and video files are experiencing.

Previous research has tended to focus on broad affective dimensions such as valence, arousal, and power. However, due to its customizable rating scale, CARMA can be adapted to nearly any project for which continuous ratings are desired. Specific emotional and cognitive states can be used in place of affective dimensions, allowing raters to report on experiences of anxiety, frustration, engagement, agreement, confusion, pain, etc. CARMA can even be used to annotate non-affective information, such as the quality of teaching in the video of a lecture.

In the social sciences, the affective measurements collected with CARMA are likely to be used as the dependent variables of studies, allowing researchers to compare mean ratings across conditions or predict ratings given other factors. However, in affective computing and human-computer interaction, these measurements may also be used as training data for automated systems attempting to detect and analyse affective states from audio and video. Collecting responses from a variety of raters is critical for such a task [], and an easy-to-use program like CARMA can greatly facilitate this data collection.

The source code for CARMA has reuse potential for researchers looking to extend the program to be entirely open source. Open source media players may be added in place of Windows Media Player using ActiveX controllers. Intrepid users may rewrite the program in open source MATLAB alternatives or adapt the system to collect ratings over the web. Researchers looking to adapt CARMA to specific studies may add self-report questionnaires or experimental manipulations before or after the collection of affect ratings by adding additional windows to the baseline program. The spreadsheet format of CARMA’s output is ideally suited to incorporating additional data fields.

[B1] Gottman, J M and Levenson, R W (1985). A valid procedure for obtaining self-report of affect in marital interaction J. Consult. Clin. Psychol. 53(2): 151–160, DOI: https://doi.org/10.1037/0022-006X.53.2.151

[B2] Ruef, A M and Levenson, R W (2007). Continuous measurement of emotion: The affect rating dial In: Coan, J A and Allen, J J B eds. Handbook of emotion elicitation and assessment. New York, NY, US: Oxford University Press.

[B3] Fredrickson, B L and Kahneman, D (1993). Duration Neglect in Retrospective Evaluations of Affective Episodes J. Pers. Soc. Psychol. 65(1): 45–55, DOI: https://doi.org/10.1037/0022-3514.65.1.45

[B4] Brugman, H and Russel, A (2004). Annotating Multimedia/ Multi-modal Resources with ELAN Int. Conf. Lang. Resour. Eval. : 2065–2068.

[B5] Kipp, M (2001). ANVIL: A Generic Annotation Tool for Multimodal Dialogue Eur. Conf. Speech Commun. Technol. : 1367–1370.

[B6] Messinger, D S, Cassel, T D, Acosta, S I, Ambadar, A and Cohn, J F (2008). Infant Smiling Dynamics and Perceived Positive Emotion J. Nonverbal Behav. 32(3): 133–155, DOI: https://doi.org/10.1007/s10919-008-0048-8

[B7] Nagel, F, Kopiez, R, Grewe, O and Altenmuller, E (2007). EMuJoy : Software for Continuous Measurement Behav. Res. Methods. 39(2): 283–290, DOI: https://doi.org/10.3758/BF03193159

[B8] Cowie, R, Douglas-Cowie, E, Savvidou, S, McMahon, E, Sawey, M and Schröder, M (2000). FEELTRACE: An Instrument for Recording Perceived Emotion in Real Time ISCA Tutor. Res. Work. Speech Emot.. : 19–24.

[B9] Cowie, R, McKeown, G and Douglas-Cowie, E (2012). Tracing Emotion: An Overview Int. J. Synth. Emot.. : 1–17, DOI: https://doi.org/10.4018/jse.2012010101

[B10] Luck, G, Toiviainen, P, Erkkila, J, Lartillot, O, Riikkila, K, Makela, A, Pyhaluoto, K, Raine, H, Varkila, L and Varri, J (2008). Modelling the Relationships Between Emotional Responses to, and Musical Content of, Music Therapy Improvisations Psychol. Music 36(1): 25–45, DOI: https://doi.org/10.1177/0305735607079714

[B11] Gunes, H and Schuller, B (2013). Introduction to the Special Issue on Affect Analysis in Continuous Input Image Vis. Comput 31(2): 118–119, DOI: https://doi.org/10.1016/j.imavis.2012.12.002

[B12] Gunes, H and Schuller, B (2013). Categorical and Dimensional Affect Analysis in Continuous Input: Current trends and Future Directions Image Vis. Comput 31(2): 120–136, DOI: https://doi.org/10.1016/j.imavis.2012.06.016

[B13] Hammal, Z, Bailie, T E, Cohn, J F, George, D T and Lucey, S (2013). Temporal Coordination of Head Motion in Couples with History of Interpersonal Violence International Conference on Automatic Face and Gesture Recognition.

[B14] Girard, J M, Cohn, J F, Mahoor, M H, Mavadati, S M, Hammal, Z and Rosenwald, D P (2014). Nonverbal Social Withdrawal in Depression: Evidence from Manual and Automatic Analyses Image Vis. Comput.,

[B15] Jeni, L A, Girard, J M, Cohn, J F and De la Torre, F (2013). Continuous AU Intensity Estimation Using Localized, Sparse Facial Feature Space International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space.

[B16] The MathWorks Inc (2014). MATLAB version 8.3.0.532 (R2014a) Available at http://www.mathworks.com/ matlabcentral/answers/124402-please-mail-a-2014adisk.

[B17] The MathWorks Inc (2014). MATLAB Compiler Runtime (MCR) Available at http://www.mathworks.co.uk/ products/compiler/mcr/..

[B18] The MathWorks Inc (2014). MATLAB GUI Development Environment (GUIDE) Available at http://www.mathworks. com/discovery/matlab-gui.html.

[B19] Microsoft Corporation (2009). Windows Media Player Available at http://windows.microsoft.com/en-us/ windows/download-windows-media-player..

[B20] Sneddon, I, McKeown, G, McRorie, M and Vukicevic, T (2011). Cross-cultural patterns in dynamic ratings of positive and negative natural emotional behaviour PLoS One 6(2): e14679.

Journal of Open Research Software

Software Metapapers

CARMA: Software for Continuous Affect Rating and Media Annotation

Abstract

(1) Overview

Introduction

Implementation and architecture

Quality control

(2) Availability

Operating system

Programming language

Dependencies

List of contributors

Archive

Name

Persistent identifier

License

Publisher

Date published

Code repository

Name

Identifier

License

Date published

Language

(3) Reuse potential

Funding Statement

Acknowledgements

References