Walking in another’s virtual shoes: Do 360-degree video news stories generate empathy in viewers?

Executive Summary

Does immersion yield empathy? The aim of this report is to address the question of whether news media—in this instance, short-form journalistic stories—presented in a 360-degree video format affects a user’s empathetic response to the material. If so, what might the advantage of such a response be, and could it include improvement in a viewer’s ability to recall the content over time or a resulting behavioral change? Naturally, part of the study will deconstruct what we mean by such a nebulous term as “empathetic” and how exactly it can be measured.

The study both investigates if particular audiences are likely to respond empathetically to certain narratives and analyzes the component parts of immersive experiences—comfort level, interactivity, and perceived amount of user agency—that contribute to producing an empathetic response. It also aims to answer whether the virtual reality (VR) format is better suited to particular stories or audiences, as well as the potential for unintended, antithetical effects in this embryonic medium such as a user’s perception of personal space invasion or the feeling that they are not looking in the right direction.¹

Results from our study of 180 people viewing five-minute treatments comprised of either 360-degree video or text articles monitored user reactions and their sense of immersion on the day of their first exposure to the narrative treatment, as well as two and five weeks later.

Key findings:

Our research found that VR formats prompted a higher empathetic response than static photo/text treatments and a higher likelihood of participants to take “political or social action” after viewing.
Users who experienced the VR treatments reported higher levels of immersion and were more likely to report a desire to take action or find out more about the topic as a result.
In the longer term (both two and five weeks after viewing initial treatments), those who registered a higher empathetic response upon first viewing were more likely to recall the stories they had seen.
There is a negligible difference in perceived levels of interactivity between the head-mounted and desktop-based virtual reality treatments, suggesting head-mounted displays (HMDs) aren’t a deal-breaker.
Viewers using HMDs reported higher levels of immersion, but also some user discomfort.
Trust in the narrator is essential to building empathy, inspiring immersion, and heightening engagement in the narrative. This is best achieved by ensuring that the narrator maintain a consistently visible presence on-screen.
Research showed that stories with one clear protagonist serving as a guide through the VR experience are consistently more enjoyable for users.
Stories that are at least in part enjoyable are most likely to have impact on viewers.
Audience over-familiarity with a story can negatively impact the level of immersion or enjoyment of a story.
The lower a user’s news consumption habits or familiarity with the technology or story, the more likely they are to be positively impacted and respond empathetically to cinematic VR.
Immersion and presence in VR are key, but still can’t outweigh a user’s lack of interest or over-familiarity with a subject.

Recommendations:

When implementing VR in news-related storytelling, avoid overburdening or overwhelming the user with complicated, lengthy experiences or interfaces.
Remember that the choice of narrative and content influences and outweighs the actual interactive affordances of a VR experience.
The highest empathetic response was registered in users who were unfamiliar with the stories they viewed, suggesting the medium’s effectiveness in introducing a new topic or VR’s suitability for targeting infrequent news consumers. Focus on less well-known topics, as production value or storytelling alone cannot compensate for lack of user interest.
Audiences that find stories pleasant are far more likely to remember them in the long term; note the interesting correlation between palatability and impact.
Be cautious about showing too many scenes that can cause viewer discomfort; scenes of harsh conditions and suffering can drastically affect user comfort and enjoyment, triggering disengagement from the material and contributing to a drop in long-term memory recall and engagement. We recommend interspersing such scenes with less charged and more neutral material to counterbalance the effect.
Establish a trusting relationship between audiences and the same, single protagonist by including them in every scene.
Choose a narrator in your 360-degree videos that users trust, or include a consistent voice throughout.
Provide clear guidance on sharing VR stories, as audiences are still unfamiliar with how to do so once they have finished viewing the experience.
Remember that VR is by no means a catch-all solution for instilling empathy in all users. Like any other storytelling medium, its power lies not only in journalists’ flair for narrative, but also in audiences’ dispositional and contextual affinity for particular topics, which is as vulnerable to over-saturation as any other medium.

Introduction

An expanding list of newsrooms both in the United States and worldwide are incorporating 360-degree video into their production processes,² often dedicating significant manpower and resources to producing this new form of content.³ The high penetration rates of smartphones, coupled with the relatively cheap barrier to entry for mobile VR (e.g., the Google Cardboard or Samsung Gear VR headsets), have meant that many newsrooms have integrated cinematic VR—that is, live-action, 360-degree video—into existing news stories.

Beyond an examination of the analytics for views and visits of 360-degree videos uploaded through social media channels such as YouTube and Facebook, the latter of which owns the VR company Oculus, little information is available about the effect that these 360-degree videos are having on audiences. It remains to be seen whether a new form of metrics is indeed necessary to truly investigate,⁴ or which best practices should be followed to create the most compelling content.

Evangelists of the mediumⁱ argue that VR journalism experiences have “yielded deeper, more immersive stories that people enjoy and stay with longer than a traditional video or article. Feedback is characterized by more visceral and emotional reactions. People say that VR brings them closer to the events and breaks down barriers inherently raised by a reporter or correspondent.”⁵

However, the news consumption habits of users when experiencing VR journalism are often overlooked, as are the respective differences between viewing an experience with an HMDⁱⁱ versus a traditional, two-dimensional desktop monitor, versus a traditional text article.

The first goal of this study is to measure the effectiveness of 360-degree video—both its immersive and non-immersive format—in creating a sense of presence, or engendering a sense of connection or emotional impact between news audiences and the subjects of their stories. The second is to deduce whether this in turn prompts viewers to take action or change their views as a result. To reach these goals, we specified a set of research questions, detailed in Appendix I.

Hypothesis

Research has demonstrated that in order to foster changes in behavior, an individual must engage with and actively process the content of a message beyond simply reading the message in textual form.⁶ In many instances, direct experiences with multi-sensory stimuli have proven to be more influential on user behavior than indirect equivalents, such as text-based descriptions.⁷ Our hypothesis is that virtual reality treatments, be they immersive or non-immersive, will have greater impact over a longer period of time than text, and lead to a higher likelihood of behavioral change and impact on the user as a result.

Virtual Reality, 360-Degree Video, and Empathy

Virtual Reality versus 360-Degree Video

The term “virtual reality” was initially coined by Jaron Lanier, founder of VPL Research, in 1989. Originally, the term referred to “immersive virtual reality,” where the user becomes fully immersed in an artificial, three-dimensional world that is generated by computer graphics. The user can explore a scene in all directions, including depth. Fundamentally, one of the chief distinctions between computer-generated (CG) VR and 360-degree video (i.e., “cinematic VR”) is the former’s use of a real-time game engine to adapt the environment to the user’s interactions. The latter, given its inherently pre-rendered nature, means that although a user can influence their viewing angle within the scene, their actions have no effect on the progression of the 360-degree video they are watching and thus render it unresponsive. Furthermore, cinematic video suffers from the same disadvantages as traditional photography, in the sense that objects further from the camera will lack detail or potentially even be out of focus.⁸ ⁱⁱⁱ

VR Definitions

Virtual reality:

Equirectangular video comprised of multiple feeds shot simultaneously on multiple cameras that are then stitched and wrapped around a spherical viewer. The user is placed in the center of the sphere and is able to use head movements to change their viewing angle in the scene; the user is stuck to the spot from which the original camera was recording and cannot explore the scene beyond turning their head. Also referred to as “cinematic virtual reality.”
A software program originally designed for the production of video games in which users can write scripts in code that enable interactions between different media assets. These have since become the leading method for developing computer-generated virtual reality experiences. The two most popular game engines are Unity and Unreal.
A virtual reality experience that requires the user to put on a head-mounted display (HMD). Typically involves a form of head tracking and headphones.

Non-immersive virtual reality: A virtual reality experience of a three-dimensional world that the user explores on a two-dimensional desktop computer, rotating their viewing angle via a mouse.

Presence: The perception that a virtual reality environment is real, and that the user feels part of that virtual world, through a combination of interactivity, physics (the world updating to their point of view via head tracking), and responsiveness. A core requisite for immersion.

Defining and Measuring an Empathetic Response

“Empathy is a multi-faceted emotional and mental faculty that is often found to be affected in a great number of psychopathologies, such as schizophrenia, yet it remains very difficult to measure in an ecological context.” –Philip Jackson⁹

The word empathy first appeared in English in Edward Bradford Titchener’s translation of the German word Einfühlung, a term from aesthetics meaning “to project yourself into what you observe.”¹⁰ This aspect of state transferal is conveyed by the primary definition for empathy in The Merriam-Webster Dictionary, namely “the imaginative projection of a subjective state into an object so that the object appears to be infused with it.”¹¹

Yet it is the secondary, interpersonal aspect that is most pertinent to this study, namely “the action of understanding, being aware of, being sensitive to, and vicariously experiencing the feelings, thoughts, and experience of another of either the past or present without having the feelings, thoughts, and experience fully communicated in an objectively explicit manner.”¹²

Mark Davis simplifies this process, emphasizing the “reactions of one individual to the observed experiences of another,”¹³ which in turn echoes Robert Hogan: ’’the intellectual or imaginative apprehension of another’s condition or state of mind.“¹⁴ This rational, imagined aspect of empathy is typically referred to as cognitive empathy.

Of particular significance is the ability to conceive of”another’s condition” while simultaneously retaining a distinct feeling of self: “The state of empathy, or being empathic, is to perceive the internal frame of reference of another with accuracy and with the emotional components and meanings which pertain thereto as if one were the person, but without ever losing the”as if” condition.“¹⁵

This focus on the intellectual or observed reaction to another’s feelings is counterbalanced by others’ insistence on the emotional core of this observational and transferential process:

Caring, individual concern, and imagination are emotional components of empathy. An individual can place himself in the mental states of others, producing thoughts or feelings that are supportive of others. Sharing emotions, engaging in an emotional exchange with others, stimulating conduct beneficial to others, assisting others, and establishing positive interpersonal relationships are also the components of empathy.¹⁶

This emotional branch of empathy, often referred to as affective empathy in the literature, must be appropriate to the observed mental state and can be classified either as “parallel”—in which the observer matches the target’s mental state—or “reactive”—in which the observer goes beyond a simple matching of affect.¹⁷

Beyond intellectual apprehension of another’s condition and the requisite ability to retain a sense of self yet remain emotionally sensitive to another’s needs, we must also ask ourselves the motivating purpose behind such an investigation, which Martin Hoffman alludes to:

An individual interprets the meaning of information transmitted by others and anticipates the justification and perception of this information. The motivation component of empathy is sufficient to elicit responses beneficial to others, producing empathy with the feelings of others when misfortune falls upon someone else, not oneself. The object of such conduct is to help others.¹⁸

Component Parts of Empathy

Thus we have a reductive yet functional three-fold structure for the process of empathizing with another, which can be summarized by a three-component framework attributable to Jorris Janssen:¹⁹

Table 1: Types of Empathetic Responses

This in turn facilitates empathic responding, as the subject is still able to retain enough distance to operationalize their response to the object’s situation. Perspective-taking, or the mental simulation of “putting oneself in another’s shoes” via one’s imagination²⁰ is often held to be a cornerstone of an empathetic response. Perspective-taking has been proven by psychologists to foster a number of positive qualities,²¹ such as the reduction of stereotypes,²² improved communication, and construction of favorable attitudes and helping behaviors²³ toward those in adverse situations that are alien to our own individual experience, as is often the case in news consumers.An understanding of empathy can thus be broken down into three constituent components, namely, perception, emotion, and motivation. It is worth noting how permeable the boundaries between each of the three respective states are, problematizing easy distinctions between them. Figure 1 was designed to show the links between the respective states, beginning on the left at the start of the empathic process: the subject and the object are clearly separated, and any appreciation of the object’s internal state is purely intellectual and cognitive. Perspective-taking is what shifts the process from the left, separate circles to the central diagram with the overlapping circles, as the subject and the object begin a process of emotional convergence.

In the illustration on the right-hand side of Figure 1, where subject and object have become one, the subject runs the risk of losing Rogers’ aforementioned “as-if” condition, feeling that they have become the object, and are therefore less able to conceive of a response to the situation. This outcome is otherwise known as “personal distress”²⁴ and can have a contradictory effect to empathy, inducing a feeling of aversion in the subject that consequently emphasizes the goal of alleviating their own discomfort.

It is worth noting that this theory, attributable to Davis, has its detractors, who argue that “empathy associated helping can no longer be presumed to be altruistic because as empathy increases, so does the presence of the self in the other.”²⁵

Figure 1: Component parts of the empathetic spectrum.

Problematizing an empathetic baseline

There are many difficulties with capturing empathic performance, hence questionnaires are often used as proxies for actual behavior. In such cases, empathy is often considered a trait or inherited characteristic, though it can differ greatly between different situations and interaction partners. Individual differences in expressivity and reactivity should also be taken into account, as should the strong inter-individual differences in emotional expressivity and baseline levels of physiological signals.²⁶

Paralinguistic empathetic responses

Tracking nonverbal behavior, such as the position of the head or facial features and eyebrows relative to the two subjects involved is a commonly used approach to gauge the extent to which individuals share the same emotional state.

At a less conspicuous level, our experience of sympathy (prosocial behavior) or personal distress as an empathic response to suffering is based on our ability to self-regulate emotions: a low ability to regulate a response will likely lead to over-arousal, in turn triggering a self-focused response of personal distress with the immediate goal of swiftly alleviating it.²⁷ Conversely, individuals with a high ability to regulate their reaction are more likely to respond with sympathy. A certain amount of arousal is required for any empathic response at all. Setting a threshold for emotional convergence is critical to ascertaining whether a response is sympathy or personal distress.

Researchers have linked therapist empathy to physiological synchronization between therapist and client.²⁸ Janssen also used physiological signals as intimate cues: communicating a heartbeat signal transforms our experience of social situation.²⁹ Madeline Balaam showed feedback on interaction behavior can enhance interactional synchrony and rapport, but once again emphasizes the difference between human-to-human interaction and human-machine interaction.³⁰

At this juncture, it is essential to highlight the respective differences between dispositional (individual differences between people in their susceptibility to empathy processes) and situational (experienced empathy at specific moments or during specific interactions) factors in an empathetic response. Our survey was designed to highlight the distinction between these states, focusing first on which demographic factors demonstrated a more favorable tendency toward an empathetic response. It then progressed to examine the specific situational stimuli within a narrative treatment that trigger any of the responses associated with empathy such as perspective-taking, emotional impact, or emotional convergence.

360-Degree Video and Empathy

Journalistic practitioners of 360-degree video have referred to empathy in VR as “the killer metric”³¹ for impacting viewers and sharing the perspective of another individual. Using the perspective-swapping experience of “The Machine to be Another,” Maarte Roel, one of the project’s creators, emphasized that the focus of the experiment was on the relationship between the respective participants, who, thanks to two mirrored VR headsets and head-mounted cameras, were shown the simultaneous feeds of their partner as they looked down at their own body. Some argue that this performative aspect of relational empathy is of more value than the individual, isolated framework that many have misapplied to empathy-driven experiences³²—yet it is only achievable in live spaces with human participants, as opposed to filmed characters or digitized avatars.

Limitations within 360-degree video

This brings up the inherent problem of how to induce emotional mirroring between a viewer and subject within a closed, unresponsive system such as 360-degree video. Face-to-face contact and emotional mirroring is a key component of building an empathetic response within a social context,³³ since it depends on real-time awareness of another individual’s emotional state. The challenge with assessing empathy through a VR headset in this way is that the technology is not yet at the level where the filmed characters in cinematic VR can acknowledge a user’s presence in a scene, although developments have been made in CG VR using animated pedagogical agents³⁴ to simulate these emotions using digital avatars within an educational context.

Recognition in a CG environment

Instances where these have been implemented have produced beneficial results with regard to encouraging student performance under test conditions,³⁵ yet all studies, including those with three-dimensional avatars, were undertaken in a non-immersive, desktop environment. Furthermore, the incorporation of a test-based system greatly facilitates the feasibility of presenting suitable emotional responses. Hardware that monitors gesture, facial expression, and conversational cues has also been tested³⁶ although its design and implementation requires extensive resources and dedicated staff.³⁷

Not only is this beyond the reach of today’s newsrooms, but it’s also further problematized by the HMD, which obscures a large portion of a user’s facial gestures. Workarounds in the VR HMD space are being built by companies such as Fove, which incorporate infrared eye tracking to detect user gaze direction and heat maps of the most observed areas within 360-degree videos, but the specific content necessary for these platforms is still limited.³⁸ To that extent, the media is still at a more passive stage, which can exacerbate feelings of distance or detachment on the part of the user. Our study investigates this phenomenon through the analysis of stories with and without first-person characters acting as a guide.

Operationalizing an Empathetic Response

Among the leading exponents of the form, the lack of interactivity beyond the ability to control the camera with one’s head movements is still outweighed by an overriding sense of presence, which in turn creates a sense of proximity to the heart of the story that is harder to achieve in other media.³⁹ WITNESS program director Sam Gregory argues that empathy does not necessarily motivate people to take action, instead suggesting VR’s potential as a tool for activism. Gregory believes this can happen only if the focus is shifted from empathy to solidarity, based on the power of live witnessing and co-presence, or “the sense of being somewhere together with other people,” over a sense of presence alone, or “the sense of being somewhere.”⁴⁰ He argues that by allowing users to interact with experiences in real time, co-presence could help people move beyond denial and disengagement, and serve as an effective route to user mobilization—for example, as in the case of frontline activists broadcasting via live 360-degree video.⁴¹

Similarly, others warn of the dangers of this heightened sense of proximity to the focal point of a story without any of the typical sensitivities one would normally be mindful of, which can have an unintended, oppositional effect: viewers are immersed, but the characters whose perspective they are supposed to be sharing are excluded.⁴² This is further exacerbated by the tendency of many 360-degree videos to utilize post-production techniques that remove any trace of the original camera rig that shot the footage, further obfuscating the transparency between the subject and viewer⁴³ and calling into question journalistic ethical standards.

Methodology

Empathetic Response

First, we began with the deconstruction of the umbrella term “empathy” into a series of qualities that could be quantitatively measured and compared with user responses to each of the three different treatments across three different platforms: immersive VR, non-immersive VR, and text. While various factors were taken in isolation, including presence, information retention, user control, and emotional convergence, a combination of corresponding questions were chosen to represent a more generalized response to the story stimuli.

Each question represented an aspect of affective and cognitive empathy, as well as measurements of the level of immersion. The following post-treatment (PT) questions were selected and correlated for internal consistency, as detailed in Table 2. Given the nature of the composite format to crafting the empathetic response criteria, an overlap with the five factors being targeted in the respective questions was expected. Hence a high immersion rating (question PT2.12) correlates directly to a high empathy rating, given that immersion constitutes one-fifth of the questions that constitute the empathetic response metric.

Table 2: Questions Comprising the Empathetic Response Metric

Third, the operationalization of an empathetic response: namely, whether those who reported high levels of immersion, emotional engagement, and perspective-taking with the subjects of their story were more likely to change their behavior and take action as a result. This action was listed on a spectrum from: wanting to find out more about a specific topic, sharing the story with a friend or family member, donating to a related non-governmental organization (NGO), or volunteering for a related NGO.Second, we analyzed participants’ results based on their response to the treatments to see whether certain demographic features such as gender, race, education, familiarity with the technology, and news consumption demonstrate a proclivity toward a higher empathetic response, and the commonalities between the user and their story’s subject. Are audience members and subjects of the same age and race more likely to empathize with each other, for example?

Limitations

There are several limitations to acknowledge in this study, encompassing the scope of study, methodology, and data collection methods. No research institution was involved in the recruitment phase of the survey, meaning that the number of subjects was not evenly distributed across gender-based lines. Our sample was instead based on the availability of members of the public at three locations around New York and their willingness to take part. This likely led to a bias toward younger participants who were more interested in experimenting with the new technology on offer. Given that the younger adult demographic is among the highest of the target audiences for virtual reality and related computer-generated experiences,⁴⁴ and similarly constitute a large proportion of the target audience that newsrooms are aiming at, this bias was actually seen as potentially advantageous.

For the follow-up questions that constitute phases two and three of the survey, respondents were sent two respective emails based on the email address they used to fill out their consent form. Of the 180 respondents, three declined to add their email addresses, two did not have their own email address, and fifteen emails were inactive or bounced back the follow-up request. Panel attrition, in the case of those who took part in the first round but neglected to fill out the follow-up responses, is a common problem with longitudinal studies of this nature.⁴⁵

However, the advantages of comparing the data of within-subject change in order to track the evolution of individual opinion provided enough evidence and insight to merit the practice, regardless of the lower numbers. While this approach has a somewhat weaker claim to causality, the temporal ordering of the measurement renders it preferable to purely observational study on a one-off basis.⁴⁶

Within the scope of the longitudinal study, participants were only asked about their likelihood of taking further action, changing their attitude toward certain topics, or looking further into related subjects—and thus are ratings of behavior and not behavior themselves.

Individual differences in expressivity and reactivity should also be taken into account. There are strong inter-individual differences in emotional expressivity and baseline levels of physiological signals that will affect the range of responses from participants.

Treatments

In order to preserve consistency between treatments, all three of the 360-degree video treatments were produced by HuffPost RYOT, a video production agency specializing in 360-degree content. The goal was to select three contrasting treatments of 360-degree video from its archive: one without a visible main protagonist, one with, and one with several interchangeable protagonists. Male and female protagonists were deliberately chosen (in Seeking Home and Growing Up Girl, respectively) to examine differences in audiences of contrasting genders, and a third, Act in Paris, was included for its lack of an on-camera narrator despite accompanying narration from Jared Leto.

All of the 360-degree videos were shot on location and shared a similar duration: 4:20 minutes (Growing Up Girl), 5:35 minutes (Seeking Home), and 3:55 minutes (Act in Paris). All of the videos were in English, with minimal background music and no computer-generated special effects.

Despite the commonalities in production, differences in the way that scenes were shot, voiceovers recorded, and narratives designed were inevitable. Similarly, the text treatments were originally conceived as direct transcriptions of the VR treatments, but after piloting were deemed too dry and descriptive in relation to the style of the other treatments. In response, the text treatments were revised to adopt a more similar tone to the videos, focusing more on the narrative and establishing dramatic interest.

The first and most important quality was the structure of the video: all three videos were guided by a narrator who acted as a guide to the visuals that appear in front of the user.

Table 3: 360-Degree VR Treatment Stories Used in This Study

Text article (control): Participants were given a transcript of the video in question, adapted to suit the text-only format, which was also accompanied by screenshots from the respective video.

Non-immersive, desktop experience: Following a brief explanation of the controls, participants were given a laptop and set of headphones and told to use either the mouse or cursor keys to control their viewing angle within the video.

Immersive gear VR experience: Following a brief explanation, participants were seated in a swivel chair and an HMD (the Samsung Gear VR headset) was placed on their head, along with headphones. They were encouraged to use their head and rotation in the chair to control their viewing angle within the video.

Procedures

Three separate instances were used to measure the mediating effect, having established a baseline to indicate the moderating effect of individual trait differences.

Table 4: Phases of Data Collection

Phase one of data collection took place on November 17 and 18, 2016, within private conference rooms at the Made in NY Media Center by IFP at 30 John Street, Brooklyn, New York 11201, a communal work space in the heart of one of New York’s busiest media production centers. Phase two of the study was conducted on November 28 and 29, 2016, on the sixth floor of Pulitzer Hall on the Columbia University campus grounds on the Upper West Side of Manhattan, which accounts for the high proportion of student participation. This was deliberately countered by situating the other two data collection sites in commercial spaces. Phase three was conducted on December 9 and 10, 2016, in a conference room in the New York offices of the Harmony Institute in lower Manhattan at 54 West 21st Street, Suite 310, New York, New York 10010.Data Collection

However, this in turn necessitated the physical presence of potential participants in one of the three recruitment spaces in New York: Dumbo, Brooklyn; Chelsea, Manhattan; and Columbia University, Manhattan. Flyers were posted around these areas to encourage participants to come forward, but required additional on-the-street canvassing in order to garner the sufficient number of participants.

Initial demographic data was collected from each participant, along with information on their experience with virtual reality. This data was used as a comparative baseline for the remainder of our research. Subjects were then exposed to one of nine treatments—either watching one of three videos in two-dimensional screen format or three-dimensional virtual reality headsets, or reading one of three transcripts derived from the same stories. After viewing the film/reading, participants were given a post-treatment questionnaire asking them specific questions about the characters and events they had viewed or read about, and their reactions to it.

The goal of the data collected from these questionnaires was to discern the reaction to the events participants had seen or read about, and to explicitly understand whether the subject remembered details of characters and events presented to them. A final part of the survey was designed to measure whether and to what degree participants might take a broad range of actions based on what they had seen. It also questioned whether they would have taken action on the topics in the film/transcript if they had not viewed the experience.

Data Analysis Software and Process

The data analysis primarily utilized SPSS and Excel. The raw data was collected in Survey Monkey and exported as an Excel file, which was then imported into SPSS format and recoded.

To cluster data, Cronbach’s alpha was used to answer the sub-questions (see Appendix I). Cronbach’s alpha is an internal consistency measure that quantifies how closely related a set of items are as a group and is a standardized measure of scale reliability.⁴⁷ The data was tested with t-tests, chi-squared tests, one-way ANOVA, and correlations between various factors using Pearson’s correlation tests, as well as for multiple regressions. Additionally, descriptive statistics, graphs, tables, and histograms were plotted through SPSS and used for exploring the data further.

Participant Demographic Breakdown

The first round of the survey was completed by 182 participants who fit the sample criteria. Two participants were disqualified due to language issues, leaving 180 participants.

Gender

Of those 180 participants, sixty-nine were women and 111 men, representing a gender balance of 38.3 to 61.7 percent. U.S. Census data from 2015⁴⁸ estimated the total female population at 51 percent, a discrepancy that highlights a preference toward virtual reality technology amongst men over women.

Age

The overall age range of participants had a noticeably younger bias, with the older generation under-represented. We recognize the bias toward a younger audience in our sample, with 71.7 percent of the total responses coming from millennials aged eighteen to thirty-four. We feel this corresponds to the heightened amount of interest in emerging technology among younger populations. This study does not prove that all younger audiences are interested in virtual reality; it only suggests a link between these younger audiences and VR consumption trends, as evidenced by consumer research reports conducted by Greenlight VR.⁴⁹ For further tests, the age groups were combined into four major groups: college-aged students, eighteen to twenty-four; young adults, twenty-five to thirty-four; middle-aged adults, thirty-five to forty-four; and those aged forty-five years and above.

Figure 2: Bar chart of age distribution (sample size 180).

Education

Of our participants, 42.8 percent reported they were currently enrolled full-time in college or at a university. Furthermore, the distribution of education shows that while 21.1 percent reached high school graduation as their highest level of education, 25.6 percent had attended college, 28.3 percent had graduated from a bachelor’s program, and 22.8 percent had post-graduate qualifications. Only four percent had not graduated from high school, reflecting a bias toward an educated sample. This is attributable to the three different locations at which the survey was conducted, one of which was Columbia University, as well as the ten-dollar incentive to take part in the study.

Ethnicity/nationality

The baseline language was English, with sixty-nine people speaking a second language (twenty-seven speaking Spanish), twelve people speaking two foreign languages, and six people speaking three or more foreign languages. This was not surprising, due to the diverse distribution of nationalities in New York.⁵⁰ The majority of respondents were Caucasian (forty-four percent), followed by African-American (nineteen percent), Asian or Asian-American (fifteen percent), and Hispanic (twelve percent), with the remaining ten percent listed as Other or preferring not to answer.

Household income levels

Corresponding to the younger age groups, the average household income was under the New York five-borough average of 66,175 dollars,⁵¹ with 52.8 percent having an annual household income under 50,000 dollars.

Figure 3: Bar chart of household income distribution (sample size 180).

Political views

As expected from young, educated participants,⁵² 57.8 percent were politically liberal or very liberal, followed by 22.2 percent who identified as moderate, and only one percent as conservative. This reflects the emphasis on a younger demographic in the survey, as well as the higher number of respondents still in education. Of our respondents, thirty-six percent saw themselves as neither, or not interested, and five percent declined to answer the question.

Results

Empathetic Response

The immersive format produced a marginally higher empathetic score (5.2 out of seven on the Likert Scale) over the non-immersive format (5.06), and both were above the 4.3 that the mean text treatment received. This corroborates the hypothesis that the immersive VR format is more effective at producing an empathetic response.

Comparing the means of the three stories in a sample t-test shows that, with 0.000 significance, Growing Up Girl produced a slightly more empathetic response with a mean of 5.25 compared to the other two experiences, Act in Paris and Seeking Home.

Level of Immersion

Tests showed no significant difference between the VR immersive and non-immersive treatments, but just that the text format was significantly less powerful than either in terms of its level of immersion. This disproves the hypothesis that an immersive format would prove more immersive than a non-immersive format, and emphasizes the similarity in terms of perceived user experience between head-mounted and desktop-based, two-dimensional virtual reality. It also highlights the disparity between both virtual reality formats and the text control treatments, as is clear from the graph in Figure 4.

Figure 4: Level of immersion across treatments (sample size 180).

There is a significance of α= 10-3 that participants’ level of immersion varies from the sense of closeness to a narrator. People who trusted the narrator a moderately strong amount (mean = 5.06) or very strongly (mean = 5.73) felt more immersed in the experience compared to those whose level of trust was moderately weak (mean = 2.5) to a little weak (mean = 3.56). An ANOVA test using the same immersion mean as above shows highly significant differences between different categories (α =10-3), allowing researchers to ascertain which stories in which formats created the highest sense of immersion. As expected from Figure 5, the text treatments produced a significantly lower level of immersion than the virtual reality equivalents.

Figure 5: Average immersion levels associated with various treatments (sample= 180).

Researchers found no significant difference between the stories or treatments and their perceived level of interactivity. However, after comparing means, results show that the immersive Seeking Home treatment was an outlier compared to the other immersive treatments. This might be the result of the refugee crisis’s high profile in mainstream media coverage, which led users to feel disinterested in the story and therefore less likely to be affected by it.

Emotional Impact

The level of emotional impact was measured using questions PT6.1, PT6.2, and PT6.3 to test the hypothesis that immersive treatments have more emotional impact on the user than non-immersive treatments. Performing a Cronbach’s alpha test to measure internal consistency generated a satisfactory result of α = 0.719. These variables were then combined to create an “emotional impact mean.” Analyzing the participants’ evaluations of immersive and non-immersive treatments, our research assumes that the difference between the scale items is equal. Performing an ANOVA test, researchers instead found that immersive treatments do not create significantly more emotional impact than non-immersive treatments. However, it was clear that both the immersive and non-immersive treatments caused a significantly more emotional reaction among users than the text treatment, as is clear from Figure 6.

Figure 6: Emotional impact across treatments (sample size 180).

This was also reflected across the range of stories, with one outlier in the form of the non-immersive Growing Up Girl story, which was the only treatment to outperform all other formats.

Figure 7: Emotion impact across stories and treatments (sample size 180).

Cognitive Absorption

The cognitive absorption level was measured according to the process described in Appendix I with the questions PT2.8, PT2.9, PT2.10, PT2.11, PT2.13, PT2.14, and PT2.15 to test the hypothesis that the virtual reality treatments would score higher in cognitive absorption than the text treatments, given their higher levels of immersion and emotional impact. Performing a Cronbach’s alpha test to measure internal consistency generated a good result of α = 0.830, allowing researchers to combine these variables into the “cognitive absorption mean.” The overall mean of 4.5012 corresponded to “a little strong” when rounded up to a five out of seven on the Likert Scale. With a significance of α = 10-3, there was a marked difference between the text treatment and the non-immersive and immersive treatments, corroborating the hypothesis that users were more absorbed during the virtual reality treatments.

Figure 8: Cognitive absorption levels in relation to story format (sample size 180).

Interaction

In terms of reported levels of interaction, there was a marked difference between the text treatment and the non-immersive and immersive treatments, corroborating the hypothesis that users were more absorbed during the virtual reality treatments. In the case of the HMD immersive treatment, this could be attributed to the nature of the device itself, since the Samsung Gear 360 viewing goggles obscure any other visual or auditory stimuli, forcing users to focus on the view through the stereoscopic lenses. We were surprised to find the similarity in levels of cognitive absorption between immersive and non-immersive formats, given that the view of the latter was via an unobstructed 2D laptop screen, leaving the potential for distraction significantly higher than the immersive format.

Enjoyment Levels

The contrast between both the virtual reality treatments and the text treatment is clear in Figure 9, corroborating the hypothesis that participants found the former more enjoyable than the latter. The text received a four out of seven (neither enjoyable nor unenjoyable), while the immersive and non-immersive registered as “slightly enjoyable.”

Figure 9: Level of enjoyment across story formats (sample size 180).

When mapped against the corresponding stories, immersive Act in Paris, covering climate change, was found to be the most enjoyable, with the text treatment of Seeking Home the least. As is clear from the graph in Figure 10, Seeking Home consistently scores lower than the two other treatments across all formats. Interestingly, the immersive treatment for Seeking Home underperforms against the non-immersive treatment—the only treatment to do so—which suggests that users prefer the distance of an interactive, two-dimensional version on a desktop computer to the fully immersive, head-mounted alternative when it comes to viewing scenes that aren’t objectively enjoyable.

Despite the fact that the subject matter contains a grave warning about the dangers of climate change, Act in Paris does feature appealing footage of a boat ride through glaciers in Alaska, which clearly appealed more to audiences than the squalor of the Calais refugee camp featured in Seeking Home.

Figure 10: Level of enjoyment across story formats and treatments (sample size 180).

User Comfort Level

Questions relating to the user’s level of comfort were only posed to the participants of the immersive and non-immersive treatments to confirm the hypothesis that the immersive treatments created a more comfortable experience for the user than did the non-immersive. No significant difference was found between the treatments and the level of comfort (α = 0.077), however the ANOVA data suggests that the content is what constitutes the significant difference between the immersive Act in Paris and immersive Seeking Home, and not the format of viewing experiences (α = 0.057). This reinforces the point made in the previous section pertaining to enjoyment levels: When placed inside scenes featuring poor conditions and poverty, users feel uncomfortable, almost in direct synchronicity with their level of enjoyment.

This was manifested across the Growing Up Girl and Seeking Home treatments, while Act in Paris scored consistently higher. The data challenged the original hypothesis, confirming that it is the content of the story that affects users comfort more directly than the format in which it is presented. As is clear from the graph in Figure 10, there are two instances where the non-immersive treatments are rated more comfortable than their immersive equivalents.

Narrator Trust

The level of trust in the narrator during the treatments was measured in accordance with the methodology laid out in the Methodology section, using question PT4.5 to test the hypothesis that participants were more likely to trust the narrator in the immersive treatments, given their heightened sense of proximity to them through the HMD. Analysis shows that there are significant differences between treatments in the participants’ level of trust in the narrator. The narrators in the immersive and non-immersive treatments were found to be significantly more trustworthy than the authors of the text treatments (α = 0.003).

Data suggests that participants experiencing Growing Up Girl across any of the three formats trusted the narrator the most, as the graph in Figure 11 demonstrates. Another important finding is the low performance of Act in Paris, which scores consistently low on narrator trust. Researchers attribute this to the lack of a visible narrator during the treatment: the narrator, celebrity Jared Leto, is heard only in voiceover and does not make a physical appearance in the scene, unlike the characters who are introduced in Growing Up Girl and Seeking Home.

Figure 11: Level of narrator trust correlated to story formats/treatments (sample size 180).

Conclusions

Links between Treatment and Empathetic Response

The immersive VR format produced a marginally higher empathetic score (5.2 out of seven on the Likert Scale) over the non-immersive scale (5.06), and both were above the 4.3 that the mean text treatment received. This corroborates the hypothesis that the virtual reality formats are more effective at producing an empathetic response than the text, albeit by a small (one factor on a seven-factor scale) margin.

There was a noticeably higher score in the measurement of immersion for immersive and non-immersive treatments, both of which received positive senses of presence, versus the text treatment, which received a mean of neither positive nor negative (3.9/seven on the Likert Scale).

A statistically significant relationship was found to exist between the participant’s level of immersion and their desire to take action. Interestingly, those with a reported immersion level of five out of seven on the Likert Scale responded “very strongly” in terms of their motivation to take action: higher than that saw the motivation level drop a full point. This supports the thesis that over-immersion in stories can actually hinder an audience’s desire to remain motivated after the end of the narrative.

Another important distinction that was noted between the VR treatments versus the text control was the level of user motivation to find out more about the subject, which also highlighted the correlation to emotional impact. Those who received the text treatment were thirteen to eighteen percent less likely to find out more about the subject when compared to the immersive/non-immersive treatments.

Data was inconclusive as to which treatment or format was most effective at prompting users to question their previously held attitudes and change their opinion on a topic. Thirty-three percent of users were unlikely to change their minds, thirty-one percent would stay the same, and thirty-six percent were likely to change their minds.

However, it is worth noting that given the inclination toward a predominantly liberal survey group, with a fifty-eight percent majority identifying as liberal, the pieces already presented a narrative that conformed with the worldview of the majority.

Advantages of Immersive VR Treatments

As the data analysis shows, both the immersive and non-immersive treatments generated a higher combined empathy score than the text-only control treatment. Data suggests that the immersive treatment outperforms the non-immersive treatment, although the difference between the two was not found to be statistically significant. One other important statistic was the difference in the level of perspective-taking across each of the three story formats: immersive and non-immersive respondents rated their treatments a full point higher (five and seven, respectively) than their text-assigned counterparts.

There was found to be no significant difference between audience response to the head-mounted immersive treatments versus the desktop-based, non-immersive treatment when analyzed on their perceived level of interactivity. However, comparing means, results showed that the immersive format of Seeking Home was an outlier, with a lower-than-average perception of interactivity, albeit one that still registered as four (neither positive nor negative) out of a possible seven on the Likert Scale.

This demonstrates a link between the narrative of a story and the perceived interactivity of a story: participants saturated by stories of refugees actually had a less immersive experience in the HMD than those experiencing the story via the desktop.

This was also echoed in the reported level of participant enjoyment when it came to the respective treatments, in which the combined mean of participants agreed a little (5.5 out of seven on the Likert Scale) that immersive treatments were enjoyable, just ahead of the non-immersive versions, but registered only ambivalence (four) for the text treatments.

When taken separately, the level of immersion directly correlated with a significance of 0.000 to the level of enjoyment, which in turn correlated to the level of empathetic response. However, the large number of outliers (see Figure 10) suggests the ambiguity between participant enjoyment of the experience while witnessing the suffering within the experience, which in itself points to an empathetic response.

One such related example was the mean for the enjoyment of immersive experiences, which was heavily affected by the low ratings of the Seeking Home refugee experience: an ambivalent (four) rating, compared to the two other immersive treatments, which scored six out of seven, respectively. Seeking Home performed equally badly in the non-immersive and text treatments, scoring least across all formats, confirming a bias against the subject matter—which again might be attributed to high media coverage of the topic.

On a related note, the emotional response to some of the pieces had a strong, direct correlation to levels of empathy, particularly in terms of registering an upset or negative emotional reaction. This calls into question the inherently dispositional and ambiguous nature of a user’s relationship with the content, and the difficulty of classifying an immersive depiction of a news event as troubling yet compelling, or well produced but upsetting and unenjoyable.

Investigating further, it was found that the level of comfort was consistently lower in Seeking Home across the immersive and non-immersive treatments, suggesting a link between the subject matter and the viewer’s discomfort, compared to both the climate change and sub-Saharan Africa narratives. In particular, the ANOVA data’s suggestion of a significant difference between the immersive treatment of Act in Paris, which scored highest in comfort level at over six out of seven, versus Seeking Home at 4.5 out of seven brings up the question of whether being privy to a visible narrator’s conspicuous plight causes discomfort.

The relationship to the narrator is another key factor in fostering a sense of empathy and engagement with the narrative, and one that is affected by the format in which the narrative is presented. Another significant contributing factor is the narrator’s visibility on screen and their consistent presence throughout the scenes: even the use of a voiceover as opposed to including the narrator in a shot can cause a significant decrease in narrator trust. The data also suggests that the narrators in both the immersive and non-immersive treatments were seen as more trustworthy than that of the writer in the text control, although that too may have to do with the use of audio over the written word.

In terms of cognitive absorption, a similar pattern to the enjoyment metrics mentioned above emerged (see Figure 8) with regard to the relative success of each story in holding the user’s attention. Growing Up Girl and Act in Paris scored similarly positively (five out of seven on the scale) in the immersive treatment, with Seeking Home generating a more ambiguous four out of seven. This difference is more marked in comparison with the text treatment, which saw its lowest performer, the text treatment of Seeking Home, generating a mean of 3.5. We suggest that this might be due to the large number of characters and different scenes that audiences are presented with, lacking the consistency of a single, unifying character or voice, as was the case with the other two narratives.

The difference between the HMD immersive version and the non-immersive format’s capacity for producing a sense of presence was shown to not be statistically significant. Both VR treatments, however, were stronger than the text control treatment. Personal relevance was also not shown to be conclusive, demonstrated by the lack of correlation between interest in subject matter and the empathetic response, or in demographic commonalities such as age, education level, or gender.

Another significant factor to take into account when analyzing participants’ level of immersion is their sense of closeness to a narrator. Respondents who trusted the narrator moderately strongly (mean = 5.06) or very strongly (mean = 5.73) felt more immersed compared to those whose trust was registered as moderately weakly (mean = 2.5) to a little weakly (mean = 3.56). There is also a significant difference between participants who trust the moderator a little strongly (mean = 4.24) compared to very strongly, confirming once more the importance of the narrator.

It is also significant that the level of trust in the narrator was significantly diminished by not including them in the frame of the shot, as was the case in Act in Paris, which featured a voiceover track but no visual representation of the narrator, Jared Leto. Growing Up Girl, which conversely featured the same consistent protagonist in every scene, clearly outperformed the two other stories. This was also the case in the category of perspective-taking: Growing Up Girl scored the highest overall mean in both the immersive and non-immersive treatments, as well as the highest in its category.

In terms of demographics in relation to self-reported levels of empathy, there were significant discrepancies between self-perceptions of empathetic qualities on the original baseline tests versus actual responses to the treatments, which were often not nearly as high on the spectrum as participants anticipated.

This is further corroborated by the evidence suggesting that those who consume news less frequently—as little as one to two times a month—are more likely to respond empathetically to stories than those who consume news on a daily basis. A similar correlation between the frequency of technology usage and empathetic response was also observed: The data suggests that the less exposure to technology and less familiar with a story a user is, the more emotional impact it will have and the more likely their response to it will be empathetic.

One example of this correlation between the familiarity with topics and the level of empathetic response was applied to the climate change piece, Act in Paris, in which those who responded as least interested in the topic actually registered the most empathetic response. Conversely, those who reported a general interest in climate change did not respond empathetically to the story, which suggests that these stories are best directed at newcomers to particular topics, or utilized as introductory pieces to new stories.

Data suggests that middle-income audiences are especially suited to 360-degree videos, while higher income populations might be more resistant. It also supports the hypothesis that women are more likely to change their behavior following an empathetic response to a treatment than men. Regarding individual trait differences in the self-reported baseline empathetic response (phase one), no significant correlation was found between gender, age, or education in terms of predisposition to an empathetic response.

Short- and Long-Term Results

In the second phase of data collection, two weeks after the initial survey, there was a strong correlation between those who had registered an empathetic response and their ability to remember the story. The third phase of data collection, five weeks after the initial survey, showed a significant drop-off in the ability of participants to recall the story of the text treatment versus the immersive and non-immersive treatments, which performed identically with a mean of 4.7 out of the seven-factor Likert Scale.

This might be attributed to the novelty of the technology, the emotional impact of the virtual reality experience, or the contrast the VR formats afforded the user compared to the bulk of other text-based information they are accustomed to reading. Suggestive patterns from the data in terms of long-term behavioral changes were problematized by the attrition in responses over phases two and three.

Although the difference between the immersive and non-immersive treatments was not statistically significant at a difference of only five percent—though interestingly, the non-immersive format proved to be the biggest driver in motivation for taking political or social action—the VR treatments clearly outperformed the text control treatment by thirteen percent and eighteen percent, respectively, in terms of gauging participant’s desire to take further action after experiencing the story.

Data analysis concludes that while there is no significant difference between immersive and non-immersive VR in relation to taking political or social action, both formats surpassed text in their ability to motivate audiences to change their behavior. None of the treatments saw a statistically significant rise in participants’ desire to volunteer or donate money to a cause.

There was only a minor difference in the reported levels of cognitive absorption between the immersive and non-immersive treatments. Immersive treatments recorded a higher consistent level of absorption, and text treatments scored lowest, with most respondents admitting an ambivalent (neither positive nor negative) response to the text control.

Takeaways

Beginning with a reflection on the importance of the immersive versus non-immersive format, it is clearly demonstrated that the immersive and non-immersive treatments outperformed the text control. As predicted in the hypothesis, immersion, sense of control, and interactivity are critical to empathy, although they aren’t objectively quantifiable factors: they are heavily influenced by the user’s perceived relationship to the choice of topic and their level of familiarity with the source material.

Similarly, presence is best established through immersive, head-mounted displays, but immersion isn’t the only route to producing empathy or facilitating perspective-taking. One of the risks with the HMD immersive treatment is the added chance of causing discomfort in the viewer, which significantly reduces an empathetic response. Considerations such as the level of interaction required, as well as scene duration, are critical in ensuring a smooth, intelligible experience that does not overwhelm the user.

In terms of story, content is still king. The choice of narrative and content influences and outweighs the actual interactive affordances within an experience. The protagonist’s inclusion is also highly significant and greatly facilitates memory retention of the story over time, in addition to being a proven means of fostering perspective-taking and emotional impact. Data suggests the more people are interested in a topic, the more likely they are to remember it two and five weeks later—although this level of interest is a dependent on a variety of external factors such as their preexisting level of exposure to the topic and their personal or political beliefs.

The highest empathetic response was registered by users who were unfamiliar with the stories, suggesting the medium’s effectiveness of introducing a new topic or VR’s suitability for targeting infrequent news consumers. Audiences that find stories pleasant are far more likely to remember them in the long term, which also suggests an interesting correlation between palatability and impact. This in turn raises a number of ethical questions around user-aversion levels and the extent to which users should be shielded from potentially traumatic or upsetting content when experiencing it in this new, immersive format. Last but not least, user levels of empathy do lead to a higher likelihood of audiences taking action in the long term, but the difference, while statistically significant, is minimal.

In terms of the implications for newsrooms, choose a narrator in your 360-degree videos whom users trust, or include a consistent voice throughout. Establish a trusting relationship between audiences and the same single protagonist by including them in every scene. Be cautious about showing too many scenes that can cause viewer discomfort; an overabundance of scenes showing harsh conditions and suffering can drastically affect both user comfort and enjoyment, causing a drop in long-term memory recall and engagement. Research recommends interspersing such scenes with less charged and more neutral material to counterbalance the effect. Focus on less well-known topics that audiences aren’t as familiar with, as lack of user interest in a topic cannot be compensated for in terms of the production value or storytelling alone.

Focus on the emotional aspect of pieces, but be careful not to overwhelm the viewer, which can trigger a panic response and disengagement from the material. Provide clear guidance on sharing VR stories, as audiences are still unfamiliar with how to do so once they have finished viewing the experience. And remember that VR is by no means a catch-all solution for instilling empathy in users. Like any medium employed in the nuanced craft of storytelling, its power lies not only in journalists’ flair for storytelling, but also in audiences’ dispositional and contextual affinity for particular topics, which is as vulnerable to over-saturation as any other medium.

Outlook

This study touched on a number of factors that merit further investigation. Principal among them is a comparison of CG VR and cinematic VR treatments to compare and contrast levels of immersion, presence, and emotional impact in real time, and responsively rendered virtual environments versus pre-rendered spherical video. Comparing interactions with characters produced using volumetrically generated, three-dimensional video inside CG environments versus characters represented using 360-degree cameras would afford researchers a deeper level of metrics in relation to immersion, narrator trust, user control, and emotional impact. It would also allow for greater flexibility in terms of constraining the level of user agency required, which would be a key metric for isolating the optimum amount of interactivity required for maximum emotional impact and presence.

Crucial to this aspect of cross-platform comparison would be securing the means to co-produce the respective treatments in parallel, to ensure as minimal a discrepancy between narratives as possible. This would mean producing, recording, and editing the treatments on a similar timeline and ensuring they met a number of consistent qualities, including duration, number of scenes, number of characters, and so forth.

The component conditions that constitute the empathetic response also require further elaboration, and the incorporation of biofeedback devices such as blood volume pulse, heart rate variability, electrodermal activity measuring stress response, and skin temperature. In addition, a more granular approach to measuring participant memory recall of the narrative would provide further details of the nature of information that is best retained, and which particular aspects are most tied to emotional impact, interactions, or sense of presence. Similarly, a framework for allowing users to donate to a cause or sign up for a newsletter could also be integrated into the longitudinal responses, in place of a question gauging their willingness to do so.

Lastly, tighter controls during the recruitment process would ensure a representative sample size that meets prescribed gender, age, and ethic levels, as well as designating specific subsets of users whose familiarity with specific news stories and technology can be more accurately mapped. This would be more beneficial if conducted in multiple locations around the United States to ensure as diverse a survey sample as possible.

Appendix I: Research Questions

Do more immersive HMD VR treatments generate a heightened empathetic response by establishing a mediating effect, whereby VR induces presence and personal relevance to a topic area, thereby generating empathy? Is this response reduced in a non-immersive VR treatment?

X –> M –> Y (mediating effect, where X is the user, and Y is an empathetic response)

Or, do VR treatments always produce an empathetic response, albeit influenced by an individual user’s dispositional trait differences?^iv
X –> Y/M (moderating effect, where X is the user, and Y is the empathetic response, moderated by M, demographic differences)

Or, is the effect of the treatment influenced by both the user’s personal traits and the environment in which they view the experience?

X –> Y/M + S (where X is the user, Y is the empathetic response, moderated by demographic differences and situational factors)

In this experiment, the independent variable is empathy, while the dependent variables are level of engagement, demographic profile, sense of presence, emotional engagement, news consumption habits, familiarity with the technology and story, and relationship to the narrator.

Sub-questions

Question 1: Is the level of empathetic response subject to a moderating effect that is influenced by individual trait differences, such as user demographic and familiarity with the technology involved?
Question 2: In the case of users fulfilling the criteria for empathy, does an empathetic response lead to higher levels of engagement and greater likelihood of a positive behavioral change in the user as a result? How long does this behavioral change last?
Question 3: What are the differences in audience responses to immersive versus non-immersive VR treatments?
Question 4: Is it possible to quantify the effectiveness of immersive HMD VR stories versus non-immersive VR stories for generating empathetic responses in audiences?
Question 5: What are the specific demographic traits of users most interested in and influenced by virtual reality experiences?

Survey Data Collected

Except where designated below, responses were collected on a seven-point Likert Scale, where one was the smallest/weakest factor and seven the largest or strongest.

Demographics

Empathy Quotient Test

Davis Empathy Test

Interactivity: Post-Treatment

Cognitive Absorption

Attitudes toward Content

Immersion

Narrator Trust

Additional Features

Behavioral Change

Follow-Up Survey: 1–2 weeks after initial treatment

Follow-Up Survey: 2–5 weeks after initial treatment

Appendix II: Acronyms and Field Experts

Acronyms

CAVE: Cave automatic virtual environment—A dedicated room-sized, cube-like space where stereoscopic project and advanced three-dimensional computer graphics create an immersive experience for multiple users.
CG: Computer-generated—Refers to assets or experiences that have been constructed in a computer modeling program such as Maya or Blender. Typically used for VR developed inside game engines such as Unity and Unreal. Commonly seen as the alternative to cinematic VR.
HMD: Head-mounted display—A head-mounted, stereoscopic display that can be either tethered to a desktop computer or powered by a smartphone. Typically combined with a head-tracking device or gyroscope to monitor the user’s head movement and update the viewing angle of the scene accordingly.
IVET: Immersive virtual environment technology—A pseudonym for immersive VR experiences.
VR: Virtual reality—An artificial environment, generated by a computer, in which the user’s actions control how that environment responds to the user. An offshoot, cinematic VR, is comprised of immersive experiences made from footage shot on video cameras, not from computer-generated graphics.

Leading Practitioners in the Field

Robert Hernandez, USC Annenberg
Sarah Hill, Story Up
The New York Times

Acknowledgments

Many thanks to Maxwell Foxman, the research assistant on this study, whose support was crucial in ensuring the design and implementation of the data collection phase of the surveys. Also to Professor Barbara Tversky at Teacher’s College, who provided a watchful eye over proceedings; Elizabeth Hansen and Claire Wardle for their initial suggestions; and Jonathan Albright, who gave us helpful final feedback. University of Florida Professor Sri Kalyanaram’s input was also invaluable as a sounding board for the project design, as was Sun Joo (Grace) Ahn’s and Professor Jeremy Bailenson’s expertise in the field. Final thanks to the supportive team at the Tow Center who helped get this project off the ground in the first place: Emily Bell, Susan McGregor, Pete Brown, and Kathy Zhang.

Footnotes

See Appendix II for a list of leading practitioners.↩
See Appendix II for a list of defined acronyms used in this report.↩
Some argue that 360-degree video does not qualify as VR due to its limitation of freedom in the field of view and not in actual physical movement or tactile interaction, as is common in room-scale, CG VR experiences. This distinction notwithstanding, for the sake of brevity we refer to the 360-degree videos as cinematic VR throughout this report.↩
A moderator is a qualitative (a demographic quality such as race or gender) variable that affects the relationship between two variables. A mediator variable accounts for the relationship between the independent or predictor variable and the dependent or criterion variable.↩

Citations

Caroline Scott, “Disrupting the Narrative: Telling Stories with 360-Degree Video,” journalism.co.uk, February 11, 2016, https://www.journalism.co.uk/news/disrupting-the-narrative-telling-stories-with-360-degree-video/s2/a609976/.
Gurman Bhatia, “Virtual Reality News Is Becoming a Reality in Many Newsrooms,” Poynter, September 30, 2015, https://www.poynter.org/news/virtual-reality-news-becoming-reality-many-newsrooms.
Patrick Doyle, Mitche Gelman, and Sam Gill, “Viewing the Future? Virtual Reality in Journalism,” Knight Foundation, March 13, 2016, https://knightfoundation.org/reports/vrjournalism.
Joseph Lichterman, “Report: 2016 Will Be Critical for Growth of VR in Journalism,” Nieman Journalism Lab, March 13, 2016, https://www.niemanlab.org/2016/03/report-2016-will-be-critical-for-growth-of-vr-in-journalism/.
Doyle, Gelman, and Gill, “Viewing the Future? Virtual Reality in Journalism.”
Albert Bandura, “Self-Efficacy: Toward a Unifying Theory of Bahavioral Change,” Psychological Review, no. 2 (1977): 191–215, https://psycnet.apa.org/doiLanding?doi=10.1037%2F0033-295X.84.2.191.
Ralph Hertwig et al., “Decisions from Experience and the Effect of Rare Events in Risky Choice,” Psychological Science, no. 8 (August 1, 2004): 534–539, https://journals.sagepub.com/doi/10.1111/j.0956-7976.2004.00715.x.
Jesse Damiani, “The Great Semantic Divide: Virtual Reality versus 360-Degree Video,” Upload, August 29, 2016, https://uploadvr.com/virtual-reality-vs-360-degree-video-semantic-divide/.
Philip L. Jackson et al., “EEVEE: The Empathy-Enhancing Virtual Evolving Environment,” Frontiers in Human Neuroscience, March 10, 2015, https://www.frontiersin.org/articles/10.3389/fnhum.2015.
00112/full.
Rae Greiner, “1909: The Introduction of the World ‘Empathy’ into English,” Branch, https://www.branchcollective.org/?ps_articles=rae-greiner-1909-the-introduction-of-the-word-empathy-into-english.
Merriam-Webster, definition of empathy, https://www.merriam-webster.com/dictionary/empathy.
Ibid.
Mark H. Davis, “A Multidimentional Approach to Individual Differences in Empathy,” JSAS Catalog of Selected Documents in Psychology (1980): 85, https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.462.7754&rep=rep1&type=pdf.
Robert Hogan, “Development of an Empathy Scale,” Journal of Consulting and Clinical Psychology, no. 3 (1969): 307–316, https://dx.doi.org/10.1037/h0027580.
Carl Ransom Rogers, A Theory of Therapy, Personality, and Interpersonal Relationships (New York: McGraw-HIll, 1959).
M.D. Rutherford, Simon Baron-Cohen, and Sally Wheelright, “Reading the Mind in the Voice: A Study with Normal Adults and Adults with Asperger Syndrome and High Functioning Autism,” Journal of Autism and Developmental Disorders, no. 3 (June 2002): 189–194, https://link.springer.com/article/10.1023/A.
Mark H. Davis, Empathy: A Social Psychological Approach (Dubuque, IA: Brown and Benchmark, 1994).
Martin L. Hoffman, “Interaction of Affect and Cognition in Empathy,” in Emotions, Cognitions and Behavior, eds. Carroll E. Izard, Jerome Kagan, and Robert B. Zajonc (New York: Cambridge University Press, 1984).
Joris H. Hanssen, “A Three-Component Framework for Empathetic Technologies to Augment Human Interaction,” Journal on Multimodal User Interfaces, nos. 3–4 (2012): 143, https://link.springer.com/article/10.1007%2Fs12193-012-0097-5.
C. Daniel Batson et al., “Empathy, Attitudes, and Action: Can Feeling for a Member of a Stigmatized Group Motivate One to Help the Group?” Journal of Personality and Social Psychology (1997): 105–118, https://journals.sagepub.com/doi/abs/10.1177/014616702237647.
Adam D. Galinsky, Gillian Ku, and Cynthia S. Wang, “Perspective-Taking and Self-Other Overlap: Fostering Social Bonds and Facilitating Social Coordination,” Group Processes & Intergroup Relations, no. 2 (April 1, 2005), https://journals.sagepub.com/doi/abs/10.1177/1368430205051060.
Batson et al., “Empathy, Attitudes, and Action: Can Feeling for a Member of a Stigmatized Group Motivate One to Help the Group?”
Sun Joo (Grace) Ahn, Amanda Ming Tran Le, and Jeremy Bailenson, “The Effect of Embodied Experiences on Self-Other Merging, Attitude, and Helping Behavior,” Media Psychology (2013): 7–38, https://vhil.stanford.edu/mm/2013/ahn-mp-embodied-experiences.pdf.
Davis, “A Multidimentional Approach to Individual Differences in Empathy.”
Steven L. Neuberg, “Does Empathy Lead to Anything More Than Superficial Healing? Comment on Batson et al.,” Journal of Personality and Social Psychology, 1997.
Hanssen, “A Three-Component Framework for Empathetic Technologies to Augment Human Interaction.”
Ibid.
Carl D. Marci et al., “Physiologic Correlates of Perceived Therapist Empathy and Social-Emotional Process During Psychotherapy,” Journal of Nervous and Mental Disease, no. 2 (2007): 103–111, https://www.ncbi.nlm.nih.gov/pubmed/17299296.
Hanssen, “A Three-Component Framework for Empathetic Technologies to Augment Human Interaction.”
Madeline Balaam et al., “Enhancing Interactional Synchrony with an Ambient Display,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, May 7, 2011, https://dl.acm.org/citation.cfm?id=1978942.
Sarah Hill, “360 Video versus Regular Video: A Case Study,” MediaShift, May12, 2016, https://mediashift.org/2016/05/360-video-vs-regular-video-our-case-study/.
Ainsley Sutherland, “The Limits of Virtual Reality: Debugging the Empathy Machine,” MIT, 2016, https://docubase.mit.edu/lab/case-studies/the-limits-of-virtual-reality-debugging-the-empathy-machine/.
“Analyzing Empathetic Interactions Based on the Porbabilistic Modeling ofthe Co-occurence Patterns of Facial Expressions in Group Meetings,” IEEEXplore Digital Library, May 19, 2011, https://ieeexplore.ieee.org/document/5771440/?reload=true.
Gwo-Dong Chen et al., “An Empathetic Avatar in a Computer-Aided Learning Program to Encourage and Persuade Learners,” no. 2 (2011): 62–72, https://www.ifets.info/journals/15_2/7.pdf.
Yanghee Kim and Amy L. Baylor, “Pedagogical Agents as Learning Companions: The Role of Agent Competency and Type of Interaction,” Journal of Computer Assisted Learning, no. 3 (2007): 220–234, https://link.springer.com/article/10.1007/s11423-006-8805-z.
Signey D’Mello et al., “Auto Tutor Detects and Responds to Learners Affective and Cognitive States,” In Proceedings of the Workshop on Emotional andCognitive Issues in ITS in Conjunction with the Ninth International Conference on Intelligent Tutoring Systems, January 2008, https://www.researchgate.net/publication/228673992_
AutoTutor_detects_and_responds_to_learners_affective_and_cognitive_states.
James C. Lester et al., “Animated Pedagogical Agents and Problem-Solving Effectiveness: A Large-Scale Empirical Evaluation,” In Proceedings of Eight World Conference on Artificial Intelligence in Education, 1997, https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.7114&rep=rep1&type=pdf.
Adi Robertson, “You Can Buy This Eye-Tracking VR Headset Soon, but Should You?” The Verge, September 16, 2016, https://www.theverge.com/circuitbreaker/2016/9/16/12940528/fove-0-eye-tracking-vr-headset-design-preorder.
Marty Swant, “How Virtual Reality Is Inspiring Donors to Dig Deep for Charitable Causes,” Adweek, May 31, 2016, https://www.adweek.com/digital/how-virtual-reality-inspiring-donors-dig-deep-charitable-causes-171641/.
MIT Open Documentary Lab, “Virtually There: Documentary Meets Virtual Reality,” MIT et al., May 6 and 7, 2016, https://opendoclab.mit.edu/wp/wp-content/uploads/2016/11/MIT_OpenDocLab_VirtuallyThereConference.pdf.
Sam Gregory, “Immersive Witnessing: From Empathy and Outrage to Action,” Witness, August 2016, https://blog.witness.org/2016/08/immersive-witnessing-from-empathy-and-outrage-to-action/.
Bryan Bello, “Of VR and Vérité: Reality Under Construction,” International Digital Media and Arts Association, June 2016, https://www.idmaajournal.org/2016/06/of-vr-and-verite-reality-under-construction/.
MIT Open Documentary Lab, “Virtually There: Documentary Meets Virtual Reality.”
Nick Yee, “The Demographics, Motivations and Derived Experiences of Users of Massively-Multiuser Online Graphical Environments,” PRESENCE: Teleoperators and Virtual Environments (2006): 309–329, https://www.nickyee.com/pubs/Yee%20-%20MMORPG%20Demographics%202006.pdf.
D. Sunshine Hillygus and Steven Snell, “Longitudinal Surveys: Issues and Opportunities,” Duke University, August 8, 2015, https://sites.duke.edu/ssnell/files/2015/07/hillygus_snell_8-8-15.pdf.
Ibid.
University of California at Los Angeles’s Statistical Consulting Group,“What Does Cronbach’s Alpha Mean?” UCLA Institute for Digital Research and Education, 2016, https://stats.idre.ucla.edu/spss/faq/what-does-cronbachs-alpha-mean/.
United States Census Bureau, QuickFacts: New York, n.d., https://www.census.gov/quickfacts/fact/table/NY/PST045216.
Staff, “Virtual Reality Interest Highest Among Gen Z,” eMarketer, December 3, 2015, https://www.emarketer.com/Article/Virtual-Reality-Interest-Highest-Among-Gen-Z/1013295.
United States Census Bureau, QuickFacts: New York.
New York Office, “Covered Employment and Wages in the United States, New York State, and Five Counties of New York City, First Quarter,” Bureau of Labor Statistics, November 9, 2016, https://www.bls.gov/regions/new-york-new-jersey/news-release/countyemploymentandwages_newyorkcity.htm#chart3.
Drew Desilver, “The Politics of American Generations: How Age Affects Attitudes and Voting Behavior,” Pew Research Center, July 9, 2014, https://www.pewresearch.org/fact-tank/2014/07/09/the-politics-of-american-generations-how-age-affects-attitudes-and-voting-behavior/.

Dan Archer and Katharina Finger are fellows at the Tow Center for Digital Journalism. Dan Archer founded the virtual reality company Empathetic Media in 2015. Katharina Finger, who works alongside Dan at the company, has a proven track record for international project management at companies like Siemens AG and Aston Martin Lagonda.