
Does the dubbing effect apply to voice-over? A conceptual replication study on visual attention and immersion
Gabriela Flis, Adam Sikorski and Agnieszka Szarkowska, University of Warsaw
ABSTRACT
In an eye-tracking study, Romero-Fresco (2016) discovered that when watching a dubbed film, Spanish viewers hardly looked at characters’ mouths and focussed instead on their eyes – a phenomenon he termed ‘the dubbing effect’. Our study is a conceptual replication of Romero-Fresco’s study, aimed at answering the question of whether a similar effect also takes place in voice-over: do viewers avoid looking at characters’ mouths to stay immersed in the film story? With this question in mind, we tested 35 Polish native speakers watching a 6-minute voiced-over excerpt from Casablanca while their eyes were monitored with an eye tracker. We also measured viewers’ immersion levels as well as their enjoyment and comprehension. In this paper, we present two experiments. In Experiment 1, by analysing viewers’ gaze behaviour and immersion levels, we found that Polish viewers did not avoid looking at characters’ mouths. In Experiment 2, we compared our results with those obtained in the original study with Spanish and English viewers. We found that visual attention distribution in Polish voice-over resembled the one observed in English viewers, who watched the film with the original soundtrack. Both Polish and English viewers spent more time looking at characters’ eyes in scenes with no dialogue compared to scenes with dialogue, as opposed to the Spanish people for whom the tendency was reversed.
Keywords
Dubbing effect, voice-over, immersion, visual attention, eye tracking, audiovisual translation.
1. Introduction
In an eye-tracking study, Romero-Fresco (2016) discovered that when watching a dubbed film, Spanish viewers avoided looking at characters’ mouths and focussed instead on their eyes. This viewing pattern was significantly different to that of English viewers, who watched the same film in the English original version and did not avoid looking at characters’ mouths. At the same time, while watching a comparable scene from a Spanish film, Spanish participants displayed a similar gaze pattern to the English people watching the original English film. This phenomenon, whereby viewers avoid looking at characters’ mouths in a dubbed film, was termed ‘the dubbing effect’ (Romero-Fresco 2016, 2020).
In this paper, we undertake a conceptual replication of the study on the dubbing effect, originally conducted on Casablanca (Michael Curtiz 1942) by Romero-Fresco, reported in 2016 as a conference paper and published as a research article in this volume1. Unlike the original study with Spanish and English viewers, our study takes place in Poland. As opposed to Spain, where the predominant audiovisual translation (AVT) mode is dubbing, and the UK, where the vast majority of audiovisual content is available in the original English version, Poland is generally considered a stronghold of voice-over (VO) (Gottlieb 1998). Casablanca has never been dubbed into Polish and only the voiced-over and subtitled versions exist.
In contrast to dubbing, where every attempt is made to synchronise the translation with the lip movements of the original actors (Chaume 2014), in voice-over there is no requirement for lip synchrony (Sepielak and Matamala 1999). Neither does the translation need to be of the same duration as the original – a requirement known as isochrony (Chaume 2014). In VO, the original soundtrack remains audible but its volume is lowered, and the translation tends to be shorter than the original, typically allowing viewers to hear the beginning and end of the original utterances. The translation is read by one voice talent, usually male.
Romero-Fresco (2016) hypothesises that viewers may be confused by the asynchrony between the dubbed text and the lip movements of the original actors. As a result, they may take the unconscious strategy of not looking at the actors’ mouths so that they can then suspend their disbelief and “thus achieve the ultimate goal of being engaged with the fictional story” (Romero-Fresco 2020: 35). The effect is aided by the fact that dubbing viewers are accustomed to this form of AVT thanks to “an early acquired and subconsciously internalised dubbing viewing habit” (ibid.: 24).
Assuming that a lack of synchrony between the characters’ lip movements and the translation may lead to viewers avoiding looking at the mouth, we wondered whether a similar effect may take place when watching Polish VO, where the lack of synchrony between the original utterance and its translation is part and parcel of this AVT mode. Have Polish viewers also developed similar strategies in their process of habituation to VO? Given the fact that all the translated utterances, whether pronounced by female or male actors, are read out by a single male voice talent, we thought that the viewers’ potential avoidance of looking at characters’ mouths may be particularly discernible in scenes with female characters speaking.
In what follows we first discuss the notion and types of replication and provide a brief overview of the original study on the dubbing effect. We then explore two major theoretical concepts underpinning our research: visual attention and immersion. Finally, we report on the results of our study and discuss its implications.
2. Replication
A study may be considered a replication “when researchers repeat all the relevant aspects of an original study” (Koole and Lakens 2012: 608). The essence of replication is to reproduce the findings of a particular experiment with a different group of participants in order to confirm, or disconfirm, the effect found in the original study. Replication provides the opportunity to assess the reproducibility of research results and increases their certainty.
Modern science is facing a replication crisis (Pashler and Wagenmakers 2012; Earp and Trafimow 2015; Stevens 2017). Indeed, according to Ioannidis (2005: online), “there is increasing concern that most current published research findings are false.” The Reproducibility Project carried out by the Open Science Collaboration (2015) demonstrated that – as opposed to the original studies reporting significant results in 97% cases – only 36% of replication studies produced results which were statistically significant (Stevens 2017).
The high rate of non-replication does not necessarily mean that the original study was faulty. Replication may not produce the same result as the original study for a number of reasons. First, the results of the original study could be a false positive, i.e. they could show a finding that is not really there, known as Type I error (Field 2009). Second, the replication study could be a false negative, i.e. it does not find a result when the result is genuine (Type II error). As noted by the Open Science Collaboration (2015: 943), “even research of exemplary quality may have irreproducible empirical findings because of random or systematic error.” Finally, fundamental differences between the original and the replication study can make replication impossible (Koole and Lakens 2012; Pashler and Harris 2012).
There are two main types of replication: direct and conceptual. In direct replication, an attempt is made to recreate the original study using the same conditions and materials. Conceptual replication aims to test the hypothesis put forward in the original study, but using a different design (Koole and Lakens 2012). It involves “deliberately changing the operationalization of the key elements of the design such as the independent variable, dependent variable, or both” (Nosek et al. 2012: 619). While the goal of direct replication is to validate a finding, a conceptual replication:
seeks to validate the underlying theory or phenomenon – i.e., the theory that has been proposed to “predict” the effect that was obtained by the initial experiment – as well (sic) to establish the boundary conditions within which the theory holds true (Earp and Trafimow 2015: 5).
The study described in this paper is a conceptual replication. Given that Casablanca is only available in Polish in a voiced-over version, we could not undertake a direct replication. Our assumption was that if, indeed, viewers avoid looking at the characters’ mouths in translated films to better guarantee their immersion, then this effect should also be discernible in VO, where lack of synchronisation between the original and the translation is much more prominent than in dubbing.
3. The dubbing effect
The original study by Romero-Fresco (2016, 2020) was conducted on a group of Spanish viewers watching a 6-minute fragment from Casablanca dubbed into Spanish and a control group of English viewers watching the same fragment in the original English version. Their gaze was monitored with an eye tracker. The analysis of participants’ eye movements showed that when watching the dubbed version, Spanish participants gazed at the characters’ mouths in dialogue exchanges much less than English participants did. While the English group spent about 76% of the time on characters’ eyes in dialogues and 24% on the mouths, the Spanish group spent as much as 95% on the eyes and only 5% on the mouths. When they were shown a comparable, originally Spanish clip2, Spanish participants displayed similar viewing patterns to the English (76% on the eyes and the rest on the mouth). Interestingly, the observed effect was found only in scenes with dialogue, i.e. when the characters were speaking, but not in moments when the characters did not speak (‘silent close-ups’).
Romero-Fresco (2020) attributes the dubbing effect to viewers’ attempt to suspend their disbelief on two levels: linguistic and visual. The suspension of linguistic disbelief is “the process that allows the dubbing audience to turn a deaf ear to the possible unnaturalness of the dubbed script while enjoying the cinematic experience” (Romero-Fresco 2009: 68-69). This unnaturalness stems from using unidiomatic expressions that are too close to the source language, both in terms of lexis and syntax: a phenomenon known as ‘dubbese’ (Antonini 2008; Leszczyńska and Szarkowska 2018). Suspension of disbelief is also performed at the visual level, for it is the lack of perfect lip synchrony that makes viewers focus on characters’ eyes rather than mouths when watching dubbed films. Suspension of disbelief is not a directly observable concept, but it can be related to immersion (see Section 5). In this sense, it can be argued that the more viewers become immersed into the story world created by the film, the higher their suspension of disbelief.
To explore the visual dimension, Romero-Fresco (2020) relied on eye tracking to measure visual attention distribution on eyes and mouth in dialogue and non-dialogue scenes. To gauge immersion, he used the self-reported immersion test, ITC Sense of Presence Inventory (ITC-SOPI) (Lessiter et al. 2001). Comprehension was also tested. Finally, participants were asked to rank the perception of their eye movements in order to determine whether they were aware of how much time they had spent on eyes and mouths in close-ups. They did it using a 1-5 scale, where 1 stood for ‘no time spent’ and 5 for ‘all the time spent’. The results – recalculated by the author into percentages – showed that the participants were largely unaware of how much time they allocated to the mouth and the eyes; for instance, Spanish participants thought that when watching Casablanca, they had spent 67% on the mouths (3.35 on a 5-pont scale), when in fact they only spent 5%. This prompted the author to conclude that the mouth avoidance strategy is unconscious.
For the scholar, the dubbing effect may be a result of the habituation of Spanish viewers to this AVT practice: the more exposed you are to dubbing from an early age, the more likely you are at a later stage to suspend your linguistic and visual disbelief, and become immersed in the film diegesis of a dubbed film.
4. Visual attention
When creating films, cineastes strive to influence viewers’ behaviour and visual attention by a skilful manipulation of mise en scène, editing and sound design (Bordwell and Thompson 2010). Empirical research into viewers’ behaviour and film cognition began only recently, thanks to new technological developments such as eye tracking (Smith 2013). Research on the reception of various AVT types can greatly benefit from eye tracking as this technology allows us to experimentally verify a number of assertions that have been made in the literature.
Although film viewing may seem like a passive activity, when watching films viewers are, in fact, busy processing the sequences of images and sounds, understanding the action, and construing the narrative. From previous research we know that viewer gaze behaviour shows certain commonalities (Smith 2013). In static images, viewers tend to focus on faces (Thomas et al. 2007) and, in particular, on the eyes (Birmingham et al. 2009a, 2009b), which can be attributed to visual saliency, social interest or information acquisition strategy (Vo et al. 2012). In dynamic scenes such as films, viewers’ gaze has been found to cluster, resulting in a phenomenon known as attentional synchrony (Mital et al. 2010). Attentional synchrony, or “the tendency for observers to be looking in the same place at the same time” (Foulsham and Sanderson 2013: 926), is greater when sound is present than during moments of silence (ibid.: 939). Figure 1 offers a visual representation of high and low attentional synchrony with eye-tracking data from our experiment:
|
|
*Note: each coloured dot represents the gaze of one viewer |
A viewer’s gaze is motivated by two types of factors: endogenous, i.e. goal-driven or top-down, and exogenous, i.e. stimulus-driven or bottom-up (Smith and Mital 2013). The former result from the viewer’s internal motivations and the characteristics of the task (Birmingham et al. 2008); for example, a viewer’s gaze is influenced by the instructions they are given before watching and by what they are thinking about when looking at something (Yarbus 1967). The latter factors relate to the nature of the stimulus, the most influential ones being motion (including mouth movements) and changes in luminance (Smith 2013). It has been demonstrated that the deployment of visual attention depends on the task, i.e. what participants are asked to do, and not only on the nature of the stimulus (Buchan et al. 2007; Vo et al. 2012). For instance, if a person is trying to understand what is being said, they may focus not on the eyes but on the mouth instead. Buchan et al. (2007) found that participants looked more at the eyes in an emotion judgement task but, in a speech recognition task, they directed their gaze at the mouth. This points to the important role of endogenous factors in visual attention.
Do these findings also apply to AVT? The dubbing effect is believed to stem from potential incongruities between sound and image in dubbing (Romero-Fresco 2020). In other words, it is the nature of the stimulus – where the motion (i.e. lip movements) is incongruous with the sound – that determines the viewers’ gaze, pointing to the importance of exogenous factors in film watching. It needs to be noted that in the original study, there was high variability of data across English subjects, shown by high standard deviation values, which may be an indication of low attentional synchrony. While many viewers avoided looking at the mouth, others may have followed different gaze paths. This may point to the possibility that a viewer’s gaze is also highly influenced by endogenous factors, internal to individuals, and independent of the condition, i.e. unrelated to the quality of dubbing or characteristics of the audiovisual material.
5. Immersion
Immersion, often defined as a subjective sensation of “being there” (Kim and Biocca 1997), is a rather complex concept, encompassing manifold types of experiences. It was first discussed in research on video games and virtual environments and, in this paper, it is understood as “a subjective response to narrative contents” (Nilsson et al. 2016: 108), made up of different dimensions.
The way in which one experiences immersion depends on factors such as one’s personality traits and immersive tendency (Weibel et al. 2010). For Witmer and Singer (1998), immersive tendency is an “individual’s ability to become involved in mediated environment”, with some people more prone to becoming immersed than others. Weibel et al. (2010) believe that the big five personality traits – i.e. openness to experience, conscientiousness, extraversion, agreeableness and neuroticism – as well as one’s immersive tendency, play an important role in how different people experience immersion.
Immersion is normally measured through self-reports. In an attempt at designing a single, encompassing tool, a group of researchers developed the ITC Sense of Presence Inventory, known as the ITC-SOPI (Lessiter et al. 2001), which includes four factors: (1) sense of physical space, (2) engagement, (3) ecological validity and (4) negative effects. As ITC-SOPI was initially developed with virtual reality in mind, it includes the measurement of negative effects in such environments through questions like ‘I felt dizzy’, ‘I had eyestrain’ or ‘I felt nauseous’. The negative effects factor has been deemed irrelevant for our study and has therefore not been included. In addition to ITC-SOPI, other immersion questionnaires have been used, focussing on aspects such as transportation, perceived realism/perceptual quality, and identification with characters, as explained below.
5.1. Transportation
An important aspect of immersion is the feeling of being transported into the fictional world. Transportation may be defined as “the experience of cognitive, affective and imagery involvement in a narrative” (Green et al. 2004). When transported into the narrative depicted by a book or film, people tend to suspend their knowledge of the real world and engage in the presented story instead (Green 2004). Transportation means that for a brief moment we forget about our environment, surroundings, values and social role, instead focusing on the storyline that is being presented to us, which is a crucial aspect of immersion.
5.2. Perceived realism and perceptual quality
An important aspect affecting viewers’ immersion is perceived realism, i.e. the degree to which viewers identify the world depicted in a film as real (Cho et al. 2012). Lack of realism and internal consistency may negatively impact on viewers’ immersion (Hall 2003). Being a multidimensional concept in itself, perceived realism comprises aspects such as plausibility, narrative consistency and perceptual quality (Cho et al. 2012). Particularly important to our study is the concept of perceptual quality, which “refers to the degree to which the audio, visual, and other manufactured elements of a media narrative compromise a convincing and compelling portrayal of the reality” (Hall 2003). One might think that perceptual quality would be extremely low in cases of, for example, science-fiction movies or animated short stories; however, this is not the case. In Hall’s (ibid.) experiment, participants who watched Jurassic Park (Steven Spielberg 1993) still perceived dinosaurs as real, even though they had become extinct millions of years ago, because they felt real in the context of the film. Similarly to transportation, perceptual quality is not something that viewers judge from the perspective of their reality, but rather in the context of the world depicted to them. It is more about whether different elements of the narrative itself are internally consistent. In the context of AVT, it would be interesting to investigate whether diverse practices, such as subtitling, dubbing and voice-over, trigger different degrees of perceived realism and perceptual quality in viewers.
5.3. Identification with characters
When watching films, viewers often become absorbed in the fictional story world and identify with the characters. Identification is “a mechanism through which audience members experience reception and interpretation of the text from the inside, as if the events were happening to them” (Cohen 2001: 245). Even though it might sound similar to transportation, character identification is limited to particular characters depicted in a movie, whereas transportation “is a more general experience created by the narrative as a whole” (Tal-Or and Cohen 2010: 404).
An important part of the effects that media exert on audiences, identification is an elusive experience, which not only varies in intensity depending on the individuals, but is rather cyclic. According to Cohen (2001), when identifying with a character, viewers suspend, for a brief moment, their own thoughts, identity and social role and instead adopt the perspective of the protagonist, which happens repeatedly throughout the movie. Generally, viewers who identify strongly with characters are more likely to be engaged in the narrative as a whole and to care more about how protagonists’ stories will be resolved (Tal-Or and Cohen 2010).
6. Overview of the current study
The goal of this eye-tracking study was to test whether an effect similar to the dubbing effect can be detected in viewers watching a Polish voiced-over film. Viewers’ gaze behaviour was monitored with the help of an eye tracker and their visual attention distribution on the eyes and mouths of the characters was measured. Their immersion, enjoyment and comprehension levels were also investigated. Similarly to Romero-Fresco (2020), participants were also asked to assess how much time they had spent looking at eyes and mouths. Additionally, we investigated the potential impact of endogenous factors on visual attention, such as participants’ immersive tendency and English proficiency.
We hypothesised that Polish viewers may avoid looking at characters’ mouths, particularly in the case of female actors, due to a potential cognitive dissonance when the translation is read out by a man. Such a dissonance may also lead to viewers’ low immersion in the fictional story. We also explored whether gaze distribution may have an effect on immersion, comprehension and enjoyment.
This research is divided into two experiments. Experiment 1 reports on the results of the eye-tracking study conducted on VO with Polish viewers, using the same 6-minute excerpt from Casablanca as Romero-Fresco (2016, 2020). We used a mixed study design with the area of the face (eyes/mouth) as an independent within-subject variable, and participants’ immersive tendency and English proficiency as factors. The dependent variables were the percentage of gaze distribution, immersion levels, comprehension and enjoyment. In Experiment 2, we compared our results with those obtained by Romero-Fresco (2016, 2020). To this end, as was done in the original study, we used one-way ANOVA analyses with gaze distribution on eyes and mouth, both in dialogue and non-dialogue sequences, as dependent variables, and participant group as the independent variable.
To the best of our knowledge, no work on the dubbing effect in voiced-over films, and especially Polish VO, has been done before. Our experiment may contribute to enhancing our understanding of the cognitive processing involved in the perception of AVT in general and VO in particular.
7. Experiment 1
7.1. Method
The experiment was carried out at the University of Warsaw with full approval of the Research Ethics Committee. Data was recorded with an SMI eye tracker with a sampling rate of 250 Hz and analysed in SMI BeGaze and SPSS Statistics 24.
7.2. Participants
A total of 35 participants took part in the study (24 women and 11 men). Their mean age was 23.26 (SD = 3.28), ranging from 20 to 36. Based on the immersive tendency questionnaire (Weibe et al. 2010), participants were grouped as belonging into either a low- or high-immersive tendency group, using the median split method (Table 1):
Gender |
Immersive tendency |
||
Low |
High |
Total |
|
Male |
6 |
5 |
11 |
Female |
13 |
11 |
24 |
Total |
19 |
16 |
35 |
Table 1. Participants by gender and immersion tendency
Participants were asked to state their proficiency using the CEFR sheet (Flis et al. 2019), and no-one declared a level lower than B1 (Table 2):
Number of people |
|
Intermediate (B1 & B2) |
5 |
Advanced (C1) |
15 |
Proficiency (C2) |
15 |
Total |
35 |
Table 2. Participants’ proficiency in English
In general, our sample consisted of young adults whose proficiency in English was relatively high, which may be important as they could understand the original English audio in the background of the Polish voiced-over version.
7.3. Materials
The material used in our experiment was the same 6-minute excerpt of Casablanca used previously by Romero-Fresco (2020). Similarly to the original study, only the last two minutes – containing close-up shots – were analysed.
7.4. Procedure
Before starting the experiment, all participants signed an informed consent form and provided their basic demographic data, such as age, gender and level of English proficiency. Testing was done individually. Participants were seated 60 to 70 cm away from the computer, which was connected to an SMI eye tracker, and were given instructions on how to complete the study. We calibrated the eye tracker and only accepted results below 0.5.
The experiment began with a training session during which participants watched a 1-minute clip from the beginning of Casablanca, different from the one used in the actual experiment, and answered several questions on immersion and comprehension similar to those that would appear later. In the actual experiment, participants watched the 6-minute clip and answered questions on immersion, enjoyment and perception as well as 12 comprehension questions, displayed on the monitor. Once they completed the study, we debriefed them and revealed the purpose of the experiment.
7.5. Design and variables
We drew areas of interest (AOIs) on characters’ eyes and mouths, separately for dialogue and non-dialogue scenes (Fig. 2). We had no access to the AOIs used in the original study:
|
|
The following dependent variables were used: gaze distribution, immersion, enjoyment and comprehension.
Gaze distribution on eyes and mouth was calculated as a percentage of dwell time, i.e. the sum of all fixations and saccades in an area of interest, starting from the first fixation, relative to the AOI visible time. For instance, participant P01 spent 22,430.43 ms in the AOI on eyes in dialogue scenes, which were displayed for 35,040 ms in total. Therefore, the gaze distribution on eyes in dialogue scenes for this participant was 64.01%, calculated as (22,430.43/35,040)*100.
Gaze distribution was measured separately for scenes where characters were speaking (dialogue) and for those where characters were visible on screen but were not speaking (non-dialogue). Unlike in the original study, we took the total display time of all AOIs as 100%. Our percentages on eyes and mouth did not add up to 100%, as was the case in the original study, because we also took into account other areas on the screen where people looked, including the nose, hat, hair, background, etc. Only in Experiment 2 did we recalculate our results to be able to compare them with the original study.
Immersion. First, we measured immersive tendency based on the works by Weibel et al. (2010) and Witmer and Singer (1998). Then, we used 9 items from the ITC-SOPI questionnaire to test sense of physical space, engagement and ecological validity (but not negative effects), on a 1-5 Likert scale (1 = strongly disagree, 5 = strongly agree). Unlike in the original study, we also tested transportation, perceptual quality and identification with characters. Transportation was measured with an 8-item questionnaire, adapted from Sestir and Green (2010), while perceptual quality was tested using a 5-item questionnaire based on Cho et al. (2012). Character identification was measured with a 4-item questionnaire taken from Tal-Or and Cohen (2010). For the non-ITC-SOPI questions, participants had to state their degree of agreement with the statements using a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree). All questions were in Polish.
Enjoyment was measured using the first item from Intrinsic Motivation Inventory (Ryan 1982), which was modified to reflect the nature of the task. Participants were asked to relate to the statement I enjoyed watching the film, by using a 1-7 scale, where 1 meant ‘not at all’ and 7 ‘very much’.
Comprehension was calculated as a percentage of correct answers to a set of 12 multiple choice questions related to the content of the clip.
All the materials used in the study, including all the questionnaires, are available in an open access repository (Flis et al. 2019).
7.6. Results
Similarly to the original experiment, all data reported here come from the final scene of the film, featuring close-ups of the main characters: Rick, played by Humphrey Bogart, and Ilsa, played by Ingrid Bergman. We begin by discussing the gaze distribution on the face in dialogue and non-dialogue scenes, followed by immersion, enjoyment and comprehension.
7.6.1. Gaze distribution on eyes vs mouth in dialogue and silent scenes
In order to investigate how much time viewers spent looking at characters’ eyes and mouths in dialogue scenes (i.e. moments when the actors were speaking) and non-dialogue scenes (when the actors were visible on screen but did not speak), we conducted a 2 x 2 repeated-measures two-way ANOVA with gaze distribution on the face (eyes vs mouth) and dialogue (dialogue vs non-dialogue) as independent variables. The dependent variables were the percentage of gaze distribution on the eyes and on the mouth. Table 3 shows descriptive statistics for this analysis:
Gaze distribution |
||||
Eyes |
Mouth |
|||
Mean |
SD |
Mean |
SD |
|
Dialogue scenes |
32.88 |
24.86 |
16.71 |
14.11 |
Non-dialogue scenes |
43.41 |
28.08 |
10.27 |
8.68 |
Table 3. Percentage of time spent on eyes and mouth
Following Romero-Fresco (2020) and drawing on the well-known eye bias (Birmingham and Kingstone 2009), we predicted that participants would spend more time looking at characters’ eyes than their mouths. Indeed, we found a main effect of gaze distribution of different parts of the face, F(1, 30) = 16.111, p<.001, partial eta2 = .354, showing that people spent twice as much time on the eyes as they did on the mouth (see Fig. 3). It must be noted, however, that there was a very high standard deviation, showing high variability of the data. There was no significant main effect of dialogue, F(1, 30) = 3.215, p = .083, partial eta2 = .097.
Fig. 3. Gaze distribution on eyes and mouth in scenes with and without dialogue
Importantly, we found a significant interaction between gaze distribution on different parts of the face (eyes/mouth) and the presence of dialogue (dialogue/non-dialogue), F(1, 30) = 51.053, p<.001, partial eta2 = .630. This means that when the actors were speaking, participants looked at the eyes twice as much as at the mouth but, when the characters were not speaking, participants looked at the eyes four times more than at the mouth.
Given the specificity of Polish VO, we were interested in exploring whether people looked differently at female and male characters in dialogue scenes only. We hypothesised that given the gender incongruity, viewers may look less at the mouths of female actors compared with the males’. We therefore compared the percentage of gaze distribution on Ilsa’s mouth with that on Rick’s mouth. Indeed, a statistically significant main effect of actors’ gender was found on gaze distribution on the mouth in dialogue scenes, F(1, 17) = 4.516, p = .049, partial eta2 = .21. Contrary to our predictions, however, viewers looked more at Ilsa’s mouth (M = 24.94, SD = 12.36) than Rick’s (M = 17.94, SD = 13.34).
We were also interested in finding out whether gaze distribution was in any way related to the participant’s immersive tendency and English proficiency, but neither of these factors was found to be significant.
7.6.2. Immersion
Given our hypothesis that viewers’ gaze behaviour may be related to immersion, we measured a number of immersion dimensions, as illustrated in Table 4:
Immersion dimension |
Mean |
SD |
ITC-SOPI Sense of physical space |
3.00 |
.88 |
ITC-SOPI Engagement |
3.48 |
.93 |
ITC-SOPI Ecological validity |
3.57 |
1.01 |
Transportation |
4.33 |
1.04 |
Perceptual quality |
4.89 |
1.42 |
Identification with Rick |
3.62 |
1.35 |
Identification with Ilsa |
3.89 |
1.36 |
Note: 1-7 scale, apart from ITC-SOPI, which uses 1-5 scale
Table 4. Descriptive results of various immersion dimensions
All immersion dimensions were moderately high, with the highest scores being obtained in the ITC-SOPI ecological validity and the perceptual quality dimensions.
To explore the differences in immersion depending on participants’ English proficiency, we conducted ANOVA analyses with various immersion parameters as dependent variables and English proficiency as a factor. We thought this may be important because in VO the original voices of the actors are still audible and thus the ability to understand the original language may be an important factor in how a film is perceived. In each immersion dimension, participants with the highest English proficiency reported higher immersion levels (Table 5). However, the differences did not reach statistical significance, merely showing some trends, possibly due to the small number of participants in the lowest proficiency group.
Intermediate |
Advanced |
Proficient |
|
|
|
|||||
Immersion dimension |
M |
SD |
M |
SD |
M |
SD |
df |
F |
P |
η2p |
ITC-SOPI Sense of physical space |
2.53 |
.50 |
2.75 |
.95 |
3.40 |
.77 |
2,32 |
3.145 |
.057 |
.164 |
ITC-SOPI Engagement |
3.06 |
.54 |
3.33 |
1.11 |
3.77 |
.77 |
2,32 |
1.482 |
.242 |
.085 |
ITC-SOPI Ecological validity |
2.86 |
.69 |
3.53 |
1.07 |
3.84 |
.96 |
2,32 |
1.865 |
.171 |
.104 |
Transportation |
3.95 |
.50 |
4.07 |
1.16 |
4.71 |
.96 |
2,32 |
1.859 |
.172 |
.104 |
Perceptual quality |
3.65 |
1.39 |
4.95 |
1.48 |
5.26 |
1.20 |
2,32 |
2.660 |
.085 |
.143 |
Identification with Rick |
3.10 |
.80 |
3.71 |
1.56 |
3.70 |
1.31 |
2,32 |
.417 |
.663 |
.025 |
Identification with Ilsa |
3.65 |
1.52 |
3.88 |
1.33 |
3.98 |
1.36 |
2,32 |
.107 |
.899 |
.007 |
Table 5. Immersion results by proficiency in English
Using an analogous ANOVA test, we also explored the differences in immersion by immersive tendency, as shown in Table 6:
Low immersion |
High immersion |
|
|
|
||||
Immersion dimension |
M |
SD |
M |
SD |
df |
F |
p |
η2p |
ITC-SOPI Sense of physical space |
2.52 |
.69 |
3.56 |
.75 |
1,33 |
17.759 |
.000* |
.350 |
ITC-SOPI Engagement |
3.08 |
.80 |
3.95 |
.86 |
1,33 |
9.520 |
.004* |
.224 |
ITC-SOPI Ecological validity |
3.43 |
.98 |
3.73 |
1.05 |
1,33 |
.716 |
.404 |
.021 |
Transportation |
3.79 |
.92 |
5.00 |
.91 |
1,33 |
12.994 |
.001* |
.309 |
Perceptual quality |
5.04 |
1.21 |
5.09 |
1.60 |
1,33 |
.010 |
.921 |
.000 |
Identification with Rick |
3.30 |
1.37 |
4.21 |
1.28 |
1,33 |
3.460 |
.073 |
.107 |
Identification with Ilsa |
3.54 |
1.33 |
4.59 |
1.35 |
1,33 |
4.672 |
.039* |
.139 |
Table 6. Immersion results by immersive tendency
In line with our expectations, people who are more prone to immersion generally declared higher immersion compared to those who are not easily immersed. However, immersive tendency did not affect ecological validity or perceptual quality, which were high for both groups.
Finally, we wanted to ascertain whether the time spent on the characters’ eyes was related to participants’ immersion levels. Following Romero-Fresco (2020), we predicted that participants who looked more at the mouth would report lower immersion. Using Pearson’s correlation, we correlated the gaze distribution on eyes and mouth with the participants’ immersion levels. We found a moderate negative correlation between time spent on the mouth and perceptual quality, r = -.442, p = .011, and ITC-SOPI ecological validity, r = -.401, p = .023. The more time people spent looking at the characters’ mouths, the lower immersion they reported. On the other hand, there was no significant correlation between the time spent on the eyes and any immersion indicator.
7.6.3. Enjoyment
We know from previous research that immersion tends to be related to enjoyment (Green 2004) and, to verify this phenomenon in the context of AVT, we also tested enjoyment. The results show that the overall enjoyment level was quite high, M = 5.05, SD = 1.43, ranging from 2 to 7.
To explore potential differences across participants, based on their English proficiency, we conducted an ANOVA test with proficiency as a factor. We found a tendency showing that participants with higher English proficiency reported slightly higher enjoyment levels than those with lower proficiency (Table 7). However, the difference did not reach significance, F(2, 32) = 2.678, p = .084, partial eta2 = .143.
Proficiency in English |
M |
SD |
Intermediate |
4.40 |
1.14 |
Advanced |
4.66 |
1.58 |
Proficient |
5.66 |
1.17 |
Total |
5.05 |
1.43 |
Table 7. Enjoyment results by proficiency in English
We explored the data to find out whether participants from the high immersive tendency group reported higher enjoyment when compared to those from the low immersive tendency group. Indeed, we found a main effect of immersive tendency on enjoyment, F(1, 33) = 5.194, p = .029, partial eta2 = .136. People who are more prone to become immersed reported higher levels of enjoyment (M = 5.62, SD = 1.40) than those who are not easily immersed (M = 4.57, SD = 1.30). The results are similar to those in the case of immersion.
Enjoyment correlated positively with ITC-SOPI sense of physical space, r = .471, p = .004; ITC-SOPI engagement, r = .611, p <.001; transportation, r = .531, p = .001; and identification with Rick’s character, r = .367, p = .030.
7.6.4. Comprehension
The overall comprehension, calculated as a percentage of correct answers, was 76.42% (SD = 15.97), ranging from 41.67% to 100%. Comprehension was not affected by participants’ proficiency in English, F(2, 32) = .354, p = .705, partial eta2 = .022, or by their immersive tendency, F(1, 33) = .230, p = .634, partial eta2 = .007. There were no significant correlations with comprehension.
8. Experiment 2
In this experiment, in an attempt at drawing more direct comparisons with the original study by Romero-Fresco (2020), we recalculated the results presented in Experiment 1 so that the gaze distribution on eyes and mouth would add up to the total of 100%, disregarding the gaze on any other areas of the screen.
Following the original experiment, we also conducted one-way ANOVA analyses for the dependent variables of gaze distribution on eyes with dialogue, mouth with dialogue, eyes with no dialogue, and mouth with no dialogue (Table 8). The independent variable was participant group (Polish, English and Spanish). The data for the English and Spanish participants come from the original experiment (Romero-Fresco 2016, 2020):
Polish |
English |
Spanish |
|
|
|||||
M |
SD |
M |
SD |
M |
SD |
df |
F |
p |
|
Eyes with dialogue |
61.42 |
34.97 |
76.18 |
20.07 |
95.00 |
3.54 |
2, 67 |
10.183 |
.000 |
Mouth with dialogue |
38.57 |
34.97 |
23.81 |
20.07 |
4.99 |
3.54 |
2, 67 |
10.183 |
.000 |
Eyes with no dialogue |
76.24 |
26.03 |
82.65 |
14.63 |
85.84 |
6.72 |
2, 67 |
1.585 |
.212 |
Mouth with no dialogue |
23.75 |
26.03 |
17.34 |
14.63 |
14.15 |
20.20 |
2, 67 |
1.585 |
.212 |
* Spanish and English data come from the experiment conducted by Romero-Fresco (2016)
Table 8. ANOVA results for gaze distribution on eyes and mouth by participant group
When it comes to the time spent on eyes in scenes with dialogue, we found a significant main effect of group. Spanish participants spent significantly more time on eyes compared to Polish (p < .001, 95% CI [18.65, 48.50]) and English people (p = .008, 95% CI [4.78, 32.84]. The difference between the Polish and English participants was not significant, p = .186, 95% CI [-34.25, 4.72]. Similarly, there was a significant effect in the time spent on mouth with dialogue. Spanish participants spent significantly less time on mouth compared to Polish (p < .001, 95% CI [-48.50, -18.65]) and English people (p = .008, 95% CI [-32.84, -4.78]. The difference between the Polish and English participants was not significant, p = .186, 95% CI [-4.72, 34.25].
As regards the time spent on eyes and mouth in scenes with no dialogue, there were no significant differences between the groups (Fig. 4), showing that dialogue plays an important role in gaze distribution:
Figure 4. Gaze distribution on eyes and mouth
in dialogue and non-dialogue scenes between groups
When it comes to the declarative time spent on eyes and mouth in dialogue scenes, in both the original and our experiment, participants were asked to state on a scale from 1 to 5 the amount of time they thought they had spent looking at the actors’ eyes and mouths. In order to compare the results with the actual time expressed as percentages, as was done in the original study, we recalculated the answers from the 1-5 scale to percentages (i.e. 5–100%, 4–80%, 3–60%, 2–40%, 1–20%).
All groups of participants thought they had spent more time looking at the eyes than at the mouth, which was consistent with the actual time spent on those areas (Fig. 5):
Figure 5. Actual vs declarative time spent on eyes and mouth in dialogues
What was different, however, was the feeling of how much time they actually spent. The largest discrepancy between the declarative and the actual time spent was found in the case of Spanish participants looking at the mouth, which may show that the dubbing effect is largely unconscious.
It needs to be noted that asking people to report on a 1-5 scale the time they think they spent on eyes and mouth is problematic for a number of reasons, including the fact that while watching they were unaware of the nature of the experiment and were not focussed on their gaze behaviour and its distribution. Some participants declared they had spent 100% of the time (‘5’ on the scale) looking at characters’ eyes, and then in the next question declared spending some time on the mouth too.
Finally, when it comes to immersion, in the original study, an 11-item ITC-SOPI questionnaire was used, including 10 items on sense of physical space, engagement and ecological validity, as well as 1 item on negative effects, which was not used in the replication study. Romero-Fresco (2020) reported that the grand average of all immersion indicators for all participants amounted to 3.6 for English participants and 3.75 for the Spanish (on a 1-5 scale). The Polish average for the three ITC-SOPI indicators is 3.35, showing that immersion for all groups of participants was relatively similar.
9. Discussion
By conducting this study, we were interested in gauging whether Polish viewers avoid looking at actors’ mouths when watching a voiced-over film in the same manner as the Spanish viewers did when watching a dubbed film in the original study. Given the gender incongruity in VO, we also wanted to know whether viewers look less at female characters’ mouths. We also sought to explore the role of endogenous and exogenous factors in gaze behaviour in film viewing. And lastly, we wondered if gaze distribution is indeed related to immersion. In what follows, we discuss the results of our study and compare them to those obtained in the original study. We begin with visual attention, followed by immersion and enjoyment. Finally, we discuss various issues related to replication.
9.1. Visual attention
In answer to our main research question, we found that when watching the voiced-over fragment of Casablanca, Polish viewers did not avoid looking at the characters’ mouths. Our participants spent – proportionally – about 60% of the time looking at the eyes and about 40% at the mouth in scenes with dialogue, while for the English this proportion was about 75% and 25% and for the Spanish 95% and 5%. This means that we did not find what could be potentially called ‘the voice-over effect’.
Given that Polish participants allocated more visual attention to the eyes than to the mouth, our results generally confirm the previously found eye bias (Birmingham et al. 2009b; Birmingham and Kingstone 2009). However, even though the participants generally focused more on the eyes than on the mouth, they still spent the largest amount of time focussing on the mouth out of all the three groups. Interestingly, the percentage gaze distribution of Polish viewers was closer to that of the English viewers watching the original clip than to the Spanish group watching the dubbed version. Statistically, there were no differences in gaze distribution between Polish and English people in the sense that more time was spent looking at eyes in scenes with no dialogue than in dialogue scenes and, analogically, at mouth in dialogue scenes in comparison with those where the character remained silent. For Spanish, the trend was reversed. Such results make us wonder whether voice-over may in fact provide an experience more similar to the one we may have while watching a film originally recorded in our native language, an aspect that could be investigated in further studies.
Apart from the similarities in visual attention patterns of English viewers watching the original and Polish viewers watching the voiced-over version, we also observed differences between the Polish VO and the Spanish dubbing, which may be attributed to dubbing and voice-over being two distinct types of translation: covert and overt, respectively. A covert translation “enjoys the status of an original text in the receiving lingua-culture” (House 2010: 246) and is not perceived as a translation per se. As it functions in the target culture “without co-activating the original’s discourse world” (ibid.), it is therefore more ‘deceptive’ and viewers may be largely unaware of its secondary nature. On the other hand, overt translation does not pretend to be ‘a second original’. On the contrary, in an overt translation, the original is “shining through” (ibid.: 245). In the case of AVT, it can be argued that lip-synching leads viewers to believe that the actors speak Spanish, thus maintaining the illusion that they are watching the original. In contrast, given the presence of the original soundtrack, VO is immediately recognisable as a translation and Polish viewers do not need to avoid looking at the actors’ mouths to maintain the fictional illusion and be immersed in the film.
Our results may also be taken to mean that the eyes and the mouth play different roles in face perception. As shown by previous studies, people focus more on the mouth in a speech perception task and more on the eyes in an emotion judgement task (Buchan et al. 2007). We acknowledge that the highly emotional and romantic nature of the scene from Casablanca – arguably “one of the most recognizable scenes in American film” (Jackson 2000: 34) – may have had an impact on viewers’ gaze distribution by triggering more visual attention to the characters’ eyes than would be the case in a different, less emotional scene.
Our findings also indicate that gaze behaviour is dependent on dialogue. In our study, the nature of the scene determined gaze distribution on the characters’ eyes or mouths: viewers looked more at the eyes in scenes with no dialogue compared to scenes with dialogue, but focused more on the mouth in dialogue scenes. This is in line with the results obtained by Vo et al. (2012), who also observed the important role played by dialogue. In their study, the participants supported their speech perception in dialogue scenes by looking more at the mouth. The focus on the characters’ mouths when they speak shows that gaze control is, at least partially, function-oriented. When viewers are able to infer meaningful information from the mouth, they tend to look there to support their speech perception.
We believe that Polish viewers, particularly those with a high level of English proficiency, may have benefitted from looking at the actors’ mouths to support their speech perception. In contrast, Spanish viewers watching the dubbed version could not infer any meaningful information from the mouth, so they focused largely on the eyes. We also know from previous research that watching a speaker’s lip movements can “dramatically enhance our ability to comprehend words, especially in noisy environments” (Ma et al. 2009: 1). In the presence of noise, where speech is less intelligible, the significance of visual speech information increases. If we consider VO as a sort of ‘noise’, making the perception of the original more difficult by the co-presence of the VO translation, then it may explain why Polish viewers focussed so much on the mouth compared to the other two groups.
One of our initial hypotheses was that owing to the gender incongruity, Polish VO viewers would spend less time gazing at the mouth of the main female character, Ilsa. However, contrary to this hypothesis, they spent more time looking at Ilsa’s mouth than Rick’s. Coincidentally, the immersion levels reported by our participants were higher in terms of their identification with Ilsa than with Rick. We cannot be sure if this effect is related to the actors’ gender, as we only tested one scene in one film with one couple. The result could be attributed to this particular actor, as Ingrid Bergman, who played Ilsa, is “a film star who has been put upon a pedestal as an icon of beauty. [...] Her movie audience would be captivated by the natural and untouched beauty that she projected on screen” (Sabine 2015: 64-66). As noted by Gelley (2008: 28), Bergman “embodied the contradictory qualities of, on the one hand, voluptuousness, assertiveness, and sexuality, and on the other, spirituality, passivity [...], and ‘niceness’”. In Bergman’s film career, she was often shown with “an intensive focus on the face in close-up” (ibid.: 34). Indeed, when directing films starring Bergman, Alfred Hitchcock increased the use of close-ups “to concentrate expression in the micromovements” of Bergman’s face (ibid.: 33). In the scene used in the study, Bergman is also framed in a close-up, placing her face and full mouth in a particularly prominent position, which may explain the larger focus on Ilsa’s face and mouth than on Rick’s.
Finally, it needs to be noted that in our study and in the original one very high standard deviation values in eye-tracking data could be found, particularly in the case of Polish and English viewers. High standard deviation means high data variability, which, in the case of our experiment, may be taken to mean that there were substantial differences in how viewers attended to the images. In other words, viewers’ gaze did not always converge, suggesting that there was low attentional synchrony. While we are not sure how to interpret this result, we believe it may point to the importance of endogenous gaze control, i.e. factors related to the participants and the viewing task rather than the sole characteristics of the stimulus, such as asynchrony between lip movements and utterances.
9.2. Immersion
The dubbing effect draws on the assumption that, to suspend their disbelief and enhance their sense of immersion into the fictional world, dubbing viewers have developed an unconscious strategy of not gazing at the actors’ mouths (Romero-Fresco 2020). Such a hypothesis can be tested by conducting experimental research with viewers and the help of eye tracking and immersion questionnaires, to verify if gazing at the mouth and the eyes is indeed related to immersion. However, it needs to be stressed that both the original and the replication study used relatively short clips, which is an important limitation in immersion research.
Similarly to the original study, where viewers declared a medium-high level of immersion, Polish viewers also reported being relatively highly immersed in the film. However, when correlating eye-tracking data with immersion levels, we found no evidence that gaze distribution on the eyes was related to the viewers’ immersion levels. The only significant result related to gaze distribution and immersion levels was a moderate negative correlation between the time spent on the mouth and two immersion dimensions: perceptual quality and ecological validity. This means that the more time viewers spent on the mouth, the lower perceptual quality and ecological validity they reported.
The same two immersion dimensions received the highest scores out of all immersion indicators tested in our study. Perceived realism and ecological validity were high in both low- and high-immersive tendency groups. Perceptual quality – an important aspect of perceived realism defined as “the degree to which the audio, visual, and other manufactured elements of a media narrative comprise a convincing and compelling portrayal of the reality” (Cho et al. 2012: 832) – may have been high partially due to the fact that the film was black and white. It is possible that the VO version of Casablanca offered a sufficiently convincing portrayal of reality during WWII, making participants perceive the world it featured as real, believable and internally consistent. Whether the nature of voice-over, enabling viewers to hear the original actors’ voices, also contributed to the high score of this immersion aspect cannot be ruled out.
In line with previous research (Green et al. 2004), our study also confirms that enjoyment is positively related to immersion. Participants with higher proficiency in English reported higher enjoyment than those who were less proficient, suggesting that the former group may have relied more on the English original and the latter on the voiced-over translation. This opens up an interesting question as to whether various AVT types trigger different enjoyment levels, be it between one another or in relation to the original.
9.3. Replication
Although it is generally acknowledged that replication is important for the validity of scientific claims (Coyne et al. 2016), it has received very little attention, particularly in AVT, where replication studies are extremely rare. It has been suggested that the reasons for the scarcity of replication in modern science include the negative perception of replication as research that is unoriginal and lacking in novelty; the unfavourable attitude of some editors and the consequent difficulty in publishing such studies; the potential hostility towards the original researchers and the fact that replications may be associated with controversy (Koole and Lakens 2012; Nosek et al. 2012; Coyne et al. 2016).
When discussing replication, it is common to think about direct replication, even though direct replication is very often impossible given the difficulties involved in exactly duplicating all the same variables as in the original study (Coyne et al. 2016). For this reason, conceptual replication is more common. In our study, direct replication was not possible since dubbing is rare in Poland and the clip used in the original study has never been dubbed into Polish. Furthermore, as we were operating with the institutional confines of our university lab, we had to work with a different eye tracker (SMI) than that used in the original study (Tobii).
Unlike direct replications, however, conceptual replications do not assess the validity of previous studies and “cannot disconfirm an original set of findings” (Nosek et al. 2012: 609). Given the departures from the original study, conceptual replications “do not constitute an unequivocal test of the validity of prior findings” (Coyne et al. 2016: 245) and can be used “only to confirm […] the original result, not to disconfirm it” (Nosek et al. 2012: 619). Therefore, the fact that the dubbing effect has not been found in the Polish context does not necessarily disconfirm its existence in a typical dubbing country such as Spain. Last but not least, as stated by Earp and Trafimow (2015: 9), “even carefully-designed replications, carried out in good faith by expert investigators, will never be conclusive on their own.” What is needed is a series of replications, conducted independently of one another by different research teams and labs.
Replicating a study may be in some ways more challenging than conducting an original study from scratch. The replication team needs to make sure that they follow exactly the same protocol as the original team did. Yet, current reporting practices are sometimes insufficient for the replicating team to be able to follow the experimental protocol to the letter. This relates to, for instance, using identical areas of interest, identical pre-processing of eye-tracking data in terms of minimum and maximum fixation duration as cut-off points, or using exactly the same eye-tracking measures, such as fixation time or dwell time. Furthermore, the differences between the original and replication study can make them difficult to compare. For instance, unlike in the original study, in our Experiment 1, the percentages in our data do not add up to 100% (Table 3). This is due to the fact that we believe that viewers do not concentrate solely on actors’ eyes and mouths. From previous research we know that some viewers tend to look at actors’ nose, especially when the character is moving, using it as a kind of spatial anchor (Vo et al. 2012). Therefore, it is only natural that other elements of the depicted scenes, such as objects or background elements also receive some of our visual attention. Even in the case of human faces, eyes and mouth do not hold 100% of one’s gaze. However, in order to be able to compare our results with those of the original study, we had to convert our data to reflect the original study. The results are reported here as Experiment 2 (Table 8).
Finally, we believe that current research and publication practices can be improved thanks to open science, i.e. sharing research data in open repositories, along with experimental protocols and supplementary materials. This would greatly facilitate the replication and reproducibility of research findings. In the spirit of promoting research transparency, the data from our experiment are available in the Repository of Open Data hosted by the University of Warsaw (Flis et al. 2019).
10. Conclusion
By conducting this conceptual replication study, we examined whether the dubbing effect can also be observed in Polish voice-over. We found that unlike Spanish viewers in the original study, Polish viewers did not avoid looking at the characters’ mouths. Furthermore, we found no evidence that visual attention distribution is related to immersion. Yet, as discussed, the non-replication of the results does not, in any way, invalidate the results obtained in the original study itself.
Our study has shown that the visual attention distribution of Polish participants was similar to that of English people watching the film in the original, which suggests that for viewers accustomed to VO, watching a voiced-over film may be an experience comparable with watching the original, at least in terms of visual attention distribution. This may come as a surprise, since VO is often considered “the worst possible method [which can] in no sense maintain or do justice to the quality of the original version” (Dries 1995: 6). Our study may open new avenues to engage with the possible benefits of this AVT practice.
Further research is necessary to fully understand the impact on viewers of the potential incongruity between actors’ lip movements and the dubbed or voiced-over translation. We also need to disentangle the role of endogenous and exogenous factors in gaze control when watching translated films. Other issues to be addressed include the impact of various translation solutions on different types of immersion, the role of immersive tendency and personality traits in the processing of translated films as well as the need for further replication studies in order to deepen our understanding of how viewers process and engage with dubbed and voiced-over films.
Acknowledgements
We wish to thank Pablo Romero-Fresco for his assistance and continued support in conducting this study, for his great patience in answering a myriad of questions related to the original study and for sharing his results with us.
References
- Antonini, Rachele (2008). “The perception of dubbese: An Italian study.” Delia Chiaro, Christiane Heiss and Chiara Bucaria (eds). Between Text and Image: Updating Research in Screen Translation. Amsterdam: John Benjamins, 135-147.
- Birmingham, Elina and Alan Kingstone (2009). “Human social attention: A new look at past, present, and future investigations.” Annals of the New York Academy of Sciences 1156, 118-140.
- Birmingham, Elina, Walter F. Bischof and Alan Kingstone (2008). “Gaze selection in complex social scenes.” Visual Cognition 16(2-3), 341-355.
- Birmingham, Elina, Walter F. Bischof and Alan Kingstone (2009a). “Get real! Resolving the debate about equivalent social stimuli.” Visual Cognition 17(6-7), 904-924.
- Birmingham, Elina, Walter F. Bischof and Alan Kingstone (2009b). “Saliency does not account for fixations to eyes within social scenes.” Vision Research 49(24), 2992-3000.
- Bordwell, David and Kristin Thompson (2010). Film Art: An Introduction. 9th ed. New York: McGraw-Hill Higher Education.
- Buchan, Julie N., Martin Pare and Kevin G. Munhall (2007). “Spatial statistics of gaze fixations during dynamic face processing.” Social Neuroscience 2(1), 1-13.
- Chaume, Frederic (2014). Audiovisual Translation: Dubbing. London: Routledge.
- Cho, Hyunyi, Lijiang Shen and Kari Wilson (2012). “Perceived realism.” Communication Research 41(6), 828-851.
- Cohen, Jonathan (2001). “Defining identification: A theoretical look at the identification of audiences with media characters.” Mass Communication and Society 4(3), 245-264.
- Coyne, Michael D., Bryan G. Cook and William J. Therrien (2016). “Recommendations for replication research in special education: A framework of systematic, conceptual replications.” Remedial and Special Education 37(4), 244-253.
- Di Giovanni, Elena and Pablo Romero-Fresco (2019). “Are we all together across languages? An eye tracking study of original and dubbed films.” Irene Ranzato and Serenella Zanotti (eds). Reassessing Dubbing: Historical Approaches and Current Trends. Amsterdam: John Benjamins, 126-144.
- Dries, Josephine (1995). “Breaking Eastern European barriers.” Sequentia II(4), 6.
- Earp, Brian D. and David Trafimow (2015). “Replication, falsification, and the crisis of confidence in social psychology.” Frontiers in Psychology 6, 1-11.
- Field, Andy (2009). Discovering Statistics Using SPSS (and Sex, Drugs and Rock 'n' Roll). 3rd ed. London: SAGE.
- Flis, Gabriela, Adam Sikorski and Agnieszka Szarkowska (2019). Dubbing Effect in Voice-over. Repository of Open Data. http://dx.doi.org/10.18150/repod.2455071
- Foulsham, Tom and Lucy A. Sanderson (2013). “Look who's talking? Sound changes gaze behaviour in a dynamic social scene.” Visual Cognition 21(7), 922-944.
- Gelley, Ora (2008). “Ingrid Bergman's star persona and the alien space of ‘Stromboli’.” Cinema Journal 47(2), 26-51.
- Gottlieb, Henrik (1998). “Subtitling.” Mona Baker (ed.). Routledge Encyclopedia of Translation Studies. London: Routledge, 244-248.
- Green, Melanie C. (2004). “Transportation into narrative worlds: The role of prior knowledge and perceived realism.” Discourse Processes 38(2), 247-266.
- Green, Melanie C., Timothy C. Brock and Geoff F. Kaufman (2004). “Understanding media enjoyment: The role of transportation into narrative worlds.” Communication Theory 14(4), 311-327.
- Hall, Alice (2003). “Reading realism: Audiences' evaluations of the reality of media texts.” Journal of Communication 53(4), 624-641.
- House, Julianne (2010). “Overt and covert translation.” Yves Gambier and Luc van Doorslaer (eds). Handbook of Translation Studies. Vol. 1. Amsterdam: John Benjamins, 245-246.
- Ioannidis, John P. (2005). “Why most published research findings are false.” PLoS Med 2(8), e124.
- Jackson, Kathy M. (2000). “Playing it again and again: Casablanca's impact on American mass media and popular culture.” Journal of Popular Film and Television, 27(4), 33-41.
- Kim, Taeyong and Frank Biocca (1997). “Telepresence via television: Two dimensions of telepresence may have different connections to memory and persuasion.” Journal of Computer-Mediated Communication 3(2). https://onlinelibrary.wiley.com/doi/full/10.1111/j.1083-6101.1997.tb00073.x (consulted 2.12.2019)
- Koole, Sander L. and Daniel Lakens (2012). “Rewarding replications: A sure and simple way to improve psychological science.” Perspectives on Psychological Science 7(6), 608-614.
- Lessiter, Jane, Jonathan Freeman, Edmund Keogh and Jules Davidoff (2001). “A cross-media presence questionnaire: The ITC-Sense of Presence Inventory.” Presence: Teleoperators & Virtual Environments 10(3), 282-297.
- Leszczyńska, Urszula and Agnieszka Szarkowska (2018). “‘I don't understand, but it makes me laugh.’ Domestication in contemporary Polish dubbing.” The Journal of Specialised Translation 30, 203-231.
- Ma, Wei J., Xiang Zhou, Lars A. Ross, John J. Foxe and Lucas C. Parra (2009). “Lip-reading aids word recognition most in moderate noise: A Bayesian explanation using high-dimensional feature space.” Plos One 4(3), e4638.
- Mital, Parag K., Tim J. Smith, Robin L. Hill and John M. Henderson (2010). “Clustering of gaze during dynamic scene viewing is predicted by motion.” Cognitive Computation 3(1), 5-24.
- Nilsson, Niels C., Rolf Nordahl and Stefania Serafin (2016). “Immersion revisited: A review of existing definitions of immersion and their relation to different theories of presence.” Human Technology 12(2), 108-134.
- Nosek, Brian A., Jeffrey R. Spies and Matt Motyl (2012). “Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability.” Perspectives on Psychological Science 7(6), 615-631.
- Open Science Collaboration (2015). “Estimating the reproducibility of psychological science.” Science, 349(6251), 943.
- Pashler, Harold and Christine R. Harris (2012). “Is the replicability crisis overblown? Three arguments examined.” Perspectives on Psychological Science 7(6), 531-536.
- Pashler, Harold and Eric-Jan Wagenmakers (2012). “Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence?” Perspectives on Psychological Science 7(6), 528-530.
- Romero-Fresco, Pablo (2009). “Naturalness in the Spanish dubbing language: A case of not-so-close Friends.” Meta 54(1), 49-72.
- Romero-Fresco, Pablo (2016). “The dubbing effect: An eye-tracking study comparing the reception of original and dubbed films.” Paper presented at Linguistic and Cultural Representation in Audiovisual Translation (Sapienza Università di Roma and Università degli Studi di Roma Tre, 11-13 February).
- Romero-Fresco, Pablo (2020). “The dubbing effect: An eye-tracking study on how viewers make dubbing work.” Journal of Specialised Translation 33, 17-40.
- Ryan, Richard (1982). “Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory.” Journal of Personality & Social Psychology 43(3), 450-461.
- Sabine, Maureen (2015). “Ingrid Bergman—A modern Magdalene: ‘Saint to whore and back again’ in Casablanca, The Bells of St. Mary's and The Inn of the Sixth Happiness.” Theology & Sexuality 13(1), 63-78.
- Sepielak, Katarzyna and Anna Matamala (1999). "Synchrony in the voice-over of Polish fiction genres.” Babel 60(2), 145-163.
- Sestir, Marc and Melanie C. Green (2010). “You are who you watch: Identification and transportation effects on temporary self-concept.” Social Influence 5(4), 272-288.
- Smith, Tim J. (2013). “Watching you watch movies: Using eye tracking to inform cognitive film theory.” Arthur P. Shimamura (ed.). Psychocinematics: Exploring Cognition at the Movies. Oxford: Oxford University Press, 1-38.
- Smith, Tim J. and Parag K. Mital (2013). “Attentional synchrony and the influence of viewing task on gaze behavior in static and dynamic scenes.” Journal of Vision 13(8), 1-24.
- Stevens, Jeffrey R. (2017). “Replicability and reproducibility in comparative psychology.” Frontiers in Psychology 8, 862.
- Tal-Or, Nurit and Jonathan Cohen (2010). “Understanding audience involvement: Conceptualizing and manipulating identification and transportation.” Poetics 38(4), 402-418.
- Thomas, Laura A., Michael D. De Bellis, Reiko Graham and Kevin S. Labar (2007). “Development of emotional facial recognition in late childhood and adolescence.” Developmental Science 10(5), 547-558.
- Vo, Melissa L., Tim J. Smith, Parag K. Mital and John M. Henderson (2012). “Do the eyes really have it? Dynamic allocation of attention when viewing moving faces.” Journal of Vision 12(13), 1-14.
- Weibel, David, Bartholomaus Wissmath and Fred W. Mast (2010). “Immersion in mediated environments: The role of personality traits.” CyberPsychology, Behavior & Social Networking 13(3), 251-256.
- Witmer, Bob G. and Michael J. Singer (1998). “Measuring presence in virtual environments: A presence questionnaire.” Presence 7(3), 225-240.
- Yarbus, Alfred L. (1967). Eye Movements and Vision (tr. B. Haigh). New York: Plenum Press.
Biographies
Gabriela Flis graduated with a Bachelor’s degree from the Institute of Applied Linguistics at the University of Warsaw, where she has also undertaken the MA Program in translation and interpreting. Being a member of the AVT Lab, her main scientific interests concern audiovisual translation and media accessibility. Professionally, she translates from English and French to Polish as well as Polish to English, and is interested in speech recognition development.
Email: gk.flis@student.uw.edu.pl
Adam Sikorski holds a master’s degree in Applied Linguistics from the University of Warsaw, where he currently is a PhD candidate and a member of the AVT Lab. For his doctoral dissertation, he is examining the acoustic properties of the language of dubbing. He also works as a Spanish and English to Polish subtitler and conference interpreter.
E-mail: adam.sikorski@student.uw.edu.pl
Agnieszka Szarkowska is Associate Professor in the Institute of Applied Linguistics, University of Warsaw. She is the head of AVT Lab, one of the first research groups on audiovisual translation. Agnieszka is a researcher, academic teacher, ex-translator, and translator trainer. Her research projects include eye tracking studies on subtitling, audio description, multilingualism in subtitling for the deaf and the hard of hearing, and respeaking.
E-mail: a.szarkowska@uw.edu.pl
Notes
Note 1:
The original study by Romero-Fresco (2016) was replicated by Di Giovanni and Romero-Fresco (2019) on a group of Italian and English viewers watching a fragment of Grand Budapest Hotel (Wes Anderson 2014). While the original study focused on close-ups, Di Giovanni and Romero-Fresco (2019) examined a different language combination (English to Italian dubbing) and different types of shots in the film.
Return to this point in the text
Note 2:
Todo sobre mi madre, Pedro Almodóvar 1999.
Return to this point in the text