Machine translation and post-editing training as part of a master’s programme
Ana Guerberof Arenas, ADAPT Centre/Dublin City University
Joss Moorkens, ADAPT Centre/Dublin City University
ABSTRACT
This article presents a description of a machine translation (MT) and post-editing course (PE), along with an MT project management module, that have been introduced in the Localisation Master’s programme at Universitat Autònoma de Barcelona in 2009 and in 2017 respectively. It covers the objectives and structure of the modules, as well as the theoretical and practical components. Additionally, it describes the project-based learning approach implemented in one of the modules, which seeks to foster creative and independent thinking, teamwork, and problem solving in unfamiliar situations, with a view to acquiring transferable skills that are likely to be in demand, regardless of the technological advances taking place in the translation industry.
KEYWORDS
Machine translation, post-editing, training, project-based learning, localisation, project management, translation technology, NMT, SMT, RBMT.
1. Introduction
Because of the increase in demand for MT post-editing (MTPE) services from translation clients in the last ten years, localisation agencies have communicated their desire to hire translators and interns with at least some basic technical knowledge in this area. In view of this feedback, some universities have implemented MTPE training as part of their curricula. The Master’s Degree in Localisation from the Universitat Autònoma de Barcelona (UAB) introduced the first MT and PE modules in 2009.
This article presents a description of both the theoretical and practical contents of the PE training module, followed by a discussion of the challenges presented by teaching PE in a classroom. In addition, it describes a project-oriented module in which students actively participated, acquiring knowledge by managing an MT project guided by the trainer and thus learning by doing. Lastly, it details the lessons learned in reflective practice based on the experience of teaching these modules.
This article does not intend to present a theoretical discussion or framework for translation competence, as this topic has been sufficiently explored by other researchers and practitioners in the past. Its objective is instead to share practices with other trainers who might wish to create an MTPE module and with those who have already implemented one, to establish a dialogue regarding current practices and to explore new paths in MT training for future generations of translators.
2. Context
It is necessary to look back a few decades to see how the translation profession has changed with the implementation of increasingly advanced technology, and how universities and translators have adapted to keep pace with these changes.
Thirty years ago, translation and interpreting faculties in Spain included practical exercises that were intended to replicate what publishing houses were doing at the time; that is, most translators were preparing themselves to translate books, manuals, or brochures using an electronic typewriter. The invention of the typewriter in the last part of the 19th century had been the last major change in the way translators worked, and the electronic typewriter represented a minor improvement when compared with the technological revolution that was to come in the following decade.
By the 1990s, computers had been introduced within most translation agencies, meaning that translators with Spanish translation degrees had to independently learn the basic features of word processors and operating systems on the job. There was a considerable gap between expectations in the professional world and the training received at university.
Trados Translator’s Workbench was launched in 1992, and by 1995 it was commonly used in translation agencies and among freelancers when dealing with technical documentation or software components1. This meant that tools were being developed in the industry to ‘recycle’ the work done by translators as the content was highly repetitive, and although they were not well-standardised, they were to become the most widespread translation technology in the 2000s.
Fortunately, most translation and interpreting faculties (in Spain and other countries) began to include technological modules in the Degree and Master’s courses where computer-aided translation (CAT) tools were taught. Nowadays, for example, there is a technological module in the first and third year of the Translation and Interpreting Degree in UAB that covers CAT tools such as MemSource, MemoQ, SDL Trados, as well as the use of macros, desktop publishing tools, project management tools, and the very basic concepts of MT and MTPE.
With regards to MT, and although it had been introduced long before in commercial environments (Vasconcellos and León 1985; Wagner 1985), it was not until the late 90s that the use of MT in combination with PE became commonplace in translation agency jobs, although the quality of the raw output could be very poor and freelance translators often struggled to perform the task. By the mid-2000s, many major software developers such as Microsoft, IBM, Autodesk, and SAP had implemented MT and were requesting trained post-editors of their language service providers (LSPs) to work on this ‘new’ material. This implementation created a need for agencies to find translators who could, or wanted to, post-edit. Customers and agencies were concerned because they were unsure whether existing translators could perform the PE task or if they needed special training that, in turn, the companies were not sure that they could provide.
To some extent, these concerns have been resolved by now, or at least there has been a lot of ground covered through industrial practice and research in Translation Studies and Natural Language Processing. As a result, MTPE may increasingly be found as part of the standard translation workflow in most localisation agencies worldwide (Lommel and DePalma 2016). The MT domain is advancing rapidly (as are many areas in which big data has been applied to artificial intelligence and neural networks), increasing the quality of MT output, and it appears that MT will continue to be a useful aid for translators and general users for the near future.
In response, many universities have seen the necessity, as with CAT tools in the 1990s, of introducing MT and MTPE as part of their curricula for new generations of translation graduates.
3. Previous research on PE training
Despite being a relatively new area in Translation Studies, academia has already focused on MT and MTPE training for translators, and this work has served as the basis for courses and modules in universities as well as in the private sector (such as in localisation agencies).
One of the first articles to describe the need to train translators on PE and the associated skills required was published by O’Brien in 2002. In this article, she argues that a potential module would need to have two components: on the one hand, a theoretical component covering knowledge of MT, along with terminology management, pre-editing and controlled language skills, and programming skills; and on the other hand, a practical component covering PE practice with at least two MT engines, terminology management and coding, controlled language exposition, corpus analysis, and programming (particularly to apply macros).
Since then, other academics involved with training have published related articles. DePraetere (2010) explores the type of PE guidelines that students might need and how these might be different to those received by professional translators. She analyses students’ PE data, observing that stylistic and phraseological changes are kept to a minimum, and that most errors occur in the field of calque or translation loss. Based on this, she suggests that authors of PE guidelines might be well-advised to concentrate on MT error analyses, as students might place too much trust in MT proposals. Rico and Torrejón (2012) explore the skills and role of post-editors, and group these into three different competences: core, linguistic and instrumental competences. Apart from the core competence that would ideally be honed in any translation training effort, the researchers suggest the addition of the ‘right’ attitude towards PE (in that they are open to experimenting with MT as an assistive technology), as well as MT knowledge, term management, MT dictionary maintenance, and basic programming skills.
O’Brien (2012) also suggests broadening the range of translation technology teaching, viewing the “increasing technologization of the profession” not as a threat, but rather “an opportunity to expand skill sets and take on new roles” (118). Pym (2013) argues for the need to rethink training based on identifying the skills needed to work with translation memories (TMs) and statistical MT (SMT) in a professional environment, rather than following a model of competences. He classifies these skills under three headings: learning to learn, learning to trust and mistrust data, and learning to revise with enhanced attention to detail. Doherty et al. (2012) describe a syllabus developed for Master’s students at Dublin City University (DCU) that includes: MT knowledge with an emphasis on SMT, MT evaluation using human and automatic metrics, the use of SMT in translation workflows, and the roles of humans in SMT workflows, complemented with labs where students use the SMT package SmartMATE (Way et al. 2011). The researchers also find that students show increased levels of self-efficacy after taking the course, demonstrating that this type of training ‘empowers’ students to have a confident interaction with this technology.
Doherty and Moorkens (2013) describe their experience of teaching TM and MT lab sessions as part of a translation technology module in DCU. The MT lab sessions cover: MT evaluation, controlled language, PE of MT output from online MT systems, and using Google Translator Toolkit to combine lessons learned from TM and MT classes. They report on the positive outcomes from these lab sessions as reported by the students themselves. Kenny and Doherty (2014) explain the need to engage students in SMT during their training. They give an overview of the way SMT works and how students can be helped to understand this technology. In a parallel article, Doherty and Kenny (2014) report on how an SMT module can be implemented in the classroom. The module covers TMs, but also a brief history of MT, Rule-Based MT (RBMT) and SMT, MT evaluation (both human and automatic), controlled language and PE, as well as ethics, payment, collaboration, the role of the human translator, and translation workflows. Finally, students train an SMT engine themselves, in order to apply the knowledge acquired. Flanagan and Christensen (2014) report on a qualitative study that investigates how trainee translators on an MA course interpret TAUS PE guidelines for producing work of a publishable quality. The students are given two PE assignments to complete using the Google Translator Toolkit, after which a complementary report is to be written. The findings suggest that trainees have difficulties interpreting the guidelines, possibly due to competency gaps, but also due to the wording of those guidelines. The researchers offer an alternative set of PE guidelines, and suggest increased MT training, further focus on translation instructions (i.e. a more detailed translation brief), and more emphasis on language skills. Koponen (2015) reports on her experience teaching an MT and PE course at The University of Helsinki. This article is particularly interesting since Finnish is not a language traditionally considered suitable for MT due to its morphological complexity. The course covers the theory and history of MT and PE, practical use of MT and PE, controlled language and pre-editing for MT, PE without source text, PE process research, PE quality levels and guidelines, MT quality evaluation and PE effort, and PE competences. Koponen reports that students were able, at the end of the course, to understand the basic principles of MT and PE, and that they had a positive attitude towards tools while at the same time being critical during evaluation. Moreover, the course seemed to initiate a change in the students’ attitude towards a more positive opinion of MT, while acknowledging its limitations. There will naturally be a requirement for university courses to incorporate Neural MT (NMT) as it becomes the dominant paradigm2. We discuss NMT further in the penultimate section of this article.
It is worthwhile mentioning that private companies and organisations such as TAUS offer online PE courses in multiple languages to acquire PE skills with a combination of theoretical and practical exercises3. The TAUS course covers the following topics: types of MT systems, evaluation of MT systems, controlled language and pre-editing, PE practice, and setting up an MT project. TAUS also offers a PE course for project managers that aims to assist participants in “understanding the implications of using machine translation, improve productivity and increase speed and ease of translation and learn to setup, run and manage PE projects” as stated in the course webpage4. TAUS published post-editing guidelines (2010) that set expectations for two levels of post-editing (light and full). These levels are also part of the ISO post-editing standard 18587, in which process standards “from qualifications and competencies of post-editors to quality requirements to activities before and after post-editing production” are defined (Muegge 2016).
In summary, researchers suggest that a focus on MTPE enhances overall translation training with a view to joining the workforce as professionals, and to sharing the knowledge necessary for making decisions with regards to MT by being involved with the technology at all possible stages. They also indicate that this training should be eminently practical and include: information on MT history, analysis of several types of engines with special emphasis on SMT (the dominant MT paradigm at the time the above-mentioned works were published), pre-editing and controlled language, understanding levels of PE, MT output evaluation, output error identification, and a considerable amount of PE practice.
4. Localisation Master’s Programme
UAB offers a Master’s degree in localisation (Master de Tradumàtica)5, which began in 2001 with the objective of training students in translation technologies. It offers a combination of classes and internships in companies based locally in Barcelona for 30 students of mixed nationalities. Initially, the programme dealt mainly with TM tools, quality control, desktop publishing, project management, and other aspects of the localisation industry, but it has now evolved to mirror the changes in technologies and processes in the marketplace.
The cohort of students have undergraduate qualifications in translation and interpreting or similar, or are professional translators who would like to acquire technical translation skills. Students should be able to work with English (C1 level), Spanish (C2.2), and Catalan (Level C2.2).
Recently, the programme has been offering training in areas such as terminology databases, SDL Trados, MemoQ, WordFast, Catalyst, localisation engineering, software localisation, video game localisation, content management systems, standard formats, open source applications, image localisation, macros, quality control (including QA tools such as Xbench), error analysis and regular expressions, and project management6. The students’ work daily in company placements and attend afternoon lectures. All classes are conducted in a computer laboratory.
4.1. PE module
As mentioned in the introduction, the programme includes a PE module. When designing the module in 2009, work from O’Brien (2002), pioneering at the time, was used as a reference. The 8-hour PE module is complemented by another 8-hour section on MT, covering both statistical and rule-based engines, as well as new developments such as NMT. In this MT module, students learn:
- The basic principles of MT technology
- The types of engines that exist in the market: RBMT, SMT, hybrids, NMT
- Quality metrics for evaluating raw MT output
- MT output and frequently-occurring errors
- MT engine training and implementation in the localisation workflow
With regards to the PE module, it was devised with the following objectives in mind:
- Acquire basic knowledge of PE to give students a realistic view of the task.
- Expand on different concepts of quality in localisation, not as a universal value but as a ‘granular’ concept depending on customer requirements.
- Acquire knowledge on quality evaluation, and PE evaluation in particular.
- Identify diverse types of PE levels: full and/or light PE depending on customer requests.
- Identify common MT errors in the output of a given engine to learn expected error patterns in order to speed up the PE process.
- Acquire basic knowledge of PE guidelines; how to interpret and create them.
- Acquire basic knowledge of existing rates in the market to be able to negotiate with potential buyers.
- Acquire knowledge of productivity metrics to plan the activity and assess its profitability.
- Get hands-on practice of the PE task.
With these objectives in mind, the module covers the following content:
- Basic definitions of PE
- Concepts
- PE vs. revision
- Post-editor profile
- Controlled language and pre-editing
- Quality
- Quality of the raw output: evaluation metrics and tools
- Expected quality in a PE assignment: human vs. good enough
- Quality of post-edited material: evaluation metrics
- Types of PE
- Light PE
- Full PE
- General rules for PE
- Guidelines for Light PE
- Guidelines for full PE
- Common MT errors
- Examples of terminological errors
- Examples of grammatical and spelling errors
- Examples of syntactical errors
- Examples of punctuation and style errors
- PE effort and productivity
- Temporal PE effort
- Cognitive PE effort
- Technical PE effort
- PE and pricing
- Different payment methods and negotiation
- Practical exercises throughout the course
- Translation of a technical text without any aid.
- Compare the translated text with different outputs from different engines finding error patterns.
- Controlled language exercises using a set of controlled language rules.
- Monolingual PE (as in Koponen 2015).
- PE using an online tool (MateCat) following PE guidelines7.
- PE using a standalone tool (SDL Trados) following PE guidelines.
The most important aspect of the training is for students to feel comfortable with the task prior to their internship and subsequent professional life. This approach – that is, to have a realistic and open-minded outlook on any translation or PE task – will provide them with sufficient information to decide if this activity can be integrated as part of their future professional skill-set or if they prefer to focus on other skills or profiles.
The students learn how MT engines work in order to avoid high levels of frustration when confronted with the same output errors repeatedly (as opposed to translation with TM, whereby an error fixed once may be considered solved for future leverage). They learn how to spot error patterns (for example issues with gender, tenses, cases, word order, product names, wrong words in context, active and passive voices, and forms of address) and anticipate the errors produced by a given engine by just looking at the source text, thereby allowing them to work faster. To this end, controlled and pre-editing exercises are included where the students change the source language using a set of controlled language rules to see how the engine behaves with each change.
Another important aspect of PE training is for students to understand that productivity increases are expected for a PE task when compared to unassisted translation, and that this is primarily related to the quality of the raw MT output. Over-editing should be avoided, even when publishable quality is required. Unfortunately, there are very few revision assignments as part of translation degrees, so students tend to think that any edit is valid as they are accustomed to editing their own work. There is also the idea that more edits improve the quality of a translation, so the course helps students to identify the point beyond which editing is not necessary. This is often true of technical translation, in which other aspects, such as terminology or accuracy, are of more importance than style.
Students are also familiarised with the fact that productivity in PE tends to improve as the knowledge about an engine – and its common errors – increases. They are encouraged to measure their own productivity to ascertain whether PE is profitable. Every translator (and student) is different, and therefore it is important to know what sort of assignments are suitable for each profile. Ultimately, TM or MT can be useful tools, but the price paid for the activity needs to compensate the effort. At the same time, trainers need to make students aware that translation agencies might be very optimistic regarding the quality of the raw MT output provided. It is advisable to ask for a sample to confirm that quality is as promised before committing to a deadline or to a set discount in a PE task.
During the module, students try different tools: SDL Trados Studio with human TM and MT-produced TMs, as well as online CAT tools like Google Translator Toolkit or MateCat, following specific guidelines and glossaries provided for the assignment.
In brief, the course allows students to become accustomed to PE assignments, to judge when or when not to edit, to ask the right questions of their potential clients and/or employer, to understand that there are various levels of quality expected by clients, and that these levels can determine the effort and payment rate to be expected8. At the end of the module, the student should understand that although PE is different from translation or human revision, it can be part of their skill-set. The idea that translators should be able to “learn to revise translation texts” (Pym 2013: 11) is very present in the module design.
4.1.1. PE assessment
Module assessments vary depending on the language combinations present in the classroom. Although the ideal situation would be to assign PE tasks and assess them (either individually or in a group), the reality is that Master’s programmes often have students with varied and unpredictable language combinations. One option is to conduct the assessment through a PE assignment accompanied by a brief report (as seen in Doherty and Moorkens 2013 and Koponen 2015). Even if different languages are used, the report will show if concepts have been assimilated. Another alternative is to assess through a series of open questions to gauge students' understanding of theoretical and practical components. Finally, and depending on the amount of time available (within the classroom and without), assessing through closed questions using a questionnaire may be the optimal route.
4.1.2. Challenges and evolution
Teaching PE in a classroom presents certain challenges, particularly where students have mixed language combinations or very little or no localisation experience. Teaching PE might be premature for students who have not reached an expert level in translation or localisation or for those who have no revision or editing experience, and hence no concrete knowledge of error classification. Finally, students might have different professional expectations and a minimal interest in MTPE or MT technology.
It is important to introduce basic and clear concepts while practising short ‘real-life’ exercises; to alternate individual, pair, and group tasks and to present genuine business situations (MT evaluation, controlled language exercises, translation revision, error identification with a QA form, etc.). It is also important to adapt the structure of the class, for example, to present students first with a practical exercise, so they can arrive at their own conclusions and work on problem solving and independent thinking, and only then, if necessary, explain the underlying theory. For example, it is usually best not to explain error classifications and then to hand out a practical exercise, but rather to let students find and classify errors and then draw the common findings into a general classification.
In the years since the first MT and PE modules were introduced in 2009, students’ attitudes have changed. In the beginning, and even though the programme dealt with technology, students were generally reluctant to engage with and suspicious of the PE task. In recent years, however, most students have become accustomed to free online MT engines for their daily personal and academic life, and they tend to see MT and PE as tools they can avail themselves of to work faster or to look for alternative translations. Some students are already using MT as freelance translators or as interns, others are positively surprised by the results when applied in a simulated working environment (they might have used MT only to find mistakes rather than considering how it could be useful). Others are sceptical about the use of MT in a human translation workflow, assuming that such a workflow could never offer the same level of quality as a human translator.
Over the years, the use of MT in translation training has been normalised (even if it is not the case in all institutions). However, quality is always a controversial topic and interesting points of view often emerge in the classroom; there are those who believe that quality can only mean perfect linguistic quality, those who are more focused on what the client wants, and even those who say that they cannot refrain from making corrections. These are understandable attitudes, especially when no payment is involved. However, most students understand that it is necessary to learn about MT and PE even if they would prefer to translate without it. After all, this core technical competence is highlighted by the EMT Expert Group (2017), and is a major reason why many enrol in translation programmes, from our experience in the classroom.
It is possible that the challenges we face now are that students or novice translators might tend to accept errors (calques) because they have become used to MT and to social media (as reported by DePraetere 2010) or they might accept NMT proposals because the text reads fluently, even where there are mistranslations. Therefore, training should place emphasis on basic quality principles, types of errors considered acceptable depending on different quality levels as requested by the client, as well as the basic writing and grammar knowledge necessary to be a proficient translator - skills that are at the core of the traditional translation training.
4.2. Machine translation project management module
In 2016, it was decided to include a 16-hour MT project management section as part of the existing Project Management module, as students would be increasingly exposed to projects that incorporate MT in their professional lives after graduation. The Project Management module already covered the basic theory and practice of managing a localisation project, and, at the end of the school year (when the module takes place), students have usually a considerable amount of knowledge during the internships and classes. Moreover, precisely because the module happens at the end of the academic year, students tend to be extremely busy preparing their final dissertation assignment while working on internships (and often applying for their next jobs). Therefore, and inspired by the work on project-based learning by Kiraly (2005) that looks to “empower” students through “the collaborative undertaking of complete translation projects for real clients” (1102), it was decided that the best option was to carry out a real-life project so that the students could put together the theory and practice learned in other modules, as well as combining the short PE exercises into one single real project. In other words, the students would have to consult previous material, resolve new situations, collaborate with each other, be creative and think for themselves, and this would help them assimilate all the information given during the programme, while using their experience in the internships. Similar ideas have more recently been proposed by others such as Buysschaert et al. (2017). As Pym argues when referring to the new role of translators: “What they need is great target-language skills and highly developed teamwork skills” (Pym 2013: 6).
This type of training requires intensive initial preparation, so that the assignment is clear, complete, instructive, and that students can finish the project within the time allocated (again, 16 hours in this case). The trainer acts as a project manager when preparing classes, and as a coach or mentor during the classes. The content of the course is divided into three phases as is common in localisation projects: Analysis or Pre-Production, Execution or Production, and Completion or Post-production. Each phase covers the following topics.
- Initiation and analysis of the project (approximately 6 hours)
- Introduction to project management and the peculiarities of MT projects: MT evaluation, productivity calculation, fluency and adequacy evaluations, error typology.
- Use an online MT provider to analyse the productivity and quality of an MT engine.
- Use an online MT provider to analyse the productivity and quality of the MT engine for the selected texts.
- Compare with other free online engines if necessary.
- Project execution (approximately 7 hours)
- Distribute the translation.
- Perform word counts on files.
- Devise initial questions for the client, if necessary.
- Spot-check or quality review.
- Project completion (approximately 3 hours)
- Final QA (using ApSIC Xbench)
- Invoices (invoices for the client and the translators, calculation of the internal budget).
- Productivity analysis
- Post-mortem: what went well, what went wrong in the project
The source texts that were used for the projects were the User Guide for an online CAT and MT provider (MateCat User's Guide)9, and the book Post-editing of Machine Translation for Project Managers by Luigi Muzii (2017)10.Both texts were chosen because they were technical texts that also dealt with the two main topics of the module: MT and Project Management. A contact at MateCat agreed to send all students a certificate of accomplishment. The tool is free of charge, as is its user guide. MateCat also uses Google Translate as its default MT engine (using NMT for the English to Spanish language combination).
The online MT provider contracted is KantanMT because it offers a user-friendly interface for engine customisation, quality evaluation, and quality improvement (by customising the engine with terminology or by creating regular expressions). Kantan, based in Dublin, also offer on-line training. However, the plan is for students to choose the default engine for the CAT tool selected (MemSource or MateCat, for example), meaning that Microsoft Translator and Google Translate could potentially be used if their quality was deemed to be better than that offered by the Kantan engine (pretty much as described by Pym in his “learn to learn” approach (2013: 9).
Since the number of students in the first cohort was 30, 5 teams of 6 people were created. The language combination was English to Spanish, as this was the language combination shared by most students in the group. However, because there were different activities, students from other countries (such as those from Austria, Italy, and Brazil) could participate, and even post-edit as the work would subsequently be reviewed. Each team had to divide the tasks according to the volume of words and deliverables. This was left to them to organise; the idea was, on the one hand, to mimic a real-world task, and on the other hand, to incentivise their creativity and independent thinking, although the trainer could give advice if the group was uncertain about the project organisation. Every project is a unique endeavour, and as such the students learned that although there are phases and theories of project management, every project has a life of its own. The original User Guide was divided into three sections, and the book into two sections. Therefore, each group had a volume of approximately 5,000 words. These were divided among each member of the group. This volume was chosen because of the time available to complete the project.
The 5 teams were required to do the following tasks.
- Analysis
- Familiarisation with Kantan: engines, quality evaluation model.
- Connect the Kantan Application Program Interface (API) to the chosen CAT tool.
- Pre-translate the document in the chosen CAT tool using the selected engine.
- Export the document to a Word or text file.
- Choose a sample from the document.
- Evaluate the sample with Language Quality Review (LQR) in Kantan (Quality Evaluation and/or AB Test)
- From the analysed data, decide a price (MT discount) and calendar (schedule) for the ‘client.’
- Production
- Create the project PE guidelines (using the PE guidelines provided in the PE module or other guidelines).
- Include error samples using the data from the LQR analysis.
- Add glossaries (from online resources, such as Microsoft online glossaries) to the project.
- Decide on the tool on the basis of their experience, MT quality and availability (MemSource, Trados, MemoQ or MateCat).
- Post-edit the texts.
- Review the texts.
- Completion
- QA the texts (internal tools) using Xbench with regular expressions (there was a previous class on this tool, so most students had a check-list for the English-to-Spanish language combination).
- Spellcheck the texts.
- Visual QA of the final documents.
- Export the memories.
- Create a zip file with the deliverables and send to the customer (trainer).
Initially the tasks seemed daunting for the students. They were unsure whether they could complete the project in the 16 hours available. It was necessary to reassure them that it was possible and that if, for whatever reason, the project was not to be completed in the available time, as happens in real projects, the teams could ask for an extension.
From the moment the project started, that is, after a brief introduction to management of MT projects and the detailed explanation of the project brief, students divided themselves into teams and followed the assigned steps carefully. Emphasis was placed on the schedule so that the students could communicate amongst themselves and find solutions appropriate to the time they had left. The teams that were working on the same projects discussed how to solve certain issues, for example how to approach cross-references. On other occasions, solutions were very different for each team even though the result (i. e. the target text) was of a similar nature. Also, unforeseeable events occurred that caused team stress. For example, a couple of students were sick, leaving one post-editor fewer to complete the PE task, necessitating it to be reassigned. For another team, the chosen engine did not give the expected results, which meant preparing another MT evaluation report with an alternative engine. One group forgot to manage a common TM and had to retrospectively create a common memory. At all times, the trainer asked students not to worry about the mistakes initially, but to focus on the solution, as in real-life projects. Creative solutions were found via discussions with all the team members, with other teams, or with the trainer. It was necessary throughout to remind the teams of tasks that were to be completed at the end of each day, so that they could plan the pending tasks for the following class.
It was important to show the students that the project was doable, and that working as a team would increase efficiency. For example, two team members could create the PE guidelines while some post-edited and others finalised the pricing structure. The students were reminded always to think creatively, and not to worry about the way things should be, but the way things were in the project. The trainer moved around the lab helping with planning, pricing, MT, translation, and PE issues. The students were regularly reminded by the trainer of the work still at hand. If, for example, a team was spending too much time on a task or if additional information was needed to understand the task, the trainer addressed these issues. Alternatives were discussed so that there was a lot of interaction in the class towards a common goal. The class was lively and, at the same time, very focused on completing the project.
At times, it seemed that the projects would not be successful, but the five teams delivered all the material at the agreed deadline (aside from one team, which delivered later that same day). All files were post-edited, reviewed, and quality-assured by the various team members.
Unfortunately, there was no time left to carry out a thorough survey or retrospective analysis to review the students’ experiences. Several participants made comments expressing their satisfaction, and during the class it was evident that students utilised notes and reused material from other classes in the programme. It seemed a perfect way to finish the year by putting all knowledge, theoretical and practical, into real use.
4.2.1. Machine translation project management evaluation
Table 1 shows the evaluation criteria presented to the students at the beginning of the project. All the students in a team received the same grade. In Spain, students are rated from 0 to 10. In the table below each task is assigned points, by completing correctly the tasks the group of students could obtain a mark of 10.
Additionally, the students could deliver the internal project budget (the relationship between benefits and costs, and gross margin) if they wanted to gain bonus points.
Task |
Points |
MT quality evaluation report |
2 |
Sales proposal in Excel specifying MT discount |
2 |
PE Guidelines in Word |
1.5 |
Final translation in Word |
2 |
Quality evaluation report (from the tools used) |
1 |
Word-counts and memories (.csv and .tmx) |
1.5 |
Total |
10 |
Table 1. Evaluation criteria.
5. What about neural?
For those of us who experienced the transition between (and often combined uses of) RBMT and SMT from a translator’s perspective, the new paradigm of NMT is familiar. We face it with optimism for a reduction in post-editing effort but also scepticism, as translation is a difficult computational problem to solve.
The initial results of NMT quality evaluation using automatic metrics are indeed encouraging (Bahdanau et al. 2014; Bojar et al. 2016; Burchardt et al. 2017; Sennrich et al. 2016). However, studies that analyse the post-editing effort (Bentivogli et al. 2016; Castilho et al. 2017; Popović 2017; Toral et al. 2018) are more cautious and conservative. After all, translation is a complex activity that involves many languages and domains. Generalisations in translation or in natural language processing are invariably wrong. This means that the hype around NMT, caused partially by the media, that states that translators will become redundant before we know it, has not really found the same resonance in academic circles. Moreover, NMT is at an early stage of development.
Will PE training change completely because NMT delivers more fluent sentences? We do not think so, at least at present. Even if fluency is better in NMT output when compared with SMT, this fluency can be deceiving because even if sentences are grammatically correct, the content might be wrong. This is a known weakness in NMT systems. The output is far from perfect and still presents errors such as omissions, additions, terminology mismatches and mistranslations, not to mention the usual difficulties in translating long sentences (over 25 words). In fact, the project presented here was carried out using NMT provided by free online engines (Google Translate and Microsoft Translator), and, as explained above, students had to correct many errors in order to achieve a high quality, publishable translation. It could be that interaction with NMT will develop over the coming years, but until that point, the framework presented here is perfectly suited to NMT.
Naturally, as the technology evolves, PE courses need to adapt to the characteristics of the technology available. CAT tools have changed significantly since their inception, however this does not mean that the framework for CAT tool training is no longer valid. It might not be necessary to train an SMT engine, although at present the technology is still widely used in the localisation industry, and it helps students to better understand how an engine works. It could be that pre-editing or controlled language exercises should be revised, although by analysing the source text and its effect in MT output, students appreciate the importance of source text quality, and the difficulties presented by abstract language. If we want to spark curiosity and creativity in the students, then yes, all these activities appear to foster these qualities (as reported in the PE training literature).
What if PE becomes unnecessary because NMT output is perfect? This question is almost a tautology. If PE becomes unnecessary, the authors of this article will recommend not to include it in any training as common sense dictates. However, and insofar as the output quality requires human interaction to offer clients a high-quality product, PE training will be necessary, and increasingly so. We have not yet seen evidence demonstrating that NMT offers perfect output in any language pair.
6. Conclusions
Translation is, above all, a practical task. Therefore, universities offering translation and interpreting as a degree need to constantly adapt to the industry and modify their syllabi so that students are well-prepared to face the professional world. The faster that universities implement these changes the better, preferably at a later stage of translator training, as students need to acquire a set of core skills prior to acquiring new or more ‘sophisticated’ skills.
Universities are increasingly including PE training in the final years of their translation degrees or as part of technical translation or localisation Master’s programmes. Such training helps improve students’ flexibility regarding this task and change misconceptions about MT. At the same time, and more importantly, such training provides students with the tools necessary to make their own informed decisions about the type of work they can and want to do in the future, and to negotiate rates or deadlines with possible clients or employers.
Therefore, translator training should ensure that students have sufficient core skills, knowledge, and self-confidence to be involved in MT activities and PE cycles, so that decisions in this domain are not solely made based on considerations of profitability or engineering abilities, but also the experience of the people ultimately using this technology.
As MT evolves, syllabi need to adapt to innovations without disregarding attention to the core skills that any translator needs to succeed in a language-related activity. Especially as technology improves, as NMT has in the last couple of years, it is also important to focus on those skills that differentiate humans from machines: to increase the creative aspects of the training, not only with creative writing, which should be compulsory in translation training, now more than ever, but also with project-focused activities that help students to think innovatively rather than follow a set of given instructions to perform a set of given tasks (Massey 2017). Moreover, the type of project-based activities described here help students to really understand the concepts taught within a variety of modules, and apply them with their own criteria. It is also important that they think critically about concepts that might be obsolete or no longer needed, thus creating concepts and processes of their own, especially at the later stages of their training, following the simple-complicated-complex progression described by Kiraly, Massey and Hofmann (2018). The trainer also benefits from this exchange. It cannot be expected that trainers can learn all of the existing technology at a fast pace and at a proficient level, therefore it is advisable to replace a more central role for the trainer with a role that seeks to guide and to learn with and from the students while also sharing their experience with them. To this end, the trainer will spend more time preparing and creating the course initially, so that during the class, they can take a step back and help students to resolve unexpected issues, organise teams, and cope with unfamiliar situations.
Acknowledgements
We would like to thank Pilar Sánchez Gijón and Adrià Martí Mor at Universitat Autònoma de Barcelona for their support in implementing these ideas for the creation of the Master’s module.
We also thank Laura Casanellas and Carlos Collantes from Kantan, Alessandro Cattelan and Soraya Tikarli from MateCat, and Luigi Muzii for their assistance with this article.
Funding
This article was written under the Edge Research Fellowship programme that has received funding from the European Union’s Horizon 2020 and innovation programme under the Marie Sklodowska-Curie grant agreement No. 713567, and by the ADAPT Centre for Digital Content Technology, funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund.
References
Biographies
Ana Guerberof-Arenas is an Edge/Marie Sklodowska-Curie Research fellow at Dublin City University and ADAPT Centre working on the impact of language in human-computer interaction. She has worked as a translator, project manager, vendor manager and operations manager in the localization industry, as well as technical and scientific translation, and translation technologies lecturer. She has authored articles and chapters on MT post-editing productivity, quality and experience; pre-editing and post-editing; and reading comprehension of MT output.
E-mail: ana.guerberof@dcu.ie
Joss Moorkens is an Assistant Professor at the School of Applied Language and Intercultural Studies at Dublin City University and a researcher affiliated with the ADAPT Centre and the Centre for Translation and Textual Studies. He has authored articles and chapters on translation technology, MT post-editing, user evaluation of MT, translator precarity, and translation technology standards. He coedited the book ‘Translation Quality Assessment: From Principles to Practice’, published by Springer.
E-mail: joss.moorkens@dcu.ie
Note 1:
Brief TRADOS history http://www.sdltrados.com/about/history.html
Return to this point in the text
Note 2:
The addition of NMT-related exercises has begun, as demonstrated in Moorkens (2018).
Return to this point in the text
Note 3:
TAUS post-editing course https://www.taus.net/academy/taus-post-editing-course
Return to this point in the text
Note 4:
TAUS post-editing for project managers course https://elearning.taus.net/course/index.php
Return to this point in the text
Note 5:
Master programme (in Spanish) http://www.uab.cat/web/estudiar/la-oferta-de-masteres-oficiales/plan-de-estudios/plan-de-estudios/x-1096480309783.html?param1=1345695508762
Return to this point in the text
Note 6:
Curricula for the Master de Tradumàtica https://www.uab.cat/web/estudiar/la-oferta-de-masteres-oficiales/plan-de-estudios/guias-docentes-1345657362859.html?param1=1345695508762
Return to this point in the text
Note 7:
These exercises can be performed using SMT or NMT as current tools offer both options.
Return to this point in the text
Note 8:
The efficacy of a slight variation on this module was tested by Blagodarna (2018) using qualitative and quantitative measures. She found that post-editing productivity increased after training for all participants, and that a positive predisposition towards MT positively influenced PE performance.
Return to this point in the text
Note 9:
MateCAT User Guide https://www.matecat.com/support/
Return to this point in the text
Note 10:
Luigi Muzii agreed to share the proceeds of the Spanish translation of his manual, and that students’ participation in the translation would be recognised. The manual in Spanish was published from June 2017 until May 2018.
Return to this point in the text