By Estelle Maryline Delpech
Computer-assisted translation (CAT) has regularly used translation stories, which require the translator to have a corpus of prior translations that the CAT software program can use to generate bilingual lexicons. this is challenging whilst the translator doesn't have this type of corpus, for example, while the textual content belongs to an rising box. to unravel this factor, CAT learn has appeared into the leveraging of similar corpora, i.e. a collection of texts, in or extra languages, which care for a similar subject yet usually are not translations of 1 another.
This paintings had fundamental goals. the 1st is to evaluate the enter of lexicons extracted from related corpora within the context of a really good human translation job. the second one goal is to spot bilingual-lexicon-extraction equipment which top fit the translators’ wishes, identifying the present limits of those strategies and suggesting advancements. the writer focuses, specifically, at the identity of fertile translations, the administration of a number of morphological buildings, and the score of candidate translations.
The experiments are performed on language pairs (English–French and English–German) and on really good texts facing breast melanoma. This study places major emphasis on applicability – methodological offerings are guided by means of the desires of the ultimate clients. This booklet is prepared in elements: the 1st half provides the applicative and clinical context of the learn, and the second one half is given over to efforts to enhance compositional translation.
The study paintings offered during this publication obtained the PhD Thesis award 2014 from the French organization for traditional language processing (ATALA).
Read or Download Comparable Corpora and Computer-assisted Translation PDF
Best software development books
Of the pc industry's best-selling authors and teachers go back with a brand new version of the software program administration e-book that begun a revolution.
With humor and knowledge drawn from years of administration and consulting event, DeMarco and Lister reveal that the most important problems with software program improvement are human, now not technical—and that managers forget about them at their peril.
Improve basic worth and determine aggressive virtue with management Agility no matter if you’re major a company, a crew, or a venture, Stand again and bring offers the agile management instruments you’ll have to in achieving leap forward degrees of functionality. This publication brings jointly instantly usable frameworks and step by step methods that assist you concentration all of your efforts the place they subject so much: supplying company price and development aggressive virtue.
A thorough method of getting IT initiatives performed speedier and less expensive than an individual thinks possible
Software in 30 Days summarizes the Agile and Scrum software program improvement approach, which permits construction of game-changing software program, in exactly 30 days. initiatives that use it are 3 times extra profitable than those who do not. software program in 30 Days is for the enterprise supervisor, the entrepreneur, the product improvement supervisor, or IT supervisor who desires to increase software program higher and swifter than they now think attainable. find out how this unorthodox technique works, tips to start, and the way to be successful. keep an eye on possibility, deal with initiatives, and feature your humans prevail with easy yet profound shifts within the thinking.
The authors clarify robust suggestions comparable to the paintings of the prospective, bottom-up intelligence, and why you must fail early—all without hazard more than thirty days.
* The productiveness achieve vs conventional "waterfall" equipment has been over a hundred% on many projects
* writer Ken Schwaber is a co-founder of the Agile software program flow, and co-creator, with Jeff Sutherland, of the "Scrum" method for development software program in 30 days
* Coauthor Jeff Sutherland used to be cosigner of the Agile Manifesto, which marked the beginning of the Agile movement
Software in 30 Days is a must-read for all managers and company proprietors who use software program of their corporations or of their items and wish to forestall the cycle of sluggish, dear software program improvement. Programmers should want to purchase copies for his or her managers and their consumers in order that they will know the way to collaborate to get the simplest paintings attainable.
- Creating a Data-Driven Organization
- Constructing Correct Software: The Basics, Edition: 1st Edition
- Performance Tuning für Oracle-Datenbanken: Methoden aus der Praxis für die Praxis (X.systems.press) (German Edition)
- Computer and Information Science 2015, 1st Edition
- Interview Secrets Exposed, 1st Edition
- Leadership, Teamwork, and Trust: Building a Competitive Software Capability (SEI Series in Software Engineering)
Extra resources for Comparable Corpora and Computer-assisted Translation
In this section, we have presented a state-of-the-art of the comparable corpus alignment techniques. 4, we will describe the way in which we have created a CAT prototype that relies on the distributional method to extract bilingual lexicons from comparable corpora. 4. CAT software prototype for comparable corpora processing In the industrial context of L INGUA ET M ACHINA, comparable-corpus extraction is meant to provide impetus for the generation of linguistic resources in emerging ﬁelds or ﬁelds in which the corporation has very little translation memory.
31 Generalist lexicon: 1,842 English–French pairs extracted from our bilingual dictionary. 45 entries overlap between the two lexica. In accordance with the other research works, we ensured that each term to be translated appeared at least 5 times in the corpus. Translation was carried out from English to French. 2). 3). 5). 6). 2. Extraction of the terms to be aligned The terms to be aligned are extracted from source and target corpora. g. simple words) belonging to the grammatical categories of noun, adjective, adverb and verb, occurring more than ﬁve times.
Bilingual syntactic patterns are acquired in three steps from an English–Spanish parallel corpus: 1) Acquisition of the English syntactic patterns on the source part of the corpus, for example: