Comparable Corpora and Computer-assisted Translation by Estelle Maryline Delpech

By Estelle Maryline Delpech

Computer-assisted translation (CAT) has regularly used translation stories, which require the translator to have a corpus of prior translations that the CAT software program can use to generate bilingual lexicons. this is challenging whilst the translator doesn't have this type of corpus, for example, while the textual content belongs to an rising box. to unravel this factor, CAT learn has appeared into the leveraging of similar corpora, i.e. a collection of texts, in or extra languages, which care for a similar subject yet usually are not translations of 1 another.

This paintings had fundamental goals. the 1st is to evaluate the enter of lexicons extracted from related corpora within the context of a really good human translation job. the second one goal is to spot bilingual-lexicon-extraction equipment which top fit the translators’ wishes, identifying the present limits of those strategies and suggesting advancements. the writer focuses, specifically, at the identity of fertile translations, the administration of a number of morphological buildings, and the score of candidate translations.

The experiments are performed on language pairs (English–French and English–German) and on really good texts facing breast melanoma. This study places major emphasis on applicability – methodological offerings are guided by means of the desires of the ultimate clients. This booklet is prepared in elements: the 1st half provides the applicative and clinical context of the learn, and the second one half is given over to efforts to enhance compositional translation.

The study paintings offered during this publication obtained the PhD Thesis award 2014 from the French organization for traditional language processing (ATALA).

Show description

Read or Download Comparable Corpora and Computer-assisted Translation PDF

Best software development books

Peopleware: Productive Projects and Teams (2nd Edition)

Of the pc industry's best-selling authors and teachers go back with a brand new version of the software program administration e-book that begun a revolution.

With humor and knowledge drawn from years of administration and consulting event, DeMarco and Lister reveal that the most important problems with software program improvement are human, now not technical—and that managers forget about them at their peril.

Beginning App Development with Parse and PhoneGap

Starting App improvement with Parse and PhoneGap teaches you the way to begin app improvement with Parse and PhoneGap: loose and open resource software program. utilizing the construction block languages of the web--HTML, JavaScript, and CSS--you’ll be in your technique to making a totally operating product with minimum attempt as speedy as attainable.

Stand Back and Deliver: Accelerating Business Agility

Improve basic worth and determine aggressive virtue with management Agility   no matter if you’re major a company, a crew, or a venture, Stand again and bring offers the agile management instruments you’ll have to in achieving leap forward degrees of functionality. This publication brings jointly instantly usable frameworks and step by step methods that assist you concentration all of your efforts the place they subject so much: supplying company price and development aggressive virtue.

Software in 30 days: how agile managers beat the odds, delight their customers, and leave competitors in the dust

A thorough method of getting IT initiatives performed speedier and less expensive than an individual thinks possible

Software in 30 Days summarizes the Agile and Scrum software program improvement approach, which permits construction of game-changing software program, in exactly 30 days. initiatives that use it are 3 times extra profitable than those who do not. software program in 30 Days is for the enterprise supervisor, the entrepreneur, the product improvement supervisor, or IT supervisor who desires to increase software program higher and swifter than they now think attainable. find out how this unorthodox technique works, tips to start, and the way to be successful. keep an eye on possibility, deal with initiatives, and feature your humans prevail with easy yet profound shifts within the thinking.
The authors clarify robust suggestions comparable to the paintings of the prospective, bottom-up intelligence, and why you must fail early—all without hazard more than thirty days.

* The productiveness achieve vs conventional "waterfall" equipment has been over a hundred% on many projects
* writer Ken Schwaber is a co-founder of the Agile software program flow, and co-creator, with Jeff Sutherland, of the "Scrum" method for development software program in 30 days
* Coauthor Jeff Sutherland used to be cosigner of the Agile Manifesto, which marked the beginning of the Agile movement

Software in 30 Days is a must-read for all managers and company proprietors who use software program of their corporations or of their items and wish to forestall the cycle of sluggish, dear software program improvement. Programmers should want to purchase copies for his or her managers and their consumers in order that they will know the way to collaborate to get the simplest paintings attainable.

Extra resources for Comparable Corpora and Computer-assisted Translation

Example text

In this section, we have presented a state-of-the-art of the comparable corpus alignment techniques. 4, we will describe the way in which we have created a CAT prototype that relies on the distributional method to extract bilingual lexicons from comparable corpora. 4. CAT software prototype for comparable corpora processing In the industrial context of L INGUA ET M ACHINA, comparable-corpus extraction is meant to provide impetus for the generation of linguistic resources in emerging fields or fields in which the corporation has very little translation memory.

31 Generalist lexicon: 1,842 English–French pairs extracted from our bilingual dictionary. 45 entries overlap between the two lexica. In accordance with the other research works, we ensured that each term to be translated appeared at least 5 times in the corpus. Translation was carried out from English to French. 2). 3). 5). 6). 2. Extraction of the terms to be aligned The terms to be aligned are extracted from source and target corpora. g. simple words) belonging to the grammatical categories of noun, adjective, adverb and verb, occurring more than five times.

Bilingual syntactic patterns are acquired in three steps from an English–Spanish parallel corpus: 1) Acquisition of the English syntactic patterns on the source part of the corpus, for example: , , <[NOUN] against fraud> 2) Acquisition of the Spanish syntactic patterns on the target part of the corpus, for example: , , <[NOUN] contra fraude> 3) Alignment of the English and Spanish patterns: 24 Average of precisions obtained for a recall level varying between 0 and 1.

Download PDF sample

Rated 4.37 of 5 – based on 48 votes

About the Author