Executive Summary
Global corporations must overcome the "language barrier" in order to sell their products to overseas markets. Providing up-to-date documentation in multiple foreign languages is a challenging task and companies have to deal with a number of complex issues, including definition and maintenance of industry-specific jargon, scarcity of competent human translation resources, time-to-market of translated publications, quality/consistency issues, and translation/localization costs. Thus, many companies have committed sizeable resources to implement Controlled English writing, in order to improve the readability and intelligibility of their technical publications.
Based on their respective experiences, BridgeTerm and its partners have found that efforts spent on English to Controlled-English conversion also cover a substantial part of the translation process, actually providing an "unambiguous understanding" of the source text. The benefits are more than just speeding up the translation process, since one major weakness of machine translation systems is, in fact, the lack of an adequate comprehension of the regular language. Thus, starting from a source text in Controlled English, existing translation engines can provide improved results. BridgeTerm's concept goes even farther by including a translation feedback loop enabling an English-speaking editor to assess the "translatability" of the source sentence and tweak it until the retranslated sentence is close enough to ensure that the foreign language rendering is correct. This radically new paradigm, by which translation becomes a by-product of the authoring/writing process, is the essence of a new product called Trampolino.
On a practical level, a custom solution has to be developed for every specific environment to take into account various input/output formats, interface terminology and translation memory databases, adapt software tools, etc. However, this development is straightforward, since most of the required components are available. The Trampolino concept can be compared to the modular customization approach SAP has been promoting with worldwide success in the field of Enterprise Resources Planning (ERP).
A Trampolino project should be consolidate all your current efforts in the field of language standardization and documentation cycle streamlining. BridgeTerm is prepared to assist, provide directions, and/or manage your resources involved in Controlled English implementation and terminology standardization. Benefits in terms of improved communication, cost-reduction and shorter time-to-market make Trampolino the right answer to the linguistic challenges of the global market.
Background
Technical publications produced by writers/authors follow rather loose rules leaving a lot of stylistic freedom. Some industries have recognized the requirement of a "constrained-approach" based on controlled-English writing rules, in order to improve both readability and translatability. This trend is increasing in the aircraft manufacturing, civil engineering machinery, and farm equipment industries. [Caterpillar has invested over US$20 million in its Caterpillar Technical English (CTE) project]. Others will certainly follow under the pressure of globalization and legal liability issues. Such a concept is based on the obvious realization that a well-written text is easier to read, to understand and, ultimately, to translate.
Training, Marketing, Legal, Public Relations and other internal publication departments of multinational corporations have traditionally used a "blind approach" to foreign language publications, basically consisting of sequential authoring, review and approval steps in the source language, before even considering translation to one or more foreign languages. This process is not only inefficient in terms of overall costs and turnaround time, but also misses one key point: the translation process is also a review step which may help catching mistakes overlooked at earlier stages.
By capitalizing on the feedback from the translation step (human or machine translator) to the writing step, the integrated concept described here achieves major gains in terms of cost reduction, time-to-market, and overall accuracy. Proven software solutions are used and most of the development effort is invested in integrating various software tools and building the required interfaces. This modularity is beneficial since new technologies (particularly in the field of machine translation) will easily be incorporated later, as they become available. To summarize, the proposed approach leads to customized language solutions modeled after SAP's philosophy of modular ERP "building blocks".
The Development Team
BridgeTerm, a division of BridgeTerm, Linguistique TEST and Smart Communications have committed their respective expertise in linguistics and knowledge management skills to improving multilingual communications.
- InfoGraffiti, through its BridgeFax division, is highly successful in providing company-wide fax server solutions to large organizations, such as Montreal Stock Exchange, Bank of Montreal and Nortel. BridgeTerm, another division of InfoGraffiti, has developed innovative linguistic tools for terminology extraction, fuzzy logic phrase matching, translation memory alignment and machine translation. Its flagship product, ProMemoria is an integrated translation platform combining all these technologies into a synergistic translation environment. BridgeTerm has done custom development for Statistics Canada, Ernst & Young, and translation companies.
- TEST has provided translation services for almost 20 years to large corporations and government agencies, cooperating with Systran Software since 1980, as well as with other vendors of language processing tools. TEST is presently distributor for Systran products and solutions.
- Smart Communications, Inc. is a leader in Controlled English solutions and its proven language verification tool, MAXit, is already used by large corporations for implementing Controlled English. Smart Communications is a recognized expert at training technical publication authors and writers.
|
Trampolino:
Integration Flow Chart
(.gif image 14,586 bytes)
|
Process Description
The source sentence, as produced by an author in compliance with a Controlled English writing standard, is submitted to the MAXit verification tool. The text and the suggested corrections are displayed and, after a first editing pass, the text is sent to the translation process where a CE translation engine generates a target language translation. The machine-translated sentence, displayed in the target language window, is re-translated into the source language by a back-translation engine. Thus, the author is presented with the source text, the translated target text and the re-translated version. Depending on the comparison of the two English versions, he/she will tweak the source text to improve its translatability. This system can be used (1) as a production tool for translating documents, (2) as a CE training tool, providing a clear demonstration on how a different wording affects translatability, and (3) as an alignment tool for building translation memories from already translated documents.
Graphic User Interface
|
Screen capture:
The ProMemoria interface screen
(.gif image 48,519 bytes)
|
The existing
ProMemoria
interface screen (as shown above) will be used with the following window assignments:
- The
Expressions Found and
Expressions Not Found windows display phrases provided by the translation memory and the terminology extractor, respectively.
- The
Source Sentence window contains the source text tagged by the CE Checker. This area is the source sentence editing area. An
Accept button is used to save the source and target sentences when the translation loop provides a positive result.
- The
Target Sentence window contains the translated text in the target language, as provided by the CE Translator. This sentence can be edited in the foreign language.
- The Back Translation Sentence window contains the target text re-translated into the source language by a translation engine.
By comparing the source and backtranslated sentences, the accuracy of the overall loop may be assessed. The editor can then: (1) edit the original text to improve its translatability, (2) add or modify dictionary entries, (3) accept the translated sentence when satisfied that it is an accurate rendering of the source sentence. Validated sentences are used to populate the translation memory database.
Functional Modules
|
Trampolino:
Process Block Diagram
(.gif image 26,989 bytes)
|
-
Translation Memory: (1) provides
exact and fuzzy matches of the source sentence, (2) stores newly accepted source/target sentences. It should be noted that the translation memory is reversible, i.e. works in either language (translation to or from a foreign language).
-
Terminology Extractor: (1) scans the source text for terms and expressions contained in the system dictionaries, (2) accepts new terms from human operator.
-
Controlled English Checker: (1) scans source text against a dictionary and a predefined set of writing rules, (2) builds a readability index, (3) displays comments and suggestions to the author.
-
Controlled English Translator: (1) translates edited source text into target language, taking into account the CE rules set.
-
Back Translator: (1) translates text in target language back into source language. This translator need not be of the controlled-language type, since it should accept a broad spectrum of target language renderings.
Implementation of the Required Components
The project will be described in the context of a Controlled-English (CE) to Mandarin Chinese writing/translation system adapted to the requirements of a worldwide organization trying to support its efforts to penetrate the growing market of the People's Republic of China.
-
Translation Memory: The existing translation memory used in BridgeTerm's
ProMemoria
will be used for this module. Only minor adaptation to the application will be required.
-
Terminology Extractor: The existing extractor/builder used in ProMemoria would be retained for this module. The same extraction tool can be used for "data mining" the terminology contained in other documentary sources.
-
Controlled English Checker: MAXit developed by Smart Communications is proposed. Under a cross-licensing agreement, the MAXit tool can be used with existing Controlled-English dictionaries. Smart Communications can provide assistance, dictionary building, and training services.
-
Controlled English Translator: A simple translation engine based on Controlled-English rules and vocabulary will provide good results. The parsing done by MAXit ensures that the source text is clearly understandable, hence avoiding a large number of rules to resolve ambiguities.
-
Back Translator: Systran's Chinese-to-English translation engine will be used. Systran's API is already integrated into ProMemoria so this development effort will be minimal.
-
GUI, Databases, Middleware will be developed to link the various modules, as well as various dictionary and translation memory management tools.
Conclusion
For a relatively modest time and effort investment, MAXit and Trampolino can be integrated in order to provide a very useful multilingual document production platform aimed at large corporations and organizations. Proven software solutions are used and most of the development effort is devoted to integrating software modules and building the required interfaces. Another benefit of this modular structure is that new developments in the field of machine translation, will be easily retrofitted as they become available. To summarize, the proposed approach leads to customized solutions modeled after SAP's philosophy of modular ERP building blocks. Trampolino has the flexibility to be integrated into a specific linguistic environment in order to consolidate your writing, translation and publishing activities into an efficient communication tool supporting your global expansion.