M. V. Umerova

State University–Higher School of Economics


The paper intends to explore modern systems of machine translation. Traditionally translation programs are divided into two types: ruled-based and example-based. Example-based systems work on the translation memory principle which prevails in modern computer programs applied for translation.
Today to the sphere information technologies Translation activities include electronic dictionaries, terminology databases, reference books and encyclopedias, as well as special programs that allow computerized (or machine) translation. Let's consider the main application programs existing today that provide computer-aided translation of texts (Computer-Aided Translation).

Traditionally, machine translation systems are divided into two categories: rule-based and example-based. In the first, the grammar is more deeply developed; they take into account various rules to a greater extent. Systems of the second type are self-learning; they are built on the dynamic generation of language rules for specific text examples. The boundaries between example-based and rule-based systems are not clear enough, because both use dictionaries and rules for working with those dictionaries. Today, the most common system built on the example-based principle is the TRADOS system, which works mainly on examples, with virtually no use grammar rules. These kinds of programs are called Translation Memory (TM-tools).

TM tools allow you to save parallel sentences in the database: a sentence from the original and its translation. More voluminous data can be stored in the database. text fragments, but it is the sentences that are automatically recorded (the program processes text fragments from point to point). At the very first moment of starting work on translating a sentence, the program automatically checks the existing database for the presence of the same or similar sentence and offers an existing translation. Thus, over time, the database expands, which happens especially quickly in translation centers where there are a large number of translators performing many translations. Most of these programs are installed both on a personal computer and used for network use. The use of a common database by all translators makes it possible to develop a unified translation style for the entire translation agency. The editing mode allows you to correct errors and prevent their repetition in further translations.

Programs of the TM category provide the function of terminological support, i.e. When it detects a word or phrase that has been marked as a term in previous translations, the program draws the translator’s attention to this fragment and offers an existing translation option. Thus, the principle of unity of terminology can be maintained by all agency translators. Today, there are ready-made terminology databases that can be provided by the translation customer as an additional block for the translation agency’s existing program. This allows you to speed up the translation process and improve its quality.

The TRADOS system is intended for large translation centers where many parallel texts have accumulated. It allows you not to translate the same sentence twice, but finds a similar sentence in a database of parallel texts and produces a translation that has already been made. For large arrays of similar texts, this approach is very effective. The concept of a document array is important for machine translation. Most experts agree that machine translation is only possible for applied (technical) texts, which can be specified in certain, often simply gigantic, arrays. Fiction, in the translation of which one has to solve the problem of choosing the contextual meaning of an ambiguous word, translating wordplay, contaminated speech, metaphors, allusions and other stylistic devices, will not be translated adequately by a computer, at least in the foreseeable future.

There are few machine translators on the Russian market, and the leader among them is the PROMT system. According to some estimates, PROMT occupies up to 95% of the machine translation market in Russia, and its position is strong not only here, but also in Europe. PROMT is a commercial development, therefore internal algorithms programs are not available to a wide range of researchers. From this point of view, the program is, in some way, a “black box”, the contents of which cannot be analyzed. But, with a reasonable degree of confidence, it can be argued that the system uses bilingual dictionaries equipped with the necessary linguistic information: morphological, syntactic, semantic. The program can separate single-valued words from polysemantic ones, after which the single-valued words are translated using lists of equivalents. To translate polysemantic words, special contextual dictionaries are used, the dictionary entries of which are algorithms for querying the context for the presence or absence of contextual determinants of meaning.

The famous linguist Yu.N. Marchuk, whose works have already become classics of applied linguistics, pays a lot of attention to the problem of resolving lexical ambiguity in machine translation, which is resolved by detecting lexical, syntactic and morphological determinants in the text that determine the translation of a polysemantic word. Today, there is a large number of works devoted to the patterns of explication of vocabulary, syntax and semantics within specific sublanguages ​​that have their own lexical, structural, syntactic and semantic features.

Yu.N.Marchuk explains low quality translations produced by many machine translation systems in that the final translated text is considered by most developers as a composition or the sum of translations of its individual parts (similar to the well-known property of generating large units from smaller ones). But translation does not have this property, and from correct translations of individual parts of the text one cannot automatically obtain correct translation the entire text as a whole. Yu.N. Marchuk sees the solution to this problem in precise consideration of the specific features of the subject field and the linguistic composition of specific sublanguages ​​(i.e., areas that are obviously significantly smaller than the entire natural language system). For such linguistic communities as sublanguages, it is possible to determine the meaning of individual linguistic units in such a way that their totality (linear combination) does not contradict the idea of ​​a whole text as such. On this theoretical basis, the idea arose of creating contextual dictionaries for certain types of texts within certain semantic fields and sublanguages, which today, in addition to contexts of use, also include interpretation. Contextual dictionaries make it possible to create multilingual terminological databases for wide subject areas.

Thus, the contextological approach synthesizes two main trends in the applied description of vocabulary - vocabulary-centric and text-centric. Many existing machine translation systems are based on a text-centric approach: the basis for describing vocabulary is auxiliary concordance dictionaries, which arise as a result of the analysis of large text arrays. The main meaning here is the textual use of the word, i.e. the totality of its specific meanings in a given type of text. In earlier machine translation systems, a contextual dictionary was combined with a special algorithm that queried each polysemantic word for contextual determinants. Modern programming languages ​​make it possible to implement a dictionary system without connecting it to a special algorithmic procedure. Today, there are other ways to implement a dictionary in a computer program in a variety of ways.

The most important result of using contextological dictionaries is an effective solution to the problem of lexical ambiguity, since it has long been proven that it is the lexical meanings of words that convey the main part of the semantic information of a sentence and text within the discourse of a certain subject area.

Today, the developers of the PROMPT machine translation system offer an integrated version that combines the PROMT XT Professional and TM TRADOS systems. The algorithm for working together between these two systems is as follows: first, the TRADOS system analyzes the document being translated and identifies segments whose translation is not in the database or the percentage of its match with the database is lower than the specified one. Identified text segments are transferred to PROMT XT Professional for further machine translation. To improve the quality of the translation, the user must first configure the system: select the topic of the document being translated, which will connect specialized dictionaries to the work and ensure the uniformity of terminology throughout the document. Computer-translated text segments are added to the Translation Memory database with a note that they were created by PROMT. When further working with the document in the TRADOS system, the user can only edit them and save them in the Translation Memory database. The result of joint use of systems is not only the translation of the document, but also the corresponding Translation Memory database, which can be used in subsequent translations.

Today learning programs Most language universities and departments provide lectures on existing machine translation systems, but not all universities have the opportunity to familiarize students with electronic translators in practice. Thus, since 2005, translation systems PROMT and TRADOS have been supplied to some Russian universities (RGGU, Herzen RGPU and others). Universities are developing special courses to teach how to work with computer systems translation.

The developers offer the next updated version of X-Translator Revolution, which operates on the basis of a new translation engine, which has significantly improved the quality of translations. The program provides the ability to work with six European languages ​​in various combinations. There is a special application for translating messages in the ICQ program. The possibilities for translating specialized texts have been expanded through the release of new dictionaries on the topics “Commerce”, “Science”, “Technology”, the total vocabulary of which is more than 250 thousand words and phrases. Connecting dictionaries allows you to improve the quality of translations of contracts, financial documents, scientific articles, technical documentation. As an advantage of the product, the developers highlight the Microsoft Office 2003 style interface that is familiar to most users, as well as the ability to translate Email and online websites.

Besides commercial developments in the field of machine translation there are also so-called academic ones. The most authoritative is the ETAP system, which works for a combination of Russian and English languages. For Russia, this system has the same significance as the Systran system, which has become a classic example, for the whole world. STAGE marked the beginning of a whole scientific direction, which has become the main one in Russian academic developments in the field of machine translation. The technologies of the ETAP system (unlike, for example, PROMT) were described in open publications. Comparing these two systems, Candidate of Technical Sciences A. Sokirko in his publications evaluates the results of PROMT as being more focused on the semantic clarity of the translation to the detriment of its grammatical correctness due to the fact that PROMT is focused on the end user. The creators of ETAPA, according to A. Sokirko, on the contrary, pay a lot of attention to the grammatical correctness of the translation, because STAGE is an academic development and is focused specifically on the correctness of the result obtained. An attempt to synthesize these two approaches (semantic clarity and grammatical correctness) was made by the Dialing group ( www. aot. ru), which led the development of a machine translation program. This program It is distinguished by the presence in it of a so-called surface-semantic module, the development of which is based on the procedure of semantic analysis, well known in linguistics. The algorithmic implementation of semantic analysis is carried out not through the interpretation familiar to linguists, but with the help of graphs (so-called semantic trees), the nodes of which contain words or units equal to words in volume. The relations of the graph are specified by a list and are called semantic relations (for example, the relations subject - object, subject - aspect, etc.). A. Sokirko shows how for each sentence of the input text its own semantic structures are built, on the basis of which machine translation is carried out. Let's give an example of a graph from A. Sokirko's article:

Such graphs can include not only words, but also stable phrases and phrases, abstract connectives, rigid syntactic groups (for example, “twenty-two boys”), etc. On the basis of such semantic structures, translation is carried out as follows: according to the Russian semantic structure, the same English one is built, with English words and phrases in the nodes, then a chain of English words is synthesized according to the semantic structure, which is a multi-stage process that requires a certain amount of time. Therefore, one of the weak points of this system is the slow translation speed.

Belarusian linguists A. Chistyakov and A. Skrebnev in their works pay special attention to another system in which the automatic translation process is built on mathematical logic and statistical calculation of probability. Developments in this direction, the foundations of which were laid by the famous scientist Franz Joseph Och, have been ongoing for more than 15 years. Today there is one commercial product in this area available on the market - the automatic translation program Language Weaver. This program does not use any dictionaries containing ready-to-use lexical and grammatical data. The program uses only parallel texts (a similar principle is used in Translation Memory Tools), and the volume of texts is very significant. The program analyzes originals and translations where there is a piece of text similar to the one that now needs to be translated. Having compared many translation options, the program selects the one that was used most often. Thus, the program records the degree of probability with which a given piece of text should most likely be translated, gradually this degree of probability approaches one hundred percent. This eliminates the need to download specialized dictionaries (the advantage statistical method before transformation programs), because the program itself generates a constantly used dictionary, which is focused on this particular user. The creators of the program claim that today their system provides higher translation quality compared to other machine translation systems.

Text prepared for translation in the Language Weaver program can be presented in various text and even audio file formats. The collected parallel texts are recognized and combined at the sentence level to create a parallel text corpus. This text corpus is processed using the Language Learner subroutine, which determines the probability of a particular translation and compiles a probabilistic dictionary, template or rule, i.e. translation parameters. The created parameters are used by the statistical translator-decoder when translating new texts.

At the current stage of development, the program does not work completely independently. The final version is evaluated by the translator and selects from the set proposed by the program the sentence that most closely matches the original in its communicative equivalence. In subsequent translations, the selected sentence is used by the program as a template or sample.

This system most clearly represents an example of the development of artificial intelligence and is likened by the authors to the process of a child mastering natural language: first, the child copies letters and words without understanding their meaning, and then proceeds to meaningful rewriting of words and expressions. Thus, this machine translation system works not with language material (the meanings of individual words and phrases, syntactic rules), but with precedent texts, i.e. with the meanings of statements that represent elements of intercultural communication. When working with texts in this way, there is a transformation of the linguistic shell of meaning as a cultural phenomenon, and not the word itself as a linguistic phenomenon.

According to the authors, with the increase in the technical power of computers, the ability to process parallel texts will increase, and the computer will be able to more or less independently establish intertextual connections. It is possible that mathematical laws of construction will be discovered in classical texts. In particular, in the last century, branches of mathematics arose that describe the works of Mozart and Beethoven with mathematical formulas. Perhaps the computer will be able to work with such paralinguistic sign systems as intonation and facial expressions.

If in the future the problem of automatically checking the adequacy of a translation is solved, then the computer will become a completely self-learning system, capable of not only translating, but also, in a sense, generating texts, depending on what “cultural baggage” it contains. in the form of parallel texts.

Today, Language Weaver is releasing regular updated versions of machine translation, in particular version SMTS 4.2. Unlike other electronic translators, SMTS (Statistical Machine Translation Software) uses the statistical analysis methods described above to study sentences, phrases and structures and selects the most suitable translation option from among the many available.

The program's support for outputting data in TMX (Translation Memory eXchange) format simplifies the exchange of information between SMTS and other machine translation systems. The program has a built-in filter that allows you to translate Microsoft documents Office (MS Word, Excel and PowerPoint) and output translation results in the original format.

Thus, many modern researchers come to the conclusion that the quality of machine translation will improve. Most information technology specialists agree that adequate translation of any type of test, independently performed by a machine, will become a reality in 50-70 years with the development of artificial intelligence capabilities.


Machine translation and its types and functions

1.1 Machine translation in the life of a translator

The most striking examples of how new technologies become indispensable tools for performing certain jobs can be found in our everyday life. Now it is difficult to imagine harvesting without a combine, working in factories without all kinds of machines, and doing laundry without a washing machine. Previously, all this work was done manually, and there is no point in even talking about how much time and effort people spent on performing these types of work. There are good reasons to believe that pretty soon Translation Memory technology will become firmly established in the lives of translators, and its use will become as commonplace as cutting vegetables in a food processor. Previously, it was generally accepted that written translation is an exclusively creative process, akin to writing fiction. books, it is not for nothing that many famous translators became famous as poets or writers. However, today the realities of life require accuracy in the transmission of information during translation and efficiency of execution. The modern specificity of written translation lies in the need to translate large volumes of often repetitive technical or business documents. Technical translation generally requires rigor of style and canonicity of forms, and who will think about creativity when you need to translate hundreds or thousands of pages of technical documentation in record time. Documents constantly repeat typical phrases, and if translators are forced to manually translate the same thing over and over again, this significantly reduces the speed of their work, and as a result, the company’s profit.

The productivity and quality of a translator’s work depend on his personal experience and the ability to continually learn from the experiences and knowledge of others. What is your own experience in translation? This is the translator's memory to which he turns every time to remember whether he encountered a particular word, phrase or sentence, and how it was translated by him the last time. Using other people's knowledge comes down to searching for the most appropriate words and expressions in a given context in dictionaries. However, sooner or later, every translator has a logical question: How can they keep at hand not only dictionaries, but all the translations they have previously made related to a certain field?

New problems require new solutions. One of the new translator tools is Translation Memory (TM) technology - a database where completed translations are stored, sometimes also called “translation memory”. Translation Memory is often confused with machine translation (Mahcine Translation), which is also certainly useful and interesting, but its description is not the purpose of this article. The use of TM technology increases the speed of translation by reducing the amount of mechanical work. However, TM will not perform the translation for the translator, but will greatly facilitate his work. The principle of operation of TM technology is quite simple - during the translation process, pairs “source text - final (translated) text” are accumulated in a database (or databases) and then used to translate new documents.

To make it easier to process information and compare different documents, the Translation Memory system breaks all text into separate pieces called segments. Such segments are most often sentences, but there may be other segmentation rules. When translating a new text, the system compares all text segments with those already in the database. If the system manages to find a completely or partially matching segment, its translation is displayed indicating the match as a percentage. Words and phrases that differ from the saved text are highlighted. These are a kind of “tips” that to some extent facilitate the translator’s work and reduce the time required to edit the translation. As a rule, the match threshold is set at a level of at least 75%. With a lower percentage of matches, the cost of editing the text increases too much, and it is faster to translate this segment manually. It turns out that when working with TM, the translator only has to translate new segments and edit partially matching ones. Every change or new translation are saved in the TM, and there is no need to translate the same thing twice!

Like a diligent student, Translation Memory remembers terms and sentences, on the basis of which the so-called “translation memory” is built. TM is a constantly growing database (or databases, if the translation is carried out on various topics) of data that “remembers” all the translations performed, and can become a “language memory” for a product or for the company’s activities as a whole. The translation memory database is replenished and grows with the translation of each new document, so the time spent on the next similar translation is reduced, and financial costs are correspondingly reduced.

This technology helps to significantly reduce the cost and time spent on translating technical documentation through the use of repeated text fragments. In addition to reducing the labor intensity of system translation, TM allows you to maintain the unity of terminology and style throughout all documentation, as well as reduce the costs of subsequent layout of translated documents.

Let's calculate the benefits that TM technology provides to users:

Increasing translator productivity. It is easy to see that substituting even 80% of matching segments from the translation database can reduce the time spent on translation by 50-60%. As practice shows, editing a completed translation is much faster than translating it again from scratch.

Saving money as a direct consequence of saving time.

Unity of style terminology in the presence of a translation database on the subject of the document being translated. This is especially important when translating highly specialized documentation.

Creating continuity of the work process, which guarantees the absence of disruptions in the company’s work. The funds spent on creating a translation database are not costs, but rather an investment in stable and high-quality work, which increases not only profits, but also the value of the company itself.

Of course, not everything is so rosy and Translation Memory technology is not without a number of significant drawbacks:

Large start-up costs. It is quite obvious that to work productively with the Translation Memory system you need ready base translations on the subject of the texts that are planned to be translated. Without such a base, the Translation Memory program will not be able to help. Therefore, you must either buy such a database or create it manually.

Large volume self made. Even with a good translation database, the number of 100% matches will be very limited. Therefore, it will not be possible to completely automate manual labor using Translation Memory.

Note that there is also the possibility of integrating TM systems with machine translation systems, which gives additional benefits in working with large flows of documentation. The user can extract terminology for subsequent work with it, create his own custom dictionaries, connect additional dictionaries, and, finally, the translation of segments that do not match those already in the TM translation database will be carried out automatically.

1.2 Machine translation and its types and functions

machine translation technology

The active implementation of modern technologies in various fields of activity allows us to reduce the time and effort required to complete any work. The field of linguistics was no exception, especially such areas as legal translation or, for example, technical translation.

Translations of technical texts are characterized by an impressive volume of documentation that must be translated in the shortest possible time. Various programs - translators and electronic dictionaries - are designed to solve this problem.

These technologies can help a professional translator when he is faced with the task of performing high-quality legal translation from English in a short time. An experienced translator is well aware that machine-translated text requires further checking for compliance with the original and making amendments. At the same time, the translation of a legal text using the program must be checked not only by the linguist himself, but also by a qualified lawyer. Many newcomers and young freelancers, hammering into the line search engines The request “legal translation online” is used only by sites found for the request. As a result, they perform only two actions: “copy text into an online translator” - “paste text into a document”, i.e. receive a machine translation and send it to the customer. As a rule, only a highly qualified translator working in a reputable translation agency carefully reads the original, checks all information obtained during machine translation with the source documentation, and uses various electronic dictionaries and reference books to adapt all terms and formulations in accordance with the legal peculiarities of both countries. As a result, he receives a high-quality legal translation of the text. This approach to labor organization is called automated.

To automate their translation activities, many linguists use online translators such as Promt, Google, Transneed, online dictionaries Multitran, Lingvo, MrTranslate and other online resources. Many companies are developing programs for offline use. At the same time, programs that perform legal translation from English and into this foreign language based on general and specialized Lingvo dictionaries, as a rule, cope well with the translation task and correctly interpret most words, dialect expressions, technical and legal terms. This is explained simply - ABBYY Lingvo is constantly improving its own products and collaborating with the best translators.

CAT systems stand apart, for example Transit, Trados, Wordfast, Across, Meta Texis, Star, etc. They are at a much higher level in contrast to translator programs and portals that offer legal translation online, however, they are inferior to many translation agencies. They are characterized by the presence of a special Translation Memory, in which original-translation correspondences are accumulated. When a legal text is translated, the program compares the text with the information accumulated in its database and evaluates it. Thus, the linguist will only need to edit the material to obtain a stylistically and logically coherent text. CAT systems are quite popular among translators, since when performing legal translation, the linguist independently forms a correspondence database. This allows you to speed up the process of his further work, despite the fact that at first the translator spends quite a lot of time analyzing information, consulting with lawyers and searching for a clearly correct interpretation of a specific legal term or formulation. This is how a professional translator not only accumulates knowledge when translating a legal text, but also rationally distributes his efforts, and the customer will always receive a high-quality translation of the text as a result.

Agree, not a single machine, not a single resource that offers legal translation online, is capable of feeling all the subtleties and nuances of the text that an experienced translator sees. Naturally, the larger the vocabulary and terminological base of the translator program, the greater the capabilities the tool will have, the better the machine translation will be. However, only a master of his craft - a highly qualified linguist - can feel the specifics of the text and carry out the translation competently.

Consequently, a legal translation from or into English performed using software mandatory requires further testing for adequacy, equivalence and literacy. A highly qualified translator plays a leading role here.

If a dilemma arises: contact a reputable translation agency or perform the translation yourself using software products or online translators, preference will be given to the first option. Online legal translation or unedited machine translation may be used as reference material only. In addition, the translation agency, unlike innovative tools, will be responsible for the work performed, for the reliability of all information and the adequacy of the received text.

It is known that when legal translation is carried out within the framework of business cooperation, the accuracy of the translation, competent and correct presentation of the text, full compliance of the text with the legal systems and traditions of both countries, as well as correct formatting in accordance with international norms and standards are very important. The above indicators affect the mutual understanding of the parties, the speed of signing agreements, contracts, agreements and other legal documents. As a result, high-quality translation becomes the key to successful and productive cooperation between companies or a job seeker and employer.

The modern period of development of society is characterized by a strong influence on it of computer technologies, which penetrate into all spheres of human activity, ensure the dissemination of information flows in society, forming a global information space. An integral and important part of these processes is the computerization of translation processes. Computerization of the translation process has become one of the important tasks from the very beginning of the use of IT in science. The dream of creating automatic machine translators has not left scientists from the very beginning. And even though the complete transfer of the process into the sphere of machine activity at this stage IT development is impossible - the presence of the human factor as the final decision-making authority is still necessary - the task of the developers was to provide all possible assistance to the translator through IT. The introduction of computer tools into a process that was initially focused only on humans, their ability to select the appropriate option at the level of experience and sense of style, requires special attention to detail and technology. In addition to developing suitable software different types To perform relevant related tasks, the first priority is also to train specialists in the use of these programs and create comfortable conditions for their use.

Computer technologies are intended to become not an additional “makeweight” in translation, but an integral part of the integral process, significantly increasing its efficiency, becoming the “right hand” of the translator, speeding up the translation process and making it more technologically advanced.

At this stage, the capabilities of IT in translation are used incompletely and insufficiently.

The main reason for this situation is insufficient attention to the possibilities of using IT at the educational stage. When training translators in our universities, there is a complete lack of attention to the capabilities of IT - not only is there no separate course, but there is not even any talk of studying this issue as part of the program. The teachers themselves are not always sufficiently familiar with the issue, so their advice also cannot fully satisfy the needs of students. At the current stage, finding opportunities to use IT in translation is 90% the task of the student-translator himself.

Relevance of the study: the modern period of development of society is characterized by a strong influence on it of computer technologies, which penetrate into all spheres of human activity, ensure the dissemination of information flows in society, forming a global information space. There is an improvement in IT support capabilities in various areas, including such an important area as translation.

The object is the achievements of modern information technologies in the translation process.

The subject is computer programs and Internet resources designed to help the translator in the translation process.

Purpose: to highlight the possibilities of using SIT in translation at the current stage of development, to propose options for increasing the efficiency of using existing achievements.

· Study the history of the development of computer technologies in the field of translation;

· Study the available translation tools, both software and IR

· Consider the Lingvo electronic dictionary, electronic translator PROMT.

· Identify the advantages and disadvantages of modern translation systems.

· Explore options for improving the efficiency of using TSPs.

Chapter 1 History of the development of modern information technologies in translation.

Computer translation is a complex but interesting scientific task. Its main difficulty is that natural languages ​​are difficult to formalize. Hence the low quality of text obtained using MT systems, the content and form of which is an invariable object of jokes. However, the idea of ​​machine translation goes back a long way.

The idea of ​​​​the possibility of machine translation was first expressed by Charles Babbage, who developed it in 1836-1848. "Digital Analytical Engine" project. Ch. Babbage's idea was that a memory of 1000 50-bit decimal numbers (50 gears in each register) could be used to store dictionaries. C. Babbage cited this idea as a rationale for requesting from the British government the funds necessary for the physical embodiment of the analytical engine, which he was never able to build.

And 100 years later, in 1947, W. Weaver (director of the natural sciences department of the Rockefeller Foundation) wrote a letter to Norbert Wiener. In this letter, he proposed using decryption techniques to translate texts. This year is considered the birth year of machine translation. In the same year, an algorithm for word-by-word translation was developed, and in 1948, R. Richens proposed a rule for dividing a word into a stem and an ending. Over the next two decades, machine translation systems developed rapidly.

In January 1954, the first machine translation system, the IBM Mark II, was demonstrated on an IBM 701 machine. But in 1967, a specially created Commission of the US National Academy of Sciences recognized “machine translation as unprofitable,” which significantly slowed down research in this area. Machine translation experienced a new rise in the 70s, and in the 80s it became economically profitable due to the comparative cheapness of machine time.

However, in the USSR, research in the field of machine translation continued. After the demonstration IBM systems Mark II, a group of VINITI scientists began developing a machine translation system for the BESM machine. The first sample of translation from English into Russian was received by the end of 1955.

Another direction of work arose in the Department of Applied Mathematics of the Mathematical Institute of the USSR Academy of Sciences (now the M. V. Keldysh Institute of Problems of the Russian Academy of Sciences) on the initiative of A. A. Lyapunov. The first machine translation programs developed by this team were implemented on the Strela machine. Thanks to the work on the creation of MP systems, such a direction as applied linguistics took shape.

In the 70s, a group of developers from VINITI RAS worked on the creation of MP systems under the leadership of prof. G.G. Belonogov. Their first MP system was developed in 1993, and in 1996, after a number of modifications, it was registered with ROSAPO under the name Retrans. This system was used by the Ministries of Defence, Railways, Science and Technology.

Parallel studies were carried out in the Laboratory of Engineering Linguistics of Leningrad State Pedagogical Institute named after. A. I. Herzen (now Pedagogical University). They formed the basis of the now most popular MP system “PROMT”. The latest versions of this software product use high-tech technologies and are built on the basis of extended transition networks technology and neural network formalism.

Chapter 2 Classification of machine translation tools (according to Larry Child)

“New members of CompuServe's Foreign Language Forum often ask if anyone can recommend a good machine translation program at a reasonable price. The answer to this question is invariably “no.” Depending on the person answering, the answer may contain two main arguments: either that machines cannot translate, or that machine translation is too expensive.

Both of these arguments are valid to a certain extent. However, the answer is far from so simple. When studying the problem of machine translation (MT), it is necessary to consider separately the various subsections of this problem. The following division is based on lectures by Larry Childs given at the 1990 International Technical Communication Conference:

Machine translation systems perform automated text translation. The units of translation are words or phrases, and recent developments make it possible to take into account the morphology of the word being translated. “Developed MT systems carry out translation using translation algorithms specified by the developer and/or user-adjusted.”

To carry out machine translation, a special program is introduced into the computer that implements the translation algorithm, which is understood as a sequence of unambiguously and strictly defined actions on the text to find translation matches in a given pair of languages ​​L1 - L2 for a given direction of translation (from one specific language another). The machine translation system includes “bilingual dictionaries equipped with the necessary grammatical information (morphological, syntactic and semantic) to ensure the transmission of equivalent, variant and transformational translation correspondences, as well as algorithmic means of grammatical analysis that implement any of the formal formalities adopted for automatic text processing grammarian." There are also separate machine translation systems designed to translate within three or more languages, but these are currently experimental.

Currently, there are two concepts for the development of machine translation systems:

1. The model of a “large dictionary with a complex structure”, which is embedded in most modern translator programs;

2. The “meaning-text” model, first formulated by A.A. Lyapunov, but has not yet been implemented in any commercial product.

Today the most famous machine translation systems are:

PROMT 2000/XT from PROMT;

Retrans Vista from Vista and Advantis;

Socrates is a set of programs from the Arsenal company.

The systems of the PROMT family have developed a morphological description that is almost unique in its completeness for all languages ​​that the systems can handle. It contains 800 types of inflections for the Russian language, more than 300 types for both German and French, and even for English, which does not belong to inflectional languages, highlighted. more than 250 types of inflections. The set of endings for each language is stored as tree structures, which provides not only an efficient storage method, but also efficient algorithm morphological analysis.

Instead of the accepted linguistic approach, which involves the identification of sequential processes of analysis and synthesis of a sentence, the architecture of the systems was based on the representation of the translation process as a process with an “object-oriented” organization based on the hierarchy of the processed components of the sentence. This made it possible to make PROMT systems stable and open. In addition, this approach made it possible to use various formalisms to describe translation different levels. The systems operate both network grammars, similar in type to extended transition networks, and procedural algorithms for filling and transforming frame structures for the analysis of complex predicates.
The description of a lexical unit in a dictionary entry, which is actually unlimited in size and can contain many different features, is closely interconnected with the structure of the system algorithms and is structured not on the basis of the eternal antithesis of syntax - semantics, but on the basis of the levels of text components.

At the same time, systems can work with incompletely described dictionary entries, which is important point when opening dictionaries for a user who cannot be required to handle linguistic material delicately.

The system distinguishes the level of lexical units, the level of groups, the level of simple sentences and the level of complex sentences. All these processes are connected and interact hierarchically in accordance with the hierarchy of text units, exchanging synthesized and inherited features. This arrangement of algorithms allows the use of different formal methods to describe algorithms at different levels.

An electronic dictionary is, as a rule, a computer database containing dictionary entries encoded in a special way that allow quick search the necessary words and phrases. The search for words is carried out taking into account morphological combinations (examples of use), as well as the possibility of changing the direction of translation (for example, English-Russian or Russian-English).

The main difference between the ES and the SMP is that the ES provides the translator with the entire range of meanings of the searched word or phrase entered in its database, leaving the choice of the most suitable option up to the person, while the SMP itself selects the option from the database based on the algorithms built into it .

Lingvo translated from Esperanto means “language,” about which there are articles in the ABBYY Lingvo dictionaries (LingvoUniversal and LingvoComputer).

ABBYY Lingvo does not have a full-text translation function, but word-by-word translation of texts from the clipboard is possible. In some dictionaries in English, German and French, most words are voiced by professional native speakers.

The program includes the Lingvo Tutor learning module, which helps you memorize new words.

In addition to the existing 150 professional dictionaries, the result of lexicographic work by ABBYY employees and authoritative paper and electronic dictionaries, there is an extensive database of free user dictionaries for the program. Dictionaries are pre-checked and are publicly available on the website of the Association of Lexicographers Lingvo.

Varieties of ABBYY Lingvo x3:

· ABBYY Lingvo x3 European version - 130 general lexical and thematic dictionaries for translation from Russian into English, Spanish, Italian, German, Portuguese and French and vice versa.

· ABBYY Lingvo x3 Multilingual version - 150 general lexical and thematic dictionaries for translation from Russian into English, Spanish, Italian, Chinese, Latin, German, Portuguese, Turkish, Ukrainian and French and vice versa.

· Mobile multilingual dictionary ABBYY Lingvo x3 - a dictionary for smartphones, communicators and PDAs, containing 38 modern complete dictionaries for 8 languages.

ABBYY Lingvo x3 English version- 57 general lexical and thematic English-Russian and Russian-English dictionaries.

· All versions contain explanatory dictionaries in English(Oxford and Collins) and Big Explanatory Dictionary of the Russian Language by Efremova T.F.

In addition to the software tools already described above that serve to assist the translator, there are also special UIs that allow you to search for translations online, without the need to download and install any software.

IR can also be divided into two types: dictionaries and similar online databases and machine translators.

The most famous online dictionary can rightfully be recognized as the Internet version of ABBYY Lingvo. In addition to the already familiar word-by-word translation and provision of dictionary entries, the site offers a wide range of additional features:

FineReader Online is a convenient online OCR service that recognizes your images, PDF files or photographs of documents and converts them into the required formats - Microsoft Word, Excel, TXT, RTF or searchable PDF

Written translation is a development by representatives of ABBYY Lingvo, which allows the customer to optimize costs. The type of translation and its cost are determined by the purpose of the document, thematic area, volume and timing of the project.

Individual training by phone (or online - via Skype)

Online version of the ABBYY Aligner program for aligning parallel texts and creating Translation Memory databases

The “Telephone Translation” service is a teleconference in which, in addition to you and your interlocutor, a remote translator participates

Additionally, you can pay attention to a resource such as Urban Dictionary. This online database was created to introduce users to the constantly changing and rapidly updating sphere of English slang, phrases with a figurative meaning, and colloquial expressions.

Concerning online translators, then it is enough to note that most MP programs have Internet versions, including PROMT. They offer the same set of features as their software counterparts.


Currently, computers occupy an increasingly significant place not only among programmers and engineers, but also among a wide variety of users, including linguists, translators and specialists who need prompt translation of foreign language information. In this regard, electronic dictionaries and programs that perform machine translation are very convenient tools for saving time and optimizing the process of understanding foreign language information. In addition, there are now translator programs that can produce more or less adequate translations of foreign language texts and can be of assistance in the work of specialists in various fields.

The real topic research can be considered quite modern, since the history of development and implementation in daily life personal computers (and especially those that would be “powerful” to implement more or less modern machine translation programs) are hardly more than fifteen years old. This topic acquires particular relevance if we take into account the fact that at present the Republic of Belarus is increasingly integrating into the international community and that, along with economic and political barriers, language barriers largely prevent this. At the same time, there are not many professional translators capable and willing to carry out such a process of communication between communities in all spheres of science and culture. This was a consequence of the fact that at this stage the process of training a professional translator takes a lot of time and is very labor-intensive. Therefore, right now it is especially relevant to search for ways to automate as much as possible the process of translation carried out by a person in order, on the one hand, to make the hard work of a human translator as easy as possible, and on the other, to make this work as efficient as possible. This can be accomplished only by maximally integrating the efforts of specialists in the fields of cybernetics, programming, psychology, and most importantly, linguistics.

In this work, a study was carried out of the modern market of merchant communication services available to translators.

Various types of IT-assisted translation have been studied and described:

Fully automatic translation;

Automated machine translation with human participation;

Translation carried out by a person using a computer.

Various types of TSPs were reviewed, described and analyzed:

Electronic dictionaries;

Machine translation systems;

Online resources for translation.

A review of specific products currently available was carried out, their capabilities, advantages and disadvantages were analyzed.

At this stage of IT development, we can draw the following conclusion: the most promising area for using TSP is fully automated translation. Software development in this area occupies the minds of leading scientists and is one of the priority areas of research in the field of computational linguistics.

Now the most popular is the use of TSP as auxiliary tools in the process of translation. In this area modern developments provide the most ample opportunities on searching and interpreting words and expressions. There are databases not only for individual words, but also databases for set expressions, jargon, slang, etc.

The main task in improving the translation process can now be considered the introduction of TSP at all levels, from the initial process of translator training at a university to the popularization of TSP in the media. Currently, the available capabilities of the TSP are not used to their full extent.

