Project

Termwise: Creating resources specialised language use.

The TermWise Knowledge Platform is a multidisciplinary co-operation between (applied) linguists and computer scientists, financed by the K.U.Leuven Association's Industrial Research Fund, and with the explicit objective of developing user-oriented applications based on previous fundamental research. More specifically, the platform aims to develop software that will help language professionals, like translators or copy-writers, deal more effectively with specialised texts. These texts, e.g. legal or medical documents, are full of domain-specific jargon and terminololgy. An in-depth knowledge about the typical words and expressions in a given discipline is essential to create high-quality translations and texts. Unfortunately, the currently available resources for language professionals are still quite limited in several respects. Therefore, the platform will develop computational knowledge acquisition algorithms that can create rich terminological databases in a time and cost effective way. The algorithms will be tested and validated in the domain of Belgian legal terminology in French and Dutch, in co-operation with the Belgian Federal Justice Department. However, the algorithms will be explicitly designed to be generic and portable to other languages and domains. The Platform will focus on three core aspects of terminological knowledge acquisition: · Term extraction: The identification of words and expressions that are typical for a specialized domain, in this case the legal domain. The platform aims to offer a better coverage of terminological units thanks to an advanced term model which takes into account possible variation between legal subdomains, and the integration of novel methods from the field of statistical corpus analysis. · Term alignment: The retrieval of translational equivalents for terms across languages, in this case French and Dutch. The platform will optimize statistical alignment algorithms for parallel corpora and comparable corpora. · Semantic modelling: The analysis of the meaning of terms and their typical usage contexts. The platform will apply Semantic Vector Space models to the large-scale analysis of meaning-context relationships. Ultimately, the platform aims to make the acquired knowledge accessible to users in the form of a software tool that offers comprehensive multilingual terminological support to language professionals. /* Style Definitions */ table.MsoNormalTable {mso-style-name:Standaardtabel; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-fareast-language:EN-US;} TermWise: Creating Resources for Specialised Language Use The TermWise Knowledge Platform is a multidisciplinary co-operation between (applied) linguists and computer scientists, financed by the K.U.Leuven Association's Industrial Research Fund, and with the explicit objective of developing user-oriented applications based on previous fundamental research. More specifically, the platform aims to develop software that will help language professionals, like translators or copy-writers, deal more effectively with specialised texts. These texts, e.g. legal or medical documents, are full of domain-specific jargon and terminololgy. An in-depth knowledge about the typical words and expressions in a given discipline is essential to create high-quality translations and texts. Unfortunately, the currently available resources for language professionals are still quite limited in several respects. Therefore, the platform will develop computational knowledge acquisition algorithms that can create rich terminological databases in a time and cost effective way. The algorithms will be tested and validated in the domain of Belgian legal terminology in French and Dutch, in co-operation with the Belgian Federal Justice Department. However, the algorithms will be explicitly designed to be generic and portable to other languages and domains. The Platform will focus on three core aspects of terminological knowledge acquisition: · Term extraction: The identification of words and expressions that are typical for a specialized domain, in this case the legal domain. The platform aims to offer a better coverage of terminological units thanks to an advanced term model which takes into account possible variation between legal subdomains, and the integration of novel methods from the field of statistical corpus analysis. · Term alignment: The retrieval of translational equivalents for terms across languages, in this case French and Dutch. The platform will optimize statistical alignment algorithms for parallel corpora and comparable corpora. · Semantic modelling: The analysis of the meaning of terms and their typical usage contexts. The platform will apply Semantic Vector Space models to the large-scale analysis of meaning-context relationships. Ultimately, the platform aims to make the acquired knowledge accessible to users in the form of a software tool that offers comprehensive multilingual terminological support to language professionals.

Date:1 Oct 2009 → 30 Sep 2013

Keywords:Specialised language use

Disciplines:Linguistics, Theory and methodology of linguistics, Other languages and literary studies

Project

Termwise: Creating resources specialised language use.

Researchers

Project partners

Funding