MAT Research at ParaLexica

Using the power of computers to analyse natural language
Building language independent machine assisted translation systems
Home of the ParaTExt glossing technologies



  • More than 7000 living languages, about 6500 still without a Bible, hundreds of translators at work today.
  • 1.5 billion without scripture in their mother-tongue.


  • Learning Machines which can learn about a language for themselves.
  • Language Independent Systems which are able to work with any language, without dictionaries or grammars.
  • Systems that don't need huge databases of linguistic information. Just the text.


  • Language is built around patterns.
    • Evaluate the results.
    • Check for consistency.
    • Find the gaps. Sometimes misses tell you more than hits.
  • Play to your strengths
    Computers: not bright but methodical
    Humans: quite bright, but can make mistakes...


Glossing Technologies

We continue to develop the ParaTExt glossing engine which already provides key term analysis and automatic interlinear back-translation for translators and consultants. Extending the effectiveness of the glosser for languages which are more complex morphologically is a high priority.

Complex Pattern Matchers

Working to improve the glosser has led us to focus on the problems posed for machines by complex morphologies, specifically non-concatenative or discontinuos morphologies. Such languages pose major problems for computers which are often unable to identify tokens in the text stream accurately.

Sparse Data Analysis

We think of semantics, morphology and syntax as discrete systems within language, in reality all the components of language are closely related. When there is very little data to work with most MT systems fail but by exploiting the relatedness of language we are learning to exploit the opportunities that exist with very small data sets.


portfolio img 1

Current Projects

Find out more about our work:

portfolio img 2

Project Paddington

Learning from Sparse Data

portfolio img 3

Project CogNomen

A language independent proper-name finder

portfolio img 4

Project Percival

Complex and discontinuous pattern recognition

portfolio img 3

Project PToleMy

An HMM implementation for mapping phoneme shifts

portfolio img 4

Project Augustus

Powering the ParaText glossing technologies



team img 1

United Bible Societies Glossing Technologies Project

team img 3

Wycliffe Bible Translators are at work all over the world

team img 2

Co-developers of the ParaTExt translation editor

team img 4

In partnership in the Institute of Computer Assisted Publishing

team img 2

Working together to bring God's word to the world

team img 4

ParaLexica - supporting global Bible translation


Brian Renes - UBS Americas

When working with these technologies, translators will use words like "incredible" and "unbelievable". How can the computer understand my language?

When something is made easier and better at the same time, it is truly a breakthrough. Your systems save time, make the job easier and in the end there is an assurance of higher quality.

Andy Warren-Rothlin - UBS Africa

Your systems have unquestionably contributed more to speeding up my work than any special budgets or new projects ever could!

Simon Crisp - UBS Scholarly Editions

The systems developed by the MAT team at ParaLexica can be said without exaggeration to have revolutionised the checking of translation drafts.

It is hard to imagine today's global Bible translation effort without these tools, which are widely appreciated and used by all agencies.


HTML form Enter code*: Contact Us form
new code
to get in touch with us send us a message using the form on this page or

make contact via Skype

sign up for our regular e-newsletters

Read more at The Blog

Donate: Support the Team