Project: Creating Regular Expressions to examine potential translation errors

Team Members: Chaoyue Jiang, Peng Xu, Yuhan Song

Regular Expression(Regex) is a useful tool in Trados helping translators to examine potential errors that may occur during translation. My language pair is English to Chinese. Chaoyue Jiang, Peng Xu and I created three regular expressions that can be used in QA check.

  1. Zipcode
  2. Phone Number
  3. Date

I. Zipcode

US : (\D|^)[0-9]\d{4}$

ZH : (\D|^)[0-9]\d{5}$

In the United States, zip codes are 5 digits; in China, zip codes are 6 digits. Therefore, we set up a regex that will show a warning sign when the zip code is not matched in the source and target language.

In this example, Chinese translation mistranslated the 5 digits zip code into a six digits one. It might be hard to identify when browsing through the target text. Therefore, by applying this regex, Trados shows a warning sign after you confirm the translation.

II. Phone Number

US: \(|\d{3,4}|\)|\s|\.|\-

ZH: ((\+86)?|\(\+86\))[ -]?(1\d{10}|((\d{3,4})?)[ -]?\d{8}?)

There are many forms of expressing phone numbers. For instance, China and the US has a different format to express phone numbers. When we translate phone numbers, we should keep the original format from the source text. Therefore, our group create a regex to identify a format that is different from source text, vice versa.

In this example, the phone number in target text could not match with the format in source text, which leads to a warning sign in Trados.

III. Date

Find (0[1-9]|1[012])[- \/.](0[1-9]|[12][0-9]|3[01])[- \/.]((?:19|20)\d\d)

Replace $3/$1/$2

In the United States, dates are shown by month, day and year, while in Chinese, Dates are shown by the order of year, month, and day. Our group set a regex to check whether dates are localized or not. What’s more, we also get rid of false date information such as 13/03/2022, and 12/201/2022. We use Find&Replace to change it to the right expression of dates in China.

In this example, the date is automatically been reversed into the year-month-date. But Trados still show a warning sign because the number doesn’t match.

Overall, creating regular expressions is an interesting process. When deciding what to create, you need to know the differences between the language pair you decide to work on. Then, creating a regular expression needs a person to be careful about some restrictions that may apply, for example, a wrong date.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php