The development of the world's first intelligent Chinese-Portuguese-English translation platform
全球首個智能中葡英翻譯機器的研發之路

15 Apr 2020

FST Associate Professor Wong Fai shared his research effort and experiences
科技學院副教授黃輝分享多年研究心得

UM-Computer Aided Translation (UM-CAT) Translation System
在線中葡英輔助翻譯平台翻譯系統界面

FST Associate Professor Wong Fai shared his research effort and experiences
科技學院副教授黃輝及其研究團隊

FST Assoicate Professor Wong Fai and his research assistant Ao Chi Hong
科技學院副教授黃輝(右)及其研究助理歐志雄

In the current era of globalization, language gap is considered as one of the challenges for cross-cultural and cross-language communication. Machine translation has played a crucial role in globalization as it overcomes the language gap. According to the China Language Service Industry Development Report, the output value of global language services has maintained an upward trajectory, with a total output value of 46.52 billion US dollars in 2018 and close to 56 billion US dollars in 2021, while more than 360,000 language service enterprises in China. In response to the growth of the market of machine translation, Associate Professor of Faculty of Science and Technology, Wong Fai and his research team have studied Chinese-Portuguese-English machine translation for more than 20 years. Prof. Wong has developed and optimized the translation platform since UM launched the world's first talking Chinese-Portuguese electronic dictionary (PCT) in 1999. To optimize the translation, he inputted more local specific terminologies and terms commonly used in Macao in order to create higher degree of accuracy for local users. The research on Chinese-Portuguese-English machine translation at UM has always been at the forefront of the world, leading the development of local translation industry and promoting the exchange and communication between Macao and Portuguese-speaking countries.

Accurate translation fulfils local translation needs

A major breakthrough has achieved in the research of translation machines since UM established the Natural Language Processing and Chinese-Portuguese Machine Translation laboratory (NLP2CT) in 2009. With the rise of the artificial intelligence technology, NLP2CT keeps pace with the changing times, it launched its latest "Online Chinese-Portuguese-English Computer Aided Translation platform" (UM-CAT). Except for translating the content into three different languages, UM-CAT has higher degree of accuracy as it has knowledge of local and dialect culture. It can translate Macao’s street names, government departments, legal terminology and terms commonly used. For instance, if you type the Chinese name of Avenida de Almeida Ribeiro in Google Translate, it may not produce a satisfactory translation as the translation will turn into “New Road” rather than the street name. But with UM-CAT, you can translate it right away. More professional functions can be found in UM-CAT, including translation memory, terminology management and intelligent assistant prompts. One of the key intelligent functions is creating specific terminology databases which allow user to review the phrases translated in the past. In addition to its intelligent function, there is a function of project management for business users, it enables flexible division of labour for different projects, including translation, reviewing and typesetting. It allows business users to track the translation and review progress effectively. UM-CAT has been widely used in government departments, newspaper office, law firms, other enterprises and institutions since it was launched in December 2018. With its accurate translation, many important Chinese-Portuguese translation projects have been successfully translated by UM-CAT. To further enhance this technology, prof. Wong and his team are currently working on a conference intelligent translation system. This is a tool for translating spoken words into four written languages in real time, including Chinese, Cantonese, English and Portuguese. Machine translation technology can be applied into different scenarios and it is expected to be widely used by general public in the near future.

Research career and achievements over the past 20 years

Looking back to the research career of prof. Wong, he graduated from UM with a bachelor's degree in Software Engineering in 1995, he kept on studying a master's degree of Software Engineering in UM and then went to Tsinghua University for a PhD degree. After that he came back to UM and started from the position of research assistant. When discussing about the main motive of studying machine translation, prof. Wong smiled and said, "I didn't think much about it. I was just interested in it. I am glad to participate into the research of PCT at the beginning so I just wanted to continue the research.” Prof. Wong said that there was no research regarding machine translation in Macao before the handover in 1999. He described the research field in the past was like a "desert" where everything had to start from scratch. Fortunately, Macao has historical background and the advantage of having many Portuguese-speaking talents. Looking back over the past 20 years, the greatest achievement is successfully launched numbers of translation machines, including PCT, a Chinese/Portuguese translation system; the Um2T, an online interactive Chinese/Portuguese machine translation system; and the smart translation platform, UM - CAT etc. In 2017, an English-Chinese machine translation system developed by the research team received the top three prizes as well as the fifth prize at the constraint English-to-Chinese machine translation campaign organized by the 13th China Workshop on Machine Translation. Noting comes easy, same with the journey of doing research, success always comes from steady progress.

A Breeding Ground for Tech Talent

In addition to achieving multiple advancements in machine translation, it is more important to cultivate scientific and technological talents for the society. Prof. Wong shared that machine translation is an interdisciplinary research, including disciplines such as Computer Science, Mathematics and Linguistics. He said, “I will ask students who want to advance in this field whether they really enjoy studying words. It will be difficult for them even if they are passionate with Computer Science but have no interest with words, because machine translation and words are always inseparable. You can be motivated only by something you’re really interested in." Over the past 20 years, prof. Wong has nurtured many students. One of the UM alumnus, Ao Chi Hong is the core members of the team behind UM-CAT. Ao is responsible for system setup and management of UM-CAT as well as ensuring good user experience. He has worked as a research assistant of prof. Wong after completing his bachelor's and master's degrees in Computer Science.

UM alumnus Zeng Xiaodong, who graduated from the master’s degree in E-Commerce Technology in 2012, was involved in some interdisciplinary projects in the NLP2CT while still studying at UM. In 2018, Zeng entered the 35 Innovators under 35 China list released by MIT Technology Review. Many students have achieved great career advancement after graduation, some of them worked at well-known companies such as Tencent and Alibaba. Some of them established their own business, distributing the technology of machine translation to different places.

Future Prospects

Today, the translation machines in UM has vigorously promoted the Chinese-Portuguese-English translation works, it assists the economic, cultural exchange between Macao and Portuguese-speaking countries. However, with the rise of innovative technologies such as artificial intelligence and big data, prof. Wong said that there is still a lot of space for the development of multi-language machine translation in the future. There is a high demand and the variety of potential applications, including generating movie subtitles, translation app for personal travelling use, teaching, meeting and conference translations, the vocal conversion tools for hearing impaired and even providing vocal conversion sign tools for the deaf. In the future, prof. Wong will continue to innovate and optimize machine translation, apply this technology into more industries and contribute more to the development of science and technology and local society.

  1. Online Chinese-Portuguese translation platform (UM-NMT)
    Click here to access: http://nlp2ct.cis.um.edu.mo/NMT/
  2. Online English-Chinese-Portuguese translation platform (UM-CAT):
    UM-CAT offers a variety of intelligent functions, including translation memory, terminology management and building up specific terminology databases that allow user to review the phrases translated in the past. If you have some English/Portuguese translation projects that require to use more professional software, please contact nlp2ct@um.edu.mo or Alice Lam at alicewmlam for creating account.

在全球化的時代,跨語言交流面臨語言鴻溝的巨大挑戰。而機器翻譯則是打破語言鴻溝的重要技術,全球化中有著舉足輕重的角色。根據中國語言服務行業發展報告, 全球語言服務產值持續增長,2018年語言服務總產值為465.2億美元,2021年將接近560億美元,而中國含有語言服務的企業更超過36萬間。面對機器翻譯的龐大需求,澳大科技學院副教授黃輝及其團隊多年來一直致力研究中葡英翻譯機器 。自1999年澳大發佈全球首本中葡雙語發聲詞典 「中葡通」後,黃教授便不斷研發及優化翻譯平台,包括按地域性翻譯,加入更多本地詞語及專業術語,期望為用戶帶來更專業更精準的翻譯。因此,過去澳大的中葡英機器翻譯研究一直走在世界前沿,帶領著本地翻譯行業的發展,推動本地與葡語國家交流。

精準互譯,協助本地翻譯需求

澳大2009年成立自然語言處理與中葡機器翻譯實驗室(NLP2CT)後,翻譯機器的研究取得了重大突破。隨著人工智能技術的興起,NLP2CT也與時俱進,最新推出的「在線中葡英輔助翻譯平台」(UM-Computer Aided Translation, UM-CAT)除提供了中英葡三種語言的翻譯,更有很強的本土語言文化邏輯。UM-CAT懂得配合地域翻譯,可準確地翻譯街道名稱、部門名字、法律語言、常用詞彙等。因以也有較高的準確度。例如,輸入本地詞語「新馬路」,如果利用一般翻譯軟件如Google Translate,翻譯結果就會是「新的馬路」, 而非我們所熟悉的街名。但若採用UM-CAT,就能馬上準確翻譯。此外,UM-CAT引入多種智能功效,包括翻譯記憶、術語管理、智能輔助提示等。其中建立專屬術語庫,能夠為翻譯人員提供過去翻譯的詞句作參考。為方便企業使用,它同時具備項目管理功能,可以把一個翻譯任務分配給不同的翻譯人員,按照他們的工作類型給予不同權限,包括翻譯、審核、排版,也可以追蹤翻譯流程及審核進度。UM-CAT自2018年12月推出後,便廣泛地應用在政府部門、報社、律師事務所等企業及機構,其精準互譯功能,已經成功協助處理不少中英葡翻譯項目。除了UM-CAT外,黃教授及其團隊近年積極研究「會議智能翻譯系統」,這是一種將人聲轉換成文字的工具。可以同步生成中文、廣東話,英文及葡文四種文字,可滿足多種場景同傳交互體驗,預計不久即會推出供大眾使用。

過去二十年的研究生涯及成就

回顧黃教授的經歷,1995年於澳門大學軟件工程學士畢業後繼續修讀碩士學位,後來前往清華大學深造,博士畢業後回到澳門大學擔任研究助理。過去二十年的研究生涯,由研究助理到大學副教授,當提及為什麼一開始想從事翻譯機器的研究時,黃教授笑笑說:「當初我也沒有想太多,只是純粹為了興趣。最初機緣巧合下有幸參加當時中葡通的研究,只是想繼續研究下去。」黃教授表示因澳門回歸前沒有機器翻譯這方面的研究,他打趣地形容當時是一個 “沙漠” ,所有都要由零開始,幸好澳門十分有優勢,因有歷史背景及不少葡語人才。回望過去二十年的研究生涯最大的收穫就是成功推出多項翻譯成果,包括中葡通、葡譯通、Um2T中葡在線神經機器翻譯系統、UM-CAT等多項創新的科研技術。其中的英中機器翻譯系統更於2017年全國機器翻譯研討會主辦的英中機器翻譯評測大賽中,包攬大賽的冠、亞、季軍。一步一腳印,一切都得來不易。

積極培育人才

除了推出多項成果外,更重要的是為社會培育了一代科技人才。黃教授表示機器翻譯是這是一門跨學科的研究,涉及計算機科學、數學及語言學等學科,因此對於有志在機器翻譯領域發展的學生,同樣是過來人的黃教授表示: 「我總是會先問學生是否真的對文字有興趣,就算對計算機有興趣但對文字沒有興趣也會很辛苦,因為翻譯始終離不開文字,有興趣才會有動力研究。」過去二十年間,黃教授培育有志在機器翻譯發展的學生。其中歐志雄是黃教授的得力助手,同時也是UM-CAT 的「大管家」,在澳大完成計算機科學學士和碩士學位後即加入實驗室當研究助理。另一位2012年電子商貿科學碩士畢業生曾曉東,在學期間也跨學系積極參與NLP2CT項目研發工作, 學習人工智能技術。2018年成功躋身由麻省理工科技評論(MIT Technology Review)選出的「中國35歲以下科技創新青年35人」全球最權威的青年科技創新人才榜中。黃教授表示有不少學生畢業後都得到良好的發展,多名學生畢業後就職於騰訊、阿里巴巴等知名公司。部份更自行創業,把機器翻譯的技術帶到不同地方。

未來展望

今天,澳大多項機器翻譯的推出已大力推動了中葡英翻譯工作, 有助中葡經貿文化的交往同時也幫助澳門走出國際。但隨著人工智能,大數據等創新科技的興起,對於未來多語機器翻譯,黃教授表示目前技術還很大的發展空間,皆因潛在市場十分龐大,各行各業都有不少應用情景。包括日常觀看影片時自動生成的字幕、個人旅遊所需的翻譯手機程式、教學翻譯、會議翻譯,為弱聽人士提供人聲轉換文字工具,更什至為聾啞人士提供文字或人聲轉換手語工具。未來,黃教授期望能不斷創新及優化機器翻譯,把技術投入到更多行業,為科技發展及人類社會進步貢獻更多。

  1. 在線中葡翻譯平台 (UM-NMT)
    點擊此鏈接: http://nlp2ct.cis.um.edu.mo/NMT/
  2. 在線中葡英輔助翻譯平台 (UM-CAT)
    相比一般的翻譯平台,UM-CAT提供了多種智能功能,包括智能記憶翻譯,術語管理和建立術語數據庫,數據庫可以幫助用戶翻查過去翻譯的用語。如果您有一些英語/葡語文件需要使用更專業的軟件翻譯,請發郵件到nlp2ct@um.edu.mo或與Alice Lam(alicewmlam)聯繫以創建帳戶。