• Size: 2.1 MB
  • Uploaded: 2019-01-13 14:47:46
  • Status: Successfully converted

Some snippets from your converted document:

A COMPUTATIONAL GRAMMAR OF SINHALA FOR ENGLISH-SINHALA MACHINE TRANSLATION By Budditha Hettige (08/8021) Supervised by Prof. Asoka S. Karunananda Overview 1. Introduction 2. Aim and Objectives 3. Approaches to Machine Translation 4. Overview of English and Sinhala Languages 5. A Novel Approach to Machine Translation 6. Design and Implementation 7. Evaluation 8. Conclusion & Further works 9. Demonstration 2 Introduction • According to the latest estimates, Sri Lanka has a population of about 20 million • Majority of Sri Lankans (74%) use Sinhala as the spoken and written language • Sri Lankans also use English as a Second Language and many sources are in English • However, only about 10% of the population can read and write English Language well • 18 million cannot read and write English Language perfectly • This leads to what is called “Language barrier” 3 Language Barrier • Barrier for communication among communities using different languages • It affects on acquisition of world knowledge • It also a barrier for discovery and dissemination of knowledge 4 Solutions for Language Barrier • Learning English by the entire population, but this is not practical and also cannot neglect the power of the mother tongue in the context of knowledge • Usage of translators – Human Translation – Machine Translation • Machine Translation has been a potential solution • Machine Translation is cost effective and faster than the human translation • Many Asian and European countries use machine translation systems to solve their language barrier 5 Machine Translation • Machine Translation is a computer software that translates text or voice from one natural language to another with or without human assistance [Wikipedia] • This is an inherently difficult task due to diversification of natural languages • As such many machine translation approaches appear to be rather ad-hoc • Thus developing theoretical-based approaches to machine translation turn out to be a research challenge 6 Problem Lack of theoretical based approach to machine translation has been one of the major reasons for the development of efficient computer based solutions for natural language translations 7 Aim Design and develop an agent based English to Sinhala machine translation system with a theoretical basis 8 Objectives • Objective 1: Critically review the existing systems for machine translation. • Objective 2: Study the concepts / techniques for Natural Language Processing. • Objective 3: Study the concepts/techniques and adapt existing Morphological Analyzers and Parsers for English language • Objective 4: Design and develop Morphological Analyzer/Generator and Parser/Composer for Sinhala Language 9 Objectives (cont..) • Objective 5: Design and develop lexical databases for English to Sinhala Machine Translation • Objective 6: Design and develop English to Sinhala Machine Translation system by integrating above sub-systems to form a Multi agent system • Objective 7: Evaluate the system 10 Hypothesis Concepts of “Varanageema” (Conjugation) in Sinhala language can be used to drive English to Sinhala Machine translation 11 Approaches to Machine Translation • Human-assisted • Rule-based • Statistical • Example-based • Knowledge-based • Hybrid • Agent-based 12 Human-assisted Approach • Uses human interaction for the pre editing, post editing and/or intermediate editing stages • Examples – Anusaaraka • Translates among Indian languages and English to Hindi – MaTra • Translates English to Hindi • Produce understandable output than wide coverage 13 Rule-based Approach • Gives grammatically correct translation by using the set of rules • Examples – Apertium • Open source system • Can be used to translate any related two languages – OpenLogos • Open source • English and German to major European languages 14 Statistical Approach • Generates translations using statistical methods based on bilingual text corpora • Examples – Babel Fish • Web based application develop by the AltaVista • Can translate among English F

Recently converted files (publicly available):