English to Malay Automated Translation Resources –Proverb

[updated 12 Sept 2013]
These are several resources for my reseach on Malay Proverb Treatment in Malay-English Machine Translation…
  1. Google Translate - my dream to achieve... getting nearer
  2. BabelFish - another online translator by Altavista, now own by Yahoo!
  3. OpenLogos - machine translation, I can't get it to work for me yet.
  4. Statistical Machine Translation - the research page of SMT
  5. Moses: a state-of-the-art open source SMT system - got to try this... soon.
  6. CitCat - They have done English to Malay and Malay to English online translation... but still in BETA
  7. Machine Translation - comparison of machine translation applications on wikipedia
  8. Lost In Translation - ever think about funny stuff here? here's one...
  9. My favourite Malay-English translation # http://www.citcat.com
  10. Proverb resources from all over the world, including proverb collections, articles on proverbs, proverb journals, and reference works - http://cogweb.ucla.edu/Discourse/Proverbs/index.html
  11. Articles for translators and translation agencies: Translation Theory: A Model of Translation Based on Proverbs and Their Metaphors: A Cognitive Descriptive Approach - http://www.translationdirectory.com/articles/article2283.php
  12. Manual Translation - http://www.translationdirectory.com/
  13. Stanford Natural Language Processing and Computational Linguistics Group - http://nlp.stanford.edu/projects/mt.shtml
  14. Google Translation Research group - http://research.google.com/pubs/MachineTranslation.html
  15. University Research Program for Google Translate - The University Research Program for Google Translate provides researchers, in the field of automatic machine translation, tools to help compare and contrast with, and build on top of, Google's statistical machine translation system.Participation in the program will allow researchers programmatic access to Google's translation service. http://research.google.com/university/translate/
Proverb/Idioms and Dictionaries
  1. Peribahasa Scanner - capable of detecting Malay proverb from a sentences or a part of sentence. There are currently 3000+ entries of Malay proverbs available in the database. Also capable of suggesting Malay proverb based on Malay sentence context. https://play.google.com/store/apps/details?id=net.kerul.peribahasa
  2. Large Malay idioms/proverbs collection by MalayCivilation.ukm.my - http://malaycivilization.ukm.my/idc/groups/ukm_view_page/documents/malayportal/ukm_view_page.hcsp?nid=79
Open Journal Directories
Some open access thesis/dissertation directories, some have fulltext documents;
  1. DORAS (Dublin City University Online research Access Service) - http://doras.dcu.ie/theses/
  2. Ohiolink Electronik Thesis and Dissertations Center - http://www.ohiolink.edu/etd/
  3. PQDT OPEN - http://pqdtopen.proquest.com/
  4. UDL Thesis - http://www.udltheses.com/
  5. UWC Thesis Online - http://etd.uwc.ac.za/
  6. Open Thesis - http://www.openthesis.org/
  1. 8th International Conference of the Asian Association for Lexicography - http://asialex2013.org
  2. International Conference on Translation 2013, Penang, Malaysia
  3. MALINDO.org - MALINDO, as an annual international workshop, aims to bring together researchers and practitioners, representing different perspectives, to share and to exchange their ideas on the processing of South East Asian languages. Because of the venue of the 2012 Workshop, the Organisers of MALINDO will anticipate papers related to the processing of the varieties of Borneo languages. MALINDO’s objective is to highlight the effort and promising works done on the processing of the South East Asian languages, thus attracting more students to the CL and NLP fields.
  4. Computational Linguistics / NLP Conferences – 2013
  5. International Journal of Computer Science Issues - http://ijcsi.org/


Parallel Corpora


  •  European Corpus Initiative Multilingual Corpus I (ECI/MCI)
    A 98 million word corpus, covering most of the major European languages, as well as many others (viz. Albanian, Bulgarian, Chinese, Czech, Dutch, English, Estonian, French, Gaelic, German, Greek, Italian, Japanese, Latin, Lithuanian, Malay, Spanish, Danish, Uzbek, Norwegian, Portuguese, Russian, Serbian, Swedish, Turkish, Tibetan). The primary focus in this effort is on textual material of all kinds, including transcriptions of spoken material. ECI/MCI has 46 subcorpora in 27 (mainly European) languages. The total size of these is roughly 92 million (lexical) words. The corpora are marked up using TEI P2 conformant SGML (to varying levels of detail), with easy access to the source text without markup. Twelve of the component corpora are multilingual parallel corpora with from two to nine sub-corpora. All the alphabetic corpora (there is some Japanese and Chinese) are encoded in the ISO LATIN family of 8-bit character sets (ISO 8859-1, -5 and -7). A complete list of the contents is available following this link.
    Unusually cheap: the ECI/MCI is available directly from ECI at a price of 95 DFl (for payments made by credit card or Eurocheque); 110 DFl (for payments by bank transfer); or 120 DFl (for payments by cheques other than Eurocheques). Need only to sign a license agreement available (Postcript or LaTex version) at this address or this other one.
    It is also available at 35$ price (or trough membership) from the LDC in a CD-ROM in High Sierra format (ISO 9660), readable on UNIX, MSDOS and Apple systems at least: cf. this page.

    Complete listing:  http://www.elsnet.org/ecilisting.html
    Some samples: http://www.elsnet.org/resources/eci-samples.html

  1. Treatment of Germany Proverb – Dimitra Anastasiou
Articles on the issue of Malay Proverb Treatment in Machine Translation
  1. Khirulnizam Abd Rahman and Norita Md Norwawi, 2013. THE CHALLENGES OF HANDLING PROVERBS IN MALAY-ENGLISH MACHINE TRANSLATION. International Conference on Translation 2013. Universiti Sains Malaysia, Penang.
  2. Khirulnizam Abd Rahman and Norita Md Norwawi, 2013. Proverb Treatment in Malay-English Machine Translation. International Conference on Machine Learning and Computer Science (IMLCS'2013). Kuala Lumpur.
  3. Khirulnizam Abd Rahman and Norita Md Norwawi, 2012. Malay Proverb Detection; Implementation on Mobile Environment. Mobile Computing, Applications and Services, September 2012, Universiti Teknikal Melaka, Malaysia.
  6. The Function of Stemming and Stop-word removal in Filtering Malay Proverb.– (not yet published)
  7. Comparison of Pattern Matching, Bags of Words, N-gram and Bayesian in Proverb Detection.– (not yet published)
  8. Implementation of Malay Proverb Filtering in Moses – (not yet published)
