Chair of Foundations of Programming
|
Seminar zur Verarbeitung natürlicher Sprache in the winter term 2010/11
The language of this seminar is going to be German, so some parts of this page are only available in German. If you do not speak German and you are interested in a seminar, please have a look at our offerings for next semester.
Introduction
We will study the interplay between the field of automated translation of human languages, in particular syntax-based statistical machine translation, and the theory of tree automata, tree transducers and related models. Please click the links if you want to learn more.
Objectives
In brief: studying literature, giving a scientific talk, and writing a scientific essay. Every student is expected to give a talk (about 45 minutes) about his topic. Students who are not taking this course as a proseminar are expected to write a seminar essay (about 15 pages). After the seminar every student shall have a grasp of the central statement of each of the presented topics. When the seminar is included into an oral examination, this matter may be examined.
Prerequisites
It is advisable, though not compulsory, to have knowledge from one of our lectures about Machine Translation or Tree Automata. Basic formal language theory is also recommended.
Organization
During the first meeting topics are assigned to the students. Please attend this meeting if you want to participate. A second meeting, around one month later, is is intended to encourage your literature study. Further meetings will be scheduled individually between each student and his/her supervisor as needed. At the end of the lecture period, a day or two will be reserved for all the talks.
Important note: Students who fail to adhere to deadlines (i.e., who hand in too little or too late) will be excluded from the seminar. Any deviations from the deadlines have to be negotiated with the supervisor in advance.
Talks Schedule
All talks take place on Thursday, Februrary 03. Each talk may take at most 45 minutes. This is a hard limit. Speakers can use their own laptops or the one which is on site. Speakers have to set up the respective laptop 15 minutes before their session begins.
10:00–10:15 |
Opening |
Morning Session. Chair: Toni Dietze |
10:15 |
Lars Engel: n-Grams (Proseminar) |
slides |
11:15–12:15 |
Lunch Break |
Afternoon Session. Chair: Toni Dietze |
12:15 |
Anja Fischer: Beweis der NP-Schwere des Decoding |
composition, slides |
13:15 |
Stefan Prasse: State-Split Grammars |
composition, slides |
14:15–14:30 |
Closing |
Schedule
The following table outlines the structure of the seminar over the course of the semester. Meetings will take place in the room INF 3027. See the panel at the right for your supervisor's contact information.
date |
event |
15. Oktober, 13 Uhr |
erstes Treffen: Themenvergabe |
5. November, 13 Uhr |
zweites Treffen |
November, Dezember |
individuelle Treffen mit dem Betreuer (nach Bedarf) zur Klärung von Fragen zur Literatur oder zur Ausarbeitung |
10. Dezember 23:59 MEZ |
letzte Gelegenheit, einen Entwurf der Ausarbeitung zwecks Betreuung einzureichen (optional) |
31. Dezember 23:59 MEZ |
Abgabe der finalen Version der Ausarbeitung (wird per E-Mail an alle Teilnehmer gereicht) |
Woche vom 17. Januar |
individuelle Treffen zur Besprechung der Folien für den Vortrag |
24. Januar |
wer bis jetzt nicht ausgeschlossen wurde, nimmt definitiv an den Vorträgen teil |
3. Februar, 10:00 Uhr |
letztes Treffen: Vorträge |
Topics
No. |
Title |
Literature |
Supervisor |
Student |
1 |
Sprachmodelle: n-Gramm-Modelle (Proseminar) |
[5] |
Dietze |
Lars Engel |
2 |
Syntaxbasierte Sprachmodelle: State-split Grammars |
[2,3] |
Dietze |
Stefan Prasse |
3 |
Synchronous Tree-Sequence-Substitution Grammars |
[1,4] |
— |
— |
4 |
Parsingprobleme für Probabilistic Synchronous Tree-Insertion Grammars |
[7] |
— |
— |
5 |
Binarisierung |
[14,15,16] |
— |
— |
6 |
Alignments |
[6,10,11,12] |
— |
— |
7 |
Syntactic Realignment Models |
[13] |
— |
— |
8 |
Bewertung von MT-Systemen |
[17,18,19] |
— |
— |
9 |
Verfügbarkeit von Korpora (Proseminar) |
[20,21,22] |
— |
— |
10 |
Decoding: NP-vollständig |
[23,24,25] |
Dietze |
Anja Fischer |
11 |
Decoding: Variational Decoding |
[25] |
— |
— |
Literature
[1] |
Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lim Tan, Sheng Li. A Tree Sequence Alignment-based Tree-to-Tree Translation Model. Proc. of ACL-HLT 2008. |
pdf |
[2] |
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein. Learning Accurate, Compact, and Interpretable Tree Annotation. Proc. COLING/ACL 2006 (Main Conference) |
pdf |
[3] |
Slav Petrov, Dan Klein. Learning and Inference for Hierarchically Split PCFGs. AAAI 2007 (Nectar Track) |
pdf |
[4] |
David Chiang. Learning to translate with source and target syntax. 2010. In Proc. ACL, pages 1443–1452. |
pdf |
[5] |
Daniel Jurafsky, James H. Martin. Speech and Language Processing. Pearson Education, 2009 |
[6] |
Jason Riesa, Daniel Marcu. Hierarchical Search for Word Alignment, Proc. ACL 2010. |
pdf |
[7] |
R. Nesson, S. M. Shieber, A. Rush. Induction of probabilistic synchronous tree-insertion grammars for machine translation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas (AMTA 2006), Boston, Massachusetts, 8–12 August 2006. |
pdf |
[8] |
Liang Huang, David Chiang. 2005. Better k-best parsing. In Parsing '05: Proceedings of the Ninth International Workshop on Parsing Technology, pages 53–64, Morristown, NJ, USA. Association for Computational Linguistics. |
pdf |
[9] |
Adam Pauls, Dan Klein. 2009. k-best a* parsing. In ACL-IJCNLP '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pages 958–966, Morristown, NJ, USA. Association for Computational Linguistics. |
pdf |
[10] |
Stephan Vogel, Hermann Ney, Christoph Tillmann. HMM-Based Word Alignment in Statistical Translation. Proc. ACL 1996 |
pdf |
[11] |
Franz Josef Och, Hermann Ney. A Systematic Comparison of Various Statistical Alignment Models. Proc. ACL 2003 |
pdf |
[12] |
Franz Josef Och, Hermann Ney. Improved Statistical Alignment Models. Proc. ACL 2000 |
pdf |
[13] |
Jonathan May, Kevin Knight. Syntactic Re-Alignment Models for Machine Translation. Proc. EMNLP, 2007. |
pdf |
[14] |
Andreas Maletti. Why synchronous tree substitution grammars? Proc. NAACL 2010. |
pdf |
[15] |
Wei Wang, Kevin Knight, Daniel Marcu. Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy. Proc. EMNLP-CoNLL, 2007. |
pdf |
[16] |
Liang Huang, Hao Zhang, Daniel Gildea, Kevin Knight. Binarization of Synchronous Context-Free Grammars. Computational Linguistics, 35 (4). |
pdf |
[17] |
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: A method for automatic evaluation of machine translation. In ACL, pages 311-318, 2002. |
pdf |
[18] |
Joseph P. Turian, Luke Shen, and I. Dan Melamed. Evaluation of machine translation and its evaluation. In MT Summit, 2003. |
pdf |
[19] |
Satanjeev Banerjee and Alon Lavie. METEOR: An automatic metric for MT evaluation with improved correlation with human judgements. In ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65-72, 2005. |
pdf |
[20] |
http://en.wikipedia.org/wiki/Text_corpus |
[21] |
http://en.wikipedia.org/wiki/Parallel_text_alignment |
[22] |
http://en.wikipedia.org/wiki/Treebank |
[23] |
Kevin Knight. Squibs and discussions – Decoding complexity in word-replacement translation models. Computational Linguistics, 25(4), 1999. |
pdf |
[24] |
Francisco Casacuberta and Colin de la Higuera. Computational Complexity of Problems on Probabilistic Grammars and Transducers. LNCS, 2000. |
pdf |
[25] |
Zhifei Li, Jason Eisner and Sanjeev Khudanpur. Variational Decoding for Statistical Machine Translation. In Proc. ACL 2009. |
pdf |
Getting Help
We have some information on writing articles available online. In general, if you have questions, do not hesitate to contact your supervisor. The earlier you address your problems, the easier the solutions will be.
|
Contact
Prof. Dr.-Ing. habil. Dr. h.c./Univ. Szeged Heiko Vogler
Phone: +49 (0) 351 463-38232 Fax: +49 (0) 351 463-37959
e-mail contact form
Sorry — there was an error in gathering the desired information
|