What's New?

Keynote Talks

dekai@cs.ust.hk (Dekai Wu) — Fri, 02 Nov 2012 06:01:54 +0000

IWSLT 2012 is happy to feature three keynote talks, sponsored by NICT.

Dr. Dong Yu, Microsoft Research, USA
Prof. Hideki Isozaki, Okayama Prefectural University, Japan
Dr. Chai Wutiwiwatchai, National Electronics and Computer Technology Center (NECTEC), Thailand

Who Can Understand Your Speech Better — Deep Neural Network or Gaussian Mixture Model?
Dr. Dong Yu, Microsoft Research

Abstract: Recently we have shown that the context-dependent deep neural network (DNN) hidden Markov model (CD-DNN-HMM) can do surprisingly well for large vocabulary speech recognition (LVSR) as demonstrated on several benchmark tasks. Since then, much work has been done to understand its potential and to further advance the state of the art. In this talk I will share some of these thoughts and introduce some of the recent progresses we have made.

In the talk, I will first briefly describe CD-DNN-HMM and bring some insights on why DNNs can do better than the shallow neural networks and Gaussian mixture models. My discussion will be based on the fact that DNN can be considered as a joint model of a complicated feature extractor and a log-linear model. I will then describe how some of the obstacles, such as training speed, decoding speed, sequence-level training, and adaptation, on adopting CD-DNN-HMMs can be removed thanks to recent advances. After that, I will show ways to further improve the DNN structures to achieve better recognition accuracy and to support new scenarios. I will conclude the talk by indicating that DNNs not only do better but also are simpler than GMMs.

Bio: Dr. Dong Yu joined Microsoft Corporation in 1998 and Microsoft Speech Research Group in 2002, where he is currently a senior researcher. He holds a PhD degree in computer science from University of Idaho, an MS degree in computer science from Indiana University at Bloomington, an MS degree in electrical engineering from Chinese Academy of Sciences, and a BS degree (with honors) in electrical engineering from Zhejiang University. His recent work focuses on deep neural network and its applications to large vocabulary speech recognition. Dr. Dong Yu has published over 100 papers in speech processing and machine learning and is the inventor/co-inventor of around 50 granted/pending patents. He is currently serving as an associate editor of IEEE transactions on audio, speech, and language processing (2011-) and has served as an associate editor of IEEE signal processing magazine (2008-2011) and the lead guest editor of IEEE Transactions on Audio, Speech, and Language Processing special issue on deep learning for speech and language processing (2010-2011).

Head Finalization: Translation from SVO to SOV
Prof. Hideki Isozaki, Okayama Prefectural University

Abstract: Asian languages such as Japanese and Korean follow Subject-Object-Verb (SOV) word order, which is completely different from European languages such as English and French that follow Subject-Verb-Object word. The difference is not limited to the position of "Object" or the accusative case, and the former is also called head-final and the latter is also called head-initial. Because of the difference, phrase-based SMT between SVO and SOV does not work well. This talk introduces Head Finalization that reorders sentences into the head-final word order. According to the result of the NTCIR-9 workshop, Head Finalization was quite effective for English-to-Japanese patent translation.

Bio: Hideki Isozaki is a professor of Okayama Prefectural University, Japan. He received B.E., M.E., and Ph.D. from the University of Tokyo in 1983, 1986, and 1998 respectively. After joining Nippon Telegraph and Telephone Corporation (NTT) in 1986, he has worked on logical inference, information extraction, named entity recognition, question answering, summarization, and machine translation. From 1990 to 1991, he was a visiting scholar at Stanford University. He has authored or coauthored over 100 papers and Japanese books including LaTeX with Complete Control and Question Answering Systems.

Toward Universal Network-based Speech Translation
Dr. Chai Wutiwiwatchai, National Electronics and Computer Technology Center (NECTEC)

Abstract: The speech translation technology has been widely expected to play an important role in today global communication. This talk will address activities of a recently developed international consortium, called Universal Speech Translation Advanced Research (U-STAR), which composes 26 research organizations from 23 Asian and European countries. This largest research consortium has jointly developed a network-based speech translation service which supports translation among 23 languages and accepts up to 17 languages speech input. The service has been developed based on shared language resources in travel and sport domains. Users are able to access the service via a freely available iPhone application, namely VoiceTra4U-M. This talk will start by describing the initiation of the U-STAR consortium, followed by summarizing the development issues on both language resource and system engineering parts. Some statistics and analyses of the global usage during a few months field-testing after service launching will be revealed. Finally, challenging issues to improve the service accuracy and to extend the number of supported languages and translation domains will be discussed.

Bio: Chai Wutiwiwatchai received his BEng (the first honor) and MEng degrees of electrical engineering from Thammasat and Chulalongkorn University, Thailand in 1994 and 1997 respectively. He received his PhD in Computer Science from Tokyo Institute of Technology in 2004 under the Japanese Governmental scholarship. He is now the Head of Speech and Audio Technology Laboratory, National Electronics and Computer Technology Center (NECTEC), Thailand. His research work includes several international collaborative projects in a wide area of speech and language processing including Universal Speech Translation Advanced Research (U-STAR), PAN Localization Network (PANL10N), and ASEAN Machine Translation. He is a member of International Speech Communication Association (ISCA), Institute of Electronics, Information and Communication Engineers (IEICE), and has served as a country representative in the ISCA international affair committee during 2007-2009.

Release of ASR/SLT/MT References

michael.paul@nict.go.jp (Michael Paul) — Tue, 23 Oct 2012 08:58:08 +0000

The references of the IWSLT 2012 TED progress evaluation data set (tst2011) and the OLYMPICS evaluation data set (testset_IWLST12) are now available to participants.

TED Task: SLT and MT test sets available

cettolo@fbk.eu (Mauro Cettolo) — Fri, 14 Sep 2012 06:52:28 +0000

The test sets for both SLT and MT tracks of the TED task are now available: visit the the TED task page for details and getting them.

Please, remind that the deadline for the submission of automatic translations is Sunday, September 23, [23:59 Japan Standard Time].

PARTICIPANTS ARE RESPONSIBLE FOR SUBMITTING THEIR RUNS IN THE CORRECT FORMAT.

For any problem, please contact the organizers by e-mail to iwslt2012.ted AT gmail DOT com

OLYMPICS Task Testset Released

michael.paul@nict.go.jp (Michael Paul) — Tue, 11 Sep 2012 00:42:18 +0000

the testset data files of the OLYMPICS task are now available. If you registered for the task and didn't receive the data sets yet, please contact the organizers immediately. The run submission deadline is set for September 18, 2012 [23:59 Japan Standard Time]. PARTICIPANTS ARE RESPONSIBLE FOR SUBMITTING THEIR RUNS IN THE CORRECT FORMAT. Incomplete or incorrectly formatted runs will be ignored for the IWSLT 2012 evaluation.

Good Luck! :-)

Accommodations first come first served, book early (rates guaranteed till 19 Oct 2012)

dekai@cs.ust.hk (Dekai Wu) — Tue, 04 Sep 2012 10:18:28 +0000

Accommodations informations have been updated. First come first served, so please book early to avoid disappointment! Rates are guaranteed until 19 Oct 2012.

Scientific Paper Submission Deadline Extended

michael.paul@nict.go.jp (Michael Paul) — Tue, 04 Sep 2012 08:17:00 +0000

The submission deadline for scientific papers was extended to:

Sep 30, 2012

Details can be found at the Important Dates and the Submission pages.

TED task: ASR test sets available

cettolo@fbk.eu (Mauro Cettolo) — Fri, 31 Aug 2012 08:06:01 +0000

The test sets for the ASR track of the TED task are available. Please, find them at the TED task page.

TED Run Submission Deadline Extended

cettolo@fbk.eu (Mauro Cettolo) — Fri, 24 Aug 2012 09:33:53 +0000

In order to avoid overlap with InterSpeech 2012, the run submission deadlines of the MT/SLT tracks of IWSLT 2012 TED Task are extended as follows:

Sep 14, 2012 : Release of TEST data
Sep 23, 2012 : Run submission of TEST

Details can be found at the TED Task page.

New TED MT additional language pair: Slovenian to English

cettolo@fbk.eu (Mauro Cettolo) — Fri, 24 Aug 2012 09:26:22 +0000

A new language pair has been added as additional pair to the TED MT track:

from Slovenian to English

Training and development data are available at the TED Task page.

OLYMPICS Run Submission Deadline Extended

michael.paul@nict.go.jp (Michael Paul) — Fri, 10 Aug 2012 06:36:51 +0000

In order to give recently registered participants enough time to prepare their MT engines, the run submission deadlines of the IWSLT 2012 OLYMPICS Task were extended as follows:

Sep 11, 2012 : Release of TEST data
Sep 18, 2012 : Run submission of TEST

Details can be found at the OLYMPICS Task page.