TIAD 2020

About

The third shared task for Translation Inference Across Dictionaries (TIAD 2020) is aimed at exploring methods and techniques for automatically generating new bilingual (and multilingual) dictionaries from existing ones in the context of a coherent experiment framework that enables reliable validation of results and solid comparison of the processes used. This initiative also aims to enhance further research on the topic of inferring translations across languages.

TIAD 2020 will be held in conjunction to the GLOBALEX 2020 – Linked Lexicography workshop at the 12th Language Resources and Evaluation Conference (LREC 2020) on Tuesday, May 12 2020 in Marseille (France).

UPDATE: Unfortunately, LREC 2020 was cancelled owing to the COVID-19 crisis and did not take place in Marseille as expected, nor was it replaced by an online event. However, the TIAD 2020 campaign continued as planned, the evaluation results were communicated, and the system papers were published jointly with the Globalex workshop proceedings. As for the physical (or virtual) authors' presentation of such papers to the community, the authors will have the opportunity of presenting them in the next TIAD edition (TIAD 2021).

Task definition

The objective of TIAD shared task is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual resources. Such techniques would help in auto-generating new bilingual and multilingual dictionaries based on existing ones.

In particular, the participating systems will be asked to generate new translations automatically among three languages, English, French, Portuguese, based on known translations contained in the Apertium RDF graph. As these languages (EN, FR, PT) are not directly connected in this graph, no translations can be obtained directly among them there. Based on the available RDF data, the participants will apply their methodologies to derive translations, mediated by any other language in the graph, between the pairs EN/FR, FR/PT and PT/EN. Other language pairs might be included in the evaluation process, in which case the participants will be conveniently informed.

Participants may also make use of other freely available sources of background knowledge (e.g. lexical linked open data and parallel corpora) to improve performance, as long as no direct translation among the target language pairs is applied.

Evaluation of the results will be carried out by the organisers against manually compiled pairs of K Dictionaries and other resources.

Publication of results

Participants will submit a system paper that should include a description of the system, the way the data have been processed, the applied algorithms, the obtained results, as well as the conclusions and ideas for future improvements. The papers will be peer reviewed prior to publication to confirm that all aspects are well covered.

The workshop will accept also regular papers from participants who are not participating in the shared task but still have worked in the topic of translation inference and want to publish novel results or ideas, maybe with different datasets and experimental basis as the ones proposed in this shared task. Such papers will be peer reviewed on the basis of their scientific quality.

Both types of papers should have 6-10 pages and be formatted according to the LREC submission format. All the accepted papers will be published as part of the GLOBALEX workshop proceedings and presented during the Globalex workshop. Paper submissions will go through the START system.

UPDATE:The TIAD 2020 proceedings are already available as part of the Globalex 2020 workshop proceedings.

How to participate in the shared task

1. Contact us so we can be aware of your participation and inform you about any possible change, issue, etc. (see contact details at the bottom of this page)
2. Read the task and data description
3. Get the input data (initial translations)
4. Run your system on the input data
5. Get the output results (inferred translations) and format it according to the guidelines (see the task and data description section)
6. Send the output data to the organisers and wait for the evaluation results
7. Write and submit a system description paper
8. Present your paper at the workshop

Important dates

17/12/2019 – Technical description of the evaluation process and data provided by organisers
14/02/2020 23/02/2020 – Submission of regular papers (not participating systems)
13/03/2020 26/03/2020 – Submission of results by participating systems / notification of regular papers
02/04/2020 09/04/2020 – Evaluation results communicated by organisers / camera–ready of regular papers
09/04/2020 16/04/2020 – Submission of system description papers
12/05/2020 – Workshop day

Organisers

Jorge Gracia, University of Zaragoza, Spain
Besim Kabashi. Friedrich-Alexander University of Erlangen-Nuremberg and Ludwig-Maximilian University of Munich, Germany

In conjunction with the GLOBALEX organisers

Previous editions

TIAD 2017 at LDK 2017 in Galway (Ireland)
TIAD 2019 at LDK 2019 in Leipzig (Germany)

Reviewing committee

See the GLOBALEX scientific committee

References

Some papers describing previous work on translation inference across dictionaries:

Gracia J., Kabashi, B., Kernerman, I. (Eds.): Proceedings of "TIAD-2019 Shared Task – Translation Inference Across Dictionaries" co-located with the 2nd Language, Data and Knowledge Conference (LDK 2019). Leipzig, Germany, May 20, 2019. See http://ceur-ws.org/Vol-2493/.
Gracia J., Kabashi, B., Kernerman, I., Lanau-Coronas M., Lonke D.: Results of the Translation Inference Across Dictionaries 2019 Shared Task. In Proceedings of TIAD-2019 at LDK 2019, Leipzig, Germany, May 20, 2019. See http://ceur-ws.org/Vol-2493/summary.pdf
McCrae J. P., Bond, F., Buitelaar, P., Cimiano, Ph., Declerck, Th., Gracia, J., Kernerman, I., Montiel Ponsoda, E., Ordan, N. and Piasecki, M. (Eds.): Proceedings of the Workshop “Shared Task on Translation Inference Across Dictionaries”, co-located with the 1st Conference on Language, Data and Knowledge (LDK 2017). Galway, Ireland 2017. See http://ceur-ws.org/Vol-1899/.
Villegas, M., Melero, M., Gracia, J., and Bel, N. 2016. Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries. In LREC 2016 Proceedings: 613–622. http://repository.dlsi.ua.es/242/1/pdf/175_paper.pdf
Saralegi, X., Manterola, I. and San Vicente, I. 2011. Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 846–856. ACL. http://dl.acm.org/citation.cfm?id=2145526.
Shezaf, D. and Rappoport, A. 2010. Bilingual Lexicon Generation Using Non-Aligned Signatures. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 98–107. ACL. http://dl.acm.org/citation.cfm?id=1858692
Kaji, H., Tamamura, S. and Erdenebat, D. 2008. Automatic Construction of a Japanese-Chinese Dictionary via English. In LREC 2008 Proceedings: 699–706.
Mausam, Soderland, S., Etzioni, O,, Weld, D, Skinner, M. and Bilmes, J. 2008. Compiling a Massive, Multilingual Dictionary via Probabilistic Inference. In Annual Meeting of the Association of Computational Linguistics. ACL. https://www.cs.washington.edu/sites/default/files/ai/papers/tmpiVvJEg.pdf
Tanaka, K. and Umemura, K. 1994. Construction of a Bilingual Dictionary Intermediated by a Third Language. In Proceedings of the 15th Conference on Computational Linguistics, Volume 1, 297–303. ACL. http://dl.acm.org/citation.cfm?id=991937