Call for Papers

Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability

LREC 2016
Grand Hotel Bernardin Conference Center
Portorož, Slovenia
23 May 2016

Call for Submissions

http://interop2016.github.io

Description

Recent years have witnessed an upsurge in the quantity of available digital research data, offering new insights and opportunities for improved understanding. Following advances in Natural Language Processing (NLP), Text and data mining (TDM) is emerging as an invaluable tool for harnessing the power of structured and unstructured content and data. Hidden and new knowledge can be discovered by using TDM at multiple levels and in multiple dimensions. However, text mining and NLP solutions are not easy to discover and use, nor are they easy to combine for end users.

Multiple efforts are being undertaken world-wide to create TDM and NLP platforms. These platforms are targeted at specific research communities, typically researchers in a particular location, e.g. OpenMinTeD, CLARIN (Europe), ALVEO (Australia), or LAPPS (USA). All of these platforms face similar problems in the following areas: discovery of content and analytics capabilities, integration of knowledge resources, legal and licensing aspects, data representation, and analytics workflow specification and execution.

The goal of cross-platform interoperability raises many problems. At the level of content, metadata, language resources, and text annotations, we use different data representations and vocabularies. At the level of workflows, there is no uniform process model that allows platforms to smoothly interact. The licensing status of content, resources, analytics, and of the output created by a combination of such licenses is difficult to determine and there is currently no way to reliably exchange such information between platforms. User identity management is often tightly coupled to the licensing requirements and likewise an impediment for cross-platform interoperability.

Target audience

Language resources and technologies, NLP, computational linguistics, and text mining communities as well as their associated infrastructural initiatives.

Motivation and Topics of interest

Workshop topics include but are not limited to:

cross-repository discovery of content, language resources, and analytics
uniform access to content repositories or heterogeneous data sources (content, knowledge)
extraction of textual content from heterogeneous sources
orchestration of analytics workflows composed from analytics from different sources
orchestration of cross-platform analytics workflows
linking knowledge sources and uniformly accessing them from analytics workflows
annotation schema design best practices
mapping and transformation between annotation schemata
dynamic deployment of analytics to computing resources
machine-interpretable representation of legal and licensing metadata
policy making for TDM for an international open research environment and open access publishing

Format

The workshop is planned as an open-space event in which the workshop participants host and participate in discussions related to the topics of interest.

We invite submissions of extended abstracts/short papers describing recent work, thoughts, or best practices on one or more of the topics of interest (up to 4 pages). All submissions will be reviewed using a simple blind process by at least three programm committee members and will be assessed based on their relevance, potential to create constructive discussion, and clarity of writing. The submissions must be formatted in compliance with the style sheet that will be adopted for the LREC Proceedings (to be announced later on the Conference web site).

Accepted papers will be presented at the workshop in the form of a 5 minute lightning talk and included in the workshop proceedings. If there is an unexpectedly high number of submissions, we may consider accepting some as posters.

At least one author of each paper is expected to register for the workshop. During the workshop, the author is expected to host or co-host a discussion group. We plan to align the topics of the discussion groups with the topics of the authors submissions. The hosts will take minutes which are to be aggregated into a report after the workshop. We wish to encourage authors to offer their help in the report writing process to the organizing committee.

Important dates

Submission: February 19, 2016
Notification: March 4, 2016
Camera ready: March 25, 2016
Workshop: Monday, 23 May 2016

Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.

As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2016 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.

Contact Person

Richard Eckart de Castilho, Technische Universität Darmstadt, Germany

Organizing Committee

Richard Eckart de Castilho, Technische Universität Darmstadt, Germany
Sophia Ananiadou University of Manchester, UK
Thomas Margoni, University of Stirling, UK
Wim Peters, University of Sheffield, UK
Stelios Piperidis, ILSP/ARC, Greece

Programme Committee

Dominique Estival, Western Sydney University, Australia
Iryna Gurevych, Technische Universität Darmstadt, Germany
Jens Grivolla, Universitat Pompeu Fabra, Spain
John Philip McCrae, National University of Ireland, Galway, Ireland
Joseph Mariani, LIMSI/CNRS, France
Kalina Bontcheva, University of Sheffield, UK
Lucie Guibault, University of Amsterdam, The Netherlands
Menzo Windhouwer, Meertens Institute, The Netherlands
Nancy Ide, Vassar College, USA
Natalia Manola, ILSP/ARC, Greece
Nicolas Hernandez, University of Nantes, France
Pei Chen, Wired Informatics, USA
Peter Klügl, Averbis GmbH, Germany
Rafal Rak, UberResearch and University of Manchester, UK
Renaud Richardet, EPFL, Switzerland
Robert Bossy, INRA, France
Thilo Götz, IBM, Germany
Torsten Zesch, University of Duisburg-Essen, Germany
Yohei Murakami, Kyoto University, Japan