Difference between revisions of "SemPub2016"
(17 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
{{Event | {{Event | ||
− | |Acronym= | + | |Acronym=SemPub16 |
|Title=Semantic Publishing Challenge 2016 | |Title=Semantic Publishing Challenge 2016 | ||
|Series=Semantic Publishing Challenge | |Series=Semantic Publishing Challenge | ||
− | |Type= | + | |Type=Workshop |
|Field=Computer Science | |Field=Computer Science | ||
− | |Start date= | + | |Start date=2016/05/29 |
− | |End date= | + | |End date=2016/06/02 |
− | |Homepage= | + | |Homepage=github.com/ceurws/lod/wiki/SemPub2016 |
− | |City= | + | |City=Crete |
− | |State= | + | |State=Heraklion |
− | |Country= | + | |Country=Greece |
|Abstract deadline=2016/01/18 | |Abstract deadline=2016/01/18 | ||
|Paper deadline=2016/03/11 | |Paper deadline=2016/03/11 | ||
+ | |Poster deadline=2016/05/29 | ||
+ | |Demo deadline=2016/05/29 | ||
|Camera ready=2016/04/24 | |Camera ready=2016/04/24 | ||
}} | }} | ||
− | + | ==Topics== | |
+ | This is the next iteration of the successful Semantic Publishing Challenge of ESWC 2014 and 2015. We continue pursuing the objective of assessing the quality of scientific output, evolving the dataset bootstrapped in 2014 and 2015 to take into account the wider ecosystem of publications. | ||
+ | |||
+ | To achieve that, this year’s challenge focuses on refining and enriching an existing linked open dataset about workshops, their publications and their authors. Aspects of “refining and enriching” include extracting deeper information from the HTML and PDF sources of the workshop proceedings volumes and enriching this information with knowledge from existing datasets. | ||
+ | |||
+ | Thus, a combination of broadly investigated technologies in the Semantic Web field, such as Information Extraction (IE), Natural Language Processing (NLP), Named Entity Recognition (NER), link discovery, etc., is required to deal with the challenge’s tasks. | ||
− | |||
==Submissions== | ==Submissions== | ||
+ | We ask challengers to automatically annotate a set of multi-format input documents and to produce a LOD that fully describes these documents, their context, and relevant parts of their content. The evaluation will consist of evaluating a set of queries against the produced dataset to assess its correctness and completeness. | ||
+ | |||
+ | The primary input dataset is the LOD that has been extracted from the CEURWS.org workshop proceedings using the winning extraction tools of the 2014 and 2015 challenges, plus its full original HTML and PDF source documents. In addition, the challenge uses (as linking targets) existing LOD on scholarly publications. | ||
+ | |||
+ | The input dataset will be split in two parts: a training dataset and an evaluation dataset, which will disclosed a few days before the submission deadline. Participants will be asked to run their tool on the evaluation dataset and to produce the final Linked Dataset and the output of the queries on that dataset. | ||
+ | |||
+ | Further details about the organization are provided in the general rules page. | ||
+ | |||
+ | The Challenge will include three tasks: | ||
+ | |||
+ | Task 1: Extraction and assessment of workshop proceedings information | ||
+ | |||
+ | Participants are required to extract information from a set of HTML tables of contents and PDF papers published in CEURWS.org workshop proceedings. The extracted information is expected to answer queries about the quality of these workshops, for instance by measuring growth, longevity, etc. The task is an extension of the Task 1 of the 2014 and 2015 Challenges: we will reuse the most challenging quality indicators from last year’s challenge, others will be defined more precisely, others will be completely new. | ||
+ | |||
+ | Task 2: Extracting information from the PDF full text of the papers | ||
+ | |||
+ | Participants are required to extract information from the textual content of the papers (in PDF). That information should describe the organization of the paper and should provide a deeper understanding of the context in which it was written. In particular, the extracted information is expected to answer queries about the internal organization of sections, tables, figures and about the authors’ affiliations and research institutions, and fundings source. The task mainly requires PDF mining techniques and some NLP processing. | ||
+ | |||
+ | Task 3: Interlinking | ||
+ | |||
+ | Participants are required to interlink the CEURWS.org linked dataset with relevant datasets already existing at the Linked Open Data cloud. In particular, they are expected to interlink persons, papers, events, organizations and publications. All these entities should be identified, disambiguated and interlinked to their correspondences at other LOD datasets. Task 3 can be accomplished either as a named entity recognition and disambiguation task (NLP based entity linking), or as an entity interlinking task, or as a combination of methods. | ||
+ | |||
==Important Dates== | ==Important Dates== | ||
+ | January 18, 2016: Publication of the full description of tasks, rules and queries; publication of the training dataset . | ||
+ | March 24, 2016: Paper submission (extended) . | ||
+ | April 16, 2016: Notification and invitation to submit task results; (extended) . | ||
+ | April 30, 2016: Conference camera-ready (see note below) (extended) . | ||
+ | May 5, 2016: Deadline for making remarks to the training dataset and the evaluation tool (extended) . | ||
+ | May 12, 2016: Publication of the evaluation dataset details . | ||
+ | May 13, 2016: Results submission . | ||
+ | May 29 June 2: Challenge days . | ||
==Committees== | ==Committees== | ||
* Co-Organizers | * Co-Organizers | ||
* General Co-Chairs | * General Co-Chairs | ||
− | ** [[has general chair:: | + | ** [[has general chair::Angelo Di Iorio]], Department of Computer Science and Engineering, University of Bologna, IT |
− | + | ** [[has general chair::Anastasia Dimou]], Ghent University, BE | |
− | * | + | ** [[has general chair::Christoph Lange]], Enterprise Information Systems, University of Bonn / Fraunhofer IAIS, DE |
− | ** [[has | + | ** [[has general chair::Sahar Vahdati]], University of Bonn, DE |
− | |||
− | |||
− | ** [[has | ||
* Panel Chair | * Panel Chair | ||
− | ** [[has OC member:: | + | ** [[has OC member::Stefan Dietze]], L3S Research Institute, Hannover, DE |
− | + | ** [[has OC member::Anna Tordai]], Elsevier, NL | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | ** [[has | ||
* Program Committee Members | * Program Committee Members | ||
− | ** [[has PC member:: | + | ** [[has PC member::Aliaksandr Birukou]], Springer Verlag, Heidelberg, Germany |
+ | ** [[has PC member::Lukasz Bolikowski]], University of Warsaw, Poland | ||
+ | ** [[has PC member::Kai Eckert]], University of Mannheim, Germany | ||
+ | ** [[has PC member::Maxim Kolchin]], ITMO University, SaintPetersburg, Russia | ||
+ | ** [[has PC member::Phillip Lord]], Newcastle University, UK | ||
+ | ** [[has PC member::Philipp Mayr]], GESIS, Germany | ||
+ | ** [[has PC member::Jodi Schneider]], University of Pittsburgh, USA | ||
+ | ** [[has PC member::Selver Softic]], Graz University of Technology, Austria | ||
+ | ** [[has PC member::Ruben Verborgh]], Ghent university – iMinds | ||
+ | ** [[has PC member::Michael Wagner]], Schloss Dagstuhl, LeibnizZentrum für Informatik, German |
Latest revision as of 18:29, 1 June 2016
SemPub2016 | |
---|---|
Semantic Publishing Challenge 2016
| |
Event in series | Semantic Publishing Challenge |
Dates | 2016/05/29 (iCal) - 2016/06/02 |
Homepage: | github.com/ceurws/lod/wiki/SemPub2016 |
Location | |
Location: | Crete, Heraklion, Greece |
Loading map... | |
Important dates | |
Abstracts: | 2016/01/18 |
Papers: | 2016/03/11 |
Posters: | 2016/05/29 |
Demos: | 2016/05/29 |
Camera ready due: | 2016/04/24 |
Table of Contents | |
Topics
This is the next iteration of the successful Semantic Publishing Challenge of ESWC 2014 and 2015. We continue pursuing the objective of assessing the quality of scientific output, evolving the dataset bootstrapped in 2014 and 2015 to take into account the wider ecosystem of publications.
To achieve that, this year’s challenge focuses on refining and enriching an existing linked open dataset about workshops, their publications and their authors. Aspects of “refining and enriching” include extracting deeper information from the HTML and PDF sources of the workshop proceedings volumes and enriching this information with knowledge from existing datasets.
Thus, a combination of broadly investigated technologies in the Semantic Web field, such as Information Extraction (IE), Natural Language Processing (NLP), Named Entity Recognition (NER), link discovery, etc., is required to deal with the challenge’s tasks.
Submissions
We ask challengers to automatically annotate a set of multi-format input documents and to produce a LOD that fully describes these documents, their context, and relevant parts of their content. The evaluation will consist of evaluating a set of queries against the produced dataset to assess its correctness and completeness.
The primary input dataset is the LOD that has been extracted from the CEURWS.org workshop proceedings using the winning extraction tools of the 2014 and 2015 challenges, plus its full original HTML and PDF source documents. In addition, the challenge uses (as linking targets) existing LOD on scholarly publications.
The input dataset will be split in two parts: a training dataset and an evaluation dataset, which will disclosed a few days before the submission deadline. Participants will be asked to run their tool on the evaluation dataset and to produce the final Linked Dataset and the output of the queries on that dataset.
Further details about the organization are provided in the general rules page.
The Challenge will include three tasks:
Task 1: Extraction and assessment of workshop proceedings information
Participants are required to extract information from a set of HTML tables of contents and PDF papers published in CEURWS.org workshop proceedings. The extracted information is expected to answer queries about the quality of these workshops, for instance by measuring growth, longevity, etc. The task is an extension of the Task 1 of the 2014 and 2015 Challenges: we will reuse the most challenging quality indicators from last year’s challenge, others will be defined more precisely, others will be completely new.
Task 2: Extracting information from the PDF full text of the papers
Participants are required to extract information from the textual content of the papers (in PDF). That information should describe the organization of the paper and should provide a deeper understanding of the context in which it was written. In particular, the extracted information is expected to answer queries about the internal organization of sections, tables, figures and about the authors’ affiliations and research institutions, and fundings source. The task mainly requires PDF mining techniques and some NLP processing.
Task 3: Interlinking
Participants are required to interlink the CEURWS.org linked dataset with relevant datasets already existing at the Linked Open Data cloud. In particular, they are expected to interlink persons, papers, events, organizations and publications. All these entities should be identified, disambiguated and interlinked to their correspondences at other LOD datasets. Task 3 can be accomplished either as a named entity recognition and disambiguation task (NLP based entity linking), or as an entity interlinking task, or as a combination of methods.
Important Dates
January 18, 2016: Publication of the full description of tasks, rules and queries; publication of the training dataset . March 24, 2016: Paper submission (extended) . April 16, 2016: Notification and invitation to submit task results; (extended) . April 30, 2016: Conference camera-ready (see note below) (extended) . May 5, 2016: Deadline for making remarks to the training dataset and the evaluation tool (extended) . May 12, 2016: Publication of the evaluation dataset details . May 13, 2016: Results submission . May 29 June 2: Challenge days .
Committees
- Co-Organizers
- General Co-Chairs
- Angelo Di Iorio, Department of Computer Science and Engineering, University of Bologna, IT
- Anastasia Dimou, Ghent University, BE
- Christoph Lange, Enterprise Information Systems, University of Bonn / Fraunhofer IAIS, DE
- Sahar Vahdati, University of Bonn, DE
- Panel Chair
- Stefan Dietze, L3S Research Institute, Hannover, DE
- Anna Tordai, Elsevier, NL
- Program Committee Members
- Aliaksandr Birukou, Springer Verlag, Heidelberg, Germany
- Lukasz Bolikowski, University of Warsaw, Poland
- Kai Eckert, University of Mannheim, Germany
- Maxim Kolchin, ITMO University, SaintPetersburg, Russia
- Phillip Lord, Newcastle University, UK
- Philipp Mayr, GESIS, Germany
- Jodi Schneider, University of Pittsburgh, USA
- Selver Softic, Graz University of Technology, Austria
- Ruben Verborgh, Ghent university – iMinds
- Michael Wagner, Schloss Dagstuhl, LeibnizZentrum für Informatik, German
Abstract deadline | January 18, 2016 + |
Acronym | SemPub16 + |
Camera ready due | April 24, 2016 + |
Demo deadline | May 29, 2016 + |
End date | June 2, 2016 + |
Event in series | Semantic Publishing Challenge + |
Event type | Workshop + |
Has OC member | Stefan Dietze + and Anna Tordai + |
Has PC member | Aliaksandr Birukou +, Lukasz Bolikowski +, Kai Eckert +, Maxim Kolchin +, Phillip Lord +, Philipp Mayr +, Jodi Schneider +, Selver Softic +, Ruben Verborgh + and Michael Wagner + |
Has general chair | Angelo Di Iorio +, Anastasia Dimou +, Christoph Lange + and Sahar Vahdati + |
Has location city | Crete + |
Has location country | Category:Greece + |
Has location state | Heraklion + |
Homepage | http://github.com/ceurws/lod/wiki/SemPub2016 + |
IsA | Event + |
Paper deadline | March 11, 2016 + |
Poster deadline | May 29, 2016 + |
Start date | May 29, 2016 + |
Submission deadline | March 11, 2016 + |
Title | Semantic Publishing Challenge 2016 + |