Discovering and Maintaining Links on the Web of Data
Discovering and Maintaining Links on the Web of Data | |
---|---|
Discovering and Maintaining Links on the Web of Data
| |
Bibliographical Metadata | |
Subject: | Link Discovery |
Keywords: | Linked data, web of data, link discovery, link maintenance, record linkage, duplicate detection |
Year: | 2009 |
Authors: | Julius Volz, Christian Bizer, Martin Gaedke, Georgi Kobilarov |
Venue | ISWC |
Content Metadata | |
Problem: | Link Discovery |
Approach: | No data available now. |
Implementation: | Silk–Linking |
Evaluation: | No data available now. |
Contents
Abstract
The Web of Data is built upon two simple ideas, Employ the RDF data model to publish structured data on the Web and to create explicit data links between entities within different data sources. This paper presents the Silk -- Linking Framework, a toolkit for discovering and maintaining data links between Web data sources. Silk consists of three components: 1. A link discovery engine, which computes links between data sources based on a declarative specification of the conditions that entities must fulfil in order to be interlinked; 2. A tool for evaluating the generated data links in order to fine-tune the linking specification; 3. A protocol for maintaining data links between continuously changing data sources. The protocol allows data sources to exchange both linksets as well as detailed change information and enables continuous link recomputation. The interplay of all the components is demonstrated within a life science use case.
Conclusion
We presented the Silk framework, a flexible tool for discovering links between entities within different web data sources. The Silk-LSL link specification language was introduced and its applicability was demonstrated within a life science use case. We then proposed the WOD-LMP protocol for synchronizing and maintaining links between continuously changing Linked Data sources.
Future work
Future work on Silk will focus on the following areas: We will implement further similarity metrics to support a broader range of linking use cases. To assist users in writing Silk-LSL specifications, machine learning techniques could be employed to adjust weightings or optimize the structure of the matching specification. Finally, we will evaluate the suitability of Silk for detecting duplicate entities within local datasets instead of using it to discover links between disparate RDF data sources. The value of the Web of Data rises and falls with the amount and the quality of links between data sources. We hope that Silk and other similar tools will help to strengthen the linkage between data sources and therefore contribute to the overall utility of the network.
Approach
Positive Aspects: No data available now.
Negative Aspects: No data available now.
Limitations: No data available now.
Challenges: No data available now.
Proposes Algorithm: No data available now.
Methodology: No data available now.
Requirements: No data available now.
Limitations: No data available now.
Implementations
Download-page: http://silk.googlecode.com
Access API: No data available now.
Information Representation: No data available now.
Data Catalogue: {{{Catalogue}}}
Runs on OS: No data available now.
Vendor: No data available now.
Uses Framework: No data available now.
Has Documentation URL: http://www4.wiwiss.fu-berlin.de/bizer/silk/
Programming Language: Python
Version: No data available now.
Platform: No data available now.
Toolbox: No data available now.
GUI: Yes
Research Problem
Subproblem of: No data available now.
RelatedProblem: No data available now.
Motivation: No data available now.
Evaluation
Experiment Setup: No data available now.
Evaluation Method : A methodology that proved useful for optimizing link specifications is to manually create a small reference linkset and then optimize the Silk linking specification to produce these reference links, before Silk is run against the complete target data source.
Hypothesis: No data available now.
Description: No data available now.
Dimensions: {{{Dimensions}}}
Benchmark used: DBpedia, DrugBank
Results: No data available now.
Access API | No data available now. + |
Event in series | ISWC + |
Has Benchmark | DBpedia + and DrugBank + |
Has Challenges | No data available now. + |
Has DataCatalouge | {{{Catalogue}}} + |
Has Description | No data available now. + |
Has Dimensions | {{{Dimensions}}} + |
Has DocumentationURL | http://www4.wiwiss.fu-berlin.de/bizer/silk/ + |
Has Downloadpage | http://silk.googlecode.com + |
Has Evaluation | No data available now. + |
Has EvaluationMethod | A methodology that proved useful for optim … A methodology that proved useful for optimizing link specifications is to manually create a small reference linkset and then optimize the Silk linking specification to produce these reference links, before Silk is run against the complete target data source. n against the complete target data source. + |
Has ExperimentSetup | No data available now. + |
Has GUI | Yes + |
Has Hypothesis | No data available now. + |
Has Implementation | Silk–Linking + |
Has InfoRepresentation | No data available now. + |
Has Limitations | No data available now. + |
Has NegativeAspects | No data available now. + |
Has PositiveAspects | No data available now. + |
Has Requirements | No data available now. + |
Has Results | No data available now. + |
Has Subproblem | No data available now. + |
Has Version | No data available now. + |
Has abstract | The Web of Data is built upon two simple i … The Web of Data is built upon two simple ideas, Employ the RDF data model to publish structured data on the Web and to create explicit data links between entities within different data sources. This paper presents the Silk -- Linking Framework, a toolkit for discovering and maintaining data links between Web data sources. Silk consists of three components: 1. A link discovery engine, which computes links between data sources based on a declarative specification of the conditions that entities must fulfil in order to be interlinked; 2. A tool for evaluating the generated data links in order to fine-tune the linking specification; 3. A protocol for maintaining data links between continuously changing data sources. The protocol allows data sources to exchange both linksets as well as detailed change information and enables continuous link recomputation. The interplay of all the components is demonstrated within a life science use case. monstrated within a life science use case. + |
Has approach | No data available now. + |
Has authors | Julius Volz +, Christian Bizer +, Martin Gaedke + and Georgi Kobilarov + |
Has conclusion | We presented the Silk framework, a flexibl … We presented the Silk framework, a flexible tool for discovering links between entities within different web data sources. The Silk-LSL link specification language was introduced and its applicability was demonstrated within a life science use case. We then proposed the WOD-LMP protocol for synchronizing and maintaining links between continuously changing Linked Data sources. continuously changing Linked Data sources. + |
Has future work | Future work on Silk will focus on the foll … Future work on Silk will focus on the following areas: We will implement further similarity metrics to support a broader range of linking use cases. To assist users in writing Silk-LSL specifications, machine learning techniques could be employed to adjust weightings or optimize the structure of the matching specification. Finally, we will evaluate the suitability of Silk for detecting duplicate entities within local datasets instead of using it to discover links between disparate RDF data sources. The value of the Web of Data rises and falls with the amount and the quality of links between data sources. We hope that Silk and other similar tools will help to strengthen the linkage between data sources and therefore contribute to the overall utility of the network. ute to the overall utility of the network. + |
Has keywords | Linked data, web of data, link discovery, link maintenance, record linkage, duplicate detection + |
Has motivation | No data available now. + |
Has platform | No data available now. + |
Has problem | Link Discovery + |
Has relatedProblem | No data available now. + |
Has subject | Link Discovery + |
Has vendor | No data available now. + |
Has year | 2009 + |
ImplementedIn ProgLang | Python + |
Proposes Algorithm | No data available now. + |
RunsOn OS | No data available now. + |
Title | Discovering and Maintaining Links on the Web of Data + |
Uses Framework | No data available now. + |
Uses Methodology | No data available now. + |
Uses Toolbox | No data available now. + |