Discovering and Maintaining Links on the Web of Data

From Openresearch
Jump to: navigation, search
Discovering and Maintaining Links on the Web of Data
Discovering and Maintaining Links on the Web of Data
Bibliographical Metadata
Subject: Link Discovery
Keywords: Linked data, web of data, link discovery, link maintenance, record linkage, duplicate detection
Year: 2009
Authors: Julius Volz, Christian Bizer, Martin Gaedke, Georgi Kobilarov
Venue ISWC
Content Metadata
Problem: Link Discovery
Approach: No data available now.
Implementation: Silk–Linking
Evaluation: No data available now.

Abstract

The Web of Data is built upon two simple ideas, Employ the RDF data model to publish structured data on the Web and to create explicit data links between entities within different data sources. This paper presents the Silk -- Linking Framework, a toolkit for discovering and maintaining data links between Web data sources. Silk consists of three components: 1. A link discovery engine, which computes links between data sources based on a declarative specification of the conditions that entities must fulfil in order to be interlinked; 2. A tool for evaluating the generated data links in order to fine-tune the linking specification; 3. A protocol for maintaining data links between continuously changing data sources. The protocol allows data sources to exchange both linksets as well as detailed change information and enables continuous link recomputation. The interplay of all the components is demonstrated within a life science use case.

Conclusion

We presented the Silk framework, a flexible tool for discovering links between entities within different web data sources. The Silk-LSL link specification language was introduced and its applicability was demonstrated within a life science use case. We then proposed the WOD-LMP protocol for synchronizing and maintaining links between continuously changing Linked Data sources.

Future work

Future work on Silk will focus on the following areas: We will implement further similarity metrics to support a broader range of linking use cases. To assist users in writing Silk-LSL specifications, machine learning techniques could be employed to adjust weightings or optimize the structure of the matching specification. Finally, we will evaluate the suitability of Silk for detecting duplicate entities within local datasets instead of using it to discover links between disparate RDF data sources. The value of the Web of Data rises and falls with the amount and the quality of links between data sources. We hope that Silk and other similar tools will help to strengthen the linkage between data sources and therefore contribute to the overall utility of the network.

Approach

Positive Aspects: No data available now.

Negative Aspects: No data available now.

Limitations: No data available now.

Challenges: No data available now.

Proposes Algorithm: No data available now.

Methodology: No data available now.

Requirements: No data available now.

Limitations: No data available now.

Implementations

Download-page: http://silk.googlecode.com

Access API: No data available now.

Information Representation: No data available now.

Data Catalogue: {{{Catalogue}}}

Runs on OS: No data available now.

Vendor: No data available now.

Uses Framework: No data available now.

Has Documentation URL: http://www4.wiwiss.fu-berlin.de/bizer/silk/

Programming Language: Python

Version: No data available now.

Platform: No data available now.

Toolbox: No data available now.

GUI: Yes

Research Problem

Subproblem of: No data available now.

RelatedProblem: No data available now.

Motivation: No data available now.

Evaluation

Experiment Setup: No data available now.

Evaluation Method : A methodology that proved useful for optimizing link specifications is to manually create a small reference linkset and then optimize the Silk linking specification to produce these reference links, before Silk is run against the complete target data source.

Hypothesis: No data available now.

Description: No data available now.

Dimensions: {{{Dimensions}}}

Benchmark used: DBpedia, DrugBank

Results: No data available now.

Access APINo data available now. +
Event in seriesISWC +
Has BenchmarkDBpedia + and DrugBank +
Has ChallengesNo data available now. +
Has DataCatalouge{{{Catalogue}}} +
Has DescriptionNo data available now. +
Has Dimensions{{{Dimensions}}} +
Has DocumentationURLhttp://www4.wiwiss.fu-berlin.de/bizer/silk/ +
Has Downloadpagehttp://silk.googlecode.com +
Has EvaluationNo data available now. +
Has EvaluationMethodA methodology that proved useful for optim
A methodology that proved useful for optimizing link specifications is to manually create a small reference linkset and then optimize the Silk linking specification to produce these reference links, before Silk is run against the complete target data source.
n against the complete target data source. +
Has ExperimentSetupNo data available now. +
Has GUIYes +
Has HypothesisNo data available now. +
Has ImplementationSilk–Linking +
Has InfoRepresentationNo data available now. +
Has LimitationsNo data available now. +
Has NegativeAspectsNo data available now. +
Has PositiveAspectsNo data available now. +
Has RequirementsNo data available now. +
Has ResultsNo data available now. +
Has SubproblemNo data available now. +
Has VersionNo data available now. +
Has abstractThe Web of Data is built upon two simple i
The Web of Data is built upon two simple ideas, Employ the RDF data model to publish structured data on the Web and to create explicit data links between entities within different data sources. This paper presents the Silk -- Linking Framework, a toolkit for discovering and maintaining data links between Web data sources. Silk consists of three components: 1. A link discovery engine, which computes links between data sources based on a declarative specification of the conditions that entities must fulfil in order to be interlinked; 2. A tool for evaluating the generated data links in order to fine-tune the linking specification; 3. A protocol for maintaining data links between continuously changing data sources. The protocol allows data sources to exchange both linksets as well as detailed change information and enables continuous link recomputation. The interplay of all the components is demonstrated within a life science use case.
monstrated within a life science use case. +
Has approachNo data available now. +
Has authorsJulius Volz +, Christian Bizer +, Martin Gaedke + and Georgi Kobilarov +
Has conclusionWe presented the Silk framework, a flexibl
We presented the Silk framework, a flexible tool for discovering links between entities within different web data sources. The Silk-LSL link specification language was introduced and its applicability was demonstrated within a life science use case. We then proposed the WOD-LMP protocol for synchronizing and maintaining links between continuously changing Linked Data sources.
continuously changing Linked Data sources. +
Has future workFuture work on Silk will focus on the foll
Future work on Silk will focus on the following areas: We will implement further similarity metrics to support a broader range of linking use cases. To assist users in writing Silk-LSL specifications, machine learning techniques could be employed to adjust weightings or optimize the structure of the matching specification. Finally, we will evaluate the suitability of Silk for detecting duplicate entities within local datasets instead of using it to discover links between disparate RDF data sources. The value of the Web of Data rises and falls with the amount and the quality of links between data sources. We hope that Silk and other similar tools will help to strengthen the linkage between data sources and therefore contribute to the overall utility of the network.
ute to the overall utility of the network. +
Has keywordsLinked data, web of data, link discovery, link maintenance, record linkage, duplicate detection +
Has motivationNo data available now. +
Has platformNo data available now. +
Has problemLink Discovery +
Has relatedProblemNo data available now. +
Has subjectLink Discovery +
Has vendorNo data available now. +
Has year2009 +
ImplementedIn ProgLangPython +
Proposes AlgorithmNo data available now. +
RunsOn OSNo data available now. +
TitleDiscovering and Maintaining Links on the Web of Data +
Uses FrameworkNo data available now. +
Uses MethodologyNo data available now. +
Uses ToolboxNo data available now. +