A Survey of Current Link Discovery Frameworks

From Openresearch
Revision as of 22:39, 11 July 2018 by Said (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
A Survey of Current Link Discovery Frameworks
A Survey of Current Link Discovery Frameworks
Bibliographical Metadata
Subject: Link Discovery
Year: 2017
Authors: Markus Nentwig, Michael Hartung, Axel-Cyrille Ngonga Ngomo, Erhard Rahm
Venue Semantic Web Journal
Content Metadata
Problem: Link Discovery
Approach: No data available now.
Implementation: No data available now.
Evaluation: No data available now.

Abstract

Links build the backbone of the Linked Data Cloud. With the steady growth in the size of datasets comes an increased need for end users to know which frameworks to use for deriving links between datasets. In this survey, we comparatively evaluate current Link Discovery tools and frameworks. For this purpose, we outline general requirements and derive a generic architecture of Link Discovery frameworks. Based on this generic architecture, we study and compare the features of state-of the-art linking frameworks. We also analyze reported performance evaluations for the different frameworks. Finally, we derive insights pertaining to possible future developments in the domain of Link Discovery.

Conclusion

We investigated ten LD frameworks and compared their functionality based on a common set of criteria. The criteria cover the main steps such as the configuration of linking specifications and methods for matching and runtime optimization. We also covered general aspects such as the supported input formats and link types, support for a GUI and software availability as open source. We observed that the considered tools already provide a rich functionality with support for semi-automatic configuration including advanced learning-based approaches such as unsupervised genetic programming or active learning. On the other side, we found that most tools still focus on simple property-based match techniques rather than using the ontological context within structural matchers. Furthermore, existing links and background knowledge are not yet exploited in the considered frameworks. More comprehensive support of efficiency techniques is also necessary such as the combined use of blocking, filtering and parallel processing. We also analyzed comparative evaluations of the LD frameworks to assess their relative effectiveness and efficiency. In this respect, the OAEI instance matching track is the most relevant effort and we thus analyzed its match tasks and the tool participation and results for the last years. Unfortunately, the participation has been rather low thereby preventing the comparative evaluation between most of the tools. Moreover, the focus of the contest has been on effectiveness so far while runtime efficiency has not yet been evaluated. To better assess the relative effectiveness and efficiency of LD tools it would be valuable to test them on a common set of benchmark tasks on the same hardware. Given the general availability of the tools and the existence of a considerable set of match task definitions and datasets this should be feasible with reasonable effort.

Future work

No Future work exists.

Approach

Positive Aspects: No data available now.

Negative Aspects: No data available now.

Limitations: No data available now.

Challenges: No data available now.

Proposes Algorithm: No data available now.

Methodology: No data available now.

Requirements: No data available now.

Limitations: No data available now.

Implementations

Download-page: No data available now.

Access API: No data available now.

Information Representation: No data available now.

Data Catalogue: -

Runs on OS: No data available now.

Vendor: No data available now.

Uses Framework: No data available now.

Has Documentation URL: No data available now.

Programming Language: No data available now.

Version: No data available now.

Platform: No data available now.

Toolbox: No data available now.

GUI: No

Research Problem

Subproblem of: No data available now.

RelatedProblem: No data available now.

Motivation: No data available now.

Evaluation

Experiment Setup: No data available now.

Evaluation Method : -

Hypothesis: No data available now.

Description: No data available now.

Dimensions: {{{Dimensions}}}

Benchmark used: -

Results: No data available now.

Access APINo data available now. +
Event in seriesSemantic Web Journal +
Has Benchmark- +
Has ChallengesNo data available now. +
Has DataCatalouge- +
Has DescriptionNo data available now. +
Has Dimensions{{{Dimensions}}} +
Has DocumentationURLhttp://No data available now. +
Has Downloadpagehttp://No data available now. +
Has EvaluationNo data available now. +
Has EvaluationMethod- +
Has ExperimentSetupNo data available now. +
Has GUINo +
Has HypothesisNo data available now. +
Has ImplementationNo data available now. +
Has InfoRepresentationNo data available now. +
Has LimitationsNo data available now. +
Has NegativeAspectsNo data available now. +
Has PositiveAspectsNo data available now. +
Has RequirementsNo data available now. +
Has ResultsNo data available now. +
Has SubproblemNo data available now. +
Has VersionNo data available now. +
Has abstractLinks build the backbone of the Linked Dat
Links build the backbone of the Linked Data Cloud. With the steady growth in the size of datasets comes an increased need for end users to know which frameworks to use for deriving links between datasets. In this survey, we comparatively evaluate current Link Discovery tools and frameworks. For this purpose, we outline general requirements and derive a generic architecture of Link Discovery frameworks. Based on this generic architecture, we study and compare the features of state-of the-art linking frameworks. We also analyze reported performance evaluations for the different frameworks. Finally, we derive insights pertaining to possible future developments in the domain of Link Discovery.
elopments in the domain of Link Discovery. +
Has approachNo data available now. +
Has authorsMarkus Nentwig +, Michael Hartung +, Axel-Cyrille Ngonga Ngomo + and Erhard Rahm +
Has conclusionWe investigated ten LD frameworks and comp
We investigated ten LD frameworks and compared their functionality based on a common set of criteria. The criteria cover the main steps such as the configuration of linking specifications and methods for matching and runtime optimization. We also covered general aspects such as the supported input formats and link types, support for a GUI and software availability as open source. We observed that the considered tools already provide a rich functionality with support for semi-automatic configuration including advanced learning-based approaches such as unsupervised genetic programming or active learning. On the other side, we found that most tools still focus on simple property-based match techniques rather than using the ontological context within structural matchers. Furthermore, existing links and background knowledge are not yet exploited in the considered frameworks. More comprehensive support of efficiency techniques is also necessary such as the combined use of blocking, filtering and parallel processing. We also analyzed comparative evaluations of the LD frameworks to assess their relative effectiveness and efficiency. In this respect, the OAEI instance matching track is the most relevant effort and we thus analyzed its match tasks and the tool participation and results for the last years. Unfortunately, the participation has been rather low thereby preventing the comparative evaluation between most of the tools. Moreover, the focus of the contest has been on effectiveness so far while runtime efficiency has not yet been evaluated. To better assess the relative effectiveness and efficiency of LD tools it would be valuable to test them on a common set of benchmark tasks on the same hardware. Given the general availability of the tools and the existence of a considerable set of match task definitions and datasets this should be feasible with reasonable effort.
should be feasible with reasonable effort. +
Has future workNo Future work exists. +
Has motivationNo data available now. +
Has platformNo data available now. +
Has problemLink Discovery +
Has relatedProblemNo data available now. +
Has subjectLink Discovery +
Has vendorNo data available now. +
Has year2017 +
ImplementedIn ProgLangNo data available now. +
Proposes AlgorithmNo data available now. +
RunsOn OSNo data available now. +
TitleA Survey of Current Link Discovery Frameworks +
Uses FrameworkNo data available now. +
Uses MethodologyNo data available now. +
Uses ToolboxNo data available now. +