SLINT: A Schema-Independent Linked Data Interlinking System

From Openresearch
Jump to: navigation, search
SLINT: A Schema-Independent Linked Data Interlinking System
SLINT: A Schema-Independent Linked Data Interlinking System
Bibliographical Metadata
Subject: Link Discovery
Keywords: linked data, schema-independent, blocking, interlinking
Year: 2012
Authors: Khai Nguyen, Ryutaro Ichise, Bac Le
Venue OM
Content Metadata
Problem: Link Discovery
Approach: Weighted co-occurrence and adaptive filtering in blocking and instance matching
Implementation: SLINT
Evaluation: Accuracy Evaluation

Abstract

Linked data interlinking is the discovery of all instances that represent the same real-world object and locate in different data sources. Since different data publishers frequently use different schemas for storing resources, we aim at developing a schema-independent interlinking system. Our system automatically selects important predicates and useful predicate alignments, which are used as the key for blocking and instance matching. The key distinction of our system is the use of weighted co-occurrence and adaptive filtering in blocking and instance matching. Experimental results show that the system highly improves the precision and recall over some recent ones. The performance of the system and the efficiency of main steps are also discussed.

Conclusion

In this paper, we present SLINT, an efficient schema-independent linked data interlinking system. We select important predicates by predicate’s coverage and discriminability. The predicate alignments are constructed and filtered for obtaining key alignments.We implement an adaptive filtering technique to produce candidates and identities. Compare with the most recent systems, SLINT highly outperforms the precision and recall in interlinking. The performance of SLINT is also very high when it takes around 1 minute to detect more than 13,000 identity pairs.

Future work

Although SLINT has good result on tested datasets, it is not sufficient to evaluate the scalability of our system, which we consider as the current limiting point because of the used of weighted co-occurrence matrix. We will investigate about a solution for this issue in our next work. Besides, we also interested in automatic configuration for every threshold used in SLINT and improving SLINT into a novel cross-domain interlinking system.

Approach

Positive Aspects: No data available now.

Negative Aspects: No data available now.

Limitations: No data available now.

Challenges: No data available now.

Proposes Algorithm: No data available now.

Methodology: No data available now.

Requirements: No data available now.

Limitations: No data available now.

Implementations

Download-page: http://ri-www.nii.ac.jp/SLINT/index.html

Access API: No data available now.

Information Representation: No data available now.

Data Catalogue: {{{Catalogue}}}

Runs on OS: No data available now.

Vendor: No data available now.

Uses Framework: No data available now.

Has Documentation URL: No data available now.

Programming Language: No data available now.

Version: No data available now.

Platform: No data available now.

Toolbox: No data available now.

GUI: No

Research Problem

Subproblem of: No data available now.

RelatedProblem: No data available now.

Motivation: No data available now.

Evaluation

Experiment Setup: 2.66Ghz quad-core CPU and 4GB of memory

Evaluation Method : Compare the system with AgreementMaker, SERIMI, and Zhishi.Links

Hypothesis: No data available now.

Description: No data available now.

Dimensions: Accuracy

Benchmark used: LinkedMDB, GeoNames

Results: SLINT system totally outperforms the others on both precision and recall. AgreementMaker has a competitive precision with SLINT on dataset D3 but this system is much lower in recall. Zhishi.Links results on dataset D3 are very high, but the F1 score of SLINT is still 0.05 higher in overall.

Access APINo data available now. +
Event in seriesOM +
Has BenchmarkLinkedMDB + and GeoNames +
Has ChallengesNo data available now. +
Has DataCatalouge{{{Catalogue}}} +
Has DescriptionNo data available now. +
Has DimensionsAccuracy +
Has DocumentationURLhttp://No data available now. +
Has Downloadpagehttp://ri-www.nii.ac.jp/SLINT/index.html +
Has EvaluationAccuracy Evaluation +
Has EvaluationMethodCompare the system with AgreementMaker, SERIMI, and Zhishi.Links +
Has ExperimentSetup2.66Ghz quad-core CPU and 4GB of memory +
Has GUINo +
Has HypothesisNo data available now. +
Has ImplementationSLINT +
Has InfoRepresentationNo data available now. +
Has LimitationsNo data available now. +
Has NegativeAspectsNo data available now. +
Has PositiveAspectsNo data available now. +
Has RequirementsNo data available now. +
Has ResultsSLINT system totally outperforms the other
SLINT system totally outperforms the others on both precision and recall. AgreementMaker has a competitive precision with SLINT on dataset D3 but this system is much lower in recall. Zhishi.Links results on dataset D3 are very high, but the F1 score of SLINT is still 0.05 higher in overall.
of SLINT is still 0.05 higher in overall. +
Has SubproblemNo data available now. +
Has VersionNo data available now. +
Has abstractLinked data interlinking is the discovery
Linked data interlinking is the discovery of all instances that represent the same real-world object and locate in different data sources. Since different data publishers frequently use different schemas for storing resources, we aim at developing a schema-independent interlinking system. Our system automatically selects important predicates and useful predicate alignments, which are used as the key for blocking and instance matching. The key distinction of our system is the use of weighted co-occurrence and adaptive filtering in blocking and instance matching. Experimental results show that the system highly improves the precision and recall over some recent ones. The performance of the system and the efficiency of main steps are also discussed.
ficiency of main steps are also discussed. +
Has approachWeighted co-occurrence and adaptive filtering in blocking and instance matching +
Has authorsKhai Nguyen +, Ryutaro Ichise + and Bac Le +
Has conclusionIn this paper, we present SLINT, an effici
In this paper, we present SLINT, an efficient schema-independent linked data interlinking system. We select important predicates by predicate’s coverage and discriminability. The predicate alignments are constructed and filtered for obtaining key alignments.We implement an adaptive filtering technique to produce candidates and identities. Compare with the most recent systems, SLINT highly outperforms the precision and recall in interlinking. The performance of SLINT is also very high when it takes around 1 minute to detect more than 13,000 identity pairs.
to detect more than 13,000 identity pairs. +
Has future workAlthough SLINT has good result on tested d
Although SLINT has good result on tested datasets, it is not sufficient to evaluate the scalability of our system, which we consider as the current limiting point because of the used of weighted co-occurrence matrix. We will investigate about a solution for this issue in our next work. Besides, we also interested in automatic configuration for every threshold used in SLINT and improving SLINT into a novel cross-domain interlinking system.
a novel cross-domain interlinking system. +
Has keywordslinked data, schema-independent, blocking, interlinking +
Has motivationNo data available now. +
Has platformNo data available now. +
Has problemLink Discovery +
Has relatedProblemNo data available now. +
Has subjectLink Discovery +
Has vendorNo data available now. +
Has year2012 +
ImplementedIn ProgLangNo data available now. +
Proposes AlgorithmNo data available now. +
RunsOn OSNo data available now. +
TitleSLINT: A Schema-Independent Linked Data Interlinking System +
Uses FrameworkNo data available now. +
Uses MethodologyNo data available now. +
Uses ToolboxNo data available now. +