Adaptive Integration of Distributed Semantic Web Data
Adaptive Integration of Distributed Semantic Web Data | |
---|---|
Adaptive Integration of Distributed Semantic Web Data
| |
Bibliographical Metadata | |
Subject: | Querying Distributed RDF Data Sources |
Year: | 2010 |
Authors: | Steven Lynden, Isao Kojima, Akiyoshi Matono, Yusuke Tanimura |
Venue | DNIS |
Content Metadata | |
Problem: | SPARQL Query Federation |
Approach: | Distributed Query Processing |
Implementation: | ADERIS |
Evaluation: | Performance Analysis |
Contents
Abstract
The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible viaWeb Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed.
Conclusion
An adaptive framework has been presented for executing queries over multiple SPARQL endpoints that differs from existing approaches which use static query optimisation techniques. Many SPARQL web services are currently available and the number of them is growing. The work presented in this paper is a framework for executing queries over federations of such services. The framework proposed in this paper, which allows adaptive query processing over dynamically constructed predicate tables to be performed in conjunction with the construction of the predicate tables, was shown to perform relatively well in unpredictable environments where source query failures may occur. The prototype implemented was evaluated using real data, showing some advantage in terms of response times of adaptive over non-adaptive methods using a subset of DBPedia..
Future work
Future work will aim to investigate other data sets with different characteristics and larger data sets. As the approach presented in this paper focuses on efficiently executing a specific kind of query, that of adaptively ordering multiple joins, further work will focus on optimising other kinds of queries and implementing support for more SPARQL query language features. Future work will also concentrate on investigating how the work can be applied in various domains.
Approach
Positive Aspects: No data available now.
Negative Aspects: No data available now.
Limitations: No data available now.
Challenges: No data available now.
Proposes Algorithm: No data available now.
Methodology: No data available now.
Requirements: No data available now.
Limitations: No data available now.
Implementations
Download-page: No data available now.
Access API: No data available now.
Information Representation: RDF
Data Catalogue: Predicate List during setup phase
Runs on OS: OS independent
Vendor: No data available now.
Uses Framework: No data available now.
Has Documentation URL: No data available now.
Programming Language: Java
Version: No data available now.
Platform: -
Toolbox: No data available now.
GUI: Yes
Research Problem
Subproblem of: No data available now.
RelatedProblem: No data available now.
Motivation: No data available now.
Evaluation
Experiment Setup: Endpoint machines are connected to the machine on which the mediator is deployed (2GHz AMD Athlon X2, 2GB RAM) via a 100Mbs Ethernet LAN.
Evaluation Method : No data available now.
Hypothesis: No data available now.
Description: No data available now.
Dimensions: Performance
Benchmark used: DBPedia
Results: No data available now.
Access API | No data available now. + |
Event in series | DNIS + |
Has Benchmark | DBPedia + |
Has Challenges | No data available now. + |
Has DataCatalouge | Predicate List during setup phase + |
Has Description | No data available now. + |
Has Dimensions | Performance + |
Has DocumentationURL | http://No data available now. + |
Has Downloadpage | http://No data available now. + |
Has Evaluation | Performance Analysis + |
Has EvaluationMethod | No data available now. + |
Has ExperimentSetup | Endpoint machines are connected to the machine on which the mediator is deployed (2GHz AMD Athlon X2, 2GB RAM) via a 100Mbs Ethernet LAN. + |
Has GUI | Yes + |
Has Hypothesis | No data available now. + |
Has Implementation | ADERIS + |
Has InfoRepresentation | RDF + |
Has Limitations | No data available now. + |
Has NegativeAspects | No data available now. + |
Has PositiveAspects | No data available now. + |
Has Requirements | No data available now. + |
Has Results | No data available now. + |
Has Subproblem | No data available now. + |
Has Version | No data available now. + |
Has abstract | The use of RDF (Resource Description Frame … The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible viaWeb Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed. and potential applications are discussed. + |
Has approach | Distributed Query Processing + |
Has authors | Steven Lynden +, Isao Kojima +, Akiyoshi Matono + and Yusuke Tanimura + |
Has conclusion | An adaptive framework has been presented f … An adaptive framework has been presented for executing queries over multiple SPARQL endpoints that differs from existing approaches which use static query optimisation techniques. Many SPARQL web services are currently available and the number of them is growing. The work presented in this paper is a framework for executing queries over federations of such services. The framework proposed in this paper, which allows adaptive query processing over dynamically constructed predicate tables to be performed in conjunction with the construction of the predicate tables, was shown to perform relatively well in unpredictable environments where source query failures may occur. The prototype implemented was evaluated using real data, showing some advantage in terms of response times of adaptive over non-adaptive methods using a subset of DBPedia.. aptive methods using a subset of DBPedia.. + |
Has future work | Future work will aim to investigate other … Future work will aim to investigate other data sets with different characteristics and larger data sets. As the approach presented in this paper focuses on efficiently executing a specific kind of query, that of adaptively ordering multiple joins, further work will focus on optimising other kinds of queries and implementing support for more SPARQL query language features. Future work will also concentrate on investigating how the work can be applied in various domains. he work can be applied in various domains. + |
Has motivation | No data available now. + |
Has platform | - + |
Has problem | SPARQL Query Federation + |
Has relatedProblem | No data available now. + |
Has subject | Querying Distributed RDF Data Sources + |
Has vendor | No data available now. + |
Has year | 2010 + |
ImplementedIn ProgLang | Java + |
Proposes Algorithm | No data available now. + |
RunsOn OS | OS independent + |
Title | Adaptive Integration of Distributed Semantic Web Data + |
Uses Framework | No data available now. + |
Uses Methodology | No data available now. + |
Uses Toolbox | No data available now. + |