Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles

From Openresearch
Jump to: navigation, search
Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles
Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles
Bibliographical Metadata
Subject: Scholarly communication
Keywords: Semantic Metadata Enrichment, Quality Assessment, Recommendation Services, Scholarly Communication, Semantic Publishing
Year: 2017
Authors: Said Fathalla, Sahar Vahdati, Sören Auer, Christoph Lange
Venue TPDL
Content Metadata
Problem: Semantifying scholarly artifacts
Approach: Structuring research results is via knowledge graph representation
Implementation: SemSur
Evaluation: Questionnaire-based evaluation

Abstract

Despite significant advances in technology, the way how research is done and especially communicated has not changed much. We have the vision that ultimately researchers will work on a common knowledge base comprising comprehensive descriptions of their research, thus making research contributions transparent and comparable. The current approach for structuring, systematizing and comparing research results is via survey or review articles. In this article, we describe how surveys for research fields can be represented in a semantic way, resulting in a knowledge graph that describes the individual research problems, approaches, implementations and evaluations in a structured and comparable way. We present a comprehensive ontology for capturing the content of survey articles. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the resulting knowledge graph by using it to answer queries about the different research contributions covered by the survey and evaluate how well the query answers serve readers’ information needs, in comparison to having them extract the same information from reading a survey paper.

Conclusion

In this article, we presented SemSur, a Semantic Survey Ontology, and an approach for creating a comprehensive knowledge graph representing research findings. We see this work as an initial step of a long-term research agenda to create a paradigm shift from document-based to knowledge-based scholarly communication. Our vision is to have this work deployed in an extended version of the existing OpenResearch.org platform. We have created instances of three selected surveys on different fields of research using the SemSur ontology. We evaluated our approach involving nine researchers. As we see in the evaluation results, SemSur enables successful retrieval of relevant and accurate results without users having to spend much time and effort compared to traditional ways. This ontology can have a significant influence on the scientific community especially for researchers who want to create a survey article or write literature on a certain topic. The results of our evaluation show that researchers agree that the traditional way of gathering an overview on a particular research topic is cumbersome and time-consuming. Much effort is needed and important information might be easily overlooked. Collaborative integration of research metadata provided by the community supports researchers in this regard. Interviewed domain experts mentioned that it might be necessary to read and understand 30 to 100 scientific articles to get a proper level of understanding or an overview of a topic or sub-topics. A collaboration of researchers as owners of each particular research work to provide a structured and semantic representation of their research achievements can have a huge impact in making their research more accessible. A similar effort is spent on preparing survey and overview articles.

Future work

Integrating our methodology with the procedure of publishing survey articles can help to create a paradigm shift. We plan to further extend the ontology to cover other research methodologies and fields. For a more robust implementation of the proposed approach, we are planning to use and significantly expand the OpenResearch.org platform and a user-friendly SPARQL auto-generation services for accessing metadata analysis for non-expert users. More comprehensive evaluation of the services will be done after the implementation of the curation, exploration and discovery services. In addition, our intention is to develop and foster a living community around OpenResearch.org and SemSur, to extend the ontology and to ingest metadata to cover other research fields.

Approach

Positive Aspects: No data available now.

Negative Aspects: No data available now.

Limitations: No data available now.

Challenges: No data available now.

Proposes Algorithm: No data available now.

Methodology: No data available now.

Requirements: No data available now.

Limitations: No data available now.

Implementations

Download-page: http://sda.tech/SemSur/Documentation/Semsur.html

Access API: -

Information Representation: OWL

Data Catalogue: {{{Catalogue}}}

Runs on OS: OS independent

Vendor: Open source

Uses Framework: -

Has Documentation URL: http://sda.tech/SemSur/Documentation/Semsur.html

Programming Language: owl

Version: 1.0

Platform: -

Toolbox: -

GUI: No

Research Problem

Subproblem of: Semantic publishing

RelatedProblem: Ontologies integration and reuse

Motivation: Making research contributions transparent and comparable.

Evaluation

Experiment Setup: The evaluation started with the phase of letting researchers first read the given overview questions and letting them try in their own way to find the respective answer.

Evaluation Method : Fill in a satisfaction questionnaire.

Hypothesis: Involved researchers should be aware of the domain in use.

Description: We followed these steps: – A set of 10 predefined natural language queries has been prepared for evaluation Table 4. Then, asking participants to try to answer these queries using their own tools and services. The queries were chosen in increasing order of complexity. – We implemented SPARQL queries corresponding to each of these queries to enable non-expert participants, not familiar with SPARQL, to query the knowledge graph. – We asked researchers to review the answers of the pre-defined queries that we formulated based on the SemSur ontology. We asked them to tell us whether they consider the provided answers and the way queries are formulated comprehensive and reasonable. – We finally asked the same researchers to fill in a satisfaction questionnaire with 18 questions14

Dimensions: Accuracy

Benchmark used: -

Results: 5 out of the 9 researchers immediately started with wellknown standardWeb search engines to explore the given topic. They tried to use several variations of keywords from the questions, e.g., “Federated Query Engines”, “SPARQL Federation”, etc. They also used digital libraries and scientific metadata services, e.g., ACM DL or Microsoft Academic Search, following the same approach and sometimes using advanced search options and filters. However, the retrieved results were either out of scope for the question but more related to the search keywords. Overall, 8 researchers found it difficult to collect information and reach a conclusive overview of the research topics or related work using current methods. Six of the participants pointed out that for some of the overview questions, search engines were as good as the proposed system particularly when the framework name is part of the search keyword. They all agreed that for complicated questions our SemSur approach outperformed any existing approach/tool. Seven participants agreed that our system would be helpful for both new and experienced researchers. Two-thirds of them strongly agreed that the time and effort they spent to find such information using our system in comparison to other traditional ways is relatively low. Finally, 100% of the participants would like to use SemSur approach in their further research for studying the literature of a research topic or writing a survey article. Since the results of queries were shown to the participants in table view, the main feedback from all participants about possible improvements was to provide a better way of data representation.

Access API- +
Event in seriesTPDL +
Has Benchmark- +
Has ChallengesNo data available now. +
Has DataCatalouge{{{Catalogue}}} +
Has DescriptionWe followed these steps: – A set of 10 pre
We followed these steps:

– A set of 10 predefined natural language queries has been prepared for evaluation Table 4. Then, asking participants to try to answer these queries using their own tools and services. The queries were chosen in increasing order of complexity. – We implemented SPARQL queries corresponding to each of these queries to enable non-expert participants, not familiar with SPARQL, to query the knowledge graph. – We asked researchers to review the answers of the pre-defined queries that we formulated based on the SemSur ontology. We asked them to tell us whether they consider the provided answers and the way queries are formulated comprehensive and reasonable. – We finally asked the same researchers to fill in a satisfaction questionnaire with 18

questions14
sfaction questionnaire with 18 questions14 +
Has DimensionsAccuracy +
Has DocumentationURLhttp://sda.tech/SemSur/Documentation/Semsur.html +
Has Downloadpagehttp://sda.tech/SemSur/Documentation/Semsur.html +
Has EvaluationQuestionnaire-based evaluation +
Has EvaluationMethodFill in a satisfaction questionnaire. +
Has ExperimentSetupThe evaluation

started with the phase of letting researchers first read the given overview questions

and letting them try in their own way to find the respective answer. +
Has GUINo +
Has HypothesisInvolved researchers should be aware of the domain in use. +
Has ImplementationSemSur +
Has InfoRepresentationOWL +
Has LimitationsNo data available now. +
Has NegativeAspectsNo data available now. +
Has PositiveAspectsNo data available now. +
Has RequirementsNo data available now. +
Has Results5 out of the 9 researchers immediately sta
5 out of the 9 researchers immediately started with wellknown

standardWeb search engines to explore the given topic. They tried to use several variations of keywords from the questions, e.g., “Federated Query Engines”, “SPARQL Federation”, etc. They also used digital libraries and scientific metadata services, e.g., ACM DL or Microsoft Academic Search, following the same approach and sometimes using advanced search options and filters. However, the retrieved results were either out of scope for the question but more related to the search keywords. Overall, 8 researchers found it difficult to collect information and reach a conclusive overview of the research topics or related work using current methods. Six of the participants pointed out that for some of the overview questions, search engines were as good as the proposed system particularly when the framework name is part of the search keyword. They all agreed that for complicated questions our SemSur approach outperformed any existing approach/tool. Seven participants agreed that our system would be helpful for both new and experienced researchers. Two-thirds of them strongly agreed that the time and effort they spent to find such information using our system in comparison to other traditional ways is relatively low. Finally, 100% of the participants would like to use SemSur approach in their further research for studying the literature of a research topic or writing a survey article. Since the results of queries were shown to the participants in table view, the main feedback from all participants about possible

improvements was to provide a better way of data representation.
ovide a better way of data representation. +
Has SubproblemSemantic publishing +
Has Version1.0 +
Has abstractDespite significant advances in technology
Despite significant advances in technology, the way how research is done and especially communicated has not changed much. We have the vision that ultimately researchers will work on a common knowledge base comprising comprehensive descriptions of their research, thus making research contributions transparent and comparable. The current approach for structuring, systematizing and comparing research results is via survey or review articles. In this article, we describe how surveys for research fields can be represented in a semantic way, resulting in a knowledge graph that describes the individual research problems, approaches, implementations and evaluations in a structured and comparable way. We present a comprehensive ontology for capturing the content of survey articles. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the resulting knowledge graph by using it to answer queries about the different research contributions covered by the survey and evaluate how well the query answers serve readers’ information needs, in comparison to having them extract the same information from reading a survey paper.
e information from reading a survey paper. +
Has approachStructuring research results is via knowledge graph representation +
Has authorsSaid Fathalla +, Sahar Vahdati +, Sören Auer + and Christoph Lange +
Has conclusionIn this article, we presented SemSur, a Se
In this article, we presented SemSur, a Semantic Survey Ontology, and an approach for

creating a comprehensive knowledge graph representing research findings. We see this work as an initial step of a long-term research agenda to create a paradigm shift from document-based to knowledge-based scholarly communication. Our vision is to have this work deployed in an extended version of the existing OpenResearch.org platform. We have created instances of three selected surveys on different fields of research using the SemSur ontology. We evaluated our approach involving nine researchers. As we see in the evaluation results, SemSur enables successful retrieval of relevant and accurate results without users having to spend much time and effort compared to traditional ways. This ontology can have a significant influence on the scientific community especially for researchers who want to create a survey article or write literature on a certain topic. The results of our evaluation show that researchers agree that the traditional way of gathering an overview on a particular research topic is cumbersome and time-consuming. Much effort is needed and important information might be easily overlooked. Collaborative integration of research metadata provided by the community supports researchers in this regard. Interviewed domain experts mentioned that it might be necessary to read and understand 30 to 100 scientific articles to get a proper level of understanding or an overview of a topic or sub-topics. A collaboration of researchers as owners of each particular research work to provide a structured and semantic representation of their research achievements can have a huge impact in making their research

more accessible. A similar effort is spent on preparing survey and overview articles.
on preparing survey and overview articles. +
Has future workIntegrating our methodology with the proce
Integrating our methodology with the procedure of publishing survey articles can

help to create a paradigm shift. We plan to further extend the ontology to cover other research methodologies and fields. For a more robust implementation of the proposed approach, we are planning to use and significantly expand the OpenResearch.org platform and a user-friendly SPARQL auto-generation services for accessing metadata analysis for non-expert users. More comprehensive evaluation of the services will be done after the implementation of the curation, exploration and discovery services. In addition, our intention is to develop and foster a living community around OpenResearch.org and

SemSur, to extend the ontology and to ingest metadata to cover other research fields.
t metadata to cover other research fields. +
Has keywordsSemantic Metadata Enrichment, Quality Assessment, Recommendation Services, Scholarly Communication, Semantic Publishing +
Has motivationMaking research contributions transparent and comparable. +
Has platform- +
Has problemSemantifying scholarly artifacts +
Has relatedProblemOntologies integration and reuse +
Has subjectScholarly communication +
Has vendorOpen source +
Has year2017 +
ImplementedIn ProgLangOwl +
Proposes AlgorithmNo data available now. +
RunsOn OSOS independent +
TitleTowards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles +
Uses Framework- +
Uses MethodologyNo data available now. +
Uses Toolbox- +