Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles
Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles | |
---|---|
Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles
| |
Bibliographical Metadata | |
Subject: | Scholarly communication |
Keywords: | Semantic Metadata Enrichment, Quality Assessment, Recommendation Services, Scholarly Communication, Semantic Publishing |
Year: | 2017 |
Authors: | Said Fathalla, Sahar Vahdati, Sören Auer, Christoph Lange |
Venue | TPDL |
Content Metadata | |
Problem: | Semantifying scholarly artifacts |
Approach: | Structuring research results is via knowledge graph representation |
Implementation: | SemSur |
Evaluation: | Questionnaire-based evaluation |
Contents
Abstract
Despite significant advances in technology, the way how research is done and especially communicated has not changed much. We have the vision that ultimately researchers will work on a common knowledge base comprising comprehensive descriptions of their research, thus making research contributions transparent and comparable. The current approach for structuring, systematizing and comparing research results is via survey or review articles. In this article, we describe how surveys for research fields can be represented in a semantic way, resulting in a knowledge graph that describes the individual research problems, approaches, implementations and evaluations in a structured and comparable way. We present a comprehensive ontology for capturing the content of survey articles. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the resulting knowledge graph by using it to answer queries about the different research contributions covered by the survey and evaluate how well the query answers serve readers’ information needs, in comparison to having them extract the same information from reading a survey paper.
Conclusion
In this article, we presented SemSur, a Semantic Survey Ontology, and an approach for creating a comprehensive knowledge graph representing research findings. We see this work as an initial step of a long-term research agenda to create a paradigm shift from document-based to knowledge-based scholarly communication. Our vision is to have this work deployed in an extended version of the existing OpenResearch.org platform. We have created instances of three selected surveys on different fields of research using the SemSur ontology. We evaluated our approach involving nine researchers. As we see in the evaluation results, SemSur enables successful retrieval of relevant and accurate results without users having to spend much time and effort compared to traditional ways. This ontology can have a significant influence on the scientific community especially for researchers who want to create a survey article or write literature on a certain topic. The results of our evaluation show that researchers agree that the traditional way of gathering an overview on a particular research topic is cumbersome and time-consuming. Much effort is needed and important information might be easily overlooked. Collaborative integration of research metadata provided by the community supports researchers in this regard. Interviewed domain experts mentioned that it might be necessary to read and understand 30 to 100 scientific articles to get a proper level of understanding or an overview of a topic or sub-topics. A collaboration of researchers as owners of each particular research work to provide a structured and semantic representation of their research achievements can have a huge impact in making their research more accessible. A similar effort is spent on preparing survey and overview articles.
Future work
Integrating our methodology with the procedure of publishing survey articles can help to create a paradigm shift. We plan to further extend the ontology to cover other research methodologies and fields. For a more robust implementation of the proposed approach, we are planning to use and significantly expand the OpenResearch.org platform and a user-friendly SPARQL auto-generation services for accessing metadata analysis for non-expert users. More comprehensive evaluation of the services will be done after the implementation of the curation, exploration and discovery services. In addition, our intention is to develop and foster a living community around OpenResearch.org and SemSur, to extend the ontology and to ingest metadata to cover other research fields.
Approach
Positive Aspects: No data available now.
Negative Aspects: No data available now.
Limitations: No data available now.
Challenges: No data available now.
Proposes Algorithm: No data available now.
Methodology: No data available now.
Requirements: No data available now.
Limitations: No data available now.
Implementations
Download-page: http://sda.tech/SemSur/Documentation/Semsur.html
Access API: -
Information Representation: OWL
Data Catalogue: {{{Catalogue}}}
Runs on OS: OS independent
Vendor: Open source
Uses Framework: -
Has Documentation URL: http://sda.tech/SemSur/Documentation/Semsur.html
Programming Language: owl
Version: 1.0
Platform: -
Toolbox: -
GUI: No
Research Problem
Subproblem of: Semantic publishing
RelatedProblem: Ontologies integration and reuse
Motivation: Making research contributions transparent and comparable.
Evaluation
Experiment Setup: The evaluation started with the phase of letting researchers first read the given overview questions and letting them try in their own way to find the respective answer.
Evaluation Method : Fill in a satisfaction questionnaire.
Hypothesis: Involved researchers should be aware of the domain in use.
Description: We followed these steps: – A set of 10 predefined natural language queries has been prepared for evaluation Table 4. Then, asking participants to try to answer these queries using their own tools and services. The queries were chosen in increasing order of complexity. – We implemented SPARQL queries corresponding to each of these queries to enable non-expert participants, not familiar with SPARQL, to query the knowledge graph. – We asked researchers to review the answers of the pre-defined queries that we formulated based on the SemSur ontology. We asked them to tell us whether they consider the provided answers and the way queries are formulated comprehensive and reasonable. – We finally asked the same researchers to fill in a satisfaction questionnaire with 18 questions14
Dimensions: Accuracy
Benchmark used: -
Results: 5 out of the 9 researchers immediately started with wellknown standardWeb search engines to explore the given topic. They tried to use several variations of keywords from the questions, e.g., “Federated Query Engines”, “SPARQL Federation”, etc. They also used digital libraries and scientific metadata services, e.g., ACM DL or Microsoft Academic Search, following the same approach and sometimes using advanced search options and filters. However, the retrieved results were either out of scope for the question but more related to the search keywords. Overall, 8 researchers found it difficult to collect information and reach a conclusive overview of the research topics or related work using current methods. Six of the participants pointed out that for some of the overview questions, search engines were as good as the proposed system particularly when the framework name is part of the search keyword. They all agreed that for complicated questions our SemSur approach outperformed any existing approach/tool. Seven participants agreed that our system would be helpful for both new and experienced researchers. Two-thirds of them strongly agreed that the time and effort they spent to find such information using our system in comparison to other traditional ways is relatively low. Finally, 100% of the participants would like to use SemSur approach in their further research for studying the literature of a research topic or writing a survey article. Since the results of queries were shown to the participants in table view, the main feedback from all participants about possible improvements was to provide a better way of data representation.
Access API | - + |
Event in series | TPDL + |
Has Benchmark | - + |
Has Challenges | No data available now. + |
Has DataCatalouge | {{{Catalogue}}} + |
Has Description | We followed these steps:
– A set of 10 pre … We followed these steps:
sfaction questionnaire with 18
questions14 +– A set of 10 predefined natural language queries has been prepared for evaluation Table 4. Then, asking participants to try to answer these queries using their own tools and services. The queries were chosen in increasing order of complexity. – We implemented SPARQL queries corresponding to each of these queries to enable non-expert participants, not familiar with SPARQL, to query the knowledge graph. – We asked researchers to review the answers of the pre-defined queries that we formulated based on the SemSur ontology. We asked them to tell us whether they consider the provided answers and the way queries are formulated comprehensive and reasonable. – We finally asked the same researchers to fill in a satisfaction questionnaire with 18 questions14 |
Has Dimensions | Accuracy + |
Has DocumentationURL | http://sda.tech/SemSur/Documentation/Semsur.html + |
Has Downloadpage | http://sda.tech/SemSur/Documentation/Semsur.html + |
Has Evaluation | Questionnaire-based evaluation + |
Has EvaluationMethod | Fill in a satisfaction questionnaire. + |
Has ExperimentSetup | The evaluation
started with the phase of letting researchers first read the given overview questions and letting them try in their own way to find the respective answer. + |
Has GUI | No + |
Has Hypothesis | Involved researchers should be aware of the domain in use. + |
Has Implementation | SemSur + |
Has InfoRepresentation | OWL + |
Has Limitations | No data available now. + |
Has NegativeAspects | No data available now. + |
Has PositiveAspects | No data available now. + |
Has Requirements | No data available now. + |
Has Results | 5 out of the 9 researchers immediately sta … 5 out of the 9 researchers immediately started with wellknown
ovide a better way of data representation. +standardWeb search engines to explore the given topic. They tried to use several variations of keywords from the questions, e.g., “Federated Query Engines”, “SPARQL Federation”, etc. They also used digital libraries and scientific metadata services, e.g., ACM DL or Microsoft Academic Search, following the same approach and sometimes using advanced search options and filters. However, the retrieved results were either out of scope for the question but more related to the search keywords. Overall, 8 researchers found it difficult to collect information and reach a conclusive overview of the research topics or related work using current methods. Six of the participants pointed out that for some of the overview questions, search engines were as good as the proposed system particularly when the framework name is part of the search keyword. They all agreed that for complicated questions our SemSur approach outperformed any existing approach/tool. Seven participants agreed that our system would be helpful for both new and experienced researchers. Two-thirds of them strongly agreed that the time and effort they spent to find such information using our system in comparison to other traditional ways is relatively low. Finally, 100% of the participants would like to use SemSur approach in their further research for studying the literature of a research topic or writing a survey article. Since the results of queries were shown to the participants in table view, the main feedback from all participants about possible improvements was to provide a better way of data representation. |
Has Subproblem | Semantic publishing + |
Has Version | 1.0 + |
Has abstract | Despite significant advances in technology … Despite significant advances in technology, the way how research is done and especially communicated has not changed much. We have the vision that ultimately researchers will work on a common knowledge base comprising comprehensive descriptions of their research, thus making research contributions transparent and comparable. The current approach for structuring, systematizing
and comparing research results is via survey or review articles. In this article, we describe how surveys for research fields can be represented in a semantic way, resulting in a knowledge graph that describes the individual research problems, approaches, implementations and evaluations in a structured and comparable way. We present a comprehensive ontology for capturing the content of survey articles. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the resulting knowledge graph by using it to answer queries about the different research contributions covered by the survey and evaluate how well the query answers serve readers’ information needs, in comparison to having them extract the same information from reading a survey paper. e information from reading a survey paper. + |
Has approach | Structuring research results is via knowledge graph representation + |
Has authors | Said Fathalla +, Sahar Vahdati +, Sören Auer + and Christoph Lange + |
Has conclusion | In this article, we presented SemSur, a Se … In this article, we presented SemSur, a Semantic Survey Ontology, and an approach for
on preparing survey and overview articles. +creating a comprehensive knowledge graph representing research findings. We see this work as an initial step of a long-term research agenda to create a paradigm shift from document-based to knowledge-based scholarly communication. Our vision is to have this work deployed in an extended version of the existing OpenResearch.org platform. We have created instances of three selected surveys on different fields of research using the SemSur ontology. We evaluated our approach involving nine researchers. As we see in the evaluation results, SemSur enables successful retrieval of relevant and accurate results without users having to spend much time and effort compared to traditional ways. This ontology can have a significant influence on the scientific community especially for researchers who want to create a survey article or write literature on a certain topic. The results of our evaluation show that researchers agree that the traditional way of gathering an overview on a particular research topic is cumbersome and time-consuming. Much effort is needed and important information might be easily overlooked. Collaborative integration of research metadata provided by the community supports researchers in this regard. Interviewed domain experts mentioned that it might be necessary to read and understand 30 to 100 scientific articles to get a proper level of understanding or an overview of a topic or sub-topics. A collaboration of researchers as owners of each particular research work to provide a structured and semantic representation of their research achievements can have a huge impact in making their research more accessible. A similar effort is spent on preparing survey and overview articles. |
Has future work | Integrating our methodology with the proce … Integrating our methodology with the procedure of publishing survey articles can
t metadata to cover other research fields. +help to create a paradigm shift. We plan to further extend the ontology to cover other research methodologies and fields. For a more robust implementation of the proposed approach, we are planning to use and significantly expand the OpenResearch.org platform and a user-friendly SPARQL auto-generation services for accessing metadata analysis for non-expert users. More comprehensive evaluation of the services will be done after the implementation of the curation, exploration and discovery services. In addition, our intention is to develop and foster a living community around OpenResearch.org and SemSur, to extend the ontology and to ingest metadata to cover other research fields. |
Has keywords | Semantic Metadata Enrichment, Quality Assessment, Recommendation Services, Scholarly Communication, Semantic Publishing + |
Has motivation | Making research contributions transparent and comparable. + |
Has platform | - + |
Has problem | Semantifying scholarly artifacts + |
Has relatedProblem | Ontologies integration and reuse + |
Has subject | Scholarly communication + |
Has vendor | Open source + |
Has year | 2017 + |
ImplementedIn ProgLang | Owl + |
Proposes Algorithm | No data available now. + |
RunsOn OS | OS independent + |
Title | Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles + |
Uses Framework | - + |
Uses Methodology | No data available now. + |
Uses Toolbox | - + |