WikiData Scientific Events
Introduction
- Simon Cobb
- Wolfgang Fahl
Simon Cobb
- System Admin University Exeter
- Spare Time - Wikidata editor
- Wikicite aspect
- Big import from ORCID data - affiliations
Wolfgang Fahl
- BITPlan
Confident and PTP
- http://confident.bitplan.com/
- https://confident.dbis.rwth-aachen.de/dblpconf/sample/dblp/inproceedings/1000
- http://ptp.bitplan.com
Conference Lookup e.g.
- https://rq.bitplan.com/index.php/Full_list_of_References
- https://rq.bitplan.com/index.php/AIDA#_SCITEe1bb391f4ac042a49ccdd81498dbf12e
- https://cr.bitplan.com/index.php/ISWC_2020
- https://www.semantic-mediawiki.org/wiki/List_of_Attendees
WikiData Import
- Main_Page#Concrete
- WikiData
- https://www.openresearch.org/wiki/Property:DblpSeries
- Data_Donations
- dblp conference xml dump import
- Microsoft Academic Graph dump - see ConferenceInstances.txt and ConferenceSeries.txt
Task
Import Scientific Event Series/Event instances into Wikidata as a pseudo-PID provider
Goals
- import Data Donations to Wikidata as a PID provider to that later DataCite PID creation is simpler
- reuse Wikidata curation infrastructure and resources
- Define minimum viable data set
Procedure
- Agree on Datasets to be imported
- Test importing a to find out how comfortable we are with the quality of
- as minimum viable dataset reuse PTP elements
The Proceedings Title Parser parses Natural Language Proceedings titles to supply the following metadata elements in digital form:
- title
- acronym
- series
- homepage (+ archive link and state (blue link/red link)
- ordinal
- year
- month
- country
- region
- city
- startdate
- enddate
- frequency
- event type
- did happen?
- indicator of trust (1: good, 0: predatory)
- provenance identifiers (dblp, openresearch, GND, wikidata, ...)
- proceedings identfiers (url, doi, ...)
Progress
MVP Dataset
First import
- https://www.openresearch.org/wiki/List_of_DblpEventSeries (records where dblpseries is set only)
- Three Wikidata items updated with extra information and three new items created.
OPENRESEARCH Event Series Wikidata item 3DUI IEEE Symposium on 3D User Interfaces http://www.wikidata.org/entity/Q105456162 ACNS International Conference on Applied Cryptography and Network Security http://www.wikidata.org/entity/Q4781524 ASE IEEE/ACM International Conference on Automated Software Engineering http://www.wikidata.org/entity/Q17087684 ICRE International Conference on Requirements Engineering http://www.wikidata.org/entity/Q105456163 RE IEEE International Requirements Engineering Conference http://www.wikidata.org/entity/Q18358216 VR IEEE Virtual Reality Conference and 3D User Interfaces http://www.wikidata.org/entity/Q105455915
Second Step
- Identify and improve existing conference series items in Wikidata without DBLP Venue ID.
OPENRESEARCH / Acronym Event Series Wikidata item AAAI AAAI Conference on Artificial Intelligence https://www.wikidata.org/entity/Q56682083 RecSys ACM Conference on Recommender Systems https://www.wikidata.org/entity/Q20888918 MobiHoc ACM Interational Symposium on Mobile Ad Hoc Networking and Computing https://www.wikidata.org/entity/Q6053806 ISPD ACM International Symposium on Physical Design https://www.wikidata.org/entity/Q6053813 UIST ACM Symposium on User Interface Software and Technology https://www.wikidata.org/entity/Q15994864 AFIPS NCC AFIPS National Computer Conferences https://www.wikidata.org/entity/Q6269156 COLT Annual Conference Computational Learning Theory https://www.wikidata.org/entity/Q75702617 MICRO Annual IEEE/ACM International Symposium on Microarchitecture https://www.wikidata.org/entity/Q6053802 RECOMB Annual International Conference on Research in Computational Molecular Biology https://www.wikidata.org/entity/Q16335149 CPM Annual Symposium on Combinatorial Pattern Matching https://www.wikidata.org/entity/Q30753285 ASP-DAC Asia and South Pacific Design Automation Conference https://www.wikidata.org/entity/Q4806492 BMVC British Machine Vision Conference https://www.wikidata.org/entity/Q18210447 BIS Business Information Systems https://www.wikidata.org/entity/Q18575820 ICoC CCF Internet Conference of China https://www.wikidata.org/entity/Q19854299 CLEF Conference and Labs of the Evaluation Forum https://www.wikidata.org/entity/Q5159889 CoLIS Conference on Conceptions of Library and Information Sciences https://www.wikidata.org/entity/Q5158423 EMNLP Conference on Empirical Methods in Natural Language Processing https://www.wikidata.org/entity/Q18353514 FAccT Conference on Fairness, Accountability and Transparency https://www.wikidata.org/entity/Q61312648 CIDR Conference on Innovative Data Systems Research https://www.wikidata.org/entity/Q5159947 DCC Data Compression Conference https://www.wikidata.org/entity/Q90429175 BTW Datenbanksysteme für Business, Technologie und Web https://www.wikidata.org/entity/Q25392673 DAC Design Automation Conference https://www.wikidata.org/entity/Q1529700 DATE Design, Automation, and Test in Europe https://www.wikidata.org/entity/Q5264252 DEBS Distributed Event-Based Systems https://www.wikidata.org/entity/Q5283117 EWDTS East-West Design & Test Symposium https://www.wikidata.org/entity/Q65214297 EVA Electronic Information, the Visual Arts and Beyond Conference https://www.wikidata.org/entity/Q5324595 SGP Eurographics Symposium on Geometry Processing https://www.wikidata.org/entity/Q7661884 ECCB European Conference on Computational Biology https://www.wikidata.org/entity/Q5412432 ETAPS European Joint Conferences on Theory And Practice of Software https://www.wikidata.org/entity/Q17085589 ESSIR European Summer School in Information Retrieval https://www.wikidata.org/entity/Q5413255 FSE Fast Software Encryption Workshop https://www.wikidata.org/entity/Q5436988 FCRC Federated Computing Research Conference https://www.wikidata.org/entity/Q5440680 SEAMS ICSE Workshop on Software Engineering for Adaptive and Self-Managing Systems https://www.wikidata.org/entity/Q7554201 INFOCOM IEEE Conference on Computer Communications https://www.wikidata.org/entity/Q5159938 CICC IEEE Custom Integrated Circuits Conference https://www.wikidata.org/entity/Q5196391 ETS IEEE European Test Symposium https://www.wikidata.org/entity/Q5413286 ICASSP IEEE International Conference on Acoustics, Speech, and Signal Processing https://www.wikidata.org/entity/Q6049570 ICDAR IEEE International Conference on Document Analysis and Recognition https://www.wikidata.org/entity/Q18626007 ICRA IEEE International Conference on Robotics and Automation https://www.wikidata.org/entity/Q6049675 IEEE ICWS IEEE International Conference on Web Services https://www.wikidata.org/entity/Q6049702 ISSCC IEEE International Solid-State Circuits Conference https://www.wikidata.org/entity/Q1666925 ISWC IEEE International Symposium on Wearable Computers https://www.wikidata.org/entity/Q6053821 WFCS IEEE International Workshop on Factory Communication Systems https://www.wikidata.org/entity/Q39055914 ARITH IEEE Symposium on Computer Arithmetic https://www.wikidata.org/entity/Q21015563 VIS IEEE Visualization Conference https://www.wikidata.org/entity/Q30633629 ISC ISC High Performance https://www.wikidata.org/entity/Q473962 IMS Intelligent Memory Systems https://www.wikidata.org/entity/Q5970402 ICCST International Carnahan Conference on Security Technology https://www.wikidata.org/entity/Q85769626 ICMC International Computer Music Conference https://www.wikidata.org/entity/Q4288208 COCOON International Computing and Combinatorics Conference https://www.wikidata.org/entity/Q30752700 SC International Conference for High Performance Computing, Networking, Storage and Analysis https://www.wikidata.org/entity/Q4392761 ARES International Conference on Availability, Reliability and Security https://www.wikidata.org/entity/Q6049591 BPM International Conference on Business Process Management https://www.wikidata.org/entity/Q28404180 ICCAD International Conference on Computer Aided Design https://www.wikidata.org/entity/Q15995196 CASA International Conference on Computer Animation and Social Agents https://www.wikidata.org/entity/Q16950795 CIT International Conference on Computer and Information Technology https://www.wikidata.org/entity/Q6049604 CONCUR International Conference on Concurrency Theory https://www.wikidata.org/entity/Q65119044 CSS International Conference on Cryptography and Security Systems https://www.wikidata.org/entity/Q47482923 ELPUB International Conference on Electronic Publishing https://www.wikidata.org/entity/Q54932105 CEAS International Conference on Email and Anti-Spam https://www.wikidata.org/entity/Q5159940 MEMOCODE International Conference on Formal Methods and Models for Co-Design https://www.wikidata.org/entity/Q63441024 CHI International Conference on Human Factors in Computing Systems https://www.wikidata.org/entity/Q781419 MobileHCI International Conference on Human-Computer Interaction with Mobile Devices and Services https://www.wikidata.org/entity/Q841716 K-CAP International Conference on Knowledge Capture https://www.wikidata.org/entity/Q64852380 REV International Conference on Remote Engineering and Virtual Instrumentation https://www.wikidata.org/entity/Q6049667 ICSOC International Conference on Service Oriented Computing https://www.wikidata.org/entity/Q25109670 ICSE International Conference on Software Engineering https://www.wikidata.org/entity/Q6049689 ICSEng International Conference on Systems Engineering https://www.wikidata.org/entity/Q17149596 TSD International Conference on Text, Speech and Dialogue https://www.wikidata.org/entity/Q7708376 TPDL International Conference on Theory and Practice of Digital Libraries https://www.wikidata.org/entity/Q5412433 ICEGOV International Conference on Theory and Practice of Electronic Governance https://www.wikidata.org/entity/Q6049696 Middleware International Middleware Conference https://www.wikidata.org/entity/Q6052020 ISAAC International Symposium on Algorithms and Computation https://www.wikidata.org/entity/Q16886853 ISMAR International Symposium on Mixed and Augmented Reality https://www.wikidata.org/entity/Q15995207 SYSTOR International Systems and Storage Conference https://www.wikidata.org/entity/Q18351895 ITC International Teletraffic Congress https://www.wikidata.org/entity/Q17092593 Louhi International Workshop on Health Text Mining and Information Analysis https://www.wikidata.org/entity/Q64936507 MLSP International Workshop on Machine Learning for Signal Processing https://www.wikidata.org/entity/Q56422379 OHS International Workshop on Open Hypertext Systems https://www.wikidata.org/entity/Q62932586 IWOCL International Workshop on OpenCL https://www.wikidata.org/entity/Q17047824 PRNI International Workshop on Pattern Recognition in Neuroimaging https://www.wikidata.org/entity/Q52769899 LAGOS Latin-American Algorithms, Graphs and Optimization Symposium https://www.wikidata.org/entity/Q30752285 MUC Message Understanding Conference https://www.wikidata.org/entity/Q1923926 NIME New Interfaces for Musical Expression https://www.wikidata.org/entity/Q4326497 PSB Pacific Symposium on Biocomputing https://www.wikidata.org/entity/Q7122721 PRIB Pattern Recognition in Bioinformatics https://www.wikidata.org/entity/Q16254527 Tapia Richard Tapia Celebration of Diversity in Computing Conference https://www.wikidata.org/entity/Q24964735 SCIA Scandinavian Conference on Image Analysis https://www.wikidata.org/entity/Q7429952 SMI Shape Modeling International Conference https://www.wikidata.org/entity/Q48968750 SOSP Symposium on Operating Systems Principles https://www.wikidata.org/entity/Q7661886 SPIRE Symposium on String Processing and Information Retrieval https://www.wikidata.org/entity/Q90416626 SIGCSE Technical Symposium on Computer Science Education https://www.wikidata.org/entity/Q16915368 TREC Text Retrieval Conference https://www.wikidata.org/entity/Q745623 USENIX USENIX Annual Technical Conference https://www.wikidata.org/entity/Q4174094 VRIC Virtual Reality International Conference https://www.wikidata.org/entity/Q15994865 VRST Virtual Reality Software and Technology https://www.wikidata.org/entity/Q2527904 WebSci Web Science Conference https://www.wikidata.org/entity/Q31836276 WM Wissensmanagement https://www.wikidata.org/entity/Q65014677 CHES Workshop on Cryptographic Hardware and Embedded Systems https://www.wikidata.org/entity/Q4035617 RepEval Workshop on Evaluating Vector-Space Representations for NLP https://www.wikidata.org/entity/Q60642955 BEA Workshop on Innovative Use of NLP for Building Educational Applications https://www.wikidata.org/entity/Q77325313 RepL4NLP Workshop on Representation Learning for NLP https://www.wikidata.org/entity/Q59310777 SWAT4LS Workshop on Semantic Web Applications and Tools for Life Sciences https://www.wikidata.org/entity/Q56846035 XP XP/Agile Universe https://www.wikidata.org/entity/Q47484880
OPENRESEARCH / Acronym Event Series Wikidata item AAAI AAAI Conference on Artificial Intelligence https://www.wikidata.org/entity/Q56682083 KDD ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://www.wikidata.org/entity/Q66305862 SAC ACM Symposium on Applied Computing https://www.wikidata.org/entity/Q105491087 ACL Annual Meeting of the Association for Computational Linguistics https://www.wikidata.org/entity/Q48620041 BIS Business Information Systems https://www.wikidata.org/entity/Q18575820 ESWC Extended Semantic Web Conference https://www.wikidata.org/entity/Q17012957 ICAC IEEE International Conference on Autonomic Computing https://www.wikidata.org/entity/Q105491169 EKAW International Conference Knowledge Engineering and Knowledge Management https://www.wikidata.org/entity/Q105491170 AINA International Conference on Advanced Information Networking and Applications https://www.wikidata.org/entity/Q105491209 ACNS International Conference on Applied Cryptography and Network Security https://www.wikidata.org/entity/Q4781524 ASPLOS International Conference on Architectural Support for Programming Languages and Operating Systems https://www.wikidata.org/entity/Q6049586 IEEE BigDataService International Conference on Big Data Computing Service and Applications https://www.wikidata.org/entity/Q105491171 ICCCI International Conference on Computational Collective Intelligence https://www.wikidata.org/entity/Q105491172 K-CAP International Conference on Knowledge Capture https://www.wikidata.org/entity/Q64852380 KESW International Conference on Knowledge Engineering and the Semantic Web https://www.wikidata.org/entity/Q105491173 LREC International Conference on Language Resources and Evaluation https://www.wikidata.org/entity/Q3206140 LAK International Conference on Learning Analytics and Knowledge https://www.wikidata.org/entity/Q105491174 ICML International Conference on Machine Learning https://www.wikidata.org/entity/Q17087718 MTSR International Conference on Metadata and Semantics Research https://www.wikidata.org/entity/Q105491176 TPDL International Conference on Theory and Practice of Digital Libraries https://www.wikidata.org/entity/Q5412433 ICWE International Conference on Web Engineering https://www.wikidata.org/entity/Q105491177 WISE International Conference on Web Information Systems Engineering https://www.wikidata.org/entity/Q105491178 DMS International Distributed Multimedia Systems Conference on Visualization and Visual Languages https://www.wikidata.org/entity/Q105491179 ISWC International Semantic Web Conference https://www.wikidata.org/entity/Q6053150 TREC Text Retrieval Conference https://www.wikidata.org/entity/Q745623 WWW The Web Conference https://www.wikidata.org/entity/Q3570023 WIMS Web Intelligence, Mining and Semantics https://www.wikidata.org/entity/Q105491180 WSDM Web Search and Data Mining https://www.wikidata.org/entity/Q105491181 ECDL European Conference on Research and Advanced Technology for Digital Libraries https://www.wikidata.org/entity/Q105491257 LDOW Workshop on Linked Data on the Web https://www.wikidata.org/entity/Q105491258
Combined DBLP and OpenResearch dataset
OpenResearch and Microsoft Academic data matched with DBLP conference series data in this spreadsheet. These 296 rows represent the strongest matches between DBLP and OpenResearch data; part of a larger dataset with 5,385 series. Columns containing results of exact match, fingerprint, ngram fingerprint, and difflib comparisons of event series title are included. In difflib columns the values are scored as follows: match = 1, very similar > 0.9, quite similar > 0.8. To review and discuss before ingest.
Tools
- OpenRefine
- QuickStatements for upload
- Larger batches via scripts in bash - curl requests
- https://github.com/WolfgangFahl/py-3rdparty-mediawiki
- https://pypi.org/project/pylodstorage/
- https://pypi.org/project/geograpy3/
Followup Issues
Instance of
Identifiers for events
Duplicate Removal
- Example:
- http://www.wikidata.org/entity/Q50566474
- http://www.wikidata.org/entity/Q100998436 (merged with Q50566474 2021-02-14)
- https://www.wikidata.org/wiki/Help:Merge
Quality and Trust Check
Link target | Main_Page#Concrete +, WikiData +, Data_Donations + and Data Donations + |
Link targetPage | WikiData +, Data Donations + and Main Page#Concrete + |
ORPage | 3DUI +, ACNS +, ASE +, ICRE +, RE +, VR +, AAAI +, RecSys +, XP +, KDD +, SAC +, ACL +, BIS +, ESWC +, ICAC +, EKAW +, AINA +, ASPLOS +, IEEE BigDataService +, ICCCI +, K-CAP +, KESW +, LREC +, LAK +, ICML +, MTSR +, TPDL +, ICWE +, WISE +, DMS +, ISWC +, TREC +, WWW +, WIMS +, WSDM +, ECDL + and LDOW + |