LD Connect is a Linked Data portal for IOS Press scientometrics, consisting of all IOS Press bibliographic data enriched by geographic information. This is a work funded by IOS Press in collaboration with the STKO lab at UC Santa Barbara. A SPARQL endpoint for retrieving information in LD Connect is published as http://ld.iospress.nl:7200
. In this documentation, we provide descriptions about shared ontology, embeddings, the scientometric system along with instructions on how to reuse it. More information is provided in our paper "LD Connect: A Linked Data Portal for IOS Press Scientometrics".
The ontology file can be found at data/ontology/ontology.ttl
. Two schema diagrams below show ontology fragments of iospress:Publication
and iospress:Contributor
respectively. In addition, we include a recent collection of selected triples in data/triples
that are extractd from LD Connect for convenience of reuse. The categories.ttl
contains triples about the mapping between a iospress:Journal
and corresponding iospress:Category
, geocoded.ttl
contains geocoded information about iospress:Organization
, and triplify-union.ttl
contains the union of all triples LD Connect consisted of (at the time of data collection).
Semantic search is available at http://ld.iospress.nl/explore/semantic-search/
. A sample SPARQL query is provided below, which is used to retrieve information about papers whose first author is from affiliations located in China.
select ?title (group_concat(?keyword; separator=',')
as ?keywords) ?year ?journal ?first_author_name ?org_name
{
?paper iospress:publicationTitle ?title;
iospress:publicationIncludesKeyword ?keyword;
iospress:publicationDate ?date;
iospress:articleInIssue/iospress:issueInVolume/
iospress:volumeInJournal ?journal;
iospress:publicationAuthorList ?author_list.
?author_list rdf:_0 ?first_author.
?first_author iospress:contributorFullName ?first_author_name;
iospress:contributorAffiliation ?org.
?org iospress:geocodingInput ?org_name ;
iospress:geocodingOutput/
iospress-geocode:country ?org_country.
bind(year(?date) as ?year)
values ?org_country {"China"@en}
} group by ?title ?year ?journal ?first_author_name ?org_name
A version of pre-trained embeddings are located in data/embeddings/
. We have provided document embeddings in plain text format (see data/embeddings/IOS-Doc2Vec-TXT/
). The doc2vec.txt
is the Doc2Vec model. The doc2vec_voc.txt
contains a list of all the paper entity URLs of the document embeddings. The w2v.txt
is the corresponding Word2Vec model. The w2v_voc.txt
contains a list of the word vocabulary of the word embeddings. In addition, we provide knowledge graph embeddings in plain text format as well (see data/embeddings/IOS-TransE/
). Specifically, the graph embeddings TransE_person.txt
provided consists of contributor information. Also, entity_sameAs_merge_mapping_iri.json
) is a JSON file about how same entities (e.g., contributors, affiliations, etc.) are linked after co-reference resolution. The dimension of all embeddings is 200.
To explore how embeddings unleash the power of IOS Press data, please refer to server.js
to see how we achieve the embedding-based similarity search in our scientometric system.
IOS Press scientometrics can be downloaded from the scientometrics
folder, migrated to other academic knowledge graphs and reused for relevant applications and research. Follow the instructions below to set it up and run locally.
-
After cloning this repository, type the following commands in the terminal.
$ cd scientometrics/ $ npm install
-
Create a folder
data/
withinscientometrics/sites/
. Copy both pre-trained embedding folders (includingdata/embeddings/IOS-Doc2Vec-TXT/
) anddata/embeddings/IOS-TransE/
) to thescientometrics/sites/data/
directory. -
Launch the server on an open port:
$ node src/server/server.js
You can modify the port by changing
N_PORT
in server.js. The default is set to be 7200. -
Now, open a browser and navigate to
http://localhost:N_PORT/iospress_scientometrics
.
IOS Press scientometrics can be accessed through http://stko-roy.geog.ucsb.edu:7200/iospress_scientometrics
. Note that the HTTP header should be used instead of HTTPS.
These scientometrics include Home (a choropleth map), Country Collaboration, Author Map, Author Similarity, Paper Similarity, Keyword Graph and Streamgraph. Please select a journal category first and then a journal of interest for bibliographic analysis, visualization and embedding-based similarity search. An example about how information is displayed for the Semantic Web journal are attached below.
This repository is distributed under the CC0-1.0 License. See LICENSE for more information.