PID Graph GraphQL Example Research Organization

User story

As an administrator for the University of Oxford I am interested in the reuse of research outputs from our university, so that I can help identify the most interesting research outputs.

Query strategy

We will query for research outputs where at least one author is affiliated with the University of Oxford, using the Research Organization Registry (ROR) identifier. We will ask for resource type (e.g. publication or dataset), the number of citations, views and downloads, and the authors and affiliations of the authors.

As ROR adoption is at an early stage and for example not yet implemented for Crossref DOIs as of April 2020, the returned research outputs will only be a small subset of the total number of research outputs from the University of Oxford. We further filter the results to only return research outputs with at least 100 views, according to the COUNTER Code of Practice.

Why GraphQL

The query combines the results from queries to two separate services (ROR and DataCite DOIs API). In addition, it combines DOI metadata with citation and usage data, the latter provided by the repository where the research output is hosted. GraphQL not only allows for a single query to fetch this information but also only returns the fields needed in the user story.

The output from the query is standard JSON and can be further processed by for example a Jupyter notebook.

Use the following query in the GraphQL client at https://api.datacite.org/graphql

{
  organization(id: "https://ror.org/052gg0110") {
    id
    name
    alternateName
    citationCount
    viewCount
    downloadCount
    works(hasViews: 100, first: 100) {
      totalCount
      years {
        title
        count
      }
      resourceTypes {
        title
        count
      }
      nodes {
        id
        type
        publisher
        publicationYear
        titles {
          title
        }
        creators {
          id
          name
          affiliation {
            id
            name
          }
        }
        citationCount
        viewCount
        downloadCount
      }
    }
  }
}

Related GitHub issue from 2018 FREYA workshop:

Nice example, could be useful for tracking down an organization research output. One concern: the example takes about 30 seconds to execute, also when you repeat it, it takes about 30 seconds again (so the result does not look like cached).

1 Like

@vasilyb while I am sure we can improve performance going forward, there is also a tradeoff that has to be made between query complexity and request duration. This is for example a query that probably can be run in the background, as data will not change so frequently. We can also adjust how many records are returned in the query and for example use pagination (still working on finalizing this).

I tried this for CERN and it works in general, although at this point the only information it returns is the name and alternate name.

@artemislav thanks. I hope that Zenodo supports ROR IDs for affiliations soon.

I tried this example (eliminating the hasViews part first) with my own institutional ROR, but I’m getting no returns. Would that mean that our ROR is implemented incorrectly somehow within the DataCite XML, or that it’s only using the affiliationIdentifier tag for its returns? Our datasets are not including persons as authors only organization. For instance, we have xml in the DataCite record like:

 <creator>
            <creatorName nameType="Organizational">Ocean Networks Canada</creatorName>
            <nameIdentifier nameIdentifierScheme="ROR" schemeURI="https://ror.org/">05gknh003</nameIdentifier>
        </creator>

Trying to figure out what we might be doing wrong.

Yes, we are currently only looking at the affiliation field. But I can change that.

That would be great to support our use case- thank you!

I’ve noticed that the query works now so the update seems to have been made. Thanks!

2 Likes