@luc thanks for raising the GitHub issues. And we will include additional metadata from GRID in the ROR output, let us know if you are interested in anything in particular.
The paging only works up to a point. Paging from the start goes fine, but at the time I tried, the first page said there were 96793 results, which would be 4840 pages of 20 results. But when I tried to get page 4840, I got the following error:
TransportError at /organizations TransportError(500, 'search_phase_execution_exception', 'Result window is too large, from + size must be less than or equal to: [10000] but was [96800]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.')
And indeed, I get that error starting from page 501, the first page that exceeds 10000.
Also, the error is accompanied by a complete django stack trace, which is probably something you want to disable on production, to not give people with bad intentions more information than is strictly necessary.
Thanks @martijn. This is known limitation of search indexes such as Elasticsearch, the deep paging problem, and there is an issue open in the ror-api GitHub repo. API documentation is here. I think the solution is threefold:
limit pagination to 10,000 results
implement cursor-page pagination, which overcomes this limitation
provide a data dump of all data for people who basically want all data