CERN Search REST API¶

In this section some of the common errors that occur in the CERN Search are listed with possible solutions. More information can be found in invenio-troubleshooting.

Host HOST_NAME is not trusted¶

When making a query and receiving a 401 response. It means that the configuration variable ALLOWED_HOSTS is not properly set. If you wish to set it up in the environment it should have the INVENIO prefix. For example:

export INVENIO_ALLOWED_HOSTS="['test-cern-search.web.cern.ch']"

Authentication¶

Check that the headers are properly set. The needed headers are:

Authorization:Bearer $TOKEN
Content-Type: application/json
Accept: application/json

In curl they can be set with -H.

Insecure connection¶

When using curl you need to specify the option -k in order to allow insecure connections (for self-signed certificates)

TypeError: 'RequestError' object is not callable¶

If you are having a traceback similar to:

[ERROR] Error handling request /api/records/
Traceback (most recent call last):
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 284, in handle
    keepalive = self.handle_request(req, conn)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 333, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 2000, in __call__
    return self.wsgi_app(environ, start_response)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/werkzeug/wsgi.py", line 826, in __call__
    return app(environ, start_response)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 2000, in __call__
    return self.wsgi_app(environ, start_response)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 1991, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 1567, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 1642, in full_dispatch_request
    response = self.make_response(rv)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/flask/app.py", line 1746, in make_response
    rv = self.response_class.force_type(rv, request.environ)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/werkzeug/wrappers.py", line 921, in force_type
    response = BaseResponse(*_run_wsgi_app(response, environ))
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/werkzeug/wrappers.py", line 59, in _run_wsgi_app
    return _run_wsgi_app(*args)
  File "/workspace/virtual_env/cernsearch/lib/python2.7/site-packages/werkzeug/test.py", line 923, in run_wsgi_app
    app_rv = app(environ, start_response)
TypeError: 'RequestError' object is not callable

Is most likely due to some Elasticsearch error/incompatibility. First of all check the error logs of elasticsearch for errors parsing fields. Another possible problem is a version mistmatch between the python libraries and the ES instance.

JSONSchema errors¶

When performing operations over an schema which is not the default one, it has to be specified using the $schema field in the JSON data of the request. They have to have as alias the SEARCH_INDEX of the instance in order to be found, and its format is as follows:

http://<INSTANCE_URL>/schemas/<INSTANCE_NAME>/<DOC_NAME>.json

For example an instance name would be dev-cern-search.web.cern.ch, whose alias is cernsearch-test. This alias is the name of the folder in which its jsonschemas are (You can see webservices, indico and cernsearch-test among others), which is also the same than the entrypoint set in the setup.py file. Note that the INSTANCE_NAME is also the first part of the index name. Only the DOC_NAME is missing, this is the second part of the index name (the one missing if the INSTANCE_NAME is taken out), which is also the same than the document in the mapping file.

The URLs for the cernsearch-test schemas are:

"$schema": "http://dev-cern-search.web.cern.ch/schemas/cernsearch-test/collection_v0.0.1.json" <-- Index name "cernsearch-test-collection_v0.0.1.json"
"$schema": "http://dev-cern-search.web.cern.ch/schemas/cernsearch-test/doc_v0.0.1.json" <-- Index name "cernsearch-test-doc_v0.0.1.json"

My query does not work¶

CERN Search API uses the default search factory, which implements ES (query_string)[https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-query-string-query.html]. For example:

curl -k -X GET -H 'Content-Type: application/json' -H 'Accept: application/json' 'https://<host>:<port>/api/records/?q=field_one:value_one+AND+field_two:value_two&access=egroup' -H "Authorization:Bearer $TOKEN"

Note: Do not forget to encode the special characters when testing from the command line.

Cannot apply PATCH¶

Be careful not to leave a final trailing slash in the path attribute.

Redis RDB¶

If Redis is failing due to RDB backups to disk, you can deactivate them by running redis-cli config set save "" in the console.

Status code 429 - TooManyRequests Exception¶

There is a rate limiting capability coming from invenio-rest / flask-limiting, which by default is set to 5000 requests per hour.

Status code 413 - Request Entity Too Large¶

The size of the request body has a maximum size established by NGinx. This can be changed with the client_max_body_size parameter.

Invalid request block size from uWSGI¶

For development purposes you might want to run the CERN Search instance, either or both REST API or UI, locally.

If you run uWSGI along with an Nginx server, you can run the uWSGI server with the option --socket. However, if you wish to run it alone you will encounter the following error when it receives the requests:

...
invalid request block size: 21573 (max 4096)...skip
...

To fix it, run the uWSGI server with the --http option instead of --socket.