Web component

The BioThings SDK web component contains tools used to generate and customize an API, given an Elasticsearch index with data. The web component uses the Tornado Web Server to respond to incoming API requests.

Server boot script

Settings

Config module

BiothingWebSettings

BiothingESWebSettings

Handlers

BaseHandler

class biothings.web.handlers.BaseHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]
get_sentry_client()[source]

Returns the sentry client configured in the application. If you need to change the behaviour to do something else to get the client, then subclass this method

FrontPageHandler

class biothings.web.handlers.FrontPageHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]
get_template_path()[source]

Override to customize template path for each handler.

By default, we use the template_path application setting. Return None to load templates relative to the calling file.

StatusHandler

class biothings.web.handlers.StatusHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

Web service health check

BaseESRequestHandler

BiothingHandler

class biothings.web.handlers.BiothingHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

Biothings Annotation Endpoint

URL pattern examples:

/{pre}/{ver}/{typ}/? /{pre}/{ver}/{typ}/([^/]+)/?

queries a term against a pre-determined field that represents the id of a document, like _id and dbsnp.rsid

GET -> {…} or [{…}, …] POST -> [{…}, …]

QueryHandler

class biothings.web.handlers.QueryHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

Biothings Query Endpoint

URL pattern examples:

/{pre}/{ver}/{typ}/query/? /{pre}/{ver}//query/?

GET -> {…} POST -> [{…}, …]

MetadataFieldHandler

class biothings.web.handlers.MetadataFieldHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

GET /metadata/fields

MetadataSourceHandler

class biothings.web.handlers.MetadataSourceHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

GET /metadata

extras(_meta)[source]

Override to add app specific metadata.

ESRequestHandler

BaseAPIHandler

class biothings.web.handlers.BaseAPIHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]
get_template_path()[source]

Override to customize template path for each handler.

By default, we use the template_path application setting. Return None to load templates relative to the calling file.

on_finish()[source]

This is a tornado lifecycle hook. Override to provide tracking features.

prepare()[source]

Called at the beginning of a request before get/post/etc.

Override this method to perform common initialization regardless of the request method.

Asynchronous support: Use async def or decorate this method with .gen.coroutine to make it asynchronous. If this method returns an Awaitable execution will not proceed until the Awaitable is done.

New in version 3.1: Asynchronous support.

set_default_headers()[source]

Override this to set HTTP headers at the beginning of the request.

For example, this is the place to set a custom Server header. Note that setting such headers in the normal flow of request processing may not do what you want, since headers may be reset during error handling.

write(chunk)[source]

Writes the given chunk to the output buffer.

To write the output to the network, use the flush() method below.

If the given chunk is a dictionary, we write it as JSON and set the Content-Type of the response to be application/json. (if you want to send JSON as a different Content-Type, call set_header after calling write()).

Note that lists are not converted to JSON because of a potential cross-site security vulnerability. All JSON output should be wrapped in a dictionary. More details at http://haacked.com/archive/2009/06/25/json-hijacking.aspx/ and https://github.com/facebook/tornado/issues/1009

write_error(status_code, **kwargs)[source]

Override to implement custom error pages.

write_error may call write, render, set_header, etc to produce output as usual.

If this error was caused by an uncaught exception (including HTTPError), an exc_info triple will be available as kwargs["exc_info"]. Note that this exception may not be the “current” exception for purposes of methods like sys.exc_info() or traceback.format_exc.

APISpecificationHandler

class biothings.web.handlers.APISpecificationHandler(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs: Any)[source]

Pipeline

Elasticsearch Query Builder

class biothings.web.query.ESQueryBuilder(user_query=None, scopes_regexs=(), scopes_default=('_id',), allow_random_query=True, allow_nested_query=False, metadata=None)[source]

Build an Elasticsearch query with elasticsearch-dsl.

apply_extras(search, options)[source]

Process non-query options and customize their behaviors. Customized aggregation syntax string is translated here.

build(q=None, **options)[source]

Build a query according to q and options. This is the public method called by API handlers.

Regarding multisearch:

TODO

Regarding scopes:

scopes: [str] nonempty, match query. scopes: NoneType, or [], no scope, so query string query.

Additionally support these options:

explain: include es scoring information userquery: customized function to interpret q

  • additional keywords are passed through as es keywords

    for example: ‘explain’, ‘version’ …

default_match_query(q, scopes, options)[source]

Override this to customize default match query. By default it implements a multi_match query.

default_string_query(q, options)[source]

Override this to customize default string query. By default it implements a query string query.

Elasticsearch Query Execution

class biothings.web.query.ESQueryBackend(client, indices=None)[source]

Elasticsearch Result Transformer

class biothings.web.query.ESResultFormatter(licenses=None, license_transform=None, field_notes=None, excluded_keys=())[source]

Class to transform the results of the Elasticsearch query generated prior in the pipeline. This contains the functions to extract the final document from the elasticsearch query result in `Elasticsearch Query`_. This also contains the code to flatten a document etc.

transform(response, **options)[source]

Transform the query response to a user-friendly structure. Mainly deconstruct the elasticsearch response structure and hand over to transform_doc to apply the options below.

Options:

# generic transformations for dictionaries # —————————————— dotfield: flatten a dictionary using dotfield notation _sorted: sort keys alaphabetically in ascending order always_list: ensure the fields specified are lists or wrapped in a list allow_null: ensure the fields specified are present in the result,

the fields may be provided as type None or [].

# additional multisearch result transformations # ———————————————— template: base dict for every result, for example: {“success”: true} templates: a different base for every result, replaces the setting above template_hit: a dict to update every positive hit result, default: {“found”: true} template_miss: a dict to update every query with no hit, default: {“found”: false}

# document format and content management # ————————————— biothing_type: result document type to apply customized transformation.

for example, add license field basing on document type’s metadata.

one: return the individual document if there’s only one hit. ignore this setting

if there are multiple hits. return None if there is no hit. this option is not effective when aggregation results are also returned in the same query.

native: bool, if the returned result is in python primitive types. version: bool, if _version field is kept. score: bool, if _score field is kept.

transform_aggs(res)[source]

Transform the aggregations field and make it more presentable. For example, these are the fields of a two level nested aggregations:

aggregations.<term>.doc_count_error_upper_bound aggregations.<term>.sum_other_doc_count aggregations.<term>.buckets.key aggregations.<term>.buckets.key_as_string aggregations.<term>.buckets.doc_count aggregations.<term>.buckets.<nested_term>.* (recursive)

After the transformation, we’ll have:

facets.<term>._type facets.<term>.total facets.<term>.missing facets.<term>.other facets.<term>.terms.count facets.<term>.terms.term facets.<term>.terms.<nested_term>.* (recursive)

Note the first level key change doesn’t happen here.

transform_hit(path, doc, options)[source]

Transform an individual search hit result. By default add licenses for the configured fields.

If a source has a license url in its metadata, Add “_license” key to the corresponding fields. Support dot field representation field alias.

If we have the following settings in web_config.py

LICENSE_TRANSFORM = {

“exac_nontcga”: “exac”, “snpeff.ann”: “snpeff”

},

Then GET /v1/variant/chr6:g.38906659G>A should look like: {

“exac”: {

“_license”: “http://bit.ly/2H9c4hg”, “af”: 0.00002471},

“exac_nontcga”: {

“_license”: “http://bit.ly/2H9c4hg”, <— “af”: 0.00001883}, …

} And GET /v1/variant/chr14:g.35731936G>C could look like: {

“snpeff”: {

“_license”: “http://bit.ly/2suyRKt”, “ann”: [{“_license”: “http://bit.ly/2suyRKt”, <—

“effect”: “intron_variant”, “feature_id”: “NM_014672.3”, …}, {“_license”: “http://bit.ly/2suyRKt”, <— “effect”: “intron_variant”, “feature_id”: “NM_001256678.1”, …}, …]

}, …

}

The arrow marked fields would not exist without the setting lines.

transform_mapping(mapping, prefix=None, search=None)[source]

Transform Elasticsearch mapping definition to user-friendly field definitions metadata result

static traverse(obj, leaf_node=False)

Output path-dictionary pairs. For example, input: {

‘exac_nontcga’: {‘af’: 0.00001883}, ‘gnomad_exome’: {‘af’: {‘af’: 0.0000119429, ‘af_afr’: 0.000123077}}, ‘snpeff’: {‘ann’: [{‘effect’: ‘intron_variant’,

‘feature_id’: ‘NM_014672.3’}, {‘effect’: ‘intron_variant’, ‘feature_id’: ‘NM_001256678.1’}]}

} will be translated to a generator: (

(“exac_nontcga”, {“af”: 0.00001883}), (“gnomad_exome.af”, {“af”: 0.0000119429, “af_afr”: 0.000123077}), (“gnomad_exome”, {“af”: {“af”: 0.0000119429, “af_afr”: 0.000123077}}), (“snpeff.ann”, {“effect”: “intron_variant”, “feature_id”: “NM_014672.3”}), (“snpeff.ann”, {“effect”: “intron_variant”, “feature_id”: “NM_001256678.1”}), (“snpeff.ann”, [{ … },{ … }]), (“snpeff”, {“ann”: [{ … },{ … }]}), (‘’, {‘exac_nontcga’: {…}, ‘gnomad_exome’: {…}, ‘snpeff’: {…}})

) or when traversing leaf nodes: (

(‘exac_nontcga.af’, 0.00001883), (‘gnomad_exome.af.af’, 0.0000119429), (‘gnomad_exome.af.af_afr’, 0.000123077), (‘snpeff.ann.effect’, ‘intron_variant’), (‘snpeff.ann.feature_id’, ‘NM_014672.3’), (‘snpeff.ann.effect’, ‘intron_variant’), (‘snpeff.ann.feature_id’, ‘NM_001256678.1’)

)