biothings.hub.standalone

This standalone module is originally located at “biothings/standalone” repo. It’s used for Standalone/Autohub instance.

class biothings.hub.standalone.AutoHubFeature(managers, version_urls, indexer_factory=None, validator_class=None, *args, **kwargs)[source]

Bases: object

version_urls is a list of URLs pointing to versions.json file. The name of the data release is taken from the URL (http://…s3.amazon.com/<the_name>/versions.json) unless specified as a dict: {“name” : “custom_name”, “url” : “http://…”}

If indexer_factory is passed, it’ll be used to create indexer used to dump/check versions currently installed on ES, restore snapshot, index, etc… A indexer_factory is typically used to generate indexer dynamically (ES host, index name, etc…) according to URLs for instance. See standalone.hub.DynamicIndexerFactory class for an example. It is typically used when lots of data releases are being managed by the Hub (so no need to manually update STANDALONE_CONFIG parameter.

If indexer_factory is None, a config param named STANDALONE_CONFIG is used, format is the following:

{“_default”{“es_host”: “…”, “index”: “…”, “doc_type”“…”},

“the_name” : {“es_host”: “…”, “index”: “…”, “doc_type” : “…”}}

When a data release named (from URL) matches an entry, it’s used to configured which ES backend to target, otherwise the default one is used.

If validator_class is passed, it’ll be used to provide validation methods for installing step. If validator_class is None, the AutoHubValidator will be used as fallback.

DEFAULT_DUMPER_CLASS

alias of BiothingsDumper

DEFAULT_UPLOADER_CLASS

alias of BiothingsUploader

DEFAULT_VALIDATOR_CLASS

alias of AutoHubValidator

configure()[source]

Either configure autohub from static definition (STANDALONE_CONFIG) where different hard-coded names of indexes can be managed on different ES server, or use a indexer factory where index names are taken from version_urls but only one ES host is used.

configure_auto_release(config)[source]
extract(urls)[source]
get_class_name(folder)[source]

Return class-compliant name from a folder name

get_folder_name(url)[source]
install(src_name, version='latest', dry=False, force=False, use_no_downtime_method=True)[source]

Update hub’s data up to the given version (default is latest available), using full and incremental updates to get up to that given version (if possible).

list_biothings()[source]

Example: [{‘name’: ‘mygene.info’, ‘url’: ‘https://biothings-releases.s3-us-west-2.amazonaws.com/mygene.info/versions.json’}]

class biothings.hub.standalone.AutoHubServer(source_list, features=None, name='BioThings Hub', managers_custom_args=None, api_config=None, reloader_config=None, dataupload_config=None, websocket_config=None, autohub_config=None)[source]

Bases: HubServer

Helper to setup and instantiate common managers usually used in a hub (eg. dumper manager, uploader manager, etc…) “source_list” is either:

  • a list of string corresponding to paths to datasources modules

  • a package containing sub-folders with datasources modules

Specific managers can be retrieved adjusting “features” parameter, where each feature corresponds to one or more managers. Parameter defaults to all possible available. Managers are configured/init in the same order as the list, so if a manager (eg. job_manager) is required by all others, it must be the first in the list. “managers_custom_args” is an optional dict used to pass specific arguments while init managers:

managers_custom_args={“upload” : {“poll_schedule” : “*/5 * * * *”}}

will set poll schedule to check upload every 5min (instead of default 10s) “reloader_config”, “dataupload_config”, “autohub_config” and “websocket_config” can be used to customize reloader, dataupload and websocket. If None, default config is used. If explicitely False, feature is deactivated.

DEFAULT_FEATURES = ['job', 'autohub', 'terminal', 'config', 'ws']