biothings.hub

class biothings.hub.HubCommands[source]

Bases: OrderedDict

class biothings.hub.HubSSHServer[source]

Bases: SSHServer

PASSWORDS = {}
SHELL = None
begin_auth(username)[source]

Authentication has been requested by the client

This method will be called when authentication is attempted for the specified user. Applications should use this method to prepare whatever state they need to complete the authentication, such as loading in the set of authorized keys for that user. If no authentication is required for this user, this method should return False to cause the authentication to immediately succeed. Otherwise, it should return True to indicate that authentication should proceed.

If blocking operations need to be performed to prepare the state needed to complete the authentication, this method may be defined as a coroutine.

Parameters:

username (str) – The name of the user being authenticated

Returns:

A bool indicating whether authentication is required

connection_lost(exc)[source]

Called when a connection is lost or closed

This method is called when a connection is closed. If the connection is shut down cleanly, exc will be None. Otherwise, it will be an exception explaining the reason for the disconnect.

connection_made(connection)[source]

Called when a connection is made

This method is called when a new TCP connection is accepted. The conn parameter should be stored if needed for later use.

Parameters:

conn (SSHServerConnection) – The connection which was successfully opened

password_auth_supported()[source]

Return whether or not password authentication is supported

This method should return True if password authentication is supported. Applications wishing to support it must have this method return True and implement validate_password() to return whether or not the password provided by the client is valid for the user being authenticated.

By default, this method returns False indicating that password authentication is not supported.

Returns:

A bool indicating if password authentication is supported or not

session_requested()[source]

Handle an incoming session request

This method is called when a session open request is received from the client, indicating it wishes to open a channel to be used for running a shell, executing a command, or connecting to a subsystem. If the application wishes to accept the session, it must override this method to return either an SSHServerSession object to use to process the data received on the channel or a tuple consisting of an SSHServerChannel object created with create_server_channel and an SSHServerSession, if the application wishes to pass non-default arguments when creating the channel.

If blocking operations need to be performed before the session can be created, a coroutine which returns an SSHServerSession object can be returned instead of the session iself. This can be either returned directly or as a part of a tuple with an SSHServerChannel object.

To reject this request, this method should return False to send back a “Session refused” response or raise a ChannelOpenError exception with the reason for the failure.

The details of what type of session the client wants to start will be delivered to methods on the SSHServerSession object which is returned, along with other information such as environment variables, terminal type, size, and modes.

By default, all session requests are rejected.

Returns:

One of the following:

  • An SSHServerSession object or a coroutine which returns an SSHServerSession

  • A tuple consisting of an SSHServerChannel and the above

  • A callable or coroutine handler function which takes AsyncSSH stream objects for stdin, stdout, and stderr as arguments

  • A tuple consisting of an SSHServerChannel and the above

  • False to refuse the request

Raises:

ChannelOpenError if the session shouldn’t be accepted

validate_password(username, password)[source]

Return whether password is valid for this user

This method should return True if the specified password is a valid password for the user being authenticated. It must be overridden by applications wishing to support password authentication.

If the password provided is valid but expired, this method may raise PasswordChangeRequired to request that the client provide a new password before authentication is allowed to complete. In this case, the application must override change_password() to handle the password change request.

This method may be called multiple times with different passwords provided by the client. Applications may wish to limit the number of attempts which are allowed. This can be done by having password_auth_supported() begin returning False after the maximum number of attempts is exceeded.

If blocking operations need to be performed to determine the validity of the password, this method may be defined as a coroutine.

By default, this method returns False for all passwords.

Parameters:
  • username (str) – The user being authenticated

  • password (str) – The password sent by the client

Returns:

A bool indicating if the specified password is valid for the user being authenticated

Raises:

PasswordChangeRequired if the password provided is expired and needs to be changed

class biothings.hub.HubSSHServerSession(name, shell)[source]

Bases: SSHServerSession

break_received(msec)[source]

The client has sent a break

This method is called when the client requests that the server perform a break operation on the terminal. If the break is performed, this method should return True. Otherwise, it should return False.

By default, this method returns False indicating that no break was performed.

Parameters:

msec (int) – The duration of the break in milliseconds

Returns:

A bool to indicate if the break operation was performed or not

connection_made(chan)[source]

Called when a channel is opened successfully

This method is called when a channel is opened successfully. The channel parameter should be stored if needed for later use.

Parameters:

chan (SSHServerChannel) – The channel which was successfully opened.

data_received(data, datatype)[source]

Called when data is received on the channel

This method is called when data is received on the channel. If an encoding was specified when the channel was created, the data will be delivered as a string after decoding with the requested encoding. Otherwise, the data will be delivered as bytes.

Parameters:
  • data (str or bytes) – The data received on the channel

  • datatype – The extended data type of the data, from extended data types

eof_received()[source]

Called when EOF is received on the channel

This method is called when an end-of-file indication is received on the channel, after which no more data will be received. If this method returns True, the channel remains half open and data may still be sent. Otherwise, the channel is automatically closed after this method returns. This is the default behavior for classes derived directly from SSHSession, but not when using the higher-level streams API. Because input is buffered in that case, streaming sessions enable half-open channels to allow applications to respond to input read after an end-of-file indication is received.

eval_lines(lines)[source]
exec_requested(command)[source]

The client has requested to execute a command

This method should be implemented by the application to perform whatever processing is required when a client makes a request to execute a command. It should return True to accept the request, or False to reject it.

If the application returns True, the session_started() method will be called once the channel is fully open. No output should be sent until this method is called.

By default this method returns False to reject all requests.

Parameters:

command (str) – The command the client has requested to execute

Returns:

A bool indicating if the exec request was allowed or not

session_started()[source]

Called when the session is started

This method is called when a session has started up. For client and server sessions, this will be called once a shell, exec, or subsystem request has been successfully completed. For TCP and UNIX domain socket sessions, it will be called immediately after the connection is opened.

shell_requested()[source]

The client has requested a shell

This method should be implemented by the application to perform whatever processing is required when a client makes a request to open an interactive shell. It should return True to accept the request, or False to reject it.

If the application returns True, the session_started() method will be called once the channel is fully open. No output should be sent until this method is called.

By default this method returns False to reject all requests.

Returns:

A bool indicating if the shell request was allowed or not

soft_eof_received()[source]

The client has sent a soft EOF

This method is called by the line editor when the client send a soft EOF (Ctrl-D on an empty input line).

By default, soft EOF will trigger an EOF to an outstanding read call but still allow additional input to be received from the client after that.

class biothings.hub.HubServer(source_list, features=None, name='BioThings Hub', managers_custom_args=None, api_config=None, reloader_config=None, dataupload_config=None, websocket_config=None, autohub_config=None)[source]

Bases: object

Helper to setup and instantiate common managers usually used in a hub (eg. dumper manager, uploader manager, etc…) “source_list” is either:

  • a list of string corresponding to paths to datasources modules

  • a package containing sub-folders with datasources modules

Specific managers can be retrieved adjusting “features” parameter, where each feature corresponds to one or more managers. Parameter defaults to all possible available. Managers are configured/init in the same order as the list, so if a manager (eg. job_manager) is required by all others, it must be the first in the list. “managers_custom_args” is an optional dict used to pass specific arguments while init managers:

managers_custom_args={“upload” : {“poll_schedule” : “*/5 * * * *”}}

will set poll schedule to check upload every 5min (instead of default 10s) “reloader_config”, “dataupload_config”, “autohub_config” and “websocket_config” can be used to customize reloader, dataupload and websocket. If None, default config is used. If explicitely False, feature is deactivated.

DEFAULT_API_CONFIG = {}
DEFAULT_AUTOHUB_CONFIG = {'es_host': None, 'indexer_factory': None, 'validator_class': None, 'version_urls': []}
DEFAULT_DATAUPLOAD_CONFIG = {'upload_root': '.biothings_hub/archive/dataupload'}
DEFAULT_FEATURES = ['config', 'job', 'dump', 'upload', 'dataplugin', 'source', 'build', 'auto_archive', 'diff', 'index', 'snapshot', 'auto_snapshot_cleaner', 'release', 'inspect', 'sync', 'api', 'terminal', 'reloader', 'dataupload', 'ws', 'readonly', 'upgrade', 'autohub', 'hooks']
DEFAULT_MANAGERS_ARGS = {'upload': {'poll_schedule': '* * * * * */10'}}
DEFAULT_RELOADER_CONFIG = {'folders': None, 'managers': ['source_manager', 'assistant_manager'], 'reload_func': None}
DEFAULT_WEBSOCKET_CONFIG = {}
add_api_endpoint(endpoint_name, command_name, method, **kwargs)[source]

Add an API endpoint to expose command named “command_name” using HTTP method “method”. **kwargs are used to specify more arguments for EndpointDefinition

before_configure()[source]

Hook triggered before configure(), used eg. to adjust features list

before_start()[source]
clean_features(features)[source]

Sanitize (ie. remove duplicates) features

configure()[source]
configure_api_endpoints()[source]
configure_api_manager()[source]
configure_auto_archive_manager()[source]
configure_auto_snapshot_cleaner_manager()[source]
configure_autohub_feature()[source]

See bt.hub.standalone.AutoHubFeature

configure_build_manager()[source]
configure_commands()[source]

Configure hub commands according to available managers

configure_config_feature()[source]
configure_dataplugin_manager()[source]
configure_dataupload_feature()[source]
configure_diff_manager()[source]
configure_dump_manager()[source]
configure_extra_commands()[source]

Same as configure_commands() but commands are not exposed publicly in the shell (they are shortcuts or commands for API endpoints, supporting commands, etc…)

configure_hooks_feature()[source]

Ingest user-defined commands into hub namespace, giving access to all pre-defined commands (commands, extra_commands). This method prepare the hooks but the ingestion is done later when all commands are defined

configure_index_manager()[source]
configure_inspect_manager()[source]
configure_ioloop()[source]
configure_job_manager()[source]
configure_managers()[source]
configure_readonly_api_endpoints()[source]

Assuming read-write API endpoints have previously been defined (self.api_endpoints set) extract commands and their endpoint definitions only when method is GET. That is, for any given API definition honoring REST principle for HTTP verbs, generate endpoints only for which actions are read-only actions.

configure_readonly_feature()[source]

Define then expose read-only Hub API endpoints so Hub can be accessed without any risk of modifying data

configure_release_manager()[source]
configure_reloader_feature()[source]
configure_remaining_features()[source]
configure_snapshot_manager()[source]
configure_source_manager()[source]
configure_sync_manager()[source]
configure_terminal_feature()[source]
configure_upgrade_feature()[source]

Allows a Hub to check for new versions (new commits to apply on running branch) and apply them on current code base

configure_upload_manager()[source]
configure_ws_feature()[source]
export_command_documents(filepath)[source]
get_websocket_urls()[source]
ingest_hooks()[source]
mixargs(feat, params=None)[source]
process_hook_file(hook_file)[source]
quick_index(datasource_name, doc_type, indexer_env, subsource=None, index_name=None, **kwargs)[source]

Intention for datasource developers to quickly create an index to test their datasources. Automatically create temporary build config, build collection Then call the index method with the temporary build collection’s name

start()[source]
class biothings.hub.JobRenderer[source]

Bases: object

cron_and_strdelta_info(job)[source]
render(job)[source]
render_cron(c)[source]
render_func(f)[source]
render_lambda(l)[source]
render_method(m)[source]
render_partial(p)[source]
render_strdelta(job)[source]
biothings.hub.get_schedule(loop)[source]

try to render job in a human-readable way…

async biothings.hub.start_ssh_server(loop, name, passwords, keys=['bin/ssh_host_key'], shell=None, host='', port=8022)[source]
biothings.hub.status(managers)[source]

Return a global hub status (number or sources, documents, etc…) according to available managers

Modules

Commands