biothings.cli

Entrypoint for the biothings-cli tool

biothings.cli.check_module_import_status(module: str) bool[source]

Verify that we can import a module prior to proceeding with creating our commandline tooling that depends on those modules

biothings.cli.main()[source]

The entrypoint for running the BioThings CLI to test your local data plugin

biothings.cli.dataplugin

Module for creating the cli interface for the dataplugin interface

biothings.cli.dataplugin.clean_data(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35133ed0>]=None, dump: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a34e98050>]=False, upload: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a34e98190>]=False, clean_all: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a34e982d0>]=False)[source]

Delete all dumped files and/or drop uploaded sources tables

biothings.cli.dataplugin.create_data_plugin(name: ~typing.Annotated[str, <typer.models.OptionInfo object at 0x734a351320d0>], multi_uploaders: ~typing.Annotated[bool, <typer.models.OptionInfo object at 0x734a35131e50>] = False, parallelizer: ~typing.Annotated[bool, <typer.models.OptionInfo object at 0x734a35132210>] = False)[source]

Create a new data plugin from a pre-defined template

biothings.cli.dataplugin.dump_and_upload(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35132d50>]=None)[source]

Sequentially execute the dump and upload commands

Operation Order: 1) downloads source data files to local file system 2) converts them into JSON documents 3) uploads those JSON documents to the source database.

biothings.cli.dataplugin.dump_source(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35132350>]=None, show_dump: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35132490>]=True)[source]

Download the source data files to the local file system

biothings.cli.dataplugin.index_plugin(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a34e98410>]=None, sub_source_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a34e98550>]=None)[source]

[red][bold](experimental)[/bold][/red] Create an elaticsearch index from a data source database

Our quick-index function that provides a way for quickly creating an elasticsearch index from a source backend

We currently only support converting between MongoDB -> Elasticsearch for indexing

[green]NOTE[/green] Only works correctly if the upload command has been run

biothings.cli.dataplugin.inspect_source(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35133390>]=None, sub_source_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a351334d0>]='', mode: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35133610>]='type, stats', limit: Annotated[int | None, <typer.models.OptionInfo object at 0x734a35133750>]=None, merge: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35133890>]=False, output: Annotated[str | None, <typer.models.OptionInfo object at 0x734a351339d0>]=None)[source]

Derive detailed information about the document data structure from the parsed documents

[green]NOTE[/green] Only works correctly if the upload command has been run

biothings.cli.dataplugin.listing(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35132e90>]=None, dump: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35132fd0>]=True, upload: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35133110>]=True, hubdb: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35133250>]=False)[source]

List dumped files, uploaded sources, or internal hubdb contents

biothings.cli.dataplugin.serve(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35133b10>]=None, host: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35133c50>]='localhost', port: Annotated[int | None, <typer.models.OptionInfo object at 0x734a35133d90>]=9999)[source]

Run a simple API server for serving documents from the source database

For example, we have a source_name = “test” with the following document structure: doc = {

“_id”: “123”, “key”: {

“a”: {“b”: “1”}, “x”: [

{“y”: “3”, “z”: “4”}, “5”

]

}

}

An API server will run at http://host:port/<your source name>/ (e.g http://localhost:9999/test/)

biothings.cli.dataplugin.upload_source(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a35132710>]=None, batch_limit: Annotated[int | None, <typer.models.OptionInfo object at 0x734a35132850>]=10000, parallel: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35132990>]=False, show_upload: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a35132ad0>]=True)[source]

Parse the downloaded data files from the dump operation and upload to the source database

Default database is sqlite3, but mongodb is supported if configured and an instance is setup

[green]NOTE[/green] Only works correctly if the dump command has been run

biothings.cli.dataplugin.validate_manifest(plugin_name: Annotated[str | None, <typer.models.OptionInfo object at 0x734a34e98690>]=None, show_schema: Annotated[bool | None, <typer.models.OptionInfo object at 0x734a34e987d0>]=None) None[source]

[red][bold](experimental)[/bold][/red] Validate a provided manifest file via JSONSchema

Performs jsonschema validation against the manifest file. Will not perform validation against the potential loading of modules within the manifest

if the –show-schema argument is applied, then display the biothings manifest schema

The schema is located within the biothings repository at the following path relative to root: <biothings/hub/dataplugin/loaders/schema/manifest.json>

For a reference about jsonschema itself, see the following: https://json-schema.org/

biothings.cli.dataplugin_hub

biothings.cli.utils

Utility functions for the biothings-cli tool

These are semantically separated from the operations in that these functions aide in helping the operations perform a task. Usually anything releated to plugin metadata, job handling, and data manipulation should logically exist here

biothings.cli.utils.clean_dumped_files(data_folder: str | Path, plugin_name: str)[source]

Remove all dumped files by a data plugin in the data folder.

biothings.cli.utils.clean_uploaded_sources(working_dir, plugin_name)[source]

Remove all uploaded sources by a data plugin in the working directory.

biothings.cli.utils.display_inspection_table(source_name: str, mode: str, inspection_mapping: dict, validate: bool = True)[source]
biothings.cli.utils.get_manifest_content(working_dir: str | Path) dict[source]

return the manifest content of the data plugin in the working directory

biothings.cli.utils.get_plugin_name(plugin_name=None, with_working_dir=True)[source]

return a valid plugin name (the folder name contains a data plugin) When plugin_name is provided as None, it use the current working folder. when with_working_dir is True, returns (plugin_name, working_dir) tuple

biothings.cli.utils.get_uploaded_collections(src_db, uploaders)[source]

A helper function to get the uploaded collections in the source database

biothings.cli.utils.get_uploaders(working_dir: Path) List[str][source]

A helper function to get the uploaders from the manifest file in the working directory used in show_uploaded_sources function below

biothings.cli.utils.process_inspect(source_name, mode, limit, merge) dict[source]

Perform inspect for the given source. It’s used in do_inspect function below

biothings.cli.utils.remove_files_in_folder(folder_path)[source]

Remove all files in a folder.

biothings.cli.utils.show_dumped_files(data_folder: str | Path, plugin_name: str) None[source]

A helper function to show the dumped files in the data folder

biothings.cli.utils.show_hubdb_content()[source]

Output hubdb content in a pretty format.

biothings.cli.utils.show_source_build(build_instance: DataBuilder, build_configuration_name: str)[source]

A helper function to show the build information for the plugin source

async biothings.cli.utils.show_source_index(index_name: str, index_manager: IndexManager, elasticsearch_mapping: dict)[source]

A helper function to show the elasticsearch index for the plugin source

biothings.cli.utils.show_uploaded_sources(working_dir, plugin_name)[source]

A helper function to show the uploaded sources from given plugin.

biothings.cli.utils.write_mapping_to_file(output_file: str | Path, mapping: dict) None[source]

Takes the generated mapping data and writes it to a local file

biothings.cli.web_app

class biothings.cli.web_app.BaseHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]

Bases: RequestHandler

set_default_headers()[source]

Override this to set HTTP headers at the beginning of the request.

For example, this is the place to set a custom Server header. Note that setting such headers in the normal flow of request processing may not do what you want, since headers may be reset during error handling.

class biothings.cli.web_app.CLIApplication(db, table_space: List[str], **settings)[source]

Bases: Application

The main application class, which defines the routes and handlers.

class biothings.cli.web_app.DocHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]

Bases: BaseHandler

The handler for the detail view of a document, e.g. /<source>/<doc_id/

async get(slug, item_id)[source]
class biothings.cli.web_app.HomeHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]

Bases: BaseHandler

the handler for the landing page, which lists all available routes

async get()[source]
class biothings.cli.web_app.QueryHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]

Bases: BaseHandler

The handler for return a list of docs matching the query terms passed to “q” parameter e.g. /<source>/?q=<query>

async get(slug)[source]
async biothings.cli.web_app.get_available_routes(db, table_space) Tuple[list, list][source]

return a list available URLs/routes based on the table_space and the actual collections in the database

biothings.cli.web_app.get_example_queries(db, table_space)[source]

Populate example queries for a given table_space

async biothings.cli.web_app.main(host, port, db, table_space)[source]

The main entrypoint for starting and running the cli server