ersilia.publish package

Submodules

ersilia.publish.deploy module

ersilia.publish.dockerhub module

class ersilia.publish.dockerhub.DockerHubUploader(model_id: str, config_json=None)[source]

Bases: ErsiliaBase

Class for uploading Docker images to DockerHub.

Parameters:
  • model_id (str) – The ID of the model to be uploaded.

  • config_json (str, optional) – Path to the configuration JSON file.

Examples

uploader = DockerHubUploader(
    model_id="model_id",
    config_json="path/to/config.json",
)
uploader.set_credentials(
    docker_user="username", docker_pwd="password"
)
uploader.upload()
build_image()[source]

Build the Docker image for the model.

set_credentials(docker_user: str, docker_pwd: str)[source]

Set DockerHub credentials.

Parameters:
  • docker_user (str) – DockerHub username.

  • docker_pwd (str) – DockerHub password.

upload()[source]

Upload the Docker image to DockerHub.

ersilia.publish.inspect module

class ersilia.publish.inspect.ModelInspector(model: str, dir: str, config_json=None)[source]

Bases: object

Class for inspecting model repositories.

Parameters:
  • model (str) – The ID of the model to be inspected.

  • dir (str) – The directory where the model repository is located.

  • config_json (str, optional) – Path to the configuration JSON file.

Examples

inspector = ModelInspector(
    model="model_id", dir="path/to/repo"
)
result = inspector.check_repo_exists()
result = inspector.check_complete_metadata()
BENTOML_FILES = ['model/framework/run.sh', 'README.md', 'LICENSE', 'Dockerfile', 'metadata.json', 'src/service.py', 'pack.py', '.gitignore', 'input.csv']
BENTOML_FOLDERS = ['model', 'src', '.github']
COMMON_FILES = ['model/framework/run.sh', 'README.md', 'LICENSE']
ERSILIAPACK_FILES = ['model/framework/run.sh', 'README.md', 'LICENSE', 'install.yml', 'metadata.yml', 'model/framework/examples/input.csv', 'model/framework/examples/output.csv', '.dockerignore', '.gitignore', '.gitattributes']
ERSILIAPACK_FOLDERS = ['model', '.github']
REQUIRED_FIELDS = ['Publication', 'Source Code', 'S3', 'DockerHub']
RUN_FILE = 'model/framework/run.sh'
check_complete_folder_structure()[source]

Check if the folder structure of the repository is complete.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

check_complete_metadata()[source]

Check if the metadata file is complete.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

check_computational_performance()[source]

Check the computational performance of the model.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

check_dependencies_are_valid()[source]

Check if the dependencies in the Dockerfile or install.yml are valid.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

check_no_extra_files()[source]

Check if there are no extra files in the repository.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

check_repo_exists()[source]

Check if the model repository exists.

Returns:

A namedtuple containing the success status and details of the check.

Return type:

Result

get_pack_type()[source]

Determine the packaging method of the model.

Returns:

The packaging method, either ‘bentoml’ or ‘fastapi’.

Return type:

str

validate_repo_structure()[source]

Validate the repository structure.

Return type:

List of missing items.

class ersilia.publish.inspect.Result(success, details)

Bases: tuple

details

Alias for field number 1

success

Alias for field number 0

ersilia.publish.lake module

class ersilia.publish.lake.LakeStorer(model_id, config_json, credentials_json)[source]

Bases: ErsiliaBase

Class to handle storing data in the lake.

Parameters:
  • model_id (str) – The ID of the model.

  • config_json (dict) – Configuration in JSON format.

  • credentials_json (dict) – Credentials in JSON format.

store()[source]

Store data in the lake.

ersilia.publish.publish module

class ersilia.publish.publish.ModelPublisher(model_id, config_json, credentials_json)[source]

Bases: ErsiliaBase

Class for publishing models to GitHub.

Parameters:
  • model_id (str) – The ID of the model to be published.

  • config_json (str) – Path to the configuration JSON file.

  • credentials_json (str) – Path to the credentials JSON file.

create(public=True)[source]

Create a new GitHub repository for the model.

Parameters:

public (bool, optional) – Whether the repository should be public or private. Default is True.

docker()[source]

Handle Docker-related tasks.

dvc()[source]

Set up DVC (Data Version Control) for the model repository.

git_push(message=None)[source]

Push changes to the GitHub repository.

Parameters:

message (str, optional) – The commit message. If not provided, a default message is used.

push()[source]

Set up DVC and push changes to the GitHub repository.

rebase()[source]

Rebase the model repository with the template repository.

test()[source]

Test the publishing process.

ersilia.publish.rebase module

class ersilia.publish.rebase.TemplateRebaser(model_id: str, template_repo='eos-template', config_json=None, credentials_json=None)[source]

Bases: ErsiliaBase

Class for rebasing model repositories with a template repository.

Parameters:
  • model_id (str) – The ID of the model to be rebased.

  • template_repo (str, optional) – The name of the template repository. Default is ‘eos-template’.

  • config_json (str, optional) – Path to the configuration JSON file.

  • credentials_json (str, optional) – Path to the credentials JSON file.

clean()[source]

Clean up temporary directories.

clone_current_model()[source]

Clone the current model repository.

clone_template()[source]

Clone the template repository.

dvc_part()[source]

Set up DVC (Data Version Control) for the model repository.

rebase()[source]

Rebase the model repository with the template repository.

ersilia.publish.s3 module

class ersilia.publish.s3.S3BucketRepoUploader(model_id: str, config_json=None)[source]

Bases: ErsiliaBase

Class for uploading model repositories to an S3 bucket.

Parameters:
  • model_id (str) – The ID of the model to be uploaded.

  • config_json (str, optional) – Path to the configuration JSON file.

Examples

uploader = S3BucketRepoUploader(
    model_id="model_id",
    config_json="path/to/config.json",
)
uploader.set_credentials(
    aws_access_key_id="access_key",
    aws_secret_access_key="secret_key",
)
uploader.upload()
set_credentials(aws_access_key_id: str, aws_secret_access_key: str)[source]

Set AWS credentials.

Parameters:
  • aws_access_key_id (str) – AWS access key ID.

  • aws_secret_access_key (str) – AWS secret access key.

upload(repo_path=None)[source]

Upload the model repository to the S3 bucket.

Parameters:

repo_path (str, optional) – Path to the local repository. If not provided, the repository will be cloned from GitHub.

upload_zip(repo_path=None)[source]

Upload the zipped model repository to the S3 bucket.

Parameters:

repo_path (str, optional) – Path to the local repository. If not provided, the repository will be cloned from GitHub.

ersilia.publish.store module

Class used to store model when done developing it.

This functionality is used when developing of a model is done.

class ersilia.publish.store.ModelRemover(config_json=None, credentials_json=None)[source]

Bases: ErsiliaBase

Class for removing models from OSF.

Parameters:
  • config_json (str, optional) – Path to the configuration JSON file.

  • credentials_json (str, optional) – Path to the credentials JSON file.

Examples

remover = ModelRemover(
    config_json="path/to/config.json",
    credentials_json="path/to/credentials.json",
)
remover.remove(model_id="model_id")
remove(model_id: str)[source]

Remove model from OSF.

Parameters:

model_id (str) – The ID of the model to be removed.

class ersilia.publish.store.ModelStorager(config_json=None, credentials_json=None, overwrite=True)[source]

Bases: ErsiliaBase

Class for storing models in the local data directory and in OSF.

Parameters:
  • config_json (str, optional) – Path to the configuration JSON file.

  • credentials_json (str, optional) – Path to the credentials JSON file.

  • overwrite (bool, optional) – Whether to overwrite existing files. Default is True.

store(path: str, model_id: str)[source]

Store model in the local data directory and in OSF.

Parameters:
  • path (str) – Path to the model directory.

  • model_id (str) – The ID of the model to be stored.

ersilia.publish.test module

class ersilia.publish.test.CheckService(logger: Any, model_id: str, dir: str, from_github: bool, from_dockerhub: bool, ios: IOService)[source]

Bases: object

Service for performing various checks on the model.

Parameters:
  • logger (logging.Logger) – Logger for logging messages.

  • model_id (str) – Identifier of the model.

  • dir (str) – Directory where the model repository is located.

  • from_github (bool) – Flag indicating whether to fetch the repository from GitHub.

  • from_dockerhub (bool) – Flag indicating whether to fetch the repository from DockerHub.

  • ios (IOService) – Instance of IOService for handling input/output operations.

Examples

check_service = CheckService(
    logger=logger,
    model_id="model_id",
    dir="/path/to/dir",
    from_github=True,
    from_dockerhub=False,
    ios=ios,
)
check_service.check_files()
INPUT_SHAPE = {'List', 'List of Lists', 'Pair', 'Pair of Lists', 'Single'}
MODEL_OUTPUT = {'Boolean', 'Compound', 'Descriptor', 'Distance', 'Experimental value', 'Image', 'Other value', 'Probability', 'Protein', 'Score', 'Text'}
MODEL_TASKS = {'Classification', 'Clustering', 'Dimensionality reduction', 'Generative', 'Regression', 'Representation', 'Similarity'}
OUTPUT_SHAPE = {'Flexible List', 'List', 'Matrix', 'Serializable Object', 'Single'}
check_consistent_output(**kwargs)
check_example_input(**kwargs)
check_files()[source]

Check the existence of required files for the model.

check_information(**kwargs)
check_model_output_content(run_example, run_model)[source]
get_inputs(run_example, types)[source]
validate_file_content(file_path, input_type)[source]
class ersilia.publish.test.CheckStrategy(check_function, success_key, details_key)[source]

Bases: object

Execuetd a strategy for checking inspect commands.

Parameters:
  • check_function (callable) – The function to check.

  • success_key (str) – The key for success.

  • details_key (str) – The key for details.

execute()[source]

Execute the check strategy.

Returns:

The results of the check.

Return type:

dict

class ersilia.publish.test.Checks(value)[source]

Bases: Enum

Enum for different check types.

CONSISTENCY = 'Model Output Was Consistent'
DIR_SIZE = 'Directory Size Mb'
ENV_SIZE = 'Environment Size Mb'
IMAGE_SIZE = 'Image Size Mb'
INCONSISTENCY = 'Inconsistent Output Detected'
MODEL_CONSISTENCY = 'Check Consistency of Model Output'
PREDEFINED_EXAMPLE = 'Check Predefined Example Input'
RUN_BASH = 'RMSE-MEAN'
SIZE_CACL_FAILED = 'Size Calculation Failed'
SIZE_CACL_SUCCESS = 'Size Successfully Calculated'
class ersilia.publish.test.IOService(logger, model_id: str, dir: str)[source]

Bases: object

Service for handling input/output operations related to model testing.

Parameters:
  • logger (logging.Logger) – Logger for logging messages.

  • model_id (str) – Identifier of the model.

  • dir (str) – Directory where the model repository is located.

Examples

ios = IOService(
    logger=logger,
    model_id="model_id",
    dir="/path/to/dir",
)
ios.read_information()
BENTOML_FILES = ['Dockerfile', 'metadata.json', 'model/framework/run.sh', 'src/service.py', 'pack.py', 'README.md', 'LICENSE']
ERSILIAPACK_FILES = ['install.yml', 'metadata.yml', 'model/framework/examples/input.csv', 'model/framework/examples/output.csv', 'model/framework/run.sh', 'README.md', 'LICENSE']
RUN_FILE = 'model/framework/run.sh'
calculate_directory_size(path: str) int[source]

Calculate the size of a directory.

Parameters:

path (str) – The path to the directory.

Returns:

The size of the directory.

Return type:

int

calculate_image_size(tag='latest')[source]

Calculate the size of a Docker image.

Parameters:

tag (str, optional) – The tag of the Docker image (default is ‘latest’).

Returns:

The size of the Docker image.

Return type:

str

collect_and_save_json(results, output_file)[source]

Helper function to collect JSON results and save them to a file.

get_conda_env_size()[source]

Get the size of the Conda environment for the model.

Returns:

The size of the Conda environment in megabytes.

Return type:

int

Raises:

Exception – If there is an error calculating the size.

get_directories_sizes(**kwargs)
get_env_sizes(**kwargs)
get_file_requirements() List[str][source]

Get the list of required files for the model.

Returns:

List of required files.

Return type:

List[str]

Raises:

ValueError – If the model type is unsupported.

static get_model_type(model_id: str, repo_path: str) str[source]

Get the type of the model based on the repository contents.

Parameters:
  • model_id (str) – Identifier of the model.

  • repo_path (str) – Path to the model repository.

Returns:

The type of the model (e.g., PACK_METHOD_BENTOML, PACK_METHOD_FASTAPI).

Return type:

str

read_information() dict[source]

Read the information file for the model.

Returns:

The contents of the information file.

Return type:

dict

Raises:

FileNotFoundError – If the information file does not exist.

update_metadata(json_data)[source]

Processes JSON/YAML metadata to extract size and performance info and then updates them.

Parameters:

json_data (dict) – Report data from the command output.

Returns:

Updated metadata containing computed performance and size information.

Return type:

dict

class ersilia.publish.test.InspectService(dir: str, model: str, remote: bool = False, config_json: str | None = None, credentials_json: str | None = None)[source]

Bases: ErsiliaBase

Service for inspecting models and their configurations.

Parameters:
  • dir (str, optional) – Directory where the model is located.

  • model (str, optional) – Model identifier.

  • remote (bool, optional) – Flag indicating whether the model is remote.

  • config_json (str, optional) – Path to the configuration JSON file.

  • credentials_json (str, optional) – Path to the credentials JSON file.

Examples

inspector = InspectService(
    dir="/path/to/model", model="model_id"
)
results = inspector.run()
run(check_keys: list | None = None) dict[source]

Run the inspection checks on the specified model.

Parameters:

check_keys (list, optional) – A list of check keys to execute. If None, all checks will be executed.

Returns:

A dictionary containing the results of the inspection checks.

Return type:

dict

Raises:
  • ValueError – If the model is not specified.

  • KeyError – If any of the specified keys do not exist.

class ersilia.publish.test.ModelTester(model, level, from_dir, from_github, from_dockerhub, from_s3, version, shallow, deep, as_json)[source]

Bases: ErsiliaBase

Class to handle model testing. Initializes the model tester services and runs the tests.

Parameters:
  • model (str) – The ID of the model.

  • level (str) – The level of testing.

  • from_dir (str) – The directory for the model.

  • from_github (bool) – Flag indicating whether to fetch the repository from GitHub.

  • from_dockerhub (bool) – Flag indicating whether to fetch the repository from DockerHub.

  • from_s3 (bool) – Flag indicating whether to fetch the repository from S3.

  • version (str) – Version of the model.

  • shallow (bool) – Flag indicating whether to perform shallow checks.

  • deep (bool) – Flag indicating whether to perform deep checks.

  • as_json (bool) – Flag indicating whether to output results as JSON.

run()[source]

Run the model tester.

class ersilia.publish.test.Options(value)[source]

Bases: Enum

Enum for different options.

BASE = 'base'
INPUT_CSV = 'input.csv'
INPUT_TYPES = ['str', 'list', 'csv']
NUM_SAMPLES = 5
OUTPUT1_CSV = 'output1.csv'
OUTPUT2_CSV = 'output2.csv'
OUTPUT_CSV = 'result.csv'
OUTPUT_FILES = ['file.csv', 'file.h5', 'file.json']
class ersilia.publish.test.RunnerService(model_id: str, logger, ios_service: IOService, checkup_service: CheckService, setup_service: SetupService, level: str, dir: str, model_path: Callable, from_github: bool, from_s3: bool, from_dockerhub: bool, version: str, shallow: bool, deep: bool, as_json: bool, inspector: InspectService)[source]

Bases: object

Service for running model tests and checks.

Parameters:
  • model_id (str) – Identifier of the model.

  • logger (logging.Logger) – Logger for logging messages.

  • ios_service (IOService) – Instance of IOService for handling input/output operations.

  • checkup_service (CheckService) – Instance of CheckService for performing various checks on the model.

  • setup_service (SetupService) – Instance of SetupService for setting up the environment and fetching the model repository.

  • level (str) – Level of checks to perform.

  • dir (str) – Directory where the model repository is located.

  • model_path (Callable) – Callable to get the model path.

  • from_github (bool) – Flag indicating whether to fetch the repository from GitHub.

  • from_s3 (bool) – Flag indicating whether to fetch the repository from S3.

  • from_dockerhub (bool) – Flag indicating whether to fetch the repository from DockerHub.

  • version (str) – Version of the model.

  • shallow (bool) – Flag indicating whether to perform shallow checks.

  • deep (bool) – Flag indicating whether to perform deep checks.

  • as_json (bool) – Flag indicating whether to output results as JSON.

  • inspector (InspectService) – Instance of InspectService for inspecting models and their configurations.

fetch()[source]

Fetch the model repository from the specified directory.

run()[source]

Run the model tests and checks.

Raises:

ImportError – If required packages are missing.

run_bash(**kwargs)
run_example(n_samples: int, file_name: str | None = None, simple: bool = True, try_predefined: bool = False)[source]

Generate example input samples for the model.

Parameters:
  • n_samples (int) – Number of samples to generate.

  • file_name (str, optional) – Name of the file to save the samples.

  • simple (bool, optional) – Flag indicating whether to generate simple samples.

  • try_predefined (bool, optional) – Flag indicating whether to try predefined samples.

Returns:

List of generated input samples.

Return type:

list

run_model(inputs: list, output: str, batch: int)[source]

Run the model with the given input and output parameters.

Parameters:
  • input (list) – List of input samples.

  • output (str) – Path to the output file.

  • batch (int) – Batch size for running the model.

Returns:

The output of the command.

Return type:

str

class ersilia.publish.test.STATUS_CONFIGS(value)[source]

Bases: Enum

Enum for status configurations.

FAILED = ('FAILED', 'red', '✘')
NA = ('N/A', 'dim', '~')
PASSED = ('PASSED', 'green', '✔')
SUCCESS = ('SUCCESS', 'green', '★')
WARNING = ('WARNING', 'yellow', '⚠')
class ersilia.publish.test.SetupService(model_id: str, dir: str, from_github: bool, from_s3: bool, logger: Any)[source]

Bases: object

Service for setting up the environment and fetching the model repository.

Parameters:
  • model_id (str) – Identifier of the model.

  • dir (str) – Directory where the model repository will be cloned.

  • from_github (bool) – Flag indicating whether to fetch the repository from GitHub.

  • from_s3 (bool) – Flag indicating whether to fetch the repository from S3.

  • logger (Any) – Logger for logging messages.

BASE_URL = 'https://github.com/ersilia-os/'
static get_conda_env_location(model_id: str, logger) str[source]

Get the location of the Conda environment for the model.

Parameters:
  • model_id (str) – Identifier of the model.

  • logger (logging.Logger) – Logger for logging messages.

Returns:

The location of the Conda environment.

Return type:

str

Raises:

subprocess.CalledProcessError – If the command to list Conda environments returns a non-zero exit code.

get_model()[source]
static run_command(command: str, logger, capture_output: bool = False, shell: bool = True) str[source]

Run a shell command.

Parameters:
  • command (str) – The command to run.

  • logger (logging.Logger) – Logger for logging messages.

  • capture_output (bool, optional) – Flag indicating whether to capture the command output.

  • shell (bool, optional) – Flag indicating whether to run the command in the shell.

Returns:

The output of the command.

Return type:

str

Raises:

subprocess.CalledProcessError – If the command returns a non-zero exit code.

class ersilia.publish.test.TableConfig(title: str, headers: List[str])[source]

Bases: object

Configuration for a table.

headers: List[str]
title: str
class ersilia.publish.test.TableType(value)[source]

Bases: Enum

Enum for different table types.

COMPUTATIONAL_PERFORMANCE = 'Computational Performance Summary'
CONSISTENCY_BASH = 'Consistency Summary Between Ersilia and Bash Execution Outputs'
DEPENDECY_CHECK = 'Dependency Check'
FINAL_RUN_SUMMARY = 'Test Run Summary'
INSPECT_SUMMARY = 'Inspect Summary'
MODEL_DIRECTORY_SIZES = 'Model Directory Sizes'
MODEL_ENV_SIZES = 'Model Environment Sizes'
MODEL_FILE_CHECKS = 'Model File Checks'
MODEL_INFORMATION_CHECKS = 'Model Metadata Checks'
MODEL_OUTPUT = 'Model Output Content Validation Summary'
RUNNER_CHECKUP_STATUS = 'Runner Checkup Status'
SHALLOW_CHECK_SUMMARY = 'Validation and Size Check Summary'

Module contents