ersilia.core package¶
Submodules¶
ersilia.core.base module¶
ersilia.core.model module¶
- class ersilia.core.model.ErsiliaModel(model: str, output_source: OutputSource | None = None, service_class: str | None = None, config_json: dict | None = None, credentials_json: dict | None = None, verbose: bool | None = None, fetch_if_not_available: bool = True, preferred_port: int | None = None, cache: bool = True, maxmemory: float | None = None)[source]¶
Bases:
ErsiliaBase
ErsiliaModel class for managing and interacting with different models.
This class provides methods to fetch, serve, run, and close models form a model hub. It also supports tracking runs and handling various input and output formats.
- Parameters:
model (str) – The identifier of the model.
output_source (OutputSource, optional) – The source of the output, by default OutputSource.LOCAL_ONLY.
service_class (str, optional) – The service class, by default None.
config_json (dict, optional) – Configuration in JSON format, by default None.
credentials_json (dict, optional) – Credentials in JSON format, by default None.
verbose (bool, optional) – Verbosity flag, by default None.
fetch_if_not_available (bool, optional) – Whether to fetch the model if not available locally, by default True.
preferred_port (int, optional) – Preferred port for serving the model, by default None.
track_runs (bool, optional) – Whether to track runs, by default False.
cache (bool) – Whether to use redis cache or not
maxmemory (float) – Fraction of memory used by redis
Examples
Fetching a model this requires to use asyncio since fetch is a coroutine.:
model = ErsiliaModel(model="model_id") model.fetch()
Serving a model:
model = ErsiliaModel(model="model_id") model.serve()
Running a model:
model = ErsiliaModel(model="model_id") result = model.run( input="input_data.csv", output="output_data.csv", )
Closing a model:
model = ErsiliaModel(model="model_id") model.close()
- api(api_name=None, input=None, output=None, batch_size=100)[source]¶
Run the specified API with the given input and output.
This method executes the specified API(usually with the end point run) using the provided input and output parameters. It handles file splitting and caching if necessary.
- Parameters:
api_name (str, optional) – The name of the API to run, by default None.
input (str, optional) – The input data, by default None.
output (str, optional) – The output data, by default None.
batch_size (int, optional) – The batch size, by default DEFAULT_BATCH_SIZE.
- Returns:
The result of the API run.
- Return type:
Any
- api_task(api_name, input, output, batch_size)[source]¶
Run the specified API task with the given input and output.
This method executes the specified API task using the provided input and output parameters. It returns the result of the API task, which can be a generator, file, or other data types.
- Parameters:
api_name (str) – The name of the API to run.
input (str) – The input data.
output (str) – The output data.
batch_size (int) – The batch size.
- Returns:
The result of the API task.
- Return type:
Any
- close()[source]¶
Close the model services and session.
This method stops the model service and closes the session.
- example(n_samples, file_name=None, simple=True)[source]¶
Generate example data for the model.
This method generates example data for the model using the specified number of samples. The generated data can be saved to a file if a file name is provided.
- Parameters:
n_samples (int) – The number of samples to generate.
file_name (str, optional) – The file name to save the examples, by default None.
simple (bool, optional) – Whether to generate simple examples, by default True.
- Returns:
The generated example data(path, list of smiles etc…).
- Return type:
Any
- get_apis()[source]¶
Get the list of available APIs for the model.
This method retrieves the list of APIs that are available for the model.
- Returns:
The list of available APIs.
- Return type:
list
- info()[source]¶
Get the information of the model.
This method reads the information file of the model and returns its content as a dictionary.
- Returns:
The information of the model.
- Return type:
dict
- property input_type¶
Get the input type of the model.
This property reads the input type information from the model’s card file and returns it as a list of input types.
- Returns:
The list of input types(such as compounds).
- Return type:
list
- is_valid()[source]¶
Check if the model identifier is valid.
This method verifies if the provided model identifier is valid by checking its existence and validity in the model hub.
- Returns:
True if the model identifier is valid, False otherwise.
- Return type:
bool
- property meta¶
Get the metadata of the model.
This property returns the metadata of the model, which provides additional information about the model, such as its description, version, and author.
- Returns:
The metadata of the model.
- Return type:
dict
- property output_type¶
Get the output type of the model.
This property reads the output type information from the model’s card file and returns it as a list of output types.
- Returns:
The list of output types(such as Descriptor, score, probability etc…).
- Return type:
list
- property paths¶
Get the paths related to the model.
This property returns a dictionary containing various paths related to the model, such as the destination path, repository path, and BentoML path.
- Returns:
The dictionary containing paths.
- Return type:
dict
- run(**kwargs)¶
- property schema¶
Get the schema of the model.
This property returns the schema of the model, which defines the structure and format of the model’s input and output data.
- Returns:
The schema of the model.
- Return type:
dict
- serve(**kwargs)¶
- setup()[source]¶
Setup the necessary requirements for the model.
This method ensures that the required dependencies and resources for the model are available.
- property size¶
Get the size of the model.
This property reads the size information from the model’s size file and returns it as a dictionary.
- Returns:
The size of the model.
- Return type:
dict
ersilia.core.modelbase module¶
- class ersilia.core.modelbase.ModelBase(**kwargs)[source]¶
Bases:
ErsiliaBase
Base class for managing models.
This class provides foundational functionality for handling models, including initialization, validation, and checking local availability.
- Parameters:
model_id_or_slug (str, optional) – The model identifier or slug, by default None.
repo_path (str, optional) – The repository path, by default None.
config_json (dict, optional) – Configuration in JSON format, by default None.
- is_available_locally()[source]¶
Check if the model is available locally either from the status file or from DockerHub.
- Returns:
True if the model is available locally, False otherwise.
- Return type:
bool
ersilia.core.session module¶
- class ersilia.core.session.Session(config_json)[source]¶
Bases:
ErsiliaBase
Session class for managing model sessions.
This class provides functionality to manage sessions, including opening, closing, and updating session information. Sessions are essential for tracking the state and usage of models, ensuring that all necessary information is stored and can be retrieved when needed.
- Parameters:
config_json (dict) – Configuration in JSON format.
- close()[source]¶
Close the current session.
This method removes the session file, effectively closing the session.
- current_identifier()[source]¶
Get the current identifier from the session.
This method retrieves the current identifier from the session data.
- Returns:
The current identifier, or None if no session data is available.
- Return type:
str or None
- current_model_id()[source]¶
Get the current model ID from the session.
This method retrieves the current model ID from the session data.
- Returns:
The current model ID, or None if no session data is available.
- Return type:
str or None
- current_output_source()[source]¶
Get the current output source from the session.
This method retrieves the current output source from the session data.
- Returns:
The current output source, or None if no session data is available.
- Return type:
str or None
- current_service_class()[source]¶
Get the current service class from the session.
This method retrieves the current service class from the session data.
- Returns:
The current service class, or None if no session data is available.
- Return type:
str or None
- get()[source]¶
Get the current session data.
This method retrieves the current session data from the session file. The session file is a JSON file that contains information about the current session, such as the model ID, timestamp, identifier, tracking status, service class, and output source.
- Returns:
The session data, or None if no session file exists.
- Return type:
dict or None
- open(model_id, track_runs)[source]¶
Open a new session for the specified model.
This method creates a new session for the specified model and saves the session data.
- Parameters:
model_id (str) – The identifier of the model.
track_runs (bool) – Whether to track runs.
- register_output_source(output_source)[source]¶
Register the output source in the session.
This method updates the session data with the provided output source.
- Parameters:
output_source (str) – The output source to register.
- register_service_class(service_class)[source]¶
Register the service class in the session.
This method updates the session data with the provided service class.
- Parameters:
service_class (str) – The service class to register.
- tracking_status()[source]¶
Get the tracking status from the session.
This method retrieves the tracking status from the session data.
- Returns:
The tracking status, or None if no session data is available.
- Return type:
bool or None
- update_cpu_time(cpu_time)[source]¶
Updates the total CPU time usage in the session data by adding the provided CPU time.
- Parameters:
cpu_time (float) – The CPU time to add.
ersilia.core.tracking module¶
- class ersilia.core.tracking.AwsConfig[source]¶
Bases:
ErsiliaBase
This class is responsible for retrieving AWS credentials from the environment variables or the AWS config file. It checks for the presence of AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION in the environment variables. If not found, it looks for them in the AWS config file located at ~/.aws/credentials and ~/.aws/config. If the credentials are found, they are returned as a dictionary.
- class ersilia.core.tracking.RunTracker(model_id, config_json)[source]¶
Bases:
ErsiliaBase
This class is responsible for tracking model runs. It calculates the desired metadata based on a model’s inputs, outputs, and other run-specific features, before uploading them to AWS to be ingested to Ersilia’s Splunk dashboard.
- Parameters:
model_id (str) – The identifier of the model.
config_json (dict) – Configuration in JSON format.
- create_event_data(**kwargs)¶
- get_file_sizes(input_file, output_file)[source]¶
Calculate the size of the input and output dataframes.
- Parameters:
input_file (str) – File path containing the input data.
output_file (str) – File path containing the output data.
- Returns:
Dictionary containing the input size, output size.
- Return type:
dict
- log_files_metrics(file_log)[source]¶
Log the number of errors and warnings in the log files.
- Parameters:
file_log (str) – The log file to be read.
- Returns:
A dictionary containing the error count and warning count.
- Return type:
dict
- summarize_output(output_file)[source]¶
This method summarizes the output of a model run :param output_file: The path to the output file. :type output_file: str
- Returns:
data – A dictionary containing the summarized data.
- Return type:
dict
- track(input, output, metadata, time_seconds)[source]¶
Track the model run and upload to S3 bucket. This method collects relevant data for the run, updates the session file with the stats, and uploads the data to AWS if credentials are available. :param input: The input data used in the model run. :type input: str :param output: The output data in the form of a CSV file path. :type output: str :param metadata: The metadata of the model. :type metadata: dict
- Returns:
This method does not return any value.
- Return type:
None
- upload_to_s3(event_id)[source]¶
Upload event information into an S3 bucket
- Parameters:
event_id (list) – Event identifier.
- Returns:
True if uploading completed successfully, False otherwise.
- Return type:
bool
- validate_aws_access(**kwargs)¶