lamindb.core.MuDataAnnotator

class lamindb.core.MuDataAnnotator(mdata, var_index, categoricals=None, using='default', verbosity='hint', organism=None)

Bases: object

Annotation flow for a MuData object.

Parameters:
  • mdata (MuData) – The MuData object to annotate.

  • var_index (dict[str, dict[str, DeferredAttribute]]) – The registry field for mapping the .var index for each modality. For example: {"modality_1": bt.Gene.ensembl_gene_id, "modality_2": ln.CellMarker.name}

  • categoricals (dict[str, DeferredAttribute] | None, default: None) – A dictionary mapping .obs.columns to a registry field. Use modality keys to specify categoricals for MuData slots such as "rna:cell_type": bt.CellType.name".

  • using (str, default: 'default') – A reference LaminDB instance.

  • verbosity (str, default: 'hint') – The verbosity level.

  • organism (str | None, default: None) – The organism name.

Examples

>>> import bionty as bt
>>> annotate = ln.Annotate.from_mudata(
        mdata,
        var_index={"rna": bt.Gene.ensembl_gene_id, "adt": ln.CellMarker.name},
        categoricals={"cell_type_ontology_id": bt.CellType.ontology_id, "donor_id": ln.ULabel.name},
        organism="human",
    )

Attributes

categoricals dict

Return the obs fields to validate against.

var_index FieldAttr

Return the registry field to validate variables index against.

Methods

add_new_from(key, modality=None, organism=None, **kwargs)

Add validated & new categories.

Parameters:
  • key (str) – The key referencing the slot in the DataFrame.

  • modality (str | None, default: None) – The modality name.

  • organism (str | None, default: None) – The organism name.

  • **kwargs – Additional keyword arguments to pass to the registry model.

add_new_from_columns(modality, column_names=None, organism=None, **kwargs)

Update columns records.

Parameters:
  • modality (str) – The modality name.

  • column_names (list[str] | None, default: None) – The column names to save.

  • organism (str | None, default: None) – The organism name.

  • **kwargs – Additional keyword arguments to pass to the registry model.

add_new_from_var_index(modality, organism=None, **kwargs)

Update variable records.

Parameters:
  • modality (str) – The modality name.

  • organism (str | None, default: None) – The organism name.

  • **kwargs – Additional keyword arguments to pass to the registry model.

add_validated_from(key, modality=None, organism=None)

Add validated categories.

Parameters:
  • key (str) – The key referencing the slot in the DataFrame.

  • modality (str | None, default: None) – The modality name.

  • organism (str | None, default: None) – The organism name.

lookup(using=None)

Lookup categories.

Parameters:

using (str | None, default: None) – The instance where the lookup is performed. if None (default), the lookup is performed on the instance specified in “using” parameter of the validator. if “public”, the lookup is performed on the public reference.

Return type:

AnnotateLookup

save_artifact(description=None, **kwargs)

Save the validated MuData and metadata.

Parameters:
  • description (str | None, default: None) – Description of the MuData object.

  • **kwargs – Object level metadata.

Return type:
Returns:

A saved artifact record.

validate(organism=None)

Validate categories.

Return type:

bool