lamindb.Transform¶
- class lamindb.Transform(name: str, key: str | None = None, version: str | None = None, type: TransformType | None = None, is_new_version_of: Transform | None = None)¶
Bases:
Registry
,HasParents
,IsVersioned
Data transformations.
A transform can refer to a simple Python function, script, a notebook, or a pipeline. If you execute a transform, you generate a run of a transform (
Run
). A run has input and output data.A pipeline is typically created with a workflow tool (Nextflow, Snakemake, Prefect, Flyte, MetaFlow, redun, Airflow, …) and stored in a versioned repository.
Transforms are versioned so that a given transform maps 1:1 to a specific version of code. If you switch on
sync_git_repo
, any script-like transform is synced its hashed state in a git repository.If you execute a transform, you generate a
Run
record. The definition of transforms and runs is consistent the OpenLineage specification where aTransform
record would be called a “job” and aRun
record a “run”.- Parameters:
name –
str
A name or title.key –
str | None = None
A short name or path-like semantic key.version –
str | None = None
A version.type –
TransformType | None = "pipeline"
Either'notebook'
,'pipeline'
or'script'
.is_new_version_of –
Transform | None = None
An old version of the transform.
Notes
Examples
Create a transform for a pipeline:
>>> transform = ln.Transform(name="Cell Ranger", version="7.2.0", type="pipeline") >>> transform.save()
Create a transform from a notebook:
>>> ln.track()
View parents of a transform:
>>> transform.view_parents()
Fields
- version CharField
Version (default
None
).Defines version of a family of records characterized by the same
stem_uid
.Consider using semantic versioning with Python versioning.
- id AutoField
Internal id, valid only in one DB instance.
- uid CharField
Universal id.
- name CharField
A name or title. For instance, a pipeline name, notebook title, etc.
- key CharField
A key for concise reference & versioning (optional).
- description CharField
A description (optional).
- type CharField
Transform type (default
"pipeline"
).
- latest_report ForeignKey
Latest run report.
- source_code ForeignKey
Source of the transform if stored as artifact within LaminDB.
- reference CharField
Reference for the transform, e.g., a URL.
- reference_type CharField
Type of reference, e.g., ‘url’ or ‘doi’.
- created_at DateTimeField
Time of creation of record.
- updated_at DateTimeField
Time of last update to record.
- created_by ForeignKey
Creator of record, a
User
.
- ulabels ManyToManyField
Accessor to the related objects manager on the forward and reverse sides of a many-to-many relation.
In the example:
class Pizza(Model): toppings = ManyToManyField(Topping, related_name='pizzas')
Pizza.toppings
andTopping.pizzas
areManyToManyDescriptor
instances.Most of the implementation is delegated to a dynamically defined manager class built by
create_forward_many_to_many_manager()
defined below.
- parents ManyToManyField
Parent transforms (predecessors) in data flow.
These are auto-populated whenever a transform loads an artifact or collection as run input.
Methods
- delete()¶
- Return type:
None
- get_type_display(*, field=<django.db.models.fields.CharField: type>)¶