bionty

Registries for basic biological entities, coupled to public ontologies.

Overview

  • Create records from entries in public ontologies using .from_public().

  • Access full underlying public ontologies via .public() to search & bulk-create records.

  • Create in-house ontologies by using hierarchical relationships among records (.parents).

  • Use .synonyms and .abbr to manage synonyms.

All registries inherit from CanValidate & HasParents to standardize, validate & annotate data, and from Registry for query & search.

How to ensure reproducibility across different versions of public ontologies?

It’s important to track versions of external data dependencies.

bionty manages it under the hood:

  • Versions of public databases are auto-tracked in PublicSource.

  • Records are indexed by universal ids, created by hashing name & ontology_id for portability across databases.

Installation

>>> pip install 'lamindb[bionty]'

Setup

>>> lamin init --storage <storage_name> --schema bionty

Quickstart

Import bionty:

>>> import bionty as bt

Access public ontologies:

>>> genes = bt.Gene.public()
>>> genes.validate(["BRCA1", "TCF7"], field="symbol")

Create records from public ontologies:

>>> cell_type = bt.CellType.from_public(ontology_id="CL:0000037")
>>> cell_type.save()

View ontological hierarchy:

>>> cell_type.view_parents()

Create in-house ontologies:

>>> cell_type_new = bt.CellType(name="my new cell type")
>>> cell_type_new.save()
>>> cell_type_new.parents.add(cell_type)
>>> cell_type_new.view_parents()

Manage synonyms:

>>> cell_type_new.add_synonyms(["my cell type", "my cell"])
>>> cell_type_new.set_abbr("MCT")

Note

Read the guides:

For more background on how public ontologies are accessed, see the utility library bionty-base.

API

Import the package:

import bionty as bt

Basic biological registries:

Organism()

Organism - NCBI Taxonomy, Ensembl Organism.

Gene()

Genes - Ensembl, NCBI Gene.

Protein()

Proteins - Uniprot.

CellMarker()

Cell markers - CellMarker.

CellType()

Cell types - Cell Ontology.

CellLine()

Cell lines - Cell Line Ontology.

Tissue()

Tissues - Uberon.

Disease()

Diseases - Mondo, Human Disease.

Pathway()

Pathways - Gene Ontology, Pathway Ontology.

Phenotype()

Phenotypes - Human Phenotype, Phecodes, Mammalian Phenotype, Zebrafish Phenotype.

ExperimentalFactor()

Experimental factors - Experimental Factor Ontology.

DevelopmentalStage()

Developmental stages - Human Developmental Stages, Mouse Developmental Stages.

Ethnicity()

Ethnicity - Human Ancestry Ontology.

Settings:

settings

Global Settings.

Public ontology versions:

PublicSource()

Versions of public ontologies.

Developer API:

core

Developer API.