faltwerk Python API basics¶
For more specialized operations, please also see eg functions related to geometry and stats.
Fold: basic structure manipulation and annotation¶
- class faltwerk.models.Fold(fp, quiet=True, annotate=True, strict=True, reindex: bool = False, reindex_start_position: int = 1)[source]¶
Core object to manipulate single-component protein structures.
The
Foldobject is the basis for many protein-centric operations in thefaltwerklibrary.Basic usage:
>>> from faltwerk import Fold >>> model = Fold('af2_prediction.pdb')
- __init__(fp, quiet=True, annotate=True, strict=True, reindex: bool = False, reindex_start_position: int = 1)[source]¶
Create a
faltwerk.Foldobject.Optional arguments:
strict (default True) – check that single chain
annotate (default True) – add a track that annotates positions N > C
- align_to(ref, mode=0, minscore=0.5)[source]¶
Align a structure to another one using
foldseek. Returns the Tm score (> 0.5 is good) and a copy of the query structure:>>> qry = Fold(...) >>> ref = Fold(...) >>> tm_score, cp_qry = qry.align_to(ref)
Optional arguments (passed to
foldseek, see https://github.com/steineggerlab/foldseek):mode (default 0)
minscore (default 0.5)
- annotate_(key, values, check=True)[source]¶
Annotate a structure with a track of some feature (solvent access, selected sites, surface probability, …), with one value for each residue. Inplace operation.
Optional arguments:
check (default True) – length features == length protein?
>>> from faltwerk.io import load_bfactor_column >>> features = load_bfactor_column('dMASIF.pdb') >>> model = Fold(...) >>> model.annotate_('surface', features)
Complex: basic structure complex manipulation and annotation¶
- class faltwerk.models.Complex(fp, scores=None, reindex: bool = False, reindex_start_position: int = 1)[source]¶
Core object to manipulate multi-component protein structures.
Many proteins interact in protein-protein complexes, and there is an increasing number of models such as AlphaFold v2 that can be used to predict such binding patterns. The
Complexobject collects methods to manipulate multi-component structures, selecting individual components, annotating binding sites, etc.Basic usage:
>>> from faltwerk import Complex, Layout >>> cx = Complex('af2_multichain_prediction.pdb', reindex=1)
Remove and select chains:
>>> cx - 'AB' # rm chains, same as cx - ['A', 'B'] >>> cx * 'C' # select >>> cx.chains['C'] >>> cx[2]
Visualise:
>>> Layout(cx).geom_ribbon().render()
- __init__(fp, scores=None, reindex: bool = False, reindex_start_position: int = 1)[source]¶
Create a
faltwerk.Complexobject.Optional arguments:
scores (default None) – add quality scores in ColabFold format [json]
reindex (default False) – reindex residues (if not numbered 1:n)
reindex_start_position (default 1) – where to start residue index
Binding: map likely ligand binding sites to protein structure¶
- class faltwerk.models.Binding(fold, option='confident')[source]¶
Object to predict ligand binding based on mapped Pfam domains, using the method first explored in the “InteracDome”:
The basic idea there was to take all PDB structures with interacting ligands, mark the interface between the two on the (linear) sequence, and then map the (linear) sequence to the protein sequence underlying a protein structure, marking the likely interacting residues on this query structure. There can be multiple ligands for each Pfam domain.
Because this approach was “trained” on Pfam v31, and all coordinates are based on this version, please only use v31 as reference.
There are three sets of contact data to make predictions from:
confident.. “correspond to domain-ligand interactions that had nonredundant instances across three or more distinct PDB entries and achieved a cross-validated precision of at least 0.5. [Kobren and Singh] recommend using this collection to annotate potential ligand-binding positions in protein sequences”representable/nonredundant.. “correspond to domain-ligand interactions that had nonredundant instances across three or more distinct PDB structures. [Kobren and Singh] recommend using this collection to learn more about domain binding properties” (nonredundantis the nonredundant version ofrepresentable)Update: For an end-to-end approach to the same problem, see DiffDock – https://github.com/gcorso/DiffDock
Usage:
>>> from faltwerk import Fold, Binding, Layout >>> model = Fold('faltwerk/data/zinc_finger/1mey.pdb) >>> hmms = 'path/to/pfam_v31/Pfam-A.hmm' >>> b = Binding(model, option='non-redundant') >>> b.predict_binding_(hmms) # inplace operation >>> # Explore: b.domains, b.ligands >>> model.annotate_('binding', b.get_binding('PF13912.5', 'ZN')) >>> Layout(model).geom_surface('binding').render() >>> # Marked in yellow are two symmetric iron binding sites