ovo.core.utils.residue_selection

Module Contents

Classes

ContigSegment

Class for annotating a segment of the structure.

MappedContigSegment

Class for annotating a segment of the structure with mapping to input structure numbering.

Functions

from_residues_to_segments

Convert a list of residues [3,4,5,6,9,10,…] into a list of segments [3-6,9-10,…].

from_residues_to_hotspots

Convert a chain_id (i.e. A) and list of residues [3,4,…] into list of hotspots [A3, A4, …].

parse_selections

Input: a JSON-encoded representation of the selections that looks like this {“sequenceSelections”:[{“chainId”:”A”,”residues”:[10,11,12,13,15,16,17,18,19,20”]}]} Outputs list o segments [“A10-13”, “A15-20”]

get_chains_and_contigs

from_hotspots_to_segments

from_segments_to_hotspots

Convert a list of segments [“A3-5”,”A9-10”,…] to a STRING of hotspots “A3,A4,A5,A9,A10”.

from_contig_to_residues

Convert a str chain_segment / contig “A3-5/A9-10” to a list of residues [3,4,5,9,10].

from_residues_to_chain_breaks

parse_partial_diffusion_binder_contig

Get binder length and DESIGNED segments from binder contig as they correspond to positions in binder chain A

create_partial_diffusion_binder_contig

Create a new binder contig for partial diffusion from the designed segments and old binder contig

_parse_range

parse_contig_for_input_structure

Parse contig string and return segment annotations mapping to the INPUT structure numbering

split_subcontig

Split a single-chain subcontig into list of segments (not including the trailing /0)

parse_contig_for_output_structure

Parse contig string and return segment annotations mapping to our standardized RFdiffusion OUTPUT structure numbering.

API

ovo.core.utils.residue_selection.from_residues_to_segments(chain_id: str, residues: list[int], start_res: int | None = None, end_res: int | None = None) list[str]

Convert a list of residues [3,4,5,6,9,10,…] into a list of segments [3-6,9-10,…].

Args: chain_id (str): Required. The chain ID. residues (list[int]): Required. A list of residues. start_res (int, optional): The start residue. If specified, trims residues from start_res. Defaults to None. end_res (int, optional): The end residue. If specified, trims residues to end_res. Defaults to None.

Returns: list(str): A list of segments, i.e. [3-6,9-10,…]

ovo.core.utils.residue_selection.from_residues_to_hotspots(chain_id: str, residues: list[int])

Convert a chain_id (i.e. A) and list of residues [3,4,…] into list of hotspots [A3, A4, …].

ovo.core.utils.residue_selection.parse_selections(selections: str) list[str]

Input: a JSON-encoded representation of the selections that looks like this {“sequenceSelections”:[{“chainId”:”A”,”residues”:[10,11,12,13,15,16,17,18,19,20”]}]} Outputs list o segments [“A10-13”, “A15-20”]

ovo.core.utils.residue_selection.get_chains_and_contigs(pdb_str: str | None) Dict[str, str] | None
ovo.core.utils.residue_selection.from_hotspots_to_segments(hotspots: str) list[str] | None
ovo.core.utils.residue_selection.from_segments_to_hotspots(segments: list[str] | None) str

Convert a list of segments [“A3-5”,”A9-10”,…] to a STRING of hotspots “A3,A4,A5,A9,A10”.

ovo.core.utils.residue_selection.from_contig_to_residues(chain_segment: str | None) list[int] | None

Convert a str chain_segment / contig “A3-5/A9-10” to a list of residues [3,4,5,9,10].

ovo.core.utils.residue_selection.from_residues_to_chain_breaks(residues: list[int]) List[str] | None
ovo.core.utils.residue_selection.parse_partial_diffusion_binder_contig(binder_contig: str) tuple[int, list[str]]

Get binder length and DESIGNED segments from binder contig as they correspond to positions in binder chain A

default case for a peptide of 12 residues is simply binder_contig=”12-12” turned into [“A1-12”] or when redesigning only A1 and A3-7 in a peptide of 12 residues, this would be binder_contig=”1-1/A2-2/5-5/A8-12” turned into [“A1-1”, “A3-7”]

Parameters:

binder_contig – binder contig, e.g. 12-12 or 1-1/A2-2/5-5/A8-12

Returns:

binder length (int) and list of designed segments (list of str, e.g. [“A1-1”, “A3-7”])

ovo.core.utils.residue_selection.create_partial_diffusion_binder_contig(redesigned_segments: list[str], binder_length: int) str

Create a new binder contig for partial diffusion from the designed segments and old binder contig

default case for a peptide of 12 residues: redesigned_segments=[] -> binder_contig=”12-12” or when redesigning only A1 and A3-7 in a peptide of 12 residues, redesigned_segments=[“A1-1”, “A3-7”] -> binder_contig=”1-1/A2-2/5-5/A8-12”

Parameters:
  • redesigned_segments – list of designed segments (list of str, e.g. [“A1-1”, “A3-7”])

  • binder_length – length of binder (int)

Returns:

new binder contig (str)

class ovo.core.utils.residue_selection.ContigSegment

Class for annotating a segment of the structure.

Parameters:
  • start – Starting residue number of the segment (inclusive).

  • end – Ending residue number of the segment (inclusive).

  • chain – Chain ID of the segment.

  • color – Color of the segment in hex format (e.g. “0x00ff00”), optional.

  • start_label – Label to show at the start of the segment, optional.

  • middle_label – Label to show in the middle of the segment, optional.

  • end_label – Label to show at the end of the segment, optional.

start: int

None

end: int

None

chain: str

None

color: str | None

None

start_label: str

None

middle_label: str

None

end_label: str

None

class ovo.core.utils.residue_selection.MappedContigSegment

Bases: ovo.core.utils.residue_selection.ContigSegment

Class for annotating a segment of the structure with mapping to input structure numbering.

In addition to ContigSegment fields, also includes:

Parameters:
  • input_start – Starting residue number of the segment in the input structure (inclusive).

  • input_end – Ending residue number of the segment in the input structure (inclusive).

  • input_chain – Chain ID of the segment in the input structure.

input_start: int | None

None

input_end: int | None

None

input_chain: str | None

None

ovo.core.utils.residue_selection._parse_range(range_str: str) tuple[int, int]
ovo.core.utils.residue_selection.parse_contig_for_input_structure(contig: str, include_generated: bool = False) list[ovo.core.utils.residue_selection.ContigSegment]

Parse contig string and return segment annotations mapping to the INPUT structure numbering

By default, includes only fixed segments (originating from input structure).

Parameters:
  • contig – contig string, e.g. “A1-5/A6-10 B1-8” or “A1-5/5-10 B1-8”

  • include_generated – include generated segments (e.g. 5-10) in the output, defaults to False. Note that generated segments will have start, end, and chain set to None!

ovo.core.utils.residue_selection.split_subcontig(subcontig: str) list[str]

Split a single-chain subcontig into list of segments (not including the trailing /0)

Given “A1-5/5-10/A11-15/0”, returns [“A1-5”, “5-10”, “A11-15”]

ovo.core.utils.residue_selection.parse_contig_for_output_structure(contig: str) list[ovo.core.utils.residue_selection.MappedContigSegment]

Parse contig string and return segment annotations mapping to our standardized RFdiffusion OUTPUT structure numbering.

See standardize_pdb.py script inside rfdiffusion pipeline.

Assumptions:

  • The “subcontigs” in the contig (split by whitespace) are ordered same as the chains in the output structure (contigs with generated segments come first, then contigs with only fixed segments)

  • Generated segments should be “resolved”, meaning that they should have same start and end in the range (5-5, not 5-10)

  • Chains that contain any RFdiffusion-generated segments are renumbered starting from 1 in the output structure, and they are in chains A, B, … in order of appearance (in case of multiple generated chains, typically just A)

  • Fixed chains (with no designed segments) are numbered same as in the input structure, and assigned remaining chains B, C, etc. in order of appearance