drugforge.data.schema.ligand.Ligand

class drugforge.data.schema.ligand.Ligand(*, compound_name: str | None = None, ids: LigandIdentifiers | None = None, provenance: LigandProvenance, experimental_data: ExperimentalCompoundData | None = None, expansion_tag: StateExpansionTag | None = None, charge_provenance: ChargeProvenance | None = None, bespoke_parameters: BespokeParameters | None = None, tags: dict[str, str] = {}, conf_tags: dict[str, list] | None = {}, data: str, data_format: Literal[DataStorageType.sdf] = DataStorageType.sdf)[source]

Bases: DataModelAbstractBase

Schema for a Ligand.

Has first class serialization support for SDF files as well as the typical JSON and dictionary serialization.

Note that equality comparisons are done on the chemical structure data found in the data field, not the other fields or the SD Tags in the original SDF This means you can change the other fields and still have equality, but changing the chemical structure data will change equality.

You must provide either a compound_name or ids field otherwise the ligand will be invalid.

Parameters:
  • compound_name (str, optional) – Name of compound, by default None

  • ids (Optional[LigandIdentifiers], optional) – LigandIdentifiers Schema for identifiers associated with this ligand, by default None

  • experimental_data (Optional[ExperimentalCompoundData], optional) – ExperimentalCompoundData Schema for experimental data associated with the compound, by default None

  • tags (dict[str, str], optional) – Dictionary of SD tags, by default {}

  • data (str, optional, private) – Chemical structure data from the SDF file stored as a string “”

  • data_format (DataStorageType, optional, private, const) – Enum describing the data storage method, by default DataStorageType.sdf

__init__(**data: Any) None

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Methods

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

clear_SD_data()

Clear the SD data for the ligand

construct([_fields_set])

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

data_equal(other)

dict(*[, include, exclude, by_alias, ...])

from_dict(dict)

from_inchi(inchi, **kwargs)

Create a Ligand from an InChI string

from_json(json_str)

from_json_file(file)

from_mol2(mol2_file, **kwargs)

Read in a ligand from an MOL2 file extracting all possible SD data into internal fields.

from_oemol(mol, **kwargs)

Create a Ligand from an OEMol extracting all SD tags into the internal model

from_openfe(mol, **kwargs)

Create a Ligand from an openfe SmallMoleculeComponent

from_orm(obj)

from_sdf(sdf_file, **kwargs)

Read in a ligand from an SDF file extracting all possible SD data into internal fields.

from_sdf_str(sdf_str, **kwargs)

Create a Ligand from an SDF string

from_single_conformers(confs)

Create a Ligand object from a list of Ligand objects, each representing a single conformer.

from_smiles(smiles, **kwargs)

Create a Ligand from a SMILES string

full_equal(other)

get_chemical_relationship(other)

Get the chemical relationship between two ligands

get_schema_version()

get_single_conf_SD_data([i])

Get the SD data for the ligand for a particular conformer.

has_same_charge(other)

Check if the ligand has the same charge as another ligand (the ligands can be the same).

is_chemically_equal(other)

Check if the ligand is chemically equal to another ligand using the inchikey.

is_protonation_state_isomer(other)

Check if the ligand is a conjugate acid or base of another ligand by neutralizing both ligands and checking if they are chemically equal.

is_stereoisomer(other)

Check if the ligand is a possible stereoisomer of another ligand.

is_tautomer(other)

Check if the ligand is a tautomer of another ligand, excluding protonation state isomers.

json(*[, include, exclude, by_alias, ...])

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, include, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

parse_file(path, *[, content_type, ...])

parse_obj(obj)

parse_raw(b, *[, content_type, encoding, ...])

print_SD_data()

Print the SD data for the ligand

schema([by_alias, ref_template])

schema_json(*[, by_alias, ref_template])

set_SD_data(data)

Set the SD data for the ligand, uses an update to overwrite existing data in line with OpenEye behaviour

set_expansion(parent, provenance)

Set the expansion of the ligand with a reference to the parent ligand and the settings used to create the expansion.

sort_confs_by_sd_tag_value(by[, ascending])

Sort the conformers of the ligand by a particular sd tag.

to_json_file(file)

to_oemol()

Convert the current molecule state to an OEMol including all fields as SD tags

to_openfe()

Convert to an openfe SmallMoleculeComponent via the rdkit interface.

to_rdkit()

Convert the current molecule state to an RDKit molecule including all fields as SD tags.

to_sdf(filename[, allow_append])

Write out the ligand to an SDF file with all attributes stored as SD tags

to_sdf_str()

Set the SD data for a ligand to a string representation of the data that can be written out to an SDF file

to_single_conformers()

Return a Ligand object for each conformer.

update_forward_refs(**localns)

validate(value)

Attributes

canonical_tautomer

Get the canonical tautomer of the ligand.

fixed_inchi

fixed_inchikey

flattened

Return a version of the ligand with 3d coordinates from the ligand and stereochemical information removed.

has_defined_stereo

Check if the ligand has defined stereochemistry.

has_multiple_poses

Check if the ligand has multiple poses.

has_perceived_stereo

Check if the ligand has any stereo bonds or chiral centers.

inchi

Get the InChI string for the ligand

inchikey

Get the InChIKey string for the ligand

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

neutralized

Get the neutralized version of the ligand.

non_iso_smiles

Get the non-isomeric canonical SMILES string for the ligand

num_poses

Get the number of poses in the ligand.

size

Size of the resulting JSON object for this class

smiles

Get the canonical isomeric SMILES string for the ligand

compound_name

ids

provenance

experimental_data

expansion_tag

charge_provenance

bespoke_parameters

tags

conf_tags

data

data_format

property canonical_tautomer: Ligand

Get the canonical tautomer of the ligand. Not necessarily the most physiologically relevant tautomer, but helpful for comparing ligands.

clear_SD_data() None[source]

Clear the SD data for the ligand

copy(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, update: Dict[str, Any] | None = None, deep: bool = False) Self

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`python {test="skip" lint="skip"} data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Args:

include: Optional set or mapping specifying which fields to include in the copied model. exclude: Optional set or mapping specifying which fields to exclude in the copied model. update: Optional dictionary of field-value pairs to override field values in the copied model. deep: If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

property fixed_inchi: str
Return type:

The fixed hydrogen inchi for the ligand.

property fixed_inchikey: str
Return type:

The fixed hydrogen layer inchi key for the ligand

property flattened: Ligand

Return a version of the ligand with 3d coordinates from the ligand and stereochemical information removed.

classmethod from_inchi(inchi: str, **kwargs) Ligand[source]

Create a Ligand from an InChI string

classmethod from_mol2(mol2_file: str | Path, **kwargs) Ligand[source]

Read in a ligand from an MOL2 file extracting all possible SD data into internal fields.

Parameters:

mol2_file (Union[str, Path]) – Path to the MOL2 file

classmethod from_oemol(mol: openeye.oechem.OEMol, **kwargs) Ligand[source]

Create a Ligand from an OEMol extracting all SD tags into the internal model

classmethod from_openfe(mol: gufe.components.SmallMoleculeComponent, **kwargs) Ligand[source]

Create a Ligand from an openfe SmallMoleculeComponent

classmethod from_sdf(sdf_file: str | Path, **kwargs) Ligand[source]

Read in a ligand from an SDF file extracting all possible SD data into internal fields.

Parameters:

sdf_file (Union[str, Path]) – Path to the SDF file

classmethod from_sdf_str(sdf_str: str, **kwargs) Ligand[source]

Create a Ligand from an SDF string

classmethod from_single_conformers(confs: list[Ligand]) ['Ligand'][source]

Create a Ligand object from a list of Ligand objects, each representing a single conformer.

This is a bit complicated because we want to ensure that the resulting Ligand object has the same data as all the original conformers.

classmethod from_smiles(smiles: str, **kwargs) Ligand[source]

Create a Ligand from a SMILES string

get_chemical_relationship(other: Ligand) ChemicalRelationship[source]

Get the chemical relationship between two ligands

get_single_conf_SD_data(i: int = 0) dict[str, str][source]

Get the SD data for the ligand for a particular conformer. Defaults to the first one. If you’d like to get SD data for all the conformers, those are saved in Ligand.conf_tags

Parameters:

i (int) – Return the ith conformer. Defaults to the first one (i=0).

Returns:

A dictionary of key: value pairs for the SD tags.

Return type:

dict[str, str]

property has_defined_stereo: bool

Check if the ligand has defined stereochemistry. Will be true if there are chiral centers and they are defined. If there are defined stereo bonds but no chiral centers (possible if some places are “over-defined”) this will be false.

property has_multiple_poses: bool

Check if the ligand has multiple poses.

property has_perceived_stereo: bool

Check if the ligand has any stereo bonds or chiral centers. Will be true if there are chiral centers even if they are undefined. :rtype: True if the ligand does contain any stereochemistry else False.

has_same_charge(other: Ligand) bool[source]

Check if the ligand has the same charge as another ligand (the ligands can be the same).

property inchi: str

Get the InChI string for the ligand

property inchikey: str

Get the InChIKey string for the ligand

is_chemically_equal(other: Ligand) bool[source]

Check if the ligand is chemically equal to another ligand using the inchikey. Both ligands must both have defined stereochemistry or both not have defined stereochemistry.

is_protonation_state_isomer(other: Ligand) bool[source]

Check if the ligand is a conjugate acid or base of another ligand by neutralizing both ligands and checking if they are chemically equal.

is_stereoisomer(other: Ligand) bool[source]

Check if the ligand is a possible stereoisomer of another ligand. Returns False if the ligands are the same.

is_tautomer(other: Ligand) bool[source]

Check if the ligand is a tautomer of another ligand, excluding protonation state isomers. Returns False if the ligands are the same or stereoisomers.

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod model_construct(_fields_set: set[str] | None = None, **values: Any) Self

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Args:
_fields_set: A set of field names that were originally explicitly set during instantiation. If provided,

this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

values: Trusted or pre-validated data dictionary.

Returns:

A new instance of the Model class with validated data.

model_copy(*, update: Mapping[str, Any] | None = None, deep: bool = False) Self
!!! abstract “Usage Documentation”

[model_copy](../concepts/serialization.md#model_copy)

Returns a copy of the model.

!!! note

The underlying instance’s [__dict__][object.__dict__] attribute is copied. This might have unexpected side effects if you store anything in it, on top of the model fields (e.g. the value of [cached properties][functools.cached_property]).

Args:
update: Values to change/add in the new model. Note: the data is not validated

before creating the new model. You should trust this data.

deep: Set to True to make a deep copy of the model.

Returns:

New model instance.

model_dump(*, mode: Literal['json', 'python'] | str = 'python', include: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, exclude: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, context: Any | None = None, by_alias: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, round_trip: bool = False, warnings: bool | Literal['none', 'warn', 'error'] = True, fallback: Callable[[Any], Any] | None = None, serialize_as_any: bool = False) dict[str, Any]
!!! abstract “Usage Documentation”

[model_dump](../concepts/serialization.md#modelmodel_dump)

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Args:
mode: The mode in which to_python should run.

If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

include: A set of fields to include in the output. exclude: A set of fields to exclude from the output. context: Additional context to pass to the serializer. by_alias: Whether to use the field’s alias in the dictionary key if defined. exclude_unset: Whether to exclude fields that have not been explicitly set. exclude_defaults: Whether to exclude fields that are set to their default value. exclude_none: Whether to exclude fields that have a value of None. round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. warnings: How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors,

“error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

fallback: A function to call when an unknown value is encountered. If not provided,

a [PydanticSerializationError][pydantic_core.PydanticSerializationError] error is raised.

serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

model_dump_json(*, indent: int | None = None, include: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, exclude: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, context: Any | None = None, by_alias: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, round_trip: bool = False, warnings: bool | Literal['none', 'warn', 'error'] = True, fallback: Callable[[Any], Any] | None = None, serialize_as_any: bool = False) str
!!! abstract “Usage Documentation”

[model_dump_json](../concepts/serialization.md#modelmodel_dump_json)

Generates a JSON representation of the model using Pydantic’s to_json method.

Args:

indent: Indentation to use in the JSON output. If None is passed, the output will be compact. include: Field(s) to include in the JSON output. exclude: Field(s) to exclude from the JSON output. context: Additional context to pass to the serializer. by_alias: Whether to serialize using field aliases. exclude_unset: Whether to exclude fields that have not been explicitly set. exclude_defaults: Whether to exclude fields that are set to their default value. exclude_none: Whether to exclude fields that have a value of None. round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. warnings: How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors,

“error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

fallback: A function to call when an unknown value is encountered. If not provided,

a [PydanticSerializationError][pydantic_core.PydanticSerializationError] error is raised.

serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

property model_extra: dict[str, Any] | None

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or None if config.extra is not set to “allow”.

property model_fields_set: set[str]

Returns the set of fields that have been explicitly set on this model instance.

Returns:
A set of strings representing the fields that have been set,

i.e. that were not filled from defaults.

classmethod model_json_schema(by_alias: bool = True, ref_template: str = '#/$defs/{model}', schema_generator: type[~pydantic.json_schema.GenerateJsonSchema] = <class 'pydantic.json_schema.GenerateJsonSchema'>, mode: ~typing.Literal['validation', 'serialization'] = 'validation') dict[str, Any]

Generates a JSON schema for a model class.

Args:

by_alias: Whether to use attribute aliases or not. ref_template: The reference template. schema_generator: To override the logic used to generate the JSON schema, as a subclass of

GenerateJsonSchema with your desired modifications

mode: The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

classmethod model_parametrized_name(params: tuple[type[Any], ...]) str

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Args:
params: Tuple of types of the class. Given a generic class

Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where params are passed to cls as type variables.

Raises:

TypeError: Raised when trying to generate concrete names for non-generic models.

model_post_init(context: Any, /) None

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

classmethod model_rebuild(*, force: bool = False, raise_errors: bool = True, _parent_namespace_depth: int = 2, _types_namespace: MappingNamespace | None = None) bool | None

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Args:

force: Whether to force the rebuilding of the model schema, defaults to False. raise_errors: Whether to raise errors, defaults to True. _parent_namespace_depth: The depth level of the parent namespace, defaults to 2. _types_namespace: The types namespace, defaults to None.

Returns:

Returns None if the schema is already “complete” and rebuilding was not required. If rebuilding _was_ required, returns True if rebuilding was successful, otherwise False.

classmethod model_validate(obj: Any, *, strict: bool | None = None, from_attributes: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) Self

Validate a pydantic model instance.

Args:

obj: The object to validate. strict: Whether to enforce types strictly. from_attributes: Whether to extract data from object attributes. context: Additional context to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Raises:

ValidationError: If the object could not be validated.

Returns:

The validated model instance.

classmethod model_validate_json(json_data: str | bytes | bytearray, *, strict: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) Self
!!! abstract “Usage Documentation”

[JSON Parsing](../concepts/json.md#json-parsing)

Validate the given JSON data against the Pydantic model.

Args:

json_data: The JSON data to validate. strict: Whether to enforce types strictly. context: Extra variables to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Returns:

The validated Pydantic model.

Raises:

ValidationError: If json_data is not a JSON string or the object could not be validated.

classmethod model_validate_strings(obj: Any, *, strict: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) Self

Validate the given object with string data against the Pydantic model.

Args:

obj: The object containing string data to validate. strict: Whether to enforce types strictly. context: Extra variables to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Returns:

The validated Pydantic model.

property neutralized: Ligand

Get the neutralized version of the ligand.

property non_iso_smiles: str

Get the non-isomeric canonical SMILES string for the ligand

property num_poses: int

Get the number of poses in the ligand.

print_SD_data() None[source]

Print the SD data for the ligand

set_SD_data(data: dict[str, str | list]) None[source]

Set the SD data for the ligand, uses an update to overwrite existing data in line with OpenEye behaviour

set_expansion(parent: Ligand, provenance: dict[str, Any]) None[source]

Set the expansion of the ligand with a reference to the parent ligand and the settings used to create the expansion.

Parameters:
  • parent (The parent ligand from which this child was created.)

  • provenance (The provenance dictionary of the state expander used to create this ligand created via)

  • `expander.provenance()` where the keys are fields of the expander and the values capture the

  • associated settings.

property size: ByteSize

Size of the resulting JSON object for this class

property smiles: str

Get the canonical isomeric SMILES string for the ligand

sort_confs_by_sd_tag_value(by: str, ascending: bool = True) ndarray[source]

Sort the conformers of the ligand by a particular sd tag. Changes the Ligand object IN PLACE and returns the indices of the conformers in the sorted order.

Parameters:
  • by (str) – Key value of SD tag to use

  • ascending (bool) – Whether to sort the values in ascending order, by default True.

Returns:

Array of len(num_confs) returned by np.argsort. Represents the set of indices that sorts the original conformer list into the new order.

Return type:

np.ndarray

Raises:

Value Error – If ‘by’ tag not found in ligand tags or if unable to sort the conformers

to_oemol() openeye.oechem.OEMol[source]

Convert the current molecule state to an OEMol including all fields as SD tags

to_openfe() gufe.components.SmallMoleculeComponent[source]

Convert to an openfe SmallMoleculeComponent via the rdkit interface.

to_rdkit() Chem.Mol[source]

Convert the current molecule state to an RDKit molecule including all fields as SD tags.

to_sdf(filename: str | Path, allow_append=False) None[source]

Write out the ligand to an SDF file with all attributes stored as SD tags

Parameters:
  • filename (Union[str, Path]) – Path to the SDF file

  • allow_append (bool, optional) – Allow appending to the file, by default False

to_sdf_str() str[source]

Set the SD data for a ligand to a string representation of the data that can be written out to an SDF file

to_single_conformers() ['Ligand'][source]

Return a Ligand object for each conformer.