This page was generated from /home/docs/checkouts/readthedocs.org/user_builds/asapdiscovery/checkouts/latest/docs/_collections/notebooks/running_md_simulations.ipynb.
Running MD simulations
Running vanilla molecular dynamics (MD) simulations on protein-ligand complexes can provide information about the dynamics of a given ligand inside the binding site and can be used to assess the quality and stability of a proposed binding mode.
Below we will demonstrate an example of running MD simulations on a protein ligand complex, following docking, using the drugforge framework.
We will be running this example on an ASAP target, the SARS-CoV-2 nsp3 Mac1 macrodomain; this removes ADP ribose from viral and host cell proteins. The removal of this post-translational modification reduces the inflammatory and antiviral responses to infection — facilitating replication. For more information on Mac1, follow this link.
Required modules
To execute this example, the following drugforge modules to be installed:
data,docking,modelingsimulation
To enable the visualization of the docking results, the following modules will also need to be installed:
datavizspectrum.
Note this example requires users have an OpenEye license.
Please refer to the installation instructions for more details.
[1]:
# import some dependencies
from drugforge.data.testing.test_resources import fetch_test_file
from drugforge.docking.docking import DockingInputPair
from drugforge.docking.openeye import POSITDocker
from drugforge.data.schema.complex import Complex, PreppedComplex
from drugforge.data.schema.ligand import Ligand
from drugforge.simulation.simulate import VanillaMDSimulator
Docking an arbitrary ligand to Mac1
Let us fetch an example structure of Mac1, hosted in the ASAP testing repository. For convenience, we will utilize the fetch_test_file function (part of the data module) to download this file and return its location.
[2]:
# fetch a PDB file from the test suite, in this case a PDB from the COVID MOONSHOT.
protein_pdb_file = fetch_test_file("SARS2_Mac1A-A1013.pdb")
# print out the location of the pdb file we just downloaded.
print(protein_pdb_file)
/Users/hugomacdermott/Library/Caches/asapdiscovery_testing/SARS2_Mac1A-A1013.pdb
Now we will create a Complex object from the .pdb file (see the tutorial on base level ASAP abstractions for more info)
[3]:
# make a complex
mac1_complex = Complex.from_pdb(protein_pdb_file, ligand_kwargs={"compound_name": "A1013"}, target_kwargs={"target_name": "SARS2_Mac1A"})
We will need to create a Ligand object to dock into the structure. In this case we will generate the ligand from a SMILES string. Note, in addition to SMILES, the ligands can also be created from InChI strings, SDF files, and OEMol instances.
[4]:
# make the ligand we want to dock, a simple alkane
ligand = Ligand.from_smiles("CCCCCCC", compound_name="alkane")
Next, we will run protein preparation.
[5]:
# prepare our structure
prepped_mac1_complex_complex = PreppedComplex.from_complex(mac1_complex)
# pair it up with the ligand we want to dock.
docking_input_pair = DockingInputPair(complex=prepped_mac1_complex_complex, ligand=ligand)
Warning: No BioAssembly transforms found, using input molecule as biounit: ---_LIG
DPI: 0.12, RFree: 0.28, Resolution: 1.48
Processing BU # 1 with title: ---_LIG, chains AB
Now, we dock it to our protein. Note, use_dask is set to False, disabling parallel execution, as it is not required for this example (Note, dask is discussed at the end of this tutorial).
[6]:
# run OpenEye POSIT docking,
docker = POSITDocker(use_omega=False)
results = docker.dock([docking_input_pair], use_dask=False)
[7]:
print(results[0].posed_ligand.tags)
{'docking-confidence-POSIT': 0.019999999552965164, '_POSIT_method': 'FRED'}
Vizualise the docked pose
Let us now vizualise our results! For more information on vizualisations, see the vizualisation notebook hosted in the examples directory.
[8]:
# create a visualization factory.
from drugforge.dataviz.html_viz import HTMLVisualizer
html_vizualizer = HTMLVisualizer(
target="SARS-CoV-2-Mac1",
color_method="subpockets",
align=True,
output_dir="tutorial_files/running_md_simulations/",
write_to_disk=True,
)
vizs_from_docked = html_vizualizer.visualize(inputs=results, outpaths=["visualise_docked.html"], use_dask=False)
2024-05-10 13:03:23,875 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-10 13:03:23,875 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-10 13:03:23,875 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-10 13:03:23,875 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-10 13:03:24,022 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpsqm6hmwx/
[9]:
from IPython.display import IFrame
IFrame(vizs_from_docked["html_path_pose"][0], 1000, 1000)
[9]:
Running a single MD simulation
Great, our pose looks good! Lets use the VanillaMDSimulator to run simulations of the protein-ligand complex.
The VanillaMDSimulator has many options for running simulations in different configurations, however a basic configuration should be sufficient for this example. For the purposes of this tutorial we will keep the simulations very, very short (simulation time <1 minute for a single simulation on a typical GPU-enabled computer).
[10]:
VanillaMDSimulator?
Init signature:
VanillaMDSimulator(
*,
output_dir: pathlib.Path = 'md',
debug: bool = False,
collision_rate: pydantic.types.PositiveFloat = 1,
openmm_logname: str = 'openmm_log.tsv',
openmm_platform: asapdiscovery.simulation.simulate.OpenMMPlatform = <OpenMMPlatform.Fastest: 'Fastest'>,
temperature: pydantic.types.PositiveFloat = 300,
pressure: pydantic.types.PositiveFloat = 1,
timestep: pydantic.types.PositiveFloat = 4,
equilibration_steps: pydantic.types.PositiveInt = 5000,
reporting_interval: pydantic.types.PositiveInt = 1250,
num_steps: pydantic.types.PositiveInt = 2500000,
rmsd_restraint: bool = False,
rmsd_restraint_atom_indices: list[int] = [],
rmsd_restraint_type: Optional[str] = None,
rmsd_restraint_force_constant: pydantic.types.PositiveFloat = 50,
truncate_steps: bool = True,
small_molecule_force_field: str = 'openff-2.2.0',
**extra_data: Any,
) -> None
Docstring: Base class for Simulators.
Init docstring:
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
File: ~/Desktop/asap/asapdiscovery/asapdiscovery-simulation/asapdiscovery/simulation/simulate.py
Type: ModelMetaclass
Subclasses:
Let us set up the simulator:
[11]:
md_simulator = VanillaMDSimulator(
output_dir="tutorial_files/running_md_simulations/",
equilibration_steps=1,
num_steps=1,
reporting_interval=1)
To run the simulation, we will pass the output from the docking performed early (saved as results) to the simulation function of the VanillaMDSimulation instance. This will launch a simulation using the OpenMM simulation package.
[12]:
simulation_results = md_simulator.simulate(
results,
use_dask=False)
This function returns a list, where each entry contains an instance of SimulationResult corresponding to each of the simulations executed. In this case, we only ran a single simulation, and to query the output we just need to set the index to be 0. For example, let us check to see if the simulation completed successfully:
[13]:
simulation_results[0].success
[13]:
True
The simulator makes unique paths for the resulting simulations, which can be access via the simulation_results output. The pathes that are returned are relevative to the directory where the simulation was launched. Note, future releases will add additional flexibility with regards to simulation output.
[14]:
print(simulation_results[0].traj_path)
tutorial_files/running_md_simulations/SARS2_Mac1A-b27f22555232d2d68273612ffce5a119d6e22526d95ce3eb0db9012632bcdaf6+FHHVXLFEHODNRQ-XCZWEQHLNA-M_alkane-IMNFDUFMRHMDMM-UHFFFAOYNA-N/traj.xtc
[15]:
print(simulation_results[0].final_pdb_path)
tutorial_files/running_md_simulations/SARS2_Mac1A-b27f22555232d2d68273612ffce5a119d6e22526d95ce3eb0db9012632bcdaf6+FHHVXLFEHODNRQ-XCZWEQHLNA-M_alkane-IMNFDUFMRHMDMM-UHFFFAOYNA-N/final.pdb
Running multiple simulations in parallel with dask-cuda
We can pass a series of DockingResults to the VanillaMDSimulator and have dask-cuda parallelize work over available GPU resources (see here).
For this we will need a LocalCUDACluster. You will only see parallelism here if you have more than one GPU, e.g on an HPC cluster, otherwise, the simulations will run sequentially.
NOTE dask_cuda is not available for MacOS computers.
[ ]:
# create a dask_cuda LocalCUDACluster
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster()
gpu_client = Client(cluster)
[ ]:
# drugforge provides a convenience function to do this
from drugforge.data.util.dask_utils import DaskType, make_dask_client_meta
gpu_client = make_dask_client_meta(DaskType.LOCAL_GPU)
We can see by inspecting the signature of the simulate method that it can accept a dask Client
[ ]:
VanillaMDSimulator.simulate?
Lets extend the results list of inputs so that these could be run in parallel; this will result in 3 simulation instances.
[ ]:
results_par = results + results + results # duplicate X3
[ ]:
md_simulator = VanillaMDSimulator(
output_dir="tutorial_files/running_md_simulations/",
equilibration_steps=1,
num_steps=1,
reporting_interval=1)
simulation_results_parallel = md_simulator.simulate(
results_par,
dask_client=gpu_client,
failure_mode="skip",
use_dask=True)
The simulation results are stored in a list, where the outputs for each simulation can be accessed in the same way as demonstrated above:
[ ]:
for i, sim_result in enumerate(simulation_results_parallel):
print(f"simulation {i}: ", sim_result.traj_path)
[ ]: