{
"cells": [
{
"cell_type": "markdown",
"id": "3fec30d7-4fa5-460a-b251-f6f8e830feb8",
"metadata": {},
"source": [
"# Visualizing ASAP targets"
]
},
{
"cell_type": "markdown",
"id": "98d9335f-2f85-4e3c-8dea-bc5d70b10cc9",
"metadata": {},
"source": [
"A key aspect of communicating computational chemistry is having easy to interpret visual aids that assist decision making. To this end we have developed easy ways to visualize ASAP targets in portable and easy to interpret ways.\n",
"\n",
"This includes visualizing protein-ligand conformations, molecular dynamics simulations and viral fitness data. "
]
},
{
"cell_type": "markdown",
"id": "56c5d71f-6c4c-4535-93dc-1b66632944e9",
"metadata": {},
"source": [
"## HTML views of protein-ligand conformations\n",
"\n",
"Protein-ligand conformations are central to the drug design DMTA cycle and need to be viewed quickly and in large numbers. To this end we developed a portable interactive HTML representation of protein-ligand conformations for our targets based on [3DMol](https://3dmol.csb.pitt.edu/) that can easily be shared between team members and outside collaborators, embedded into various platforms and hosted on cloud repositories. \n",
"\n",
"To make one of these HTML representations, follow the steps below!"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9da29e7b-f540-45dd-86eb-12a86b36b429",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# import some dependencies\n",
"\n",
"from drugforge.data.backend.openeye import oechem\n",
"from drugforge.data.schema.complex import Complex, PreppedComplex\n",
"from drugforge.data.schema.ligand import Ligand\n",
"from drugforge.data.testing.test_resources import fetch_test_file\n",
"from drugforge.dataviz.html_viz import HTMLVisualizer\n",
"from drugforge.docking.docking import DockingInputPair\n",
"from drugforge.docking.openeye import POSITDocker, POSITDockingResults\n",
"from drugforge.simulation.simulate import SimulationResult\n",
"from IPython.display import display, HTML, IFrame"
]
},
{
"cell_type": "markdown",
"id": "cea33ea3-5aea-4058-bfc2-63b7d38efafb",
"metadata": {},
"source": [
"To learn more about how the base level abstractions such as `Ligand`, `Complex` etc work, it is reccomended to run through the `working_with_data` tutorial (see Tutorial index)."
]
},
{
"cell_type": "markdown",
"id": "f9cae1a7-08d6-409d-af53-2d85db2c5e7d",
"metadata": {},
"source": [
"We have designed the `Visualization` module (and others) so that they work seamlessly with multiple levels of abstraction. Here we will be exploring making HTML renders from a **PDB file**, an in-memory `Complex` object and from a set of **docking results**. This gives flexibility to work with data that is more or less structured with ease. "
]
},
{
"cell_type": "markdown",
"id": "150d2864-81bc-49af-a795-78b86895d410",
"metadata": {},
"source": [
"### From a PDB file "
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "97d0de5f-9c19-4cd0-b952-7ef8a208f035",
"metadata": {},
"outputs": [],
"source": [
"protein = fetch_test_file(\n",
" \"Mpro-P2660_0A_bound-prepped_complex.pdb\"\n",
") # fetch a PDB file from the test suite, in this case a PDB from the COVID MOONSHOT."
]
},
{
"cell_type": "markdown",
"id": "378e9d50-0cd0-4dda-8386-f5aca710352f",
"metadata": {},
"source": [
"We will use the `HTMLVisualizer` factory class to create our renders. Let's inspect its arguments:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "eaba416a-01e3-4915-98a0-2aa7159340ba",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\u001b[0;31mInit signature:\u001b[0m\n",
"\u001b[0mHTMLVisualizer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0masapdiscovery\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mservices\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpostera\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmanifold_data_validation\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTargetTags\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mcolor_method\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0masapdiscovery\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdataviz\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhtml_viz\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mColorMethod\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m<\u001b[0m\u001b[0mColorMethod\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msubpockets\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'subpockets'\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mdebug\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mbool\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mwrite_to_disk\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mbool\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0moutput_dir\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mpathlib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mPath\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'html'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0malign\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mbool\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mfitness_data\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mOptional\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mAny\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mfitness_data_logoplots\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mOptional\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mAny\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mreference_protein\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mOptional\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mAny\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mDocstring:\u001b[0m \n",
"Class for generating HTML visualizations of poses.\n",
"\n",
"The main method is `visualize`, which takes a list of inputs and returns a list of HTML strings or writes them to disk, optionally a list of output paths can\n",
"be provided.\n",
"\n",
"The `visualize` is heavily overloaded and can take a list of `DockingResult`, `Path`, `Complex`, or a tuple of `Complex` and a list of `Ligand`.\n",
"\n",
"Parameters\n",
"----------\n",
"target : TargetTags\n",
" Target to visualize poses for\n",
"color_method : ColorMethod\n",
" Protein surface coloring method. Can be either by `subpockets` or `fitness`\n",
"debug : bool\n",
" Whether to run in debug mode\n",
"write_to_disk : bool\n",
" Whether to write the HTML files to disk or return them as strings\n",
"output_dir : Path\n",
" Output directory to write HTML files to\n",
"align : bool\n",
" Whether to align the poses to the reference protein\n",
"\u001b[0;31mInit docstring:\u001b[0m\n",
"Create a new model by parsing and validating input data from keyword arguments.\n",
"\n",
"Raises ValidationError if the input data cannot be parsed to form a valid model.\n",
"\u001b[0;31mFile:\u001b[0m ~/Desktop/asap/asapdiscovery/asapdiscovery-dataviz/asapdiscovery/dataviz/html_viz.py\n",
"\u001b[0;31mType:\u001b[0m ModelMetaclass\n",
"\u001b[0;31mSubclasses:\u001b[0m "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"HTMLVisualizer?"
]
},
{
"cell_type": "markdown",
"id": "e5099ff1-98cd-4203-becc-168ae9a55b34",
"metadata": {},
"source": [
"* We need to provide an ASAP target for the `target` argument, e.e `SARS-CoV-2-Mpro`\n",
"* We would like to colour by `subpocket` (more on other options later)\n",
"* We would like to align to a canonical reference structure `align=True`\n",
"* For the purposes of this notebook we will write to a folder called `html`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "50b4990a-916c-494f-8c19-bb930dcc1e2a",
"metadata": {},
"outputs": [],
"source": [
"# create a visualization factory.\n",
"html_vizualizer = HTMLVisualizer(\n",
" target=\"SARS-CoV-2-Mpro\",\n",
" color_method=\"subpockets\",\n",
" align=True,\n",
" output_dir=\"tutorial_files/visualizing_asap_targets/\",\n",
" write_to_disk=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "707482df-7041-48b7-b3a3-4bd488886b55",
"metadata": {},
"source": [
"Fantastic! Ok now let's run our renders, passing in our list of inputs. We can optionally use [dask](https://www.dask.org/) to parallelize over our list of inputs for higher performance. This is important when dealing with lots of structures or inputs, but should be unnessecary for now. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7a6c91a9-602b-4e14-9596-29c5b749bffb",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-05-10 10:39:08,032 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0\n",
"2024-05-10 10:39:08,032 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai\n",
"2024-05-10 10:39:08,032 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294\n",
"2024-05-10 10:39:08,032 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb\n",
"2024-05-10 10:39:08,251 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpwk7rm22z/\n"
]
}
],
"source": [
"# create our visualizations, explicitly specifying an output path\n",
"vizs = html_vizualizer.visualize(\n",
" inputs=[protein], outpaths=[\"subpockets_render.html\"], use_dask=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "e253e45d-8db6-42ab-aad9-66f684c561df",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" ligand_id | \n",
" target_id | \n",
" SMILES | \n",
" html_path_pose | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Mpro-P2660_0A_bound-prepped_complex_ligand | \n",
" Mpro-P2660_0A_bound-prepped_complex_target | \n",
" CNC(=O)CN1C[C@]2(CCN(C2=O)c3cncc4c3cc(cc4)Cl)c... | \n",
" tutorial_files/visualizing_asap_targets/subpoc... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" ligand_id \\\n",
"0 Mpro-P2660_0A_bound-prepped_complex_ligand \n",
"\n",
" target_id \\\n",
"0 Mpro-P2660_0A_bound-prepped_complex_target \n",
"\n",
" SMILES \\\n",
"0 CNC(=O)CN1C[C@]2(CCN(C2=O)c3cncc4c3cc(cc4)Cl)c... \n",
"\n",
" html_path_pose \n",
"0 tutorial_files/visualizing_asap_targets/subpoc... "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vizs # result is a dataframe"
]
},
{
"cell_type": "markdown",
"id": "1c60675d-2a3b-47b7-ab03-f1586a1d8c29",
"metadata": {},
"source": [
"Ok now we have our render in memory, lets try and display it in this notebook!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "89c7cbf0-ce96-456c-b1c5-b36400680868",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import IFrame\n",
"\n",
"IFrame(vizs[\"html_path_pose\"][0], 1000, 1000)"
]
},
{
"cell_type": "markdown",
"id": "6af480f1-0f02-4952-b5fa-9fe2863c435a",
"metadata": {},
"source": [
"Wow! Very cool, we now have an interactive way to view ligand-protein complexes of ASAP targets, annotated with key interactions and important protein subpockets for the target of interest. Our medicinal chemists find this very useful for quickly viewing key interactions in docked virtual designs and crystal structures. "
]
},
{
"cell_type": "markdown",
"id": "556df9a8-8e7a-40c5-8ac6-920b1daf9a57",
"metadata": {},
"source": [
"### From an in-memory Complex representation. \n",
"We can follow similar steps to render an in-memory representation of our ligand to an HTML view."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "b741c16a-574f-4b07-97f0-d79e4c58c550",
"metadata": {},
"outputs": [],
"source": [
"# make a complex\n",
"sars_cov_2_complex = Complex.from_pdb(\n",
" protein,\n",
" ligand_kwargs={\"compound_name\": \"Mpro-P2660-bound-target\"},\n",
" target_kwargs={\"target_name\": \"Mpro-P2660\"},\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "aa260816-4cdc-44a9-a93a-b624e5256834",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-05-10 10:39:09,241 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0\n",
"2024-05-10 10:39:09,241 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai\n",
"2024-05-10 10:39:09,241 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294\n",
"2024-05-10 10:39:09,241 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb\n",
"2024-05-10 10:39:09,448 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpr6r76eog/\n"
]
}
],
"source": [
"# we can re-use our factory from before\n",
"vizs = html_vizualizer.visualize(\n",
" inputs=[sars_cov_2_complex], outpaths=[\"subpockets_from_complex.html\"], use_dask=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "5ec2dfe2-b3a8-4296-8e11-14213b4d6e4c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import IFrame\n",
"\n",
"IFrame(vizs[\"html_path_pose\"][0], 1000, 1000)"
]
},
{
"cell_type": "markdown",
"id": "616bbf36-d5e2-4499-80be-fd95a242945d",
"metadata": {},
"source": [
"Note that you can also easily open it with your web browser. e.g `google-chrome render.html`"
]
},
{
"cell_type": "markdown",
"id": "ae4e80cc-8515-4e69-8d3b-609b5ea30808",
"metadata": {},
"source": [
"### Docking a new structure!\n",
"\n",
"We have shown pre-prepared examples here so far. What if we want to dock and visualize a new structure?\n",
"\n",
"Note that docking will not be covered in depth here (see `Docking and Scoring` tutorial for more information. Lets dock our structure "
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "9b41f52a-8a9d-4295-942f-f5d3f0f70693",
"metadata": {},
"outputs": [],
"source": [
"# make the ligand we want to dock, a simple alkane\n",
"ligand = Ligand.from_smiles(\"CCCCCCC\", compound_name=\"alkane\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "c0b79882-2afc-4712-98d2-bc23b09be4df",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning: No BioAssembly transforms found, using input molecule as biounit: DesignUnit Components_LIG\n",
"Warning: Iridium - Structure: DesignUnit Components_LIG has no REMARK data\n",
"Processing BU # 1 with title: DesignUnit Components_LIG, chains AB\n"
]
}
],
"source": [
"# prepare our structure\n",
"prepped_sars_cov_2_complex = PreppedComplex.from_complex(sars_cov_2_complex)\n",
"# pair it up with the ligand we want to dock.\n",
"docking_input_pair = DockingInputPair(complex=prepped_sars_cov_2_complex, ligand=ligand)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "0127e48a-06d8-4cc0-b25f-646e982a1d57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[POSITDockingResults(type='POSITDockingResults', input_pair=DockingInputPair(complex=PreppedComplex(target=PreppedTarget(target_name='Mpro-P2660', ids=None, data_format=, target_hash='2353f6855b9359b5c6693a8e1dccd24b33c634f839f72d192b68e55b0e7d78b5'), ligand=Ligand(compound_name='Mpro-P2660-bound-target', ids=None, provenance=LigandProvenance(isomeric_smiles='CNC(=O)CN1C[C@]2(CCN(C2=O)c3cncc4c3cc(cc4)Cl)c5cc(ccc5C1=O)Cl', inchi='InChI=1S/C24H20Cl2N4O3/c1-27-21(31)12-29-13-24(19-9-16(26)4-5-17(19)22(29)32)6-7-30(23(24)33)20-11-28-10-14-2-3-15(25)8-18(14)20/h2-5,8-11H,6-7,12-13H2,1H3,(H,27,31)/t24-/m1/s1', inchi_key='JZJCSVMJFIAMQB-XMMPIXPASA-N', fixed_inchi='InChI=1/C24H20Cl2N4O3/c1-27-21(31)12-29-13-24(19-9-16(26)4-5-17(19)22(29)32)6-7-30(23(24)33)20-11-28-10-14-2-3-15(25)8-18(14)20/h2-5,8-11H,6-7,12-13H2,1H3,(H,27,31)/t24-/m1/s1/f/h27H', fixed_inchikey='JZJCSVMJFIAMQB-DLYUOGNHNA-N'), experimental_data=None, expansion_tag=None, tags={}, conf_tags={}, data_format=)), ligand=Ligand(compound_name='alkane', ids=None, provenance=LigandProvenance(isomeric_smiles='CCCCCCC', inchi='InChI=1S/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3', inchi_key='IMNFDUFMRHMDMM-UHFFFAOYSA-N', fixed_inchi='InChI=1/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3', fixed_inchikey='IMNFDUFMRHMDMM-UHFFFAOYNA-N'), experimental_data=None, expansion_tag=None, tags={}, conf_tags={}, data_format=)), posed_ligand=Ligand(compound_name='alkane', ids=None, provenance=LigandProvenance(isomeric_smiles='CCCCCCC', inchi='InChI=1S/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3', inchi_key='IMNFDUFMRHMDMM-UHFFFAOYSA-N', fixed_inchi='InChI=1/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3', fixed_inchikey='IMNFDUFMRHMDMM-UHFFFAOYNA-N'), experimental_data=None, expansion_tag=None, tags={'docking-confidence-POSIT': 0.019999999552965164, '_POSIT_method': 'FRED'}, conf_tags={'compound_name': ['alkane'], 'provenance': ['{\"isomeric_smiles\": \"CCCCCCC\", \"inchi\": \"InChI=1S/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3\", \"inchi_key\": \"IMNFDUFMRHMDMM-UHFFFAOYSA-N\", \"fixed_inchi\": \"InChI=1/C7H16/c1-3-5-7-6-4-2/h3-7H2,1-2H3\", \"fixed_inchikey\": \"IMNFDUFMRHMDMM-UHFFFAOYNA-N\"}'], 'data_format': ['sdf'], 'docking-confidence-POSIT': ['0.019999999552965164'], '_POSIT_method': ['FRED']}, data_format=), probability=0.019999999552965164, provenance={'oechem': '20230910', 'oeomega': '20230910', 'oedocking': '20230910'})]\n"
]
}
],
"source": [
"# run OpenEye POSIT docking,\n",
"docker = POSITDocker(use_omega=False)\n",
"results = docker.dock([docking_input_pair], use_dask=False)\n",
"\n",
"# results is a list of POSITDockingResults, lots of info in here\n",
"print(results)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "36985506-7ebc-4a61-b0b5-8ab057b17552",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-05-10 10:39:55,212 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0\n",
"2024-05-10 10:39:55,212 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai\n",
"2024-05-10 10:39:55,212 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294\n",
"2024-05-10 10:39:55,212 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb\n",
"2024-05-10 10:39:55,387 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmplo04f74m/\n"
]
}
],
"source": [
"vizs_from_docked = html_vizualizer.visualize(\n",
" inputs=results, outpaths=[\"subpockets_from_docked.html\"], use_dask=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "40dca725-0ab7-4f45-b5f7-8f329ae15f85",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import IFrame\n",
"\n",
"IFrame(vizs_from_docked[\"html_path_pose\"][0], 1000, 1000)"
]
},
{
"cell_type": "markdown",
"id": "2b101683-2720-4c25-bf88-01537495a453",
"metadata": {},
"source": [
"We can see our alkane was docked nicely to the active site!\n"
]
},
{
"cell_type": "markdown",
"id": "cb81da0d-43f2-4086-8dc5-8305dc320eae",
"metadata": {},
"source": [
"Note that for embedding into applications you can also set `write_to_disk=False` to get the raw HTML string, for example "
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "a1b19aca-386d-461e-be13-50068e394ba9",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-05-10 10:39:57,096 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0\n",
"2024-05-10 10:39:57,096 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai\n",
"2024-05-10 10:39:57,096 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294\n",
"2024-05-10 10:39:57,096 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb\n",
"2024-05-10 10:39:57,257 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpes4h5c1i/\n"
]
}
],
"source": [
"# create a visualization factory.\n",
"html_vizualizer = HTMLVisualizer(\n",
" target=\"SARS-CoV-2-Mpro\",\n",
" color_method=\"subpockets\",\n",
" align=True,\n",
" write_to_disk=False,\n",
")\n",
"\n",
"vizs_from_docked_raw = html_vizualizer.visualize(\n",
" inputs=results, use_dask=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "0ace4e5d-5ce4-4de6-8f64-77d03869db30",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 0 | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" <!DOCTYPE HTML>\\n<html lang=\"en\">\\n <head>\\n ... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0\n",
"0 \\n\\n \\n ..."
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vizs_from_docked_raw"
]
},
{
"cell_type": "markdown",
"id": "83f21e7d-5c43-428d-bfcf-435a35329a18",
"metadata": {},
"source": [
"## HTML views of protein-ligand conformations with fitness data\n",
"\n",
"ASAP's targets are viral protein, and thus are highly mutable. An effective therapeutic must not only bind to the predominant variant currently circulating, but also regions of accessible sequence space. For this reason it is beneficial to select for interactions with highly conserved residues. \n",
"\n",
"ASAP has worked with the Bloom lab to obtain Deep Mutational Scanning ([DMS](https://www.nature.com/articles/nmeth.3027)) data for SARS-CoV-2-Mpro (https://doi.org/10.1101/2023.01.30.526314) and SARS-CoV-2-Mac1 (https://doi.org/10.1101/2023.01.30.526314) which can be visualized on the 3D protein structure to inform medicinal chemists if designed compounds are interacting with conserved or non-conserved residues. \n",
"\n",
"\n",
"These vizualisations also contain [logoplots](https://en.wikipedia.org/wiki/Sequence_logo) that can inform the viewer about the sequence space for each residue.\n",
"\n",
"\n",
"We are in the process of spinning out this fitness viewer in a self contained package called `choppa` (https://github.com/asapdiscovery/choppa), so watch this space!"
]
},
{
"cell_type": "markdown",
"id": "a75292bb-5c4e-4152-bf34-996c0fd72d7f",
"metadata": {},
"source": [
"You can easily make these visualizations by setting the `color_method` keyword to `fitness`\n",
"\n",
"Residues highlighted in red are highly mutable, white are less mutable and blue are missing data. \n"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "c4ad613a-52cd-45fb-99ce-576d58f4028f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/hugomacdermott/Desktop/asap/asapdiscovery/asapdiscovery-dataviz/asapdiscovery/dataviz/html_viz.py:815: UserWarning: Warning: no unfit residues found for residue 108 in chain A.\n",
" warnings.warn(\n",
"2024-05-10 10:40:16,776 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0\n",
"2024-05-10 10:40:16,776 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai\n",
"2024-05-10 10:40:16,776 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294\n",
"2024-05-10 10:40:16,777 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb\n",
"2024-05-10 10:40:16,957 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpz9xu9gin/\n"
]
}
],
"source": [
"# create a visualization factory.\n",
"html_vizualizer = HTMLVisualizer(\n",
" target=\"SARS-CoV-2-Mpro\",\n",
" color_method=\"fitness\",\n",
" align=True,\n",
" write_to_disk=True,\n",
" output_dir=\"tutorial_files/visualizing_asap_targets/\",\n",
" \n",
")\n",
"\n",
"vizs_from_docked_fitness = html_vizualizer.visualize(\n",
" inputs=results, outpaths=[\"fitness_from_docked.html\"], use_dask=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "a783c200-c498-4b1a-9090-2097c223c98d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import IFrame\n",
"\n",
"IFrame(vizs_from_docked_fitness[\"html_path_fitness\"][0], 1000, 1000)"
]
},
{
"cell_type": "markdown",
"id": "2295333f-292e-4b3d-ba32-37eeac23f956",
"metadata": {},
"source": [
"## Working with your own fitness data\n",
"\n",
"Currently the process to add fitness data to the drugforge repo is convolouted and labour intensive. It is better to use our prototype standalone fitness renderer [choppa](https://github.com/asapdiscovery/choppa). We are really excited about `choppa` which makes it easy to work with your own fitness data and render HTML and PyMol views easily.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
}
},
"nbformat": 4,
"nbformat_minor": 5
}