drugforge.spectrum.fitness.apply_bloom_abstraction

drugforge.spectrum.fitness.apply_bloom_abstraction(fitness_dataframe: DataFrame, threshold: float) dict[source]

Read a pandas DF containing fitness data parsed from a JSON in .parse_fitness_json() and return a processed dictionary with averaged fitness scores per residue. This is the current recommended method to get to a single value per residue. This function can be extended when the recommendation changes.

Parameters:
  • fitness_dataframe (pd.DataFrame) – DataFrame containing columns [gene, site, mutant, fitness, expected_count, wildtype]

  • threshold (float) – fitness value to use as minimum value threshold to treat a mutation as acceptably fit.

Returns:

fitness_dict

Dictionary where keys are residue indices underscored with chain IDs, keys are: [

mean_fitness, wildtype_residue, most fit mutation, least fit mutation, total count (~confidence)

]

Return type:

dict