Viva Collaboration! A new paper in a PLOS journal – and epic responses to reviewers

I am excited to say our new paper on Chagas Disease just published in PLOS Neglected Tropical Diseases. This was a relatively painless process compared to previous PLOS experiences. We submitted March 31, had reviews back by May 8th and the paper was finally accepted June 5th after some back and forth. We did have some pretty extensive editor and reviewer comments to address, and in my bid for openness I now list the reviewer comments and our responses below. I am grateful to the editors and reviewers for their input. Of course I am also very grateful to our collaborators at SRI, UCSD and my co-corresponding author Jair!

To say this was an adventure would be an understatement:

This work was funded by NIAID as a phase I STTR to CDD, Inc. and upon funding we quickly had to find another collaborator as ours moved to south America and that’s a no-no for such grants. We lucked out in several respects in that the McKerrow group were kind enough to do the in vitro testing and when we found we had some interesting results we were able to get some in vivo data. There were a few dead ends along the way as we transitioned from our initial idea for the project to using machine learning models based on the public data. There was some remarkable luck in finding pyronaridine as it appeared to have been overlooked in an earlier screen by the Broad. We took a chance, retested it and then did something no one else had and that was put it in the mouse model for Chagas disease. I hope this is just the beginning for this molecule in this disease as its already used as a combination medicine for malaria. Several years of hard work by the team involved to get to this point. Its been exhausting and exhilarating in equal measures. What happens next, that’s the BIG QUESTION. Viva Collaboration!


Responses to reviews

The manuscript has been reviewed by three experts in the field with their comments and recommendations below.  I agree that this manuscript has significant merit, but requires “major revision”.  There was concern that the reported enrichment might be overstated if the chemical libraries are biased for bioactives (see reviewer 1).  The introduction contains many statements that oversimplify the true situation (see reviewer 3) and there are many other comments about style and clarity of the writing.  As pointed out by reviewer 1, the finding of several nitrofurans as hits does not support the notion that the method identifies non-obvious compounds.  The question about appropriate controls in the in vivo experiment needs to be carefully addressed.  The specific comments of the three reviewers are as follows:

Response- We have addressed the comments described below. We have proposed to use our approach with many other NTDs in future and at that point we will certainly evaluate other sets of compounds for T. cruzi activity. We were focused on drugs and natural products (components/starting points of many drugs). The current study represents a good validation of the method in our opinion without screening all 7200 to obtain complete statistics – which was outside of the scope and budget of the project. We do not think we have over simplified the situation in the introduction but we hope our edits have improved it. We found several non-nitrofurans as hits. It is our opinion that we had appropriate controls for in vivo experiments (known positive controls and vehicle).

Reviewer #1:

General comments: The authors present a new Bayesian machine learning model aimed at enriching the selection of anti-trypanosoma compounds from large chemical libraries which do not need to have biological annotation. This methodological approach may be very powerful in order to reduce sampling compound numbers down whenever there is limitation in the assay throughput. It is potentially applicable to other diseases. The validation of the approach is based on the comparison of the hit rate obtained from the final selection versus historical experience in random screening. Nevertheless it is not experimentally proven that the chemical libraries used as compound source are not biased for bioactives. If they were, the enrichment factor offered by the Bayesian model would be contaminated. It is recommended that the authors provide more compelling evidence in this regard.

 Response- We did not filter any molecules out of any of the libraries. Many of the libraries were made up of drugs so these could be classed as bioactives against a whole array of different targets, antibiotics, antifungals antidepressants etc. Without screening every molecule in vitro is unclear how many actives/ inactives there are vs T. cruzi in the 7200 compounds used for virtual screening. The whole point of the virtual screening approach is that we just test a fraction. We have tried to address in the discussion any compounds overlapping with the training set which was our bigger concern.  

As for the Target Prediction methodology, it would also be desirable that it were experimentally proven. However, it is understood that this experimentation may fall out of the scope of the current paper, so the manuscript could be published without it.

Response – we include the target prediction methodologies in the absence of experimental verification because this is work that will be performed in future. Suggesting potential targets is potentially useful to provide some idea of mechanism and allows the scientific community to empirically verify the hypothetical target.

Comments for authors´consideration: 1.    Page 4, Author´s summary, and page 11, last paragraph: the enrichment factor of the Bayesian selection vs. a random selection is not higher than 4-8 fold (as you cite far down in the paper). Can we guarantee that the compounds set “in silico screened” are not bias for bioactives (e.g. malaria, NP (cytotoxic)? If the compound sets used as source for the analysis were biased for antibiotic activity, an enrichment over random selection would also be observed. Why not a much larger compound database with random and broad chemical diversity coverage was used? Why not to test all compounds, or at least a subset randomly selected in order to ascertain false negatives and compare success rate of random vs. Bayesian tool-driven selections? Or retrospectively, for instance compounds in reference 26.

Response – It is unclear where the reviewer obtains the 4-8 fold enrichment value. As described above, none of the libraries we searched were filtered to remove any compounds such as antimalarials etc. We did not try to remove drugs of any class. It is unclear why we would want to do that if we were trying to find new molecules for T. cruzi. Similarly we did not filter for cytotoxic compounds. We could certainly screen much larger datasets but our goal was to see if the model could help identify molecules in the smaller datasets for which compounds were readily available. Clearly our results did not reveal any antibiotics as hits and it is unclear why we would want to search for antibiotics.

Ref. 26 is the GSK kineto-box which became available only after we did the initial screening and compound selection. Scoring the GSK kineto-box Chagas hits with Bayesian model may be useful to see if there were any compounds overlapping. Our goal was to perform prospective testing rather than retrospective testing.

2.    Page 6, Paragraph 1, line 9. I suggest to consider the following more recent review on this matter: Nature Reviews Drug Discovery, published online 18 July 2014; doi:10.1038/nrd4336 Ok

Response: We have added this paper on The discovery of first-in-class drugs: origins and evolution.

3.   Page 7, Paragraph 2: You may want to comment on the negative outcome of the clinical trial with Posaconazole and Ravuconazole. Since you cite non-CYP51 preclinical and clinical assets, I miss a consideration to fexinidazole and oxaboroles.

Response: We added a few sentences about the CYP51 Phase II results. The Fexinidazole work is not published yet (we spoke to Eric Chatelain). We mentioned also about fexinidazole and oxaboroles programs lead by DNDi and added refs. 

4.   Page 9, Paragraph 2, line 4 and page 10, paragraph 1: Have you had the chance of including compounds in reference [26] in your analysis? It might be a valuable set to assess false negatives retrospectively.

Response: This paper also came out after we had completed our model and screening and we have not included these compounds. Future work could make use of this although again this would be another retrospective analysis and of limited value.

5.    Page 13, line 1: typo of bracket? “… purchased from eMolecules (La Jolla, CA)…”

Response: Thank you we have corrected the position of the parenthesis.

6.    Page 14, Paragraph 2: what´s the statistical cut-off at 3xStDev? Is 50% well far above?

Response: Usually 60% is close to the 3x StDev from DMSO controls. In our screening specifically, the 3x StDev from DMSO controls was 70% in Replicate 1 and 66% in replicate 2. When we use a 50% cut-off, we are being less stringent and we know we will select false hits for dose-response, but we reduce our odds to miss a false negative.

7.    Page 16, Paragraph 3 (Results, Bayesian Models) and Figures S1 and S3. I miss the authors do not remark the nitro-aromatic/heterocyclic moiety as a “good feature” (i.e. G13 in Figure S1?). Actually, 3 out of the 11 compounds that turned out to be active in vitro (Table S2) contain this motif. Moreover, these 3 compounds are nitro-furans which clearly resemble the chemical structure of Nifurtimox, a classical toxic anti-chagasic compound in the clinic. I think the authors should discuss about this very predictable finding. It is expected that the Bayesian model also leads to selection of novel compounds which are not so close analogues of existing gold standards. Otherwise, its value-added is questionable.

 Response: We described this feature as “aromatic fragments containing basic nitrogen” We certainly found non obvious compounds like tetrandrine and pyronaridine.

8.    Page 17, Paragraph 2: I suggest indicating the total number of molecules scored through the Bayesian model. Approx 7,500?

Response: We added “Approximately 7200 molecules were screened using the Bayesian model.

9.    Page 17, Paragraph 2: Aren´t there 5, and not only 4, compounds with EC50<1 uM (including SC-0011754 in table S2)? Noteworthy, 2 out of these top 5 are nitrofurans.

Response: We have corrected this.

10.    Page 17, Paragraph 3: May the authors elaborate on the observations of adverse/toxic effects of Furazolidone and the nitrofural prodrug? And in comparison with Nifurtimox, which is not included in the experiment?

Response: With respect, we were not interested in the side effects of these very well-known and well characterized drugs for which we provided references.

11.    Table S2: SC-0011801 is last compound with EC50 figures in the table. I assume all the compounds below are inactive (>10), but why are the cells in the table empty? I am pleased to see the inclusion of EC90 and Hill slope in the table.

Response: We did not test these inactive compounds in dose-response experiments, therefore the values are not available.

12.    Table S2: I suggest you define the legend of “Infection Ratio” column. Are the responses from the primary screening at 10 uM in duplicate?

Response: Infection Ratio: number of infected cells divided by the total number of cells. Yes, primary screening was done in duplicate, thus the two values for infection ratio (at 10 uM).

13. Table S2: SC-0011754 is Nitrofural, isn´t it? If so, please add its name to the table.

 Response: Thank you – we have added it.

14. Page 18, Paragraph 2: Without experimental testing, the target prediction based on the Pathway Genome Data Base constructed cannot be validated. It is true you are building a sound hypothesis, but the goodness of the prediction is not assessed in the paper. Since, it is not experimentally proven that TR is the target for pyronaridine, I would not stress in excess the value added by the this approach.

Response: Thank you for your suggestions, this database creation is described in this paper and is now available to other researchers for the first time. We clearly described that our goal is to ultimately use the database for target identification. We did use the metabolites from the database for similarity analysis to propose targets (along with other approaches) for future testing. This represents considerable future work but is useful to propose such targets.

15. Page 20, Paragraph 1: If you discount the 3 nitrofurans as a foreseen selection, the success rate is 8/97, i.e. 8%. That means a 4-8 fold enrichment vs. random selection. Might the authors discuss whether this is higher than expected and the value-added? In order to fairly evaluate the goodness and value of the predictive model, I miss an assessment of false negatives, and a comparison with a random selection of compounds from the same set. One caveat: some of the compounds do not look like tractable for an oral and safe drug, e.g. SC-0011752, SC-0011796. Do you have cytotoxicity results for all 11 hits? Which of the 11 hits were present in the training set?

Response: Our previous work with Mycobacterium tuberculosis has shown with prospective testing that depending on the training sets and test sets we would see similar enrichments as described by the reviewer. Also we have reported hit rates from 20-70%. Generally HTS screening provides hit rates < 1% We did not try to perform random selection and did not test every compound in the libraries that were scored.  Clearly it would have been prohibitively expensive to do this. Cytotoxicity data is available for all the hits only in Table S2.

16.    Page 20, Paragraph 2: You may want to mention that Pyronaridine has been in clinical use in China. And positive opinion by EME. Current clinical trials for combinations.

Response : we have now added the following to address this ‘Pyronaridine is in clinical use as an antimalarial [90,91], is a P-glycoprotein inhibitor [92] and was given a positive opinion by the European Medicines Agency using this molecule in a combination therapy [93]. ’

Reviewer #2:

A very interesting and innovative approach to bring chemo and bioinformatics together to address research gaps in the Chagas field, and attempt to connect phenotypic and mechanistically driven approaches to drug discovery, with the goal of populating and diversifying the late stage discovery pipeline. The public access output of the research is an excellent resource. The identification of pyronaridine as a Tc in vivo active is an interesting discovery and PoC for the methodology.

Response: Thank you

The authors do not discuss any potential improvements that could be made to the model building/machine learning based on the molecules that were identified by the process.

Response: This is an interesting question. Our focus was not on how to improve the model/ machine learning. We have applied the same approach as was found to work reasonably well with TB and other diseases in our hands. A whole paper could probably be written on what other approaches could be tried which would be well outside the current scope. We could change the algorithm and descriptors and it would likely have a small effect. We could filter the training set further based on some criteria etc.

We have added There are many steps we could take to update our computational models such as incorporating the current data and using other machine learning algorithms.’

Specifically: (p12) the same set of molecular descriptors were used to evaluate natural product libraries and “drug-like” libraries (would the model be applicable to fragment libraries?)

Response: It is possible we could filter other vendor libraries and fragments, we have not tried this and it is unclear if / how this would be an improvement on what we have done.

(p16) the good molecular features identified in the DR response data alone model (Fig S1) are not well represented in the actual hit list

Response: These are just some of the good features in the training set, it is possible these features did not appear that often in the test sets of compounds.

(p17) functional groups known to be problematic in drug development are well represented in the hits eg nitro groups, furans, micheal acceptors, metal chelators (hydroxamic acid), gramine (reactive metabolite formation), poly carboxylic acids, high MW compounds eg steroids (what was the MW filter?), bisguanidines, amidines and polyoxygenated compounds.

Although most of these compounds did not repeat in the hit confirmation assay, can the authors comment on the filtering and selection process to remove nuisance compounds (frequents hitters in HTS sets – now described as PAINS (Baell et al)) or reactive functional groups from the learning libraries.

Response: We did not filter the sets of drugs and other natural products using the PAINS filters (if we were using large vendor libraries this would be a valid filter). It is well known that the T. cruzi molecules contain these groups but they have been known for a long time. We performed dose response analysis only on the active compounds 11/17 had EC50 < 10uM.

Reviewer #3:

This is potentially an interesting study that uses various computational approaches to facilitate drug discovery for Chagas disease and applies these approaches in proof-of-principle experiments. However, the actual content suffers from many shortcomings that need to be addressed before the story can be published. The main concerns of this reviewer include: 1. There is no entry for T. cruzi in the BioCyc database even though the authors claim they have created such a database. 2. Some experiments were executed without proper controls. 3. Manuscript text is not written well and contains many unsubstantiated claims, numerous typos, missing information and it is difficult to follow at times. It needs significant work and I flagged the most obvious examples in the specific comment section below, but that list is not exhaustive at all.

Response 1. We sincerely apologize that T. cruzi was unavailable in BioCyc when the reviewer attempted to access it. There was a server powerdown and it did not automatically come back up, and we are working to ensure it does not go down again. The database can be accessed at http://node2.csl.sri.com:1555/

Response 2 We have corrected the omission of the vehicle control in the methods which we described in the figure previously.

Response 3. We have made significant changes to address all these concerns.

Specific comments. Methodology/Principal Findings p.3 – “97 compounds…”- Please change to “Ninety-seven compounds…”.

Response: We have changed this.

p.3 – “We progressed five compounds to an in vivo mouse efficacy model…” Efficacy model of what disease? This is the first time in the manuscript any mouse model is mentioned.

Response: We have changed this.

Author Summary p.4 – “We have used data from a phenotypic screen to build Bayesian models to predict activity against T. cruzi in vitro.”  Predict activity of what?

Response: We have changed this. To anti-parasitic activity

p.4 – “We identified the antimalarial pyronaridine has having in vivo efficacy and providing us with a new starting point for further investigation and optimization.” Please change this sentence to a grammatically correct form.

Response: We have changed this.

Introduction p.6 – “In the 1980’s the pharmaceutical industry took advantage of advances in molecular biology/genetic engineering and began replacing cumbersome phenotypic and whole cell HTS with target-based screening assays.” The reason for replacing phenotypic assays with target-based ones was not because of the former being cumbersome. They are not – please delete this word. The sentence also implies that phenotypic and whole cell HTS assays were not one and the same thing. I am not aware of any phenotypic HTS run in the 1980s or earlier that would not be a whole cell assay. Please use ‘phenotypic’, ‘whole cell’, or ‘phenotypic, whole cell’.

Response: We have changed this.

p.6 – ‘While target-based screens using simple recombinant protein enzyme assays offer many advantages in terms of cost and scalability,…’ Please change ‘enzyme’ to ‘enzymatic’.

Response: We have changed this.

Target-based assays do not provide advantages over cell-based assays in terms cost or scalability, and these are definitely not the reasons why they are being used. This sentence needs to be changed.

Response: We respectfully disagree with the reviewer. In our experience target-based assays are cheaper and can be performed at a much higher throughput than a cell-based assay.

p.6 – “they rely on the assumption that the selected target is in fact the best and most druggable target for a given disease.”  Again, this is not the case. The assumption is that the target is good enough to yield a drug that can meet the drug target product profile for a given disease.

Response: We have changed this.

p.6 – “…especially for neglected infectious diseases where drug targets are poorly understood or target-based approaches have been unsuccessful in the past [1] (or a complete failure [2]).” Please remove the sentence in the parentheses and reference [2]. Common bacterial infections described in this article (such those caused by Staphylococcus aureus) do not belong among neglected infectious diseases. Additionally, cell-based screens described in the reference [2] fared equally bad in terms of finding promising hits as the target-based screens described in the same paper, so the reference does not support the point authors are trying to make.

Response we changed it to : “Nonetheless, in the last decade, there has been a shift back towards using phenotypic screens as a starting point for drug discovery, especially for infectious diseases where drug targets are poorly understood or target-based approaches have been unsuccessful in the past [1]. In fact, analysis of the origin of first-in-class small molecules found that phenotypic screens identified more novel inhibitors than any other approach between 1999 and 2008 [3].”

p.6 – “One such disease area, where target-based drug discovery has largely failed, is in the field of neglected tropical diseases (NTDs).” This is again incorrect. There simply are almost no validated targets for these diseases and target-based drug discovery was barely attempted. If authors have knowledge of failed target-based programs, they should include relevant references after this statement.

 Response ; we respectfully differ in our opinion. We provide plenty of examples of target based drug discovery such as: CYP51 and cruzain as mentioned in the text, glycoproteins and glycolipids synthesis (mucins, trans-sialidase), DNA topoisomerase and other enzymes involved on the replication of the DNA, phosphodiesterase C, etc.

6, 7 – “The trend towards using phenotypic screens over target-based screens is particularly strong for NTDs and related bacterial and fungal pathogens.” There are no fungal diseases on the NTD list as far as I know. In what sense are these fungal pathogens related to the NTDs?

Response: We have changed it to ‘The trend towards using phenotypic screens over target-based screens is particularly strong for NTDs as well as bacterial and fungal pathogens.’

p. 7 – “The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity (currently focused on 3 targets, CYP51, cruzain and genes associated with DNA damage) [12-14]. Reference 14 leads to a website that offers Internet Information Services (IIS) for Windows® Server. Please provide references that support claims made in the text (perhaps this one? – ‘http://www.bvgh.org/Current-Programs/Neglected-Disease-Product-Pipelines/Global-Health-Primer.aspx’).

Response: We apologize for this broken link – we have changed this.

7 – “The remaining three products in clinical development (Phase I and II) target a single enzyme, CYP51, which has been the focus of Chagas disease research to date [17-22].” This is incorrect; please change to ‘the focus of Chagas disease drug development’.  

Response: We have changed this.

7 – “The only additional novel drug target with a single compound in preclinical development is cruzain, a T. cruzi cysteine protease and there is considerable literature surrounding this class of inhibitor.” Please change to ‘this inhibitor’ or ‘this class of inhibitors’.

Response: We have changed this.

8 – “There have been some target-based high throughput screens for CYP51 [22] and cruzain [24] as well as virtual screening for cruzain [23].” I assume that the authors mean that there were some screens to discover INHIBITORS of CYP51 and cruzain. If that is the case, please modify the sentence.

Response: We have changed this.

p. 9 – “In addition we have created a BioCyc database for T. cruzi, which complements other sources of related metabolic pathway data (including KEGG T. cruzi pathways [48]. Currently, there is no Trypanosoma cruzi organism database in BioCyc.

Response: We sincerely apologize that T.Cruzi was unavailable in BioCyc when the reviewer attempted to access it. There was a server powerdown and it did not automatically come back up, and we are working to ensure it does not go down again. The database can be accessed at http://node2.csl.sri.com:1555/

Methods p. 10 – “CDD database and Chagas datasets.” The authors need to specify the name of the dataset that was created as part of this publication. Is it “Trypanosome: Chagas Disease Literature Compounds”?

Response: The Broad dataset was named TRYPANOSOME: Broad Primary HTS to identify inhibitors of T. cruzi Replication

In the process of this work we also curated the dataset Trypanosome: Chagas Disease Literature Compounds

The molecules in Table S2 are currently in a private vault and will be shared at a later date.

p. 10 – “Data annotation and Pathway Genome Data Base construction”.  A large part of this text belongs into the Results section. Figure 1 is not very informative as it completely lacks any description other than Pathway Genome Data Base for T. cruzi. It is unclear to me what various symbols shown in the Figure mean.

Response we have added a results section for the PGDB and a description for Figure 1.

“A PGDB was constructed for T. cruzi using the complete genome sequence of the Dm28c strain (Figure 1).  The underlying genome sequence consisted of 5,287 contigs assembled into 1,378 scaffolds of 30,716,540 base pairs.  Pathologic found 11,349 distinct gene products, at least 880 of which were found to be enzymes and at least 16 of which are transporters. Pathologic was able to infer 1030 enzymatic reactions and 122 pathways from these assignments as well as the existence of 806 metabolic compounds. This set was filtered to 358 molecules after removal of compounds with R- groups and small nuisance molecules. This dataset was then used to infer potential targets by comparing the Tanimoto similarity with a phenotypic screening hit [39].”

12 – “These models were used to score the following drug libraries; Selleck Chemicals (Houston, TX) natural product library (139 molecules)…” The Selleck natural product library contains 131 natural products, not 139. Please check which of these 2 numbers is correct.
Response: The version of this library downloaded several years ago contains 139 molecules.

12 – “…and Traditional Chinese Medicine components (373 molecules).” Please, include reference for this library or, if a reference is unavailable, a brief description how it was put together.
Response : This library represents some common single component TCMs and was kindly provided by Dr. Ni Ai, Zhejiang University, China.

14 – “Hit selection and secondary screening (dose-response assay)” This section includes a repetition of the T. cruzi infection assay description from the preceding section (“Primary In vitro screening”) and contains some additional experimental details. If both sections refer to the same assay as it seems, the authors should avoid the repetition and should include only the more detailed protocol version in the manuscript.
Response: The methods section has been modified following the reviewer’s advice.

15 – “Mice were housed at a maximum of 5 per cage and kept in a specific-pathogen free (SPF) room…” Please change to “specific pathogen-free…”.
Response: We have changed this.

15 – “To infect the mice, trypomastigotes of T. cruzi Brazil luc strain were harvested from culture supernatant and injected intraperitonealy, 105 trypomastigotes per mouse.” Please describe how trypomastigotes were harvested and the volume of parasite suspension used for the infection. The level of detail is not sufficient for the experiment to be repeated by an independent group.

Response: We have changed this. Additional information was included.
Correct spelling is ‘intraperitoneally’.


Response: We have changed this.

p. 15 – “Starting on day 3 the infected mice were treated with test compounds at 50 mg/kg administered in 20% Kolliphor, i.p., b.i.d., for four consecutive days.” This experiment is missing the negative control in which infected mice are treated with the vehicle by IP injection. As both parasites and experimental compounds are injected into intraperitoneal cavity, the observed anti-parasitic effect might be at least partly due to the direct effect of the vehicle on the parasites proliferating in the IP cavity. The authors need to repeat this experiment with all the necessary controls.


Response: The controls were administered using the same route as the tested compounds: IP. The text was corrected.

Please include the volume used for injection of compounds. ‘Intraperitoneal’ is one word and should be abbreviated as ‘IP’ instead of  ‘i.p.’


Response: The Volume was included.

p. 15 – “At day 7 post-infection, the luminescent signal from infected mice was read upon injection of D-luciferin.” Please describe in more detail how this was done or include a reference. The level of detail is not sufficient for the experiment to be repeated by an independent group.


Response: We have changed this.

15 – “The absolute numbers of measured photons/s/cm2 were averaged between all five mice in each group and compared directly with compound-treated mice and the control groups.” This sentence does not make much sense. How did authors compare the luminescence signals with compound-treated mice and the control groups?


Response: We have changed this.

p. 15 – “The efficacy percentage was calculated based on relative luminescence signal reduction compared to the controls.” I assume they used the vehicle-treated group for luciferase signal normalization. This needs to be specified in more detail how it was done.


Response: We have changed this.

p. 16 – “Two tailed paired Student t test was used to assess statistical significance between luminescence values from vehicle-treated and compound-treated groups at day 7 postinfection.” The authors need to specify what hypothesis was tested – presumably that two such datasets are different?

 Response: We have rephrased this

p. 16 – “The UCSD Institutional Animal Care and Use Committee reviewed and ‘approved’ this study with protocol number S14187.” Why is the word “‘approved'” used with quotation marks?
Response: We had written – ‘All animal protocols were approved and carried out in accordance with the guidelines established by the Institutional Animal Care and Use Committee from UCSD (Protocol S14187).’

Results p. 16 – “Using either dose response data alone or the combination or dose response and cytotoxicity (dual activity) resulted in statistically comparable models.” This sentence is difficult to understand. What combination do authors have in mind?


Response: We changed it to Using either dose response data alone or the combination of dose response and cytotoxicity (dual activity) resulted in statistically comparable models.

16 – “Both had leave one out ROC values greater than 0.8 (Table 1).” I do not understand this sentence. Do the authors mean ‘leave-one-out ROC values’? Also, what is ROC? This parameter is not described in the Methods section. Does this refer to the ROC AUC parameter?


Response: ROC is now spelled out – it can be used interchangeable with ROC AUC. This is a standard measure of how the model performs in predicting compounds left out (whether leave one out, 5 fold cross validation or leave out 50% x 100 cross validation etc).

p. 16 – “The use of FCFP_6 fingerprints enabled the good features important for activity to be visualized in the dose response data alone model (Figure S1)…” What do authors mean by the good features? This is not postulated anywhere in the article. Do they mean features that positively correlate with compound potency or selectivity?


Response: We have changed the text to: ‘The use of FCFP_6 fingerprints enabled the features important for activity (termed good features) to be visualized in the dose response data alone model (Figure S1) which included tertiary amines, piperidines and aromatic fragments containing basic nitrogen functionality while those features that were negatively related to activity included cyclic hydrazines prone to tautomerization as well as a number of electron-poor chlorinated aromatic systems (Figure S2).’

p. 17 – “…providing very clear trends and perhaps a rationale for why the enrichments are so dramatic for these systems.” It is not explained anywhere what do the authors mean by ‘the enrichments’.


Response: We have deleted this statement.

p. 17 – “Ninety seven molecules were tested and 11 were found to have EC50 values less than 10 uM (Table S2). Four of these molecules (verapamil, pyronaridine, furazolidone and tetrandrine) had in vitro EC50 values less than 1 uM (Table 2).” The authors need to list how many independent experiments they ran for determining the EC50 values (n). Based on the Methods section it appears that they ran only one experiment in duplicate. If that is the case, they need to repeat experiments so that n=3 at least.


Response: Primary screening was done in duplicate

p. 17,18 – “In vivo testing” The experiment needs to be repeated with proper controls. Please see my comment in the Methods sections.


Response: As described in the methods we used the correct vehicle control.

p. 18  – “The molecules with the highest Tanimoto similarity in CDD were T. cruzi GAPDH inhibitors (Figure S6).” First of all, it is not clear from the Figure S6 legend what this figure shows. What molecule was used for this similarity search? Secondly, the shown molecules have only 43% similarity to pyroninaridine (assuming this compound was used in the search) and look very different.  The authors do not make any effort to introduce a validated metric that would assess how reliable this type of prediction could be. Unless they present some experimental evidence, this section does not have any value and should be deleted. Such a conclusion is further confirmed by quite extensive list of proposed targets that follows in the text and includes polyamine biosynthesis, trypanothione disulfide reductase, and topoisomerase VI.


Response : The figure S6 shows the similarity search of a public curated dataset using pyronaridine in CDD of Chagas related compounds from the literature. This is a different analysis to using a similarity search of the metabolites in the PGDB which resulted in one molecule with 67% Tanimoto similarity. Tanimoto similarity is widely used for similarity searching. We propose that this section of the manuscript uses all the molecule sets available and the metabolites from the PGDB to try to infer potential targets. Similarly searching in ChEMBL and other public sources is a sensible approach to try to leverage the information that is already existing for compounds and their known targets.

Discussion p. 20 – “Historically, for a diversity-based library undergoing HTS, it is expected a range of 1 to 2% of hits based on observed activity (usually >50% antiparasitic activity at 10 μM and no signs of cytotoxicity at this concentration) will be observed [30]. Screening in the attached reference was done at 3.75 uM compound concentration and resulted in identification of ~ 4,000 hits from 300k small molecule library. As the screening described in the current article was done at 10 uM compound concentration, one would expect the hit rate to be significantly higher than 1-2%. To make assessment whether 10% hit rate is higher than a hit rate in an unbiased screen, the authors need to include in Table S2 CC50 values for the listed compounds.


Response : Cytotoxicity data was generated only for compounds with dose response data

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>