My experience of submitting another manuscript to PLOSONE

Its been several years since I last submitted to PLOSONE . Since then we embarked on a Phase II grant working on tuberculosis and after much work by many labs at Rutgers and SRI we are now submitting the work today. So we decided to give PLOSONE a second chance. In the interim I have published in PLOS Computational Biology and PLOS Neglected Tropical Diseases.

So one of the challenges of having a large collaboration is the number of co-authors, this one has 13. Using Editorial Manager just adding author details alone took nearly 1h. They do not make it easy to enter addresses for people from the same group either. Then there are the picky figure resolution and image file formats to deal with that create hours of work for the non graphic designers out there. After trying to get images of the right resolution and size  etc. I have probably re-uploaded at least 10 times, waiting to read the figure quality report only to see the word fail. I have become far too familiar with a graphics program called GIMP and the removal of the “alpha channel”. Frankly I do not want to be spending my time working as a graphic artist just to publish in PLOS, I am beyond tired of their requirements. Compared to other journals like ACS, Springer, F1000Reports etc, submitting to PLOS ones are time consuming in my opinion. After spending 5.5hrs working to submit the manuscript, 4 figures and 5 supplemental files, I am wondering why anyone would want to do this again. Lets see how it does in review.



Wearing many hats

I wear a few different hats and yesterday I was CEO, working on the commercialization plan for a phase II grant for a rare disease. To this point the big selling point was getting a Rare Pediatric Disease Priority Review Voucher but this is a challenge because its unclear if there will be more of these in the future. This lead me to write a post for the good people at Rare Disease Report (any fees will be donated to rare disease foundations).

Today is another day and I am wearing another hat, working on grants for a different company with different challenges.


Targeted Drug Development : The FDA perspective – A more approachable regulator?

As I am slowly catching up on my reading and I came across this new white paper type doc from the FDA which came out a few weeks ago called Targeted Drug Development: Why are Many Diseases Lagging Behind?

It makes for an interesting read and of course conjures up images of diseases on a race track.. but should it? Tortoise and hare each being different diseases, Alzheimers and Hepatitis C perhaps in this analogy. 

I did get a sense of the FDA throwing down a challenge and saying that discovery and testing had not kept up, the regulatory side was doing just fine thanks, but that’s just my take.

The diseases chosen were those which billions of dollars / euros have been invested, cancer, diabetes, Alzheimers, HIV/AIDS and Hepatitis C. And then there is a little mention of rare diseases, which appear to be the Cinderella here, they are generally poorly funded, have few patients and have to be creative to make their mark.

Take home messages –

These were the need and use of surrogate endpoints in clinical trials. Flexible trial designs were also mentioned and this could mean many things such as adaptive trials, or anything that veers away from the gold standard randomized placebo controlled trial. The need for the FDA to work closely (collaborate) with companies, academia and patient organizations was also touted as important.

Is this a sign of a more approachable (read warmer and cuddlier) FDA as they push the US as leading the world in the introduction of novel drugs?

Personally I think a 13 page PDF is a bit much, it could have been boiled down to one page. My summary is that even if you spend billions on a big disease over decades you still may not have what it takes to find a treatment. Focus on what clearly matters to the FDA, come up with a good surrogate endpoint, plan a flexible trial, collaborate and voila you should be able to succeed. According to this document it sounds like the FDA has development covered and it is a shot across the bows to discovery to clean up their act and get with the program. Not so fast you might say – I can bet coming soon is a comparable doc from the NIH saying we need more funding to improve the discovery situation. So is this all posturing ready for the next president you might ask? Well I will let you decide, but frankly I wished the FDA would just focus on getting more drugs approved and leave white papers to others.



Wiki, Wiki, Wiki – for chemical probes

Sorry for the cheap title (for those born after the mid 80’s its a reference to a song by a funk band called Newcleus).

A very interesting commentary published today in Nature Chemical Biology by Arrowsmith et al and its all about chemical probes. Go check it out and note that it took over 50 people to put this together. I am not an expert on these things but at a guess it could have been written by a handful of authors…how many of these authors are tacked on for show?

I found out about it yesterday and was asked to comment on it for Chemistry World – I would have much preferred to have reviewed it for Nature Chemical Biology and then perhaps this would have ended up in a journal that is more appropriate based on the results it contains.

The results are a Wiki.. and so far the Wiki contains only 7 chemical probes. Yes I had to look a few times to make sure I had not gone to the wrong address today when it published – when I looked yesterday all I could find was a GoDaddy site. With over 50 authors from very distinguished labs I would have expected them to push out (or their students) a grand total of more than 7 probes. Dare I mention how many $millions these labs pull in from the tax payer to work on finding chemical probes..naming no specific names. And then there are the big pharma folks, pretty sure they have wasted a good few $billion pursuing avenues of research based on duff probes..they could have chipped in a few that actually work. Why not add them to the commentary for good measure? Yes Wiki’s take lots of effort but based on this, the wiki was an afterthought, the commentary came first. I would hazard a wild guess that there will be less than 50 people submitting probes to this wiki (outside of the authors). I hope WellcomeTrust do not look at what they have funded anytime soon – they may want to wait until the number of probes at least reaches double figures. And with that they should have published when there was something substantial to show.

I do not mean to come across as cynical because a few of the authors are widely admired (by me at least for their consistent efforts in exposing crummy molecules, probes and drugs)..Some of the authors are admired for other things that generally do not involve chemical probes or developing wiki’s.

My actual email to the reporter from Chemistry World is below – you can see how just one line was chosen out of several choice quotes – IMHO producing a database would have been ideal that is at least structure searchable (we are dealing with molecules here after all) and contained all the probes out there (good and bad).. And how can you totally ignore the massively funded NIH chemical probes effort when one of the authors is from the NIH? Why did this have to come from the Structural Genomics Consortium? The more I read it and think about it, the more questions.


Dear Ida,

I am happy to provide some comment on reading this paper. Thank you for bringing it to my attention.

My first comment is does it really need 53 authors to preach to the choir on this topic? Or does this journal only accept mega-author papers
Many of these groups are either responsible for putting probes out there / funded to the total tune of over $500M by the NIH over the past decade, they should have thought about the consequences of what they were doing at the outset and planned a database of probes and information.

Is it me or has the definition of a probe and requirement continually shifted with each publication?

A chemical probes portal – a great idea but the url appeared to fail – did the reviewers of the article test it –(I get a godaddy site!)
Most people would write a paper after they had done something useful, perhaps it would have helped if they created the wiki first, made sure it was functional, then publish a paper on it?
No idea if such a portal would actually be searchable by structure – that was not suggested in the paper by the authors!

Most people would probably come across the probes with issues on “in the pipeline” blog – Derek Lowe does a good job of alerting the community anyway.

The authors did not cite our recent summary of issues with chemical probes – which is unfortunate, I know at least one of the 53 authors read it- Parallel worlds of public and commercial bioactive chemistry data. – PubMed – NCBI 

Hope that helps.







What’s needed for TB: A critical analysis of the drug discovery pipeline and efforts

After attending the Gordon conference on Tuberculosis (TB) drug discovery (without breaking the rules on describing what was presented) I think I can safely make a few general comments which were already in my mind before the meeting. This week what stood out for me was the lack of knowledge of all the efforts that are going on globally in TB. No single person has attempted to summarize the various global initiatives, the multigroup collaborations, the consortia of all shapes and sizes and at all stages. When you ask colleagues you know or those you meet for the first time, they do not have a complete picture. Most of the TB initiatives are under the radar to most. Frankly if we are to make the case for more funding there needs to be an unbiased and transparent assessment of the preclinical and clinical efforts, the funding situation in each country has to also undergo some scrutiny and any non essential duplication could be avoided.
How could this be remedied. A team needs to be formed that can cover the biology and chemistry as well as some of the funding elements and write it up succinctly. While there are pipelines for drugs, there is no definitive source of all the preclinical hits or leads. There is no summary table of all the consortia or major funded collaborations in TB. It’s not clear what consortia are open or closed to others not in them. It’s not even clear what efforts are being made to learn from the data that has been generated historically or is being generated (e.g. all the screening data). For what targets is there duplication of effort, which have failed, or considered most or less valuable?  What libraries have been missed? What chemical space has been covered and what should be done next (most do not truly understand the drug discovery process)? This kind of analysis probably happens frequently in many other diseases but I think it is sorely needed for TB. Is there anyone willing to help do this?


Waging war on infectious diseases

While I am at the Gordon Conference on Tuberculosis (TB) Drug Discovery and Development this week I cannot live Tweet due to their policy. I do however have some thoughts based on my observations.

This is not a war fought by soldiers armed with guns and artillery against an enemy similarly armed, this is a long war in which each side has evolved its weapons. The enemy is infectious diseases, they mutate, they morph, they resist our weapon of choice the drug. Fortunately through technology advances our drugs also change, we identify new chemistry, we find new targets and we combine our weapons to make them lethal.

These soldiers do not get recognition, their battlefield is the laboratory and their enemy is contained within plastic plates or mice infected with the disease. This practice battleground prepares them for the real battle in countries around the globe, the billions affected and the millions of casualties every year.

What scars do our soldiers show, the years of research against a microscopic foe. Their brains the real arsenal, combine and collaborate. But do they suffer too? We take for granted our scientists and the mental anguish they face. Take a disease like TB which has existed for millennia, humans have battled for just as long. Only in the past 70 years have we fought back, and now we are at status quo, our drugs are losing their bite, our funding is diminishing in the west. Are we losing our way? Where are our leaders and what are our new heavy artillery to put in the front line?

As I speak to young scientists focused on the disease its pretty clear they plan to devote their whole career to this disease, their goal is to find a cure. The pipeline looks dire whichever one you pick (1, 2), we therefore have to look in new places for ideas. I think we have to invest in the new generation of scientists as we look to the future. This war will not be won by refusing to fund R&D, it will not be won by continuing to do the same old approaches. I look forward to openly sharing my ideas this week.



Observations on big and small collaborations

Its been a pretty hectic year to date with several grants coming to their conclusion and the need for writing up final reports and manuscripts. In addition it has provided just a little time for reflection. Pretty much everything I am involved in is a collaboration of some sort so I am just a tiny piece of a pretty complex puzzle. Some of the collaborations are small scale (one or two labs) while others are much bigger (MM4TB ~20 labs). In the latter case I have noticed as we got further into the project the sense of collaboration really took off. Which likely suggests that funding a group for 5 years is really too short because just as they get into their stride the project will end. I think this is probably natural as it takes a while for researchers to get to know each other and the more moving parts, the longer this takes. Initially within such a collaboration groups likely self organize with those they know already or feel comfortable with. But how do you get a big team to address a complex set of objectives, to think as one? Does this also suggest there is some optimal group size for collaboration? Possibly. What do you need as a core team for a drug discovery collaboration? Obviously it may depend on the disease to some extent. Take TB. You have in vitro screening, molecular biology, structural biology, medicinal chemistry, computational chemistry, ADME screening, in vivo testing skill sets / components. How many people are needed for each component? Do you need duplication of resources to verify what is seen in multiple labs or cover for illness or problems with experiments not progressing etc? There are a lot of questions. Ideally in a collaboration / project everyone should know each other before the start and be comfortable with your colleagues and this may mean understanding everyones strengths and weaknesses. It is also likely you need at least 2 of everything so this would suggest about 10 labs. This is much closer to the NIH ‘center grant’ type model. Too many groups and collaboration may only happen in small clusters, too few groups and everyone is overwhelmed.

If I was to do something like another large scale collaboration (e.g. MM4TB, which has been an amazing and productive experience to be involved in) I would probably tackle TB from my personally biased direction. Namely, use all the TB knowledge we have accumulated and use the computational models to identify new in vivo active compounds (start with in vivo and not in vitro). I would also combine with some structure based work, docking known in vitro whole cell actives into TB targets as a way to de-orphan compounds. There are literally thousands of HTS screening hits from whole cell screening but there have been minimal efforts to identify which of these have in vivo activity. For all the faults in the old in vivo TB literature data I would use it as a resource from the beginning. There would also be very tight loops between prediction and testing and then updating models. So future collaborations in TB should have a bigger computational and modeling component to compliment medicinal chemistry and biology aspects. While I think smaller scale collaborations do a better job of balancing the experimental and computational aspect, they often lack the breadth of the team you get in a big collaboration. You miss out on that accumulated knowledge. It will be interesting to see if any of these ideas are addressed at next weeks TB Gordon conference. Of course the team size pretty much depends on the grant source. STTR and SBIR funds will dictate small scale collaborations while bigger center grants and the past FP7 funding from the EC has lead to the larger scale TB collaborations. I am still learning from exposure to these kinds of projects and time will tell which will have long term impact. Either way, we need to focus training students to work well in collaborations.

It also occurred to me perhaps these observations could be useful for other diseases and drug discovery. For example in PubMed there are similar numbers of references on drug discovery for TB as for bigger diseases that we think of such as cancer, depression etc (see the rough figure). While admittedly much of my recent work has been on more neglected or rare diseases (Charcot-Marie-Tooth, Ebola, Chagas disease etc) , could the general ideas for a collaborative drug discovery team gleaned from TB be applicable to smaller and bigger diseases? This has implications because it may point to ways in which we can equalize out the effort to spread more resources to the thousands of rare diseases.  Perhaps young researchers will self organize and work on the rare and neglected diseases, but there has to be funding there for small molecule drug discovery. We cannot pull the money from such collaborations as we are seeing in Europe. This perhaps is something that could be expanded as a future topic. How do we learn the best ways to collaborate and pass these skills on to the next generation, and make sure that all diseases benefit from these efforts.

diseases fig



Viva Collaboration! A new paper in a PLOS journal – and epic responses to reviewers

I am excited to say our new paper on Chagas Disease just published in PLOS Neglected Tropical Diseases. This was a relatively painless process compared to previous PLOS experiences. We submitted March 31, had reviews back by May 8th and the paper was finally accepted June 5th after some back and forth. We did have some pretty extensive editor and reviewer comments to address, and in my bid for openness I now list the reviewer comments and our responses below. I am grateful to the editors and reviewers for their input. Of course I am also very grateful to our collaborators at SRI, UCSD and my co-corresponding author Jair!

To say this was an adventure would be an understatement:

This work was funded by NIAID as a phase I STTR to CDD, Inc. and upon funding we quickly had to find another collaborator as ours moved to south America and that’s a no-no for such grants. We lucked out in several respects in that the McKerrow group were kind enough to do the in vitro testing and when we found we had some interesting results we were able to get some in vivo data. There were a few dead ends along the way as we transitioned from our initial idea for the project to using machine learning models based on the public data. There was some remarkable luck in finding pyronaridine as it appeared to have been overlooked in an earlier screen by the Broad. We took a chance, retested it and then did something no one else had and that was put it in the mouse model for Chagas disease. I hope this is just the beginning for this molecule in this disease as its already used as a combination medicine for malaria. Several years of hard work by the team involved to get to this point. Its been exhausting and exhilarating in equal measures. What happens next, that’s the BIG QUESTION. Viva Collaboration!


Responses to reviews

The manuscript has been reviewed by three experts in the field with their comments and recommendations below.  I agree that this manuscript has significant merit, but requires “major revision”.  There was concern that the reported enrichment might be overstated if the chemical libraries are biased for bioactives (see reviewer 1).  The introduction contains many statements that oversimplify the true situation (see reviewer 3) and there are many other comments about style and clarity of the writing.  As pointed out by reviewer 1, the finding of several nitrofurans as hits does not support the notion that the method identifies non-obvious compounds.  The question about appropriate controls in the in vivo experiment needs to be carefully addressed.  The specific comments of the three reviewers are as follows:

Response- We have addressed the comments described below. We have proposed to use our approach with many other NTDs in future and at that point we will certainly evaluate other sets of compounds for T. cruzi activity. We were focused on drugs and natural products (components/starting points of many drugs). The current study represents a good validation of the method in our opinion without screening all 7200 to obtain complete statistics – which was outside of the scope and budget of the project. We do not think we have over simplified the situation in the introduction but we hope our edits have improved it. We found several non-nitrofurans as hits. It is our opinion that we had appropriate controls for in vivo experiments (known positive controls and vehicle).

Reviewer #1:

General comments: The authors present a new Bayesian machine learning model aimed at enriching the selection of anti-trypanosoma compounds from large chemical libraries which do not need to have biological annotation. This methodological approach may be very powerful in order to reduce sampling compound numbers down whenever there is limitation in the assay throughput. It is potentially applicable to other diseases. The validation of the approach is based on the comparison of the hit rate obtained from the final selection versus historical experience in random screening. Nevertheless it is not experimentally proven that the chemical libraries used as compound source are not biased for bioactives. If they were, the enrichment factor offered by the Bayesian model would be contaminated. It is recommended that the authors provide more compelling evidence in this regard.

 Response- We did not filter any molecules out of any of the libraries. Many of the libraries were made up of drugs so these could be classed as bioactives against a whole array of different targets, antibiotics, antifungals antidepressants etc. Without screening every molecule in vitro is unclear how many actives/ inactives there are vs T. cruzi in the 7200 compounds used for virtual screening. The whole point of the virtual screening approach is that we just test a fraction. We have tried to address in the discussion any compounds overlapping with the training set which was our bigger concern.  

As for the Target Prediction methodology, it would also be desirable that it were experimentally proven. However, it is understood that this experimentation may fall out of the scope of the current paper, so the manuscript could be published without it.

Response – we include the target prediction methodologies in the absence of experimental verification because this is work that will be performed in future. Suggesting potential targets is potentially useful to provide some idea of mechanism and allows the scientific community to empirically verify the hypothetical target.

Comments for authors´consideration: 1.    Page 4, Author´s summary, and page 11, last paragraph: the enrichment factor of the Bayesian selection vs. a random selection is not higher than 4-8 fold (as you cite far down in the paper). Can we guarantee that the compounds set “in silico screened” are not bias for bioactives (e.g. malaria, NP (cytotoxic)? If the compound sets used as source for the analysis were biased for antibiotic activity, an enrichment over random selection would also be observed. Why not a much larger compound database with random and broad chemical diversity coverage was used? Why not to test all compounds, or at least a subset randomly selected in order to ascertain false negatives and compare success rate of random vs. Bayesian tool-driven selections? Or retrospectively, for instance compounds in reference 26.

Response – It is unclear where the reviewer obtains the 4-8 fold enrichment value. As described above, none of the libraries we searched were filtered to remove any compounds such as antimalarials etc. We did not try to remove drugs of any class. It is unclear why we would want to do that if we were trying to find new molecules for T. cruzi. Similarly we did not filter for cytotoxic compounds. We could certainly screen much larger datasets but our goal was to see if the model could help identify molecules in the smaller datasets for which compounds were readily available. Clearly our results did not reveal any antibiotics as hits and it is unclear why we would want to search for antibiotics.

Ref. 26 is the GSK kineto-box which became available only after we did the initial screening and compound selection. Scoring the GSK kineto-box Chagas hits with Bayesian model may be useful to see if there were any compounds overlapping. Our goal was to perform prospective testing rather than retrospective testing.

2.    Page 6, Paragraph 1, line 9. I suggest to consider the following more recent review on this matter: Nature Reviews Drug Discovery, published online 18 July 2014; doi:10.1038/nrd4336 Ok

Response: We have added this paper on The discovery of first-in-class drugs: origins and evolution.

3.   Page 7, Paragraph 2: You may want to comment on the negative outcome of the clinical trial with Posaconazole and Ravuconazole. Since you cite non-CYP51 preclinical and clinical assets, I miss a consideration to fexinidazole and oxaboroles.

Response: We added a few sentences about the CYP51 Phase II results. The Fexinidazole work is not published yet (we spoke to Eric Chatelain). We mentioned also about fexinidazole and oxaboroles programs lead by DNDi and added refs. 

4.   Page 9, Paragraph 2, line 4 and page 10, paragraph 1: Have you had the chance of including compounds in reference [26] in your analysis? It might be a valuable set to assess false negatives retrospectively.

Response: This paper also came out after we had completed our model and screening and we have not included these compounds. Future work could make use of this although again this would be another retrospective analysis and of limited value.

5.    Page 13, line 1: typo of bracket? “… purchased from eMolecules (La Jolla, CA)…”

Response: Thank you we have corrected the position of the parenthesis.

6.    Page 14, Paragraph 2: what´s the statistical cut-off at 3xStDev? Is 50% well far above?

Response: Usually 60% is close to the 3x StDev from DMSO controls. In our screening specifically, the 3x StDev from DMSO controls was 70% in Replicate 1 and 66% in replicate 2. When we use a 50% cut-off, we are being less stringent and we know we will select false hits for dose-response, but we reduce our odds to miss a false negative.

7.    Page 16, Paragraph 3 (Results, Bayesian Models) and Figures S1 and S3. I miss the authors do not remark the nitro-aromatic/heterocyclic moiety as a “good feature” (i.e. G13 in Figure S1?). Actually, 3 out of the 11 compounds that turned out to be active in vitro (Table S2) contain this motif. Moreover, these 3 compounds are nitro-furans which clearly resemble the chemical structure of Nifurtimox, a classical toxic anti-chagasic compound in the clinic. I think the authors should discuss about this very predictable finding. It is expected that the Bayesian model also leads to selection of novel compounds which are not so close analogues of existing gold standards. Otherwise, its value-added is questionable.

 Response: We described this feature as “aromatic fragments containing basic nitrogen” We certainly found non obvious compounds like tetrandrine and pyronaridine.

8.    Page 17, Paragraph 2: I suggest indicating the total number of molecules scored through the Bayesian model. Approx 7,500?

Response: We added “Approximately 7200 molecules were screened using the Bayesian model.

9.    Page 17, Paragraph 2: Aren´t there 5, and not only 4, compounds with EC50<1 uM (including SC-0011754 in table S2)? Noteworthy, 2 out of these top 5 are nitrofurans.

Response: We have corrected this.

10.    Page 17, Paragraph 3: May the authors elaborate on the observations of adverse/toxic effects of Furazolidone and the nitrofural prodrug? And in comparison with Nifurtimox, which is not included in the experiment?

Response: With respect, we were not interested in the side effects of these very well-known and well characterized drugs for which we provided references.

11.    Table S2: SC-0011801 is last compound with EC50 figures in the table. I assume all the compounds below are inactive (>10), but why are the cells in the table empty? I am pleased to see the inclusion of EC90 and Hill slope in the table.

Response: We did not test these inactive compounds in dose-response experiments, therefore the values are not available.

12.    Table S2: I suggest you define the legend of “Infection Ratio” column. Are the responses from the primary screening at 10 uM in duplicate?

Response: Infection Ratio: number of infected cells divided by the total number of cells. Yes, primary screening was done in duplicate, thus the two values for infection ratio (at 10 uM).

13. Table S2: SC-0011754 is Nitrofural, isn´t it? If so, please add its name to the table.

 Response: Thank you – we have added it.

14. Page 18, Paragraph 2: Without experimental testing, the target prediction based on the Pathway Genome Data Base constructed cannot be validated. It is true you are building a sound hypothesis, but the goodness of the prediction is not assessed in the paper. Since, it is not experimentally proven that TR is the target for pyronaridine, I would not stress in excess the value added by the this approach.

Response: Thank you for your suggestions, this database creation is described in this paper and is now available to other researchers for the first time. We clearly described that our goal is to ultimately use the database for target identification. We did use the metabolites from the database for similarity analysis to propose targets (along with other approaches) for future testing. This represents considerable future work but is useful to propose such targets.

15. Page 20, Paragraph 1: If you discount the 3 nitrofurans as a foreseen selection, the success rate is 8/97, i.e. 8%. That means a 4-8 fold enrichment vs. random selection. Might the authors discuss whether this is higher than expected and the value-added? In order to fairly evaluate the goodness and value of the predictive model, I miss an assessment of false negatives, and a comparison with a random selection of compounds from the same set. One caveat: some of the compounds do not look like tractable for an oral and safe drug, e.g. SC-0011752, SC-0011796. Do you have cytotoxicity results for all 11 hits? Which of the 11 hits were present in the training set?

Response: Our previous work with Mycobacterium tuberculosis has shown with prospective testing that depending on the training sets and test sets we would see similar enrichments as described by the reviewer. Also we have reported hit rates from 20-70%. Generally HTS screening provides hit rates < 1% We did not try to perform random selection and did not test every compound in the libraries that were scored.  Clearly it would have been prohibitively expensive to do this. Cytotoxicity data is available for all the hits only in Table S2.

16.    Page 20, Paragraph 2: You may want to mention that Pyronaridine has been in clinical use in China. And positive opinion by EME. Current clinical trials for combinations.

Response : we have now added the following to address this ‘Pyronaridine is in clinical use as an antimalarial [90,91], is a P-glycoprotein inhibitor [92] and was given a positive opinion by the European Medicines Agency using this molecule in a combination therapy [93]. ’

Reviewer #2:

A very interesting and innovative approach to bring chemo and bioinformatics together to address research gaps in the Chagas field, and attempt to connect phenotypic and mechanistically driven approaches to drug discovery, with the goal of populating and diversifying the late stage discovery pipeline. The public access output of the research is an excellent resource. The identification of pyronaridine as a Tc in vivo active is an interesting discovery and PoC for the methodology.

Response: Thank you

The authors do not discuss any potential improvements that could be made to the model building/machine learning based on the molecules that were identified by the process.

Response: This is an interesting question. Our focus was not on how to improve the model/ machine learning. We have applied the same approach as was found to work reasonably well with TB and other diseases in our hands. A whole paper could probably be written on what other approaches could be tried which would be well outside the current scope. We could change the algorithm and descriptors and it would likely have a small effect. We could filter the training set further based on some criteria etc.

We have added There are many steps we could take to update our computational models such as incorporating the current data and using other machine learning algorithms.’

Specifically: (p12) the same set of molecular descriptors were used to evaluate natural product libraries and “drug-like” libraries (would the model be applicable to fragment libraries?)

Response: It is possible we could filter other vendor libraries and fragments, we have not tried this and it is unclear if / how this would be an improvement on what we have done.

(p16) the good molecular features identified in the DR response data alone model (Fig S1) are not well represented in the actual hit list

Response: These are just some of the good features in the training set, it is possible these features did not appear that often in the test sets of compounds.

(p17) functional groups known to be problematic in drug development are well represented in the hits eg nitro groups, furans, micheal acceptors, metal chelators (hydroxamic acid), gramine (reactive metabolite formation), poly carboxylic acids, high MW compounds eg steroids (what was the MW filter?), bisguanidines, amidines and polyoxygenated compounds.

Although most of these compounds did not repeat in the hit confirmation assay, can the authors comment on the filtering and selection process to remove nuisance compounds (frequents hitters in HTS sets – now described as PAINS (Baell et al)) or reactive functional groups from the learning libraries.

Response: We did not filter the sets of drugs and other natural products using the PAINS filters (if we were using large vendor libraries this would be a valid filter). It is well known that the T. cruzi molecules contain these groups but they have been known for a long time. We performed dose response analysis only on the active compounds 11/17 had EC50 < 10uM.

Reviewer #3:

This is potentially an interesting study that uses various computational approaches to facilitate drug discovery for Chagas disease and applies these approaches in proof-of-principle experiments. However, the actual content suffers from many shortcomings that need to be addressed before the story can be published. The main concerns of this reviewer include: 1. There is no entry for T. cruzi in the BioCyc database even though the authors claim they have created such a database. 2. Some experiments were executed without proper controls. 3. Manuscript text is not written well and contains many unsubstantiated claims, numerous typos, missing information and it is difficult to follow at times. It needs significant work and I flagged the most obvious examples in the specific comment section below, but that list is not exhaustive at all.

Response 1. We sincerely apologize that T. cruzi was unavailable in BioCyc when the reviewer attempted to access it. There was a server powerdown and it did not automatically come back up, and we are working to ensure it does not go down again. The database can be accessed at http://node2.csl.sri.com:1555/

Response 2 We have corrected the omission of the vehicle control in the methods which we described in the figure previously.

Response 3. We have made significant changes to address all these concerns.

Specific comments. Methodology/Principal Findings p.3 – “97 compounds…”- Please change to “Ninety-seven compounds…”.

Response: We have changed this.

p.3 – “We progressed five compounds to an in vivo mouse efficacy model…” Efficacy model of what disease? This is the first time in the manuscript any mouse model is mentioned.

Response: We have changed this.

Author Summary p.4 – “We have used data from a phenotypic screen to build Bayesian models to predict activity against T. cruzi in vitro.”  Predict activity of what?

Response: We have changed this. To anti-parasitic activity

p.4 – “We identified the antimalarial pyronaridine has having in vivo efficacy and providing us with a new starting point for further investigation and optimization.” Please change this sentence to a grammatically correct form.

Response: We have changed this.

Introduction p.6 – “In the 1980’s the pharmaceutical industry took advantage of advances in molecular biology/genetic engineering and began replacing cumbersome phenotypic and whole cell HTS with target-based screening assays.” The reason for replacing phenotypic assays with target-based ones was not because of the former being cumbersome. They are not – please delete this word. The sentence also implies that phenotypic and whole cell HTS assays were not one and the same thing. I am not aware of any phenotypic HTS run in the 1980s or earlier that would not be a whole cell assay. Please use ‘phenotypic’, ‘whole cell’, or ‘phenotypic, whole cell’.

Response: We have changed this.

p.6 – ‘While target-based screens using simple recombinant protein enzyme assays offer many advantages in terms of cost and scalability,…’ Please change ‘enzyme’ to ‘enzymatic’.

Response: We have changed this.

Target-based assays do not provide advantages over cell-based assays in terms cost or scalability, and these are definitely not the reasons why they are being used. This sentence needs to be changed.

Response: We respectfully disagree with the reviewer. In our experience target-based assays are cheaper and can be performed at a much higher throughput than a cell-based assay.

p.6 – “they rely on the assumption that the selected target is in fact the best and most druggable target for a given disease.”  Again, this is not the case. The assumption is that the target is good enough to yield a drug that can meet the drug target product profile for a given disease.

Response: We have changed this.

p.6 – “…especially for neglected infectious diseases where drug targets are poorly understood or target-based approaches have been unsuccessful in the past [1] (or a complete failure [2]).” Please remove the sentence in the parentheses and reference [2]. Common bacterial infections described in this article (such those caused by Staphylococcus aureus) do not belong among neglected infectious diseases. Additionally, cell-based screens described in the reference [2] fared equally bad in terms of finding promising hits as the target-based screens described in the same paper, so the reference does not support the point authors are trying to make.

Response we changed it to : “Nonetheless, in the last decade, there has been a shift back towards using phenotypic screens as a starting point for drug discovery, especially for infectious diseases where drug targets are poorly understood or target-based approaches have been unsuccessful in the past [1]. In fact, analysis of the origin of first-in-class small molecules found that phenotypic screens identified more novel inhibitors than any other approach between 1999 and 2008 [3].”

p.6 – “One such disease area, where target-based drug discovery has largely failed, is in the field of neglected tropical diseases (NTDs).” This is again incorrect. There simply are almost no validated targets for these diseases and target-based drug discovery was barely attempted. If authors have knowledge of failed target-based programs, they should include relevant references after this statement.

 Response ; we respectfully differ in our opinion. We provide plenty of examples of target based drug discovery such as: CYP51 and cruzain as mentioned in the text, glycoproteins and glycolipids synthesis (mucins, trans-sialidase), DNA topoisomerase and other enzymes involved on the replication of the DNA, phosphodiesterase C, etc.

6, 7 – “The trend towards using phenotypic screens over target-based screens is particularly strong for NTDs and related bacterial and fungal pathogens.” There are no fungal diseases on the NTD list as far as I know. In what sense are these fungal pathogens related to the NTDs?

Response: We have changed it to ‘The trend towards using phenotypic screens over target-based screens is particularly strong for NTDs as well as bacterial and fungal pathogens.’

p. 7 – “The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity (currently focused on 3 targets, CYP51, cruzain and genes associated with DNA damage) [12-14]. Reference 14 leads to a website that offers Internet Information Services (IIS) for Windows® Server. Please provide references that support claims made in the text (perhaps this one? – ‘http://www.bvgh.org/Current-Programs/Neglected-Disease-Product-Pipelines/Global-Health-Primer.aspx’).

Response: We apologize for this broken link – we have changed this.

7 – “The remaining three products in clinical development (Phase I and II) target a single enzyme, CYP51, which has been the focus of Chagas disease research to date [17-22].” This is incorrect; please change to ‘the focus of Chagas disease drug development’.  

Response: We have changed this.

7 – “The only additional novel drug target with a single compound in preclinical development is cruzain, a T. cruzi cysteine protease and there is considerable literature surrounding this class of inhibitor.” Please change to ‘this inhibitor’ or ‘this class of inhibitors’.

Response: We have changed this.

8 – “There have been some target-based high throughput screens for CYP51 [22] and cruzain [24] as well as virtual screening for cruzain [23].” I assume that the authors mean that there were some screens to discover INHIBITORS of CYP51 and cruzain. If that is the case, please modify the sentence.

Response: We have changed this.

p. 9 – “In addition we have created a BioCyc database for T. cruzi, which complements other sources of related metabolic pathway data (including KEGG T. cruzi pathways [48]. Currently, there is no Trypanosoma cruzi organism database in BioCyc.

Response: We sincerely apologize that T.Cruzi was unavailable in BioCyc when the reviewer attempted to access it. There was a server powerdown and it did not automatically come back up, and we are working to ensure it does not go down again. The database can be accessed at http://node2.csl.sri.com:1555/

Methods p. 10 – “CDD database and Chagas datasets.” The authors need to specify the name of the dataset that was created as part of this publication. Is it “Trypanosome: Chagas Disease Literature Compounds”?

Response: The Broad dataset was named TRYPANOSOME: Broad Primary HTS to identify inhibitors of T. cruzi Replication

In the process of this work we also curated the dataset Trypanosome: Chagas Disease Literature Compounds

The molecules in Table S2 are currently in a private vault and will be shared at a later date.

p. 10 – “Data annotation and Pathway Genome Data Base construction”.  A large part of this text belongs into the Results section. Figure 1 is not very informative as it completely lacks any description other than Pathway Genome Data Base for T. cruzi. It is unclear to me what various symbols shown in the Figure mean.

Response we have added a results section for the PGDB and a description for Figure 1.

“A PGDB was constructed for T. cruzi using the complete genome sequence of the Dm28c strain (Figure 1).  The underlying genome sequence consisted of 5,287 contigs assembled into 1,378 scaffolds of 30,716,540 base pairs.  Pathologic found 11,349 distinct gene products, at least 880 of which were found to be enzymes and at least 16 of which are transporters. Pathologic was able to infer 1030 enzymatic reactions and 122 pathways from these assignments as well as the existence of 806 metabolic compounds. This set was filtered to 358 molecules after removal of compounds with R- groups and small nuisance molecules. This dataset was then used to infer potential targets by comparing the Tanimoto similarity with a phenotypic screening hit [39].”

12 – “These models were used to score the following drug libraries; Selleck Chemicals (Houston, TX) natural product library (139 molecules)…” The Selleck natural product library contains 131 natural products, not 139. Please check which of these 2 numbers is correct.
Response: The version of this library downloaded several years ago contains 139 molecules.

12 – “…and Traditional Chinese Medicine components (373 molecules).” Please, include reference for this library or, if a reference is unavailable, a brief description how it was put together.
Response : This library represents some common single component TCMs and was kindly provided by Dr. Ni Ai, Zhejiang University, China.

14 – “Hit selection and secondary screening (dose-response assay)” This section includes a repetition of the T. cruzi infection assay description from the preceding section (“Primary In vitro screening”) and contains some additional experimental details. If both sections refer to the same assay as it seems, the authors should avoid the repetition and should include only the more detailed protocol version in the manuscript.
Response: The methods section has been modified following the reviewer’s advice.

15 – “Mice were housed at a maximum of 5 per cage and kept in a specific-pathogen free (SPF) room…” Please change to “specific pathogen-free…”.
Response: We have changed this.

15 – “To infect the mice, trypomastigotes of T. cruzi Brazil luc strain were harvested from culture supernatant and injected intraperitonealy, 105 trypomastigotes per mouse.” Please describe how trypomastigotes were harvested and the volume of parasite suspension used for the infection. The level of detail is not sufficient for the experiment to be repeated by an independent group.

Response: We have changed this. Additional information was included.
Correct spelling is ‘intraperitoneally’.


Response: We have changed this.

p. 15 – “Starting on day 3 the infected mice were treated with test compounds at 50 mg/kg administered in 20% Kolliphor, i.p., b.i.d., for four consecutive days.” This experiment is missing the negative control in which infected mice are treated with the vehicle by IP injection. As both parasites and experimental compounds are injected into intraperitoneal cavity, the observed anti-parasitic effect might be at least partly due to the direct effect of the vehicle on the parasites proliferating in the IP cavity. The authors need to repeat this experiment with all the necessary controls.


Response: The controls were administered using the same route as the tested compounds: IP. The text was corrected.

Please include the volume used for injection of compounds. ‘Intraperitoneal’ is one word and should be abbreviated as ‘IP’ instead of  ‘i.p.’


Response: The Volume was included.

p. 15 – “At day 7 post-infection, the luminescent signal from infected mice was read upon injection of D-luciferin.” Please describe in more detail how this was done or include a reference. The level of detail is not sufficient for the experiment to be repeated by an independent group.


Response: We have changed this.

15 – “The absolute numbers of measured photons/s/cm2 were averaged between all five mice in each group and compared directly with compound-treated mice and the control groups.” This sentence does not make much sense. How did authors compare the luminescence signals with compound-treated mice and the control groups?


Response: We have changed this.

p. 15 – “The efficacy percentage was calculated based on relative luminescence signal reduction compared to the controls.” I assume they used the vehicle-treated group for luciferase signal normalization. This needs to be specified in more detail how it was done.


Response: We have changed this.

p. 16 – “Two tailed paired Student t test was used to assess statistical significance between luminescence values from vehicle-treated and compound-treated groups at day 7 postinfection.” The authors need to specify what hypothesis was tested – presumably that two such datasets are different?

 Response: We have rephrased this

p. 16 – “The UCSD Institutional Animal Care and Use Committee reviewed and ‘approved’ this study with protocol number S14187.” Why is the word “‘approved'” used with quotation marks?
Response: We had written – ‘All animal protocols were approved and carried out in accordance with the guidelines established by the Institutional Animal Care and Use Committee from UCSD (Protocol S14187).’

Results p. 16 – “Using either dose response data alone or the combination or dose response and cytotoxicity (dual activity) resulted in statistically comparable models.” This sentence is difficult to understand. What combination do authors have in mind?


Response: We changed it to Using either dose response data alone or the combination of dose response and cytotoxicity (dual activity) resulted in statistically comparable models.

16 – “Both had leave one out ROC values greater than 0.8 (Table 1).” I do not understand this sentence. Do the authors mean ‘leave-one-out ROC values’? Also, what is ROC? This parameter is not described in the Methods section. Does this refer to the ROC AUC parameter?


Response: ROC is now spelled out – it can be used interchangeable with ROC AUC. This is a standard measure of how the model performs in predicting compounds left out (whether leave one out, 5 fold cross validation or leave out 50% x 100 cross validation etc).

p. 16 – “The use of FCFP_6 fingerprints enabled the good features important for activity to be visualized in the dose response data alone model (Figure S1)…” What do authors mean by the good features? This is not postulated anywhere in the article. Do they mean features that positively correlate with compound potency or selectivity?


Response: We have changed the text to: ‘The use of FCFP_6 fingerprints enabled the features important for activity (termed good features) to be visualized in the dose response data alone model (Figure S1) which included tertiary amines, piperidines and aromatic fragments containing basic nitrogen functionality while those features that were negatively related to activity included cyclic hydrazines prone to tautomerization as well as a number of electron-poor chlorinated aromatic systems (Figure S2).’

p. 17 – “…providing very clear trends and perhaps a rationale for why the enrichments are so dramatic for these systems.” It is not explained anywhere what do the authors mean by ‘the enrichments’.


Response: We have deleted this statement.

p. 17 – “Ninety seven molecules were tested and 11 were found to have EC50 values less than 10 uM (Table S2). Four of these molecules (verapamil, pyronaridine, furazolidone and tetrandrine) had in vitro EC50 values less than 1 uM (Table 2).” The authors need to list how many independent experiments they ran for determining the EC50 values (n). Based on the Methods section it appears that they ran only one experiment in duplicate. If that is the case, they need to repeat experiments so that n=3 at least.


Response: Primary screening was done in duplicate

p. 17,18 – “In vivo testing” The experiment needs to be repeated with proper controls. Please see my comment in the Methods sections.


Response: As described in the methods we used the correct vehicle control.

p. 18  – “The molecules with the highest Tanimoto similarity in CDD were T. cruzi GAPDH inhibitors (Figure S6).” First of all, it is not clear from the Figure S6 legend what this figure shows. What molecule was used for this similarity search? Secondly, the shown molecules have only 43% similarity to pyroninaridine (assuming this compound was used in the search) and look very different.  The authors do not make any effort to introduce a validated metric that would assess how reliable this type of prediction could be. Unless they present some experimental evidence, this section does not have any value and should be deleted. Such a conclusion is further confirmed by quite extensive list of proposed targets that follows in the text and includes polyamine biosynthesis, trypanothione disulfide reductase, and topoisomerase VI.


Response : The figure S6 shows the similarity search of a public curated dataset using pyronaridine in CDD of Chagas related compounds from the literature. This is a different analysis to using a similarity search of the metabolites in the PGDB which resulted in one molecule with 67% Tanimoto similarity. Tanimoto similarity is widely used for similarity searching. We propose that this section of the manuscript uses all the molecule sets available and the metabolites from the PGDB to try to infer potential targets. Similarly searching in ChEMBL and other public sources is a sensible approach to try to leverage the information that is already existing for compounds and their known targets.

Discussion p. 20 – “Historically, for a diversity-based library undergoing HTS, it is expected a range of 1 to 2% of hits based on observed activity (usually >50% antiparasitic activity at 10 μM and no signs of cytotoxicity at this concentration) will be observed [30]. Screening in the attached reference was done at 3.75 uM compound concentration and resulted in identification of ~ 4,000 hits from 300k small molecule library. As the screening described in the current article was done at 10 uM compound concentration, one would expect the hit rate to be significantly higher than 1-2%. To make assessment whether 10% hit rate is higher than a hit rate in an unbiased screen, the authors need to include in Table S2 CC50 values for the listed compounds.


Response : Cytotoxicity data was generated only for compounds with dose response data


The Case for Why the EC Needs to Fund Small Molecule TB Drug Discovery

My previous post may have been too refined..so let me revert to a drier scientific style. In the last few weeks Iain Old, Giovanna Riccardi, myself and several others from the MM4TB project have been lobbying MEPs and basically anyone who will listen to try to raise awareness of the lack of funding for TB drug discovering in the Horizon 2020 project. We have summarized the dire situation in the following which we have shared by email.
Tuberculosis (TB) is a truly ‘global health crisis’ and a persistent threat in high income countries [1], affecting more than two billion people around the world [2, 3]. On an annual basis, nearly 9 million people are infected, with 1.5 million of them dying. In 2013, The UK, a country with a population of over 60 million had over 7800 tuberculosis cases. Although effective TB drugs have existed for over 50 years, multidrug resistant (MDR) strains have spread across the globe (including Europe [4, 5]) which hinders treatment and increases costs [2, 6-11]. Each year, at least half a million new MDR TB cases occur. Although incidence, prevalence and mortality rates are falling in African, Eastern Mediterranean and European regions it is not fast enough to meet the 2015 global targets of the Millenium Development Goals. At least $8bn is required to deal with this epidemic and that excludes the estimated $2bn that is needed for drug and diagnostic discovery. There are currently 15 vaccines in clinical trials [3]. TB vaccines are widely used in Europe and elsewhere and seem to offer some protection from the disease but have proved a dismal failure in developing countries, like India and in Africa, where the disease is endemic.  In contrast there are only 10 drugs for TB in clinical development [3] which is inadequate to address drug resistance to existing drugs. Trials of drug combinations have so far proved inferior to the current 6 month standard of care [3]. Therefore it is widely supported that there needs to be more early stage drug discovery to feed the clinical pipeline to address this disease [1, 3]. While there is extensive work on new vaccine approaches being undertaken; some supported by the EU, it seems likely that it will take two generations to reach widespread success, if indeed this can ever be achieved. Thus, there will be a continuing need for drugs to treat TB and particularly ones that are effective against currently virulent strains and mutations of these, particularly those that are already multi-drug resistant.
In the past the European Commission has funded TB research in FP7, with over €100 million (€16.3 million euros in major vaccine projects (NEWTBVAC), €20.2 million in drugs (MM4TB and ORCHID), €6.3 million on diagnostics and €19 million on epidemiology (TB PAN-NET)) [13]. We are concerned with the lack of funding from the European Commission for TB drug discovery in Horizon2020. For example In the EC’s own press release for World TB Day 2015 they made no mention of Tuberculosis drug development in Horizon2020 [12] and the only two projects funded (€26.2 million) concern vaccines (EMI-TB and TBVAC2020). While there is some work on TB being done in IMI, the lack of big pharma interest in Tuberculosis drug development means that IMI is unsuited for Tuberculosis drug development projects. If we are to retain the talented international research teams working to find drugs for TB which could help fight the rapidly advancing drug resistance, we must fund them in Horizon2020 rather than putting all EC money on vaccines and hoping it will pay off one day.
1.         Lonnroth, K., et al., Towards tuberculosis elimination: an action framework for low-incidence countries. Eur Respir J, 2015. 45(4): p. 928-52.
2.         WHO, Global Tuberculosis Report. 2013, World Health Organization: Geneva.
3.         WHO. Global Tuberculosis Report 2014. 2014; Available from: http://apps.who.int/iris/bitstream/10665/137094/1/9789241564809_eng.pdf?ua=1.
4.         Jakab, Z., et al., Consolidated Action Plan to Prevent and Combat Multidrug- and Extensively Drug-resistant Tuberculosis in the WHO European Region 2011-2015: Cost-effectiveness analysis. Tuberculosis (Edinb), 2015.
5.         Zignol, M., et al., Drug-resistant tuberculosis in the WHO European Region: an analysis of surveillance data.Drug Resist Updat, 2013. 16(6): p. 108-15.
6.         Velayati, A.A., P. Farnia, and M.R. Masjedi, The totally drug resistant tuberculosos (TDR-TB). Int J Clin Exp Med, 2013. 6(4): p. 307-309.
7.         Abubakar, I., et al., Drug-resistant tuberculosis: time for visionary political leadership. The Lancet Infectious Diseases, 2013. 13(6): p. 529-539.
8.         Dheda, K. and G.B. Migliori, The global rise of extensively drug-resistant tuberculosis: is the time to bring back sanatoria now overdue? The Lancet, 2012. 379(9817): p. 773-775.
9.         Gothi, D. and J.M. Joshi, Resistant TB: Newer Drugs and Community Approach. Recent Pat Antiinfect Drug Discov, 2011. 6(1): p. 27-37.
10.       Udwadia, Z.F., et al., Totally Drug-Resistant Tuberculosis in India. Clinical Infectious Diseases, 2012. 54(4): p. 579-581.
11.       Velayati, A.A., et al., Emergence of new forms of totally drug-resistant tuberculosis bacilli: Super extensively drug-resistant tuberculosis or totally drug-resistant strains in iran. CHEST Journal, 2009. 136(2): p. 420-425.
13.       Commission, E. World Tuberculosis Day 2015: EU Research to Fight Tuberculosis. 2015; Available from: http://ec.europa.eu/research/index.cfm?pg=world-tuberculosis-day-2015.


The war against Tuberculosis needs new drugs

This is not a call to arms for a war fought with bombs and bullets, but one which is equally catastrophic in terms of human loss. We are fighting armies of microscopic bacteria called Tuberculosis (TB) that knows no country boundaries, infecting 9 million and killing 1.5 million per year. Human kind has only defended itself since the discovery of TB drugs nearly 70 years ago, yet the effectiveness of this armament is rapidly dwindling. TB has regrouped with multidrug and extensively drug resistant forms spreading across the globe. This threatens to cripple our health systems with the costly and lengthy treatments. The most effective weapons have been drugs, while an effective vaccine has eluded us, but we still invest in the 15 current clinical trials. In stark contrast, there are only 10 drugs for treating TB in clinical development, many of which are new indications for existing drugs or combinations. Knowing the very high failure rate of costly clinical trials, perhaps just one of these might prove effective but at what cost? While our knowledge of TB functioning increases, there still needs to be more ‘early stage’ drug discovery efforts to feed the clinical pipeline. In the past the European Commission has funded TB research in the FP7 program*, with over €16.3 million euros spent on vaccine projects versus €20.2 million on drug discovery projects in academia and industry. This is a drop in the ocean compared with the $billions that are needed annually. In 2016 the focus will be on vaccines although it will likely take generations for success, with no funding for drug discovery in Horizon2020 at all. The result will be loss of our infantry, the many drug researchers that battle on the front lines in their laboratories. Europe has hope for winning the war, it must reverse this retreat away from drugs in favor of vaccines and we must fund these scientists. This will take courage on the part of the politicians and program officers, as drugs are not trendy new technologies that grab headlines, but a reliable workhorse which have held TB at bay. We should not surrender the only opportunity to defeat our nemesis.

*I am involved with the FP7 funded MM4TB project.

Older posts «