Finally after 18 years – Time for a Research Statement

Its been something I continually grapple with. As a non academic doing a diverse range of scientific projects how could I come up with a vision statement (if indeed I needed one as I have managed for 18 years without one). A few days ago I put a research statement together. I think it basically summarizes where I am now. I am also so ticked off every time I get on a call with people or meet them face to face and they continually want ‘proof of concepts’ without going to the literature and looking at the body of work I have put together with a huge number of collaborators over the years. So here goes, why not make my research statement public then people can refer to this too. Any typos or vagueness – I take credit for.

My research over the past 18 years has predominantly fused applied preclinical drug discovery with computational methods in both large pharmaceutical and small software companies. My contribution has been to apply and develop methods and technologies to identify liabilities in small molecules as well as improve the hit rate of screening. The fundamental underpinning of this work is that we can learn from data that is available and we can test the predictions we make. My research has covered enzymes, transporters, ion channels and receptors. It has also been applied to the whole cell and the whole organism. Most recently the application of machine learning methods with collaborators to large phenotypic screening and in vivo datasets for tuberculosis may revolutionize how we address diseases where there is a limited research budget and urgency to discover new therapeutics. I am also at the forefront with collaborators by separately evolving computational approaches to be developed as apps on mobile devices. Independently my efforts using the computational and intellectual insights through my pharmaceutical experience are aiding the area of rare disease research which has a deficit of scientists to cover over 7000 diseases. Integrating the computational learning methods I have applied over the years for identifying compounds and predicting targets can provide a head-start of a year or more over even established competitors. The big picture for this research is performing intelligent drug discovery and putting it to good effect.

My research is currently funded at Collaborative Drug Discovery, Inc by The Bill and Melinda Gates Foundation, NIAID (2 grants), NCATS (1 grant), and the European commission. As Collaborations in Chemistry, I also consult widely with pharmaceutical and consumer product companies as well as collaborate independently with several researchers at Johns Hopkins University (1 grant) and Rutgers (3 grants) as a consultant. These efforts illustrate both the extensive collaborative network but also my efforts at independently funding my work which I intend to expand. My research interests are fundable in the long term. The following represent my major independent and collaborative research foci:

Computational Toxicology: Computational methods have been widely applied to toxicology across pharmaceutical, consumer product and environmental fields over the past decade. For 18 years I have been actively involved developing computational models for hepatotoxicity (e.g. for drug-induced liver injury and Pregane X Receptor), cardiotoxicity (hERG), drug-drug interactions (P450) and drug transporters (NTCP, ASBT, MATE, P-gp). My recent efforts in collaborations with Dr. James Polli (University of Maryland) and Dr. Stephen Wright (University of Arizona) have led to an array of novel models and inhibitors for human drug transporters without needing to screen large numbers of compounds randomly. This has enabled rapid understanding of molecular features required for binding as well as extensive use of machine learning methods. Considerable progress has been made in computational toxicology in a decade both in model development and availability of larger scale or ‘big data’ models. The future efforts in toxicology data generation will likely provide us with hundreds of thousands of compounds that are readily accessible for machine learning models. These models will cover relevant chemistry space for pharmaceutical, consumer product and environmental applications.

Bigger Data, Collaborative Tools and the Future of Predictive Drug Discovery: Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service (SaaS) commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger data as it accumulates from high throughput screening and enable the user to draw insights, enable predictions and move projects forward. My work with an array of collaborators has developed and applied such models and used tools to foster collaborations in areas such as tuberculosis and other neglected diseases. I am active in developing more tools and integrating methods.

Whole cell and whole organism models for drug discovery: Over 5 years of initially independent then collaborative work on applying machine learning methods to tuberculosis phenotypic screening data has resulted in identification of hundreds of new active compounds. These efforts at learning from over 350,000 compounds in the public domain exemplify how we can use such data developed at great public cost, to shape future decisions. With Dr. Joel Freundlich (Rutgers University) and collaborators we have recently extended these efforts to predicting the response of the mouse model to antituberculars. This is valuable because selecting and translating in vitro leads for a disease into molecules with in vivo activity in an animal model of the disease is a challenge that takes considerable time and money. Recent years have seen whole-cell phenotypic screens of millions of compounds yielding over 1500 inhibitors of Mycobacterium tuberculosis (Mtb). This is growing with the efforts of the TB drug accelerator which has already doubled this number. These hits must be prioritized for testing in the mouse in vivo assay for Mtb infection, a validated model utilized to select compounds for further testing. We recently demonstrated learning from in vivo active and inactive compounds using machine learning classification models (Bayesian, Support Vector Machines and recursive partitioning) consisting of 773 compounds. The Bayesian model predicted 8 out of 11 additional in vivo actives not included in the model as an external test set. Curation of seventy years of Mtb data shows we can therefore provide statistically robust computational models to focus resources on in vivo active small molecule antituberculars. This highlights a cost effective predictor for in vivo testing elsewhere in other diseases.

New Target Prediction and Visualization Tools: With collaborators Alex Clark (Molecular Materials Informatics) and Malabika Sarker (SRI), I recently developed a freely available mobile app (TB Mobile) for both iOS and Android platforms that displays Mycobacterium tuberculosis (Mtb) active molecule structures and their targets with links to associated data. The app was developed to make target information available to as large an audience as possible. We recently updated it to include enhancements that use an implementation of ECFP_6 fingerprints that we have made open source. Using these fingerprints, the user can propose compounds with possible anti-TB activity, and view the compounds within a dynamic cluster landscape. Proposed compounds can also be compared to existing target data, using a näive Bayesian scoring system to rank probable targets. This is important because phenotypic screening does not provide any indication of potential targets. We have curated an additional 60 new compounds and their targets for Mtb and added these to the original set of 745 compounds. We have also curated 20 further compounds (many without targets in TB Mobile) to evaluate this version of the app with 805 compounds and associated targets. TB Mobile can now manage a small collection of compounds that can be imported from external sources, or exported by various means such as email or app-to-app inter-process communication. This means that TB Mobile can be used as a node within a growing ecosystem of mobile apps for cheminformatics. It can also cluster compounds and use internal algorithms to help identify potential targets based on molecular similarity. TB Mobile therefore represents a valuable dataset, data-visualization aid and prediction tool that could be cloned as an approach to create similar tools for other diseases.

Multifaceted Roles of Ultra-Rare and Rare Disease Parent / Patients in Drug Discovery: Individual parents and patients are increasingly doing more to fund, discover and develop treatments for rare and ultra-rare diseases that afflict their children, themselves or their friends. They are performing roles in business development that would be classed as entrepreneurial, while their organizational roles in driving the science in some cases are equivalent to Principal Investigators. These roles are in addition to their usual one of advocates. Through their efforts and that of the collaborative networks which they have developed, they may be in position to disrupt drug discovery. For several years I have been working with at least three such rare disease foundations, writing NIH grants (RDCRN, NeuroNext, SBIR etc) and starting companies. Through these groups I am involved in rich scientific networks to drive therapeutic discovery and development at a fraction of the cost of doing this in big pharma. We have also developed a free mobile app to encourage open sharing of research and information for rare and neglected diseases, called Open Drug Discovery Teams.

Other Collaborative work: With collaborators I have been involved in evaluation of the quality of data underpinning the computational models we develop and how the data is generated. Most recently this resulted in a fundamental study into how apparent biological activity is affected by how a compound is dispensed. I have used simple cheminformatics approaches to predict immunoassay cross reactivity for drugs of abuse and “bath salts” which is important for emergency medicine. I have also co-developed the first mobile app for green chemistry called Green Solvents which has an educational component to change behavior. In addition I have a long running project using computational models to evaluate nuclear hormone receptor function and evolution as well as search for novel PXR antagonists.

Future Work: I can see the efforts above evolve and overlap to an extent. I will continue to use computational approaches to identify hypotheses and compounds to test. A focus on neglected diseases with high levels of phenotypic screens, creating massive training sets suggests that we can change the focus on less random screening and more focused selection with an idea of the compounds’ potential targets. At the opposite end of the scale rare diseases have little data so the challenge will be to mine the accumulated human knowledgebase and focus on the most promising approaches to have an immediate impact. I see this as a huge future opportunity which my prior efforts will prepare me for. My other collaborative efforts will nurture adjacent and new technologies and ideas which may inform my major research foci. New problems to work on may represent new opportunities to show how computational approaches can impact drug discovery. It will be important for the next generation that we train others who can extend the reach of these efforts. The utility of mobile apps potentially could bring powerful software into the hands of anyone. And with that possibility there are challenges in how data are interpreted correctly. My vision is to have an active role as we transition such science from the desktop to the myriad of mobile devices that could be used in our own home environments to do drug discovery research and perhaps make the next great discoveries. Bringing a degree of independence and autonomy to drug discoverers using software tools could also fundamentally change the way research is done.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>