8 min read

Drug discovery: what is it and why does data integrity matter?

Drug discovery is a complex journey that involves making crucial decisions at various stages, based on key data coming from multiple streams. The integrity of that data, and the speed with which a company can act on it, is make or break in our fast-paced, data rich world. At Kaleidoscope, we understand the importance of data integrity and getting this right. In this post, we break down the key phases and data considerations of the drug discovery process. 


Summary Points

  • Drug discovery is complicated. The journey from a target to a drug is a long and convoluted one with many players involved along the way. Across the industry, billions of dollars are poured in R&D each year and only a handful of drugs ever make it to market.
  • Good data practices leading up to pivotal review points can make or break a drug. Two key review points along the process, the IND and NDA, are where a company goes through intense regulatory scrutiny in order to determine if the potential drug is safe and ready for patients. Having practical and efficient data management systems in place leading up to these inflection points is key to ensuring successful approval of the prospective drug.
  • Not all tools are fit-for-purpose. Drug discovery brings unique challenges, like managing critical decisions, information hand-off, and high volumes of iterative data, over prolonged periods of time. The tools you use need to support this model. At Kaleidoscope, we set out to build purpose-built software for exactly these kinds of dynamic, collaborative, and high-stakes projects.

Drug discovery is a time consuming and expensive process, where the chances of success are low but the upsides are massive. The average time to market for a drug is 10+ years and can cost upwards of $2B, with key inflection points that can make or break the prospects of the drug. The process typically begins with identifying a target, often a specific biological molecule or pathway associated with a disease. Researchers then set out to find or design molecules that can modulate this target, with the ultimate goal of developing a safe and effective therapeutic intervention. The process can broadly be divided into several key stages:

  1. Research and Development
  2. Preclinical Studies
  3. Clinical Trials
  4. Review and Approval by the FDA

In this post, we’ll discuss the main aspects of each step of this long process, as well as the key industry players involved, through the lens of small molecule drug discovery.

Research and Development

Research and Development, or R&D, starts with validating the target to be pursued for a new drug campaign. Eligible targets can come from a variety of sources, including: new insights about the mechanisms of a disease, breakthroughs in a new technology, or repurposing existing drugs based on unexpected side effects. Years of work and countless dollars go into making an actual drug, so the first step of validating that the predicted target is truly connected to the disease of interest, is key to honing in on targets that have a higher chance of success. Because most studies are done in model systems, it’s impossible to know for sure if they will translate to humans, but key experiments can help to build a strong case early on in the process. 

The next phase, hit discovery and lead identification, involves screening thousands (or even millions) of compounds in order to find starting chemical compounds that have some form of interaction with the target. Promising compounds are then selected to undergo rounds of iterative enhancements in a process called lead optimization. The goal of this process is to improve a compound’s overall drug profile, specifically on metrics like: 

  • Affinity:  how intensely does it bind?
  • Specificity: how specifically does it bind the intended target?
  • Selectivity: how well does it avoid other targets?
  • Potency: how well does it induce the intended biological response?
  • Pharmacological properties: how good of a drug is it?

Once a compound, or series of compounds, hits enough of these marks to be promising, they will be selected as the lead candidate(s) and moved on to the next stage of the process, preclinical studies.

Preclinical Studies

In this stage, the selected compounds undergo rigorous testing in laboratory and animal models to assess their safety and efficacy. These tests are crucial to determining the full pharmacological profile of a drug as well as any potential toxicities that would make it too dangerous to give to people.

Because every disease and drug is a bit different, there isn’t an exact playbook for what needs to be done during this phase. Broadly speaking, these studies focus on triaging the lead compound through three key areas: efficacy evaluation, pharmacology, and safety assessment.

  • Efficacy Evaluation: Preclinical studies measure a drug candidate's ability to produce desired therapeutic effects with the goal of predicting its potential effectiveness in humans. These tests are done across a number of in vitro (in test tube) and in vivo (in organism) models, to understand how effectively and reproducibly the drug candidate has the desired effect. 
  • Pharmacology: Researchers investigate two main aspects of the pharmacology of the drug. One is pharmacodynamics (PD) which describes what the drug does to the body and its effects. The second is pharmacokinetics which describes what the body does to the drug. This can be further broken down into absorption, distribution, metabolism, and excretion (ADME) which reflect the individual steps of processing that the body does when it is given a new drug. This needs to be understood in order to predict how long a drug will be available in the body and thus be able to exert an effect. Collectively, this information guides dosage selection and regimen optimization for clinical trials.
  • Safety Assessment: Preclinical studies rigorously evaluate a drug candidate's safety profile, identifying potential toxic effects and ensuring its safety in animal models. In order to minimize the risk of missing potential toxicities, studies must be done in at least 2 model organisms. Of course, no model is perfectly representative and new technologies like in silico modeling and “organs on a chip” based on human cells are aiming to replace this, but it’s important to have this step and not use only one model species, because of the risk of over-indexing on one source of positive signal.

In parallel with the above, Chemistry, Manufacturing and Controls (CMC) processes are done to ensure that the lead compound can be manufactured reproducibly and at scale. Especially as the lead compound is tested across a variety of models and assays, it’s critical that the compound quality is consistent across manufacturing batches so that the data can be accurately compared. 

The culmination of the work done during this key phase is an Investigational New Drug (IND) application. Before a drug candidate can advance to human clinical trials, it must obtain approval from regulatory agencies such as the FDA in the United States. This approval is granted through the submission of an IND application, which outlines the preclinical data supporting the safety and efficacy of the drug candidate, the manufacturing profile and plan for making the compound at scale, and the proposed clinical study plans. Once compiled, the sponsor company submits the IND application to the regulatory agency for review.

Clinical Trials

Clinical trials are the next critical phase in the drug development process, where potential treatments are rigorously evaluated in human subjects. Divided into several phases, clinical trials aim to assess the safety, efficacy, and optimal dosage of a drug candidate. 

  • Phase I trials involve only a small number of healthy volunteers and focus on safety and dosage escalation. 
  • Phase II trials expand the study to a larger group of patients to evaluate efficacy and further assess safety. 
  • Phase III trials involve an even larger patient population and compare the new treatment to standard therapies to determine its effectiveness. 
  • Phase IV refers to the period of time that occurs after a drug is approved, during which time people who received the new drug are monitored for long term risks and benefits.

These main phases of a clinical trial can span many years and clinics depending on the drug and the outcomes being tested. Successful completion of clinical trials sets the stage for the compilation of a New Drug Application (NDA), an exhaustive document (essentially the drug’s magnum opus), presenting comprehensive evidence from all the preceding years of work. 

Review and Approval

Following the submission of an NDA, the FDA conducts a thorough review to assess the safety, efficacy, and quality of the proposed new drug. This involves experts from various disciplines scrutinizing the data presented in the NDA to ensure compliance with rigorous scientific standards and regulatory guidelines.

The FDA review process encompasses several stages across 6-12 months. Initially, an administrative review verifies the NDA's completeness and adherence to regulatory requirements. Pharmacologists and toxicologists then evaluate preclinical data to understand the drug's pharmacological activity and safety profile.

Clinical reviewers analyze data from all trial phases to assess efficacy, safety, and dosage regimens, while chemists evaluate the drug's formulation and manufacturing process for consistency and reliability. Regulatory writers ensure labeling accurately reflects the drug's indications and dosing instructions.

Upon completion of the review, the FDA issues a decision on NDA approval. Approved drugs may be subject to post-marketing requirements for further monitoring. FDA approval signifies a significant milestone, allowing the new drug to be marketed and distributed to patients.  It’s estimated that only about 14% of all drugs that enter Phase 1 will eventually lead to approval, though this varies by drug type and indication.

Major Players in the Ecosystem

The journey to a new drug is a long one and many people play different roles along the way. At the very beginning of this trajectory are academia and universities which serve as the hubs of new innovations and discoveries. Biotechs then spin out of universities, or pull discoveries from published research literature in order to test the new innovations and potential for commercialization. They do a lot of the heavy lifting in terms of early R&D up through pre-clinical and early clinical trials, often in conjunction with Contract Research Organizations (CROs) who specialize in key types of assays. Finally, big pharmaceutical companies will either invest, partner, or outright buy the biotech companies/assets as the research looks more and more promising. 

While this is an oversimplification of the ecosystem, a key reason why it plays out this way is that while early stage discoveries are new and exciting they are also risky. So many more of these early and exciting discoveries don’t pan out vs those that do. Academia and biotech are more willing to take on that risk and it’s also cheaper to do so at the earlier stages. On the other hand, clinical trials are very expensive and resource intensive, so as biotechs move assets through their pipeline, it’s advantageous to get Pharma partnership or sell the asset entirely. Money made from the successful drugs is then invested into early discovery research and biotechs and the process continues. 

Data Management and Integrity

The data generated at each stage of the drug discovery process is invaluable and serves as the foundation for key decision-making all along the way. The integrity of this data is crucial for several reasons:

  • Reliability and Reproducibility: High-quality, reliable data ensures that experiments can be repeated by other researchers, leading to consistent results. This is essential for validating findings, building a robust scientific foundation, and putting together a compelling research package for regulators or potential partners.
  • Regulatory Compliance: Drug development is subject to strict regulatory standards. Navigating key data inflection points like the IND application, NDA, and portfolio evaluations by partnering companies can be made less painful by knowing exactly where all the key data is beforehand and that it’s in alignment with regulatory requirements.
  • Cost and Time Efficiency: Inaccurate or incomplete data can lead to misguided decisions, resulting in wasted time and resources. High data integrity accelerates the drug development process by providing trustworthy information for efficient decision-making.
  • Patient Safety: Ultimately, drug discovery is about improving patient outcomes. Ensuring the accuracy and reliability of data is vital to developing safe and effective medications.

At Kaleidoscope, we understand these crucial needs and have developed tools that help facilitate data management best practices and storytelling, in intuitive and user-friendly ways. Both for driving internal projects forward and for effectively communicating externally with investors or board members. In everything that we do, our goal is to ensure that we are building tools purpose-made for the drug discovery process, so teams don’t have to do the work of force-fitting a generic software onto their specific workflows.

We understand that by helping researchers uphold the highest standards in data collection, analysis, and reporting, they contribute to the advancement of science and the development of innovative therapies that have the potential to transform lives. In the dynamic world of drug discovery, data integrity is not just a best practice – it's a cornerstone for success.


If you want to chat more about anything we wrote, or you’re interested in finding a way to work together, let us know!