Skip To Content
Back to Resources

How AstraZeneca is Overcoming Obstacles to Analytical Data Access

Industry

Industry

Pharma

Company Size

Company Size

~8500 employees

Location

Sites

Cambridge (UK), Gothenburg (Sweden) and Boston (USA)

The Data Accessibility Problem

AstraZeneca is a global pharmaceutical company with R&D sites all over the globe. As a leader in the Oncology R&D division and responsible for the analytical team in the UK, Nichola recognized that analytical data was difficult for scientists to find and access, which led to inefficiencies and duplication of effort.

“We had existing methods to share live analytical data to individual scientists, via emails with data links and attachments. But it was a challenge to find historical data, or data acquired by others. The process involved numerous applications and steps to retrieve data, and the process was different for each function. It was even more difficult accessing data generated by our colleagues in the Oncology R&D team in the US which was organized very differently.”
Nichola Davies, Director, Structural Chemistry, Oncology R&D

The Cost of Ineffective Analytical Data Management

The initial synthesis of drug product occurs years before it transitions into the development environment. Analytical teams in discovery collect volumes of data to understand structure and develop chromatographic methods for purification (including complex chiral methods) in the lead up to candidate nomination. Unfortunately, this work is often lost in the transition to development. Analytical teams in development regenerate analytical data, reassign spectra, and redevelop chromatographic methods because organizations lack effective analytical data management that would allow them to leverage the data and knowledge acquired earlier in discovery. At best they rely on personal networks and email, making data accessibility highly siloed, inconsistent, and time-consuming.

Nichola and her team recognized this was true for R&D at AstraZeneca and decided to be the changemakers, starting with their own division.

Meet the Trailblazers

Nichola Davies

Nichola Davies

Richard Lewis

Richard Lewis

Johan Ulander

Johan Ulander

Prakash Rathi

Prakash Rathi

Goals

Breaking Down Barriers to Data Access

  • 1 A centralized, cloud-based solution to make analytical data accessible to all functions within the organization
  • 2 Both raw and processed analytical data to be stored and made accessible, live (immediately re-useable)
  • 3 Make analytical data available for the development of machine learning models

AstraZeneca’s Analytical Data Management Strategy

Using Spectrus to drive efficiencies, deliver insights & prepare for AI

ACD/Labs’ applications on the Spectrus Platform were already ingrained in analytical workflows at AstraZeneca. Groups were using these tools to analyze and interpret NMR and LC/MS data and manage analytical knowledge, primarily within the development functions.

Embarking on the Global Analytical Database (GAD) project, the team decided to fully capitalize on their global licenses of Spectrus applications and use ACD/Labs services to integrate hardware and software, automate workflows, and fully realize their vision of centralized, accessible analytical knowledge.

Starting Small

A pilot project for analytical data management was implemented in 2016, to support Oncology Chemistry workflows across the UK and US teams, from inception to candidate nomination. Beginning with a small proof of concept deployment allowed the team to demonstrate the advantages of investing in an analytical data management project at scale. The success of this deployment evolved to a global project, to include data for all therapeutic areas and CRO partners in GAD, which was deployed broadly across AstraZeneca in 2021.

AstraZeneca’s Global Analytical Database

AstraZeneca's Global Analytical Database

The workflow enabled by ACD/Labs technology makes live and processed data accessible to all R&D functions in AstraZeneca, including data science.

The automated analytical data management system at AstraZeneca collects raw NMR and LC/MS data from analytical instruments, extracts metadata, adds structures from the ELN or registry, processes the data according to the datatype, and creates a database record in a cloud-based, searchable repository. Raw instrument data files are copied to an archive for regulatory and IP purposes.

Global Access to Data

Hundreds of analytical, medicinal, and computational chemists access the GAD across the globe (Waltham and Gaithesburg, US; Cambridge and Macclesfield, UK; Gothenburg, Sweden; and Oss, the Netherlands).

Installed Spectrus Software

Integrated Instruments

  • 92 analytical instruments*
  • 33 Bruker NMR instruments
  • 59 Agilent Q-TOF and Waters MassLynx, LC/MS instruments

*New instruments are continually added in ongoing efforts to further expand the data being managed in the GAD.

Impact

Benefits Being Realized from GAD

Analytical Data

Available Data

Analytical data is available minutes after it has been acquired, and searchable for users to find regardless of where it was acquired. The data is also accessible to automation workflows and for data scientists for ML applications.

Consistent Data

Consistent Data

No matter which instruments the data are generated from, consistent metadata and processing means that the data is easy to find using one or more search terms, making comparisons of data possible and data re-use easier.

Time Savings

Time Savings

Accessing historical data for patent and publication writing is quick and easy and no longer requires contacting an analyst to retrieve it.

Standardized Reporting

Easy, Standardized Reporting

Everyday reporting of data—NMR, MS, and analytical UHPLC has been standardized across sites and publishing results is faster than ever. Scientists can gather all the data they need with a few mouse clicks.

Efficiency

Efficiency

Colleagues in downstream workflows no longer need to repeat experiments—they can gather project data easily before they start on refinement and further investigations.

Insights into data

Insights

Scientists are able to pull trends from data that would be difficult to identify without the large volume of standardized data.

AI/ML ready data

AI/ML-ready Data

Analytical data is standardized and engineered, beyond anything previously available, for data science projects.

“We now have a foundational platform where the data is organized and accessible. We’re expanding on that to use it in different ways. This project will continue to grow over time.”
Prakash Rathi, Augmented Drug Design Engineering Lead, R&D IT

An AI Outlook

Standardized and contextualized analytical data becomes available for data science applications.

The teams at AstraZeneca are heavily investing in methods to characterize, understand, and predict data; and are leveraging the ability to quickly build an understanding of relationships between data and certain properties—insights that otherwise would require more rigorous experimentation or be unavailable to them.

“We’re embracing the use of analytical data to build models that put it to use beyond its primary purpose. The future application of our analytical data could pattern recognition algorithms that would make the traditional data interpretation of spectra and chromatograms scientists undertake today, a thing of the past.”
John Ulander, Principal Scientist, Data Science and Modelling

“We are using our analytical data to build our own predictors for NMR spectra. We may be able to use chromatography data to predict retention times and decide on the best purification methods without relying on method screening. That’s both efficient and contributes to more sustainable, green chemistry.”
Richard Lewis, Principal Scientist, R&I

Analytical data is at the center of R&D and raises unique challenges in data digitalization projects. AstraZeneca have successfully digitalized their analytical data, made it broadly accessible, and ensured it can be leveraged for machine learning projects.

Download the Case Study

Download Now