Industry
Pharma
Company Size
~8500 employees
Sites
Cambridge (UK), Gothenburg (Sweden) and Boston (USA)
- 1 A centralized, cloud-based solution to make analytical data accessible to all functions within the organization
- 2 Both raw and processed analytical data to be stored and made accessible, live (immediately re-useable)
- 3 Make analytical data available for the development of machine learning models
AstraZeneca’s Analytical Data Management Strategy
Using Spectrus to drive efficiencies, deliver insights & prepare for AI
ACD/Labs’ applications on the Spectrus Platform were already ingrained in analytical workflows at AstraZeneca. Groups were using these tools to analyze and interpret NMR and LC/MS data and manage analytical knowledge, primarily within the development functions.
Embarking on the Global Analytical Database (GAD) project, the team decided to fully capitalize on their global licenses of Spectrus applications and use ACD/Labs services to integrate hardware and software, automate workflows, and fully realize their vision of centralized, accessible analytical knowledge.
Starting Small
A pilot project for analytical data management was implemented in 2016, to support Oncology Chemistry workflows across the UK and US teams, from inception to candidate nomination. Beginning with a small proof of concept deployment allowed the team to demonstrate the advantages of investing in an analytical data management project at scale. The success of this deployment evolved to a global project, to include data for all therapeutic areas and CRO partners in GAD, which was deployed broadly across AstraZeneca in 2021.
Impact
Benefits Being Realized from GAD
Available Data
Analytical data is available minutes after it has been acquired, and searchable for users to find regardless of where it was acquired. The data is also accessible to automation workflows and for data scientists for ML applications.
Consistent Data
No matter which instruments the data are generated from, consistent metadata and processing means that the data is easy to find using one or more search terms, making comparisons of data possible and data re-use easier.
Time Savings
Accessing historical data for patent and publication writing is quick and easy and no longer requires contacting an analyst to retrieve it.
Easy, Standardized Reporting
Everyday reporting of data—NMR, MS, and analytical UHPLC has been standardized across sites and publishing results is faster than ever. Scientists can gather all the data they need with a few mouse clicks.
Efficiency
Colleagues in downstream workflows no longer need to repeat experiments—they can gather project data easily before they start on refinement and further investigations.
Insights
Scientists are able to pull trends from data that would be difficult to identify without the large volume of standardized data.
AI/ML-ready Data
Analytical data is standardized and engineered, beyond anything previously available, for data science projects.
“We now have a foundational platform where the data is organized and accessible. We’re expanding on that to use it in different ways. This project will continue to grow over time.”
Prakash Rathi, Augmented Drug Design Engineering Lead, R&D IT
An AI Outlook
Standardized and contextualized analytical data becomes available for data science applications.
The teams at AstraZeneca are heavily investing in methods to characterize, understand, and predict data; and are leveraging the ability to quickly build an understanding of relationships between data and certain properties—insights that otherwise would require more rigorous experimentation or be unavailable to them.
“We’re embracing the use of analytical data to build models that put it to use beyond its primary purpose. The future application of our analytical data could pattern recognition algorithms that would make the traditional data interpretation of spectra and chromatograms scientists undertake today, a thing of the past.”
John Ulander, Principal Scientist, Data Science and Modelling
“We are using our analytical data to build our own predictors for NMR spectra. We may be able to use chromatography data to predict retention times and decide on the best purification methods without relying on method screening. That’s both efficient and contributes to more sustainable, green chemistry.”
Richard Lewis, Principal Scientist, R&I
Analytical data is at the center of R&D and raises unique challenges in data digitalization projects. AstraZeneca have successfully digitalized their analytical data, made it broadly accessible, and ensured it can be leveraged for machine learning projects.