Efficiency in R&D helps us be more productive, optimize, and innovate. Our recent virtual symposium “Driving Efficiency with Spectrus” brought together scientists from industry-leading organizations (like GSK, AstraZeneca, Genentech, and more) to share strategies that they have implemented, to:

  • Accelerate structure elucidation and verification by NMR and LC/MS
  • Make method development efficient, robust, and sustainable
  • Manage high throughput chemical and analytical data to speed research

This episode recaps all 8 presentations and how ACD/Labs’ software on the Spectrus Platform was used to support these workflows.

Read the full transcript

Driving Efficiency with Spectrus

Sarah Srokosz  00:08

At ACD/Labs, we are fortunate to have customers from many of the most innovative chemical and pharmaceutical companies in the world. It is always thought provoking to see what these folks are working on, and how our software helps them in their research.

Baljit Bains  00:21

One of the ways we showcase this incredible work is our annual virtual symposium. This event includes presentations from scientists who are working on structure characterization, method development, analytical data management, and more.

Jesse Harris  00:33

Today, we’re going to be discussing some of the insights from this year’s symposium, which focused on the theme of driving efficiency with Spectrus.

Baljit Bains  00:42

Hi, I’m Bally.

Sarah Srokosz  00:43

I’m Sarah.

Jesse Harris  00:44

And I’m Jessie. We’re the hosts of The Analytical Wavelength, a podcast of chemistry and chemical data.

Baljit Bains  00:51

In this episode, we will share previews and highlights on the topics that were covered. If you want to watch the full presentations, they’re all available on our website.

Sarah Srokosz  00:59

We kicked off the virtual symposium with a presentation from our collaborators over at Merck KGaA or MilliporeSigma, as it is known in the US and Canada. The presenter, Coralie Leonard, is the digital business model developer in analytical chemistry, with a strong background in reference materials and analysis standardization. She’s now the project lead for their digital reference material platform, ChemistTwin.

Baljit Bains  01:24

What’s a digital reference material?

Sarah Srokosz  01:27

It’s essentially certified analytical data, in this case, NMR, other reference material that can be used for comparative structure verification, and eventually quantitation.

Jesse Harris  01:37

So how do they create them?

Sarah Srokosz  01:40

In her presentation, Coralie goes into more detail about how they prepare them and then explains how ChemisTwin leverages NMR Workbook’s structure verification technology to enable chemists to qualitatively verify their structures and 1/3 of the time.

Coralie Leanord  01:55

Starting point at Merck, we have a wide library of physical reference material, over a 25,000 physical product that we are offering in the area of reference material. From this library, we are taking experimental NMR spectra in our own facility, they saw mostly 600 megahertz spectrum. Then we take this known structure together with our experimental spectra, and we input them into an NMR Workbook Suite. NMR Workbook Suite will predict the spectra based on the molecule structure and then we will correct that prediction manually and optimize it. Once we have a perfect match between the prediction and the reality, then we do a quality control, quality assurance release of the data (so it gets verified by an additional two people), and once it’s released, we are calling this a digital reference material. Finally, when you as a customer go on to chemistry, you will upload your own spectra, you will select your digital reference material, and here a customized version of NMR Workbook will do the comparison between your sample and the digital reference material data. In this case, ACD/Labs will only need to predict the difference that can be accounted for your experimental setting, if you have a different solvent, and if you have a different magnetic field, but it will no longer do a prediction on the molecular structure is has already been corrected by the exact experimental spectra. If you have a good fit between the two, then you get a positive interpretation and a positive match of your sample with the selected DRM. If not, then you receive an answer that this might not be a good match or is even a complete zero match.

Jesse Harris  03:59

By reducing the need for physical reference material that would make these structure verification workflows more environmentally sustainable as well.

Sarah Srokosz  04:07

Absolutely.

Baljit Bains  04:09

Speaking of sustainability, the next presentation was by Matt Osborne from AstraZeneca. Matt has extensive experience in discovery, Pharmaceutical Sciences and product development. He provides an overview of sustainability goals for labs that want to be greener. As we all know, green chemistry is a hot topic with many organizations constantly striving to become more sustainable.

Jesse Harris  04:30

Yes exactly. In his presentation, Matt discusses taking a structured approach to method development to reduce the amount of experimentation and promote sustainability in method development through some easy wins.

Sarah Srokosz  04:43

Everyone likes easy wins. What are some of Matt’s suggestions?

Matthew Osborne  04:48

So these are some of the key easy wins that we can achieve as analytical sciences when we’re trying to deliver some of our method development strategies. So this is about driving the work that we do and get it right first time mentality so that we’re not wastefully repeating experiments that have failed for reasons that could have been otherwise been preventable. And where we are seeing a trend of analysis not being right first time, have we got structured problem solving measures in place to really allow us to understand what the root cause of that potential problem might be? And are we setting clear expectations on our scientists and our users of instruments about how they look after the instruments, how they’re trained to use them, so that they understand the instruments well enough so that they can do basic troubleshooting, but not necessarily be super engineers in being able to fix everything. And then once we can start to treat our instruments with care, we can look after them properly, we know what they’re doing, can we start to reduce the number of physical experiments that we actually need to do to get the end result and start to use the data that we’re generating on a regular basis in more of an in silico, or in more of a way that allows us to model and simulate chromatographic separations.

Jesse Harris  06:13

He then goes on to share how green chemistry principles and software can be employed to meet these goals, and how using physical chemical property prediction tools, in our Percepta software as well as method simulation tools in LC Simulator allow for more efficient method development.

Baljit Bains  06:30

This was touched on by the poll during the presentation, which asked how much time and/or effort is saved using an in silico versus purely experimental approach to method development. Matt addresses this point as well, here’s what he had to say.

Matthew Osborne  06:45

And I saw the results from the poll that was just up on the screen initially. And we’ve sort of done work in house and shown that actually, by using modeling and simulation, as opposed to doing lots and lots of experiments, we can reduce the need for practical experimentation by somewhere in the region of 30 to 50%.

Sarah Srokosz  07:06

Wow, those are big reductions.

Jesse Harris  07:09

After Matt’s presentation we had Kayleigh Mercer from GSK. She has been working in the high throughput experimentation group for a number of years, focusing on using automation and robotics to optimize and miniaturize reactions. For those who don’t know HTE is an approach to chemistry where you are running many experiments at once. This is great for accelerating your research. For example, Kayleigh is using HTE to optimize experimental conditions, sometimes running over 1500 experiments on a single plate. However, this can lead to challenges in data management.

Sarah Srokosz  07:43

What kind of challenges did she mention?

Jesse Harris  07:46

Well just answering basic questions like did I make the product I intended, or which reaction was most successful can be surprisingly challenging. When your experimental design and analytical systems are disconnected, it can lead to transcription errors and wasted time managing files, Kayleigh spoke about how one of our software tools Katalyst D2D helps to manage this challenge.

Kayleigh Mercer  08:11

So at GSK, we’re using Katalyst D2D to capture our high throughput experiments at the moment. So what can Katalyst D2D do to assist us in HTE design and execution. So GSK and ACD/Labs began a partnership which started in 2016. And in 2019, Katalyst D2D was launched. I have personally been using Katalyst since around 2019. And it encompasses, it has the ability to encompass the entire HTE workflow within a single software application, which is really powerful for this for high throughput chemistry. It’s able to interface with existing systems that we have at GSK, such as our materials inventory, which is really important and to be able to track our starting materials. And it also allows for automatic data transfer and processing with our chosen analysis method. To start with, we have our chemists identifying a problem that they would like to solve with HTE. This could either be reaction optimization, or just synthesizing multiple compounds all at the same time. They would then define the variables that they would like to test, and at this point, they’re able to move into Katalyst and use Katalyst for most of the rest of the workflow. So chemists can identify the materials used for the experiment within Katalyst. They can then design a plate and create a visualization to make it really easy to see what their plan is in the lab. They can use instructions to physically create stock solutions and reaction plates. The chemists can then complete the screen in the lab and create physical analysis plates. The data from these analysis plates is returned to Katalyst for processing and gives a visualization of data and helps chemists to identify the best conditions. So following having this answer that chemists can then take the information and scale up their reaction or or deal with that information as they wish.

Baljit Bains  10:12

She then shared a case study for a particular reaction.

Jesse Harris  10:15

Yes, it was really interesting to see how this is used in practice, I would recommend this presentation to anyone who’s interested in HTE. But it also reminded me of a podcast episode we did with Neil Fazakerley, who’s another member of the GSK team, there’ll be a link to that in the show notes that you can check out.

Baljit Bains  10:33

The final presentation of the first day of the symposium was by Jun Wang from Merck. Jun works in the high throughput purification group and specializes in the purification of small molecules and peptides. In her presentation, she talks about how the HTP group supports discovery chemistry by providing routine and specialized purification and analytical support.

Sarah Srokosz  10:53

The group uses the Spectrus platform for retention modeling and management of their chrom data from different vendors to create efficiencies and their workflows and leverage knowledge.

Jesse Harris  11:04

Jun discusses the workflow for the HTP group and the challenges they face particularly in data integrity and management. Like most labs, they have multiple modes and formats of data and have many different vendor instruments and software, making it challenging to share data and perform analytics.

Sarah Srokosz  11:22

To face these challenges, the team has set some goals which they hope to achieve with our Spectrus platform to expand access to their HTP data, and the wider chemistry department. June highlights those goals.

Jun Wang  11:35

The goal here is we we are exploring a new platform to expand the access to the data within HTP and the chemistry department. We hope we can get automatic data flow to minimize the need to manual… for manual import. We also hopefully impress our bid to make a higher circled like we can get as a data analysis and access more automatically. And will hope our data is definitely protected and located through the whole lifecycle. And also we are looking for some tool to kind of like speed up our workflow which make it a higher throughput.

Baljit Bains  12:19

The Spectrus Platform is multi technique and vendor neutral, making it a great fit for Jun’s requirements. Her team can use it for their LCMS and their NMR data and across all four different instrument vendors. Here is what she had to say about implementation of Spectrus within her team.

Jun Wang  12:35

So here come to the first conclusion about benefits and the future enhancements of snapshots. The benefits is where all the rest of the Spectrus is the new platform to expand access to data within HTP group or even get access to the whole chemistry department to access all the purification data. And all this data is already protected and located. And all those automatic data flow minimize a lot of manual input. And also the speed enhancements is achieved by automatic data analysis and the rapid access.

Baljit Bains  13:18

Day 2 of the symposium started with a presentation by Azzedine Dabo from GSK. Azzedine is part of the method innovation team and his area of interest includes in silico modeling and quality by design method development. In this presentation Azzedine walks through method development case study from start to final product. He details the steps and decisions made using a combination of his expertise, and the method development software AutoChrom to accelerate the project and achieve a final robust method.

Jesse Harris  13:45

Interestingly, the response to the poll during the presentation showed that the majority of the audience, 65%, said they manually screen their method parameters. What are the benefits of using in silico modeling instead?

Baljit Bains  14:00

In addition to being more sustainable and compliant with green chemistry, Azzedine highlights key reasons for using in silico modeling.

Azzedine Dabo  14:07

One of the main primary reasons why we do use this in silico modeling, it really enables us to be more efficient with method development but more importantly creating shorter, more robust methods, which therefore enables us to be more sustainable going forward.

Sarah Srokosz  14:27

Azzedine shows how he uses AutoChrom to screen various parameters, pH, buffer, column, column temperature, to optimize them with 2D or 3D models and how he uses in silico modeling to improve the separation between peaks.

Baljit Bains  14:43

By modeling various versions of the method he was able to reduce the runtime and improve resolution while also minimizing the number of practical experiments conducted.

Azzedine Dabo  14:53

So what I indicated is actually the method is robust, has good repeatability, and more method developed using in silico modeling software.

Sarah Srokosz  15:03

Our next presenter was Alexei Buevich, who has over 20 years of experience in the pharmaceutical industry, and now leads the NMR group at Merck as a principal scientist. Like many others in the fields of natural product structure elucidation, he noticed a disturbing trend of erroneous natural product structures being published in the literature.

Baljit Bains  15:24

Historically, when they have later been revised, this is done by total synthesis.

Alexei Buevich  15:29

The same review is showing that the changing sort of vision of the structure, when it’s the error has been identified, is usually done historically by total synthesis, and that is like a golden standard for the chemists to prove the structure. But total synthesis is a really labor intensive, and it requires the resources, it requires the time, and then it requires the capital investment which some some labs cannot afford to do, and the time of course. And in pharmaceutical industry, the time is everything.

Jesse Harris  16:10

Alexei offers an alternative to total synthesis, which isn’t as time consuming or labor intensive.

Alexei Buevich  16:17

So the solution for the problem of the structure division of the natural products, which is really creeping in, is that we see the solution is utilization of the software approaches. And we think that the combination of the computer system structure elucidation of the problem of structure elucidation program which is developed by ACD, and the methods quantum mechanical method as DFT, is that much better and more efficient way of solving that particular problem.

Sarah Srokosz  16:50

He then presents three examples of structures that were revised by total synthesis and shows that this combination of Structure Elucidator and DFT also arrives at the correct structure in a matter of minutes. He presents this as not only a way to revise erroneous structures currently in the literature, but also to prevent them from being published in the first place.

Jesse Harris  17:11

Our next talk “Building a database now and for the future” was presented by Sarah Robinson from Genentech, she leads structure elucidation in compound registration analysis and is involved in several machine learning research projects. Her presentation was about how analytical data management can accelerate your research.

Baljit Bains  17:31

One of the comments she made that really struck me was the way that this impacts scientists downstream and not just synthetic chemists, because

Sarah Robinson  17:38

We’re really looking at so many different compounds, and we might be moving an oxygen around and building out SAR, and later our downstream colleagues are seeing that as a metabolite from a P450 enzyme or as an oxidation impurity in development when looking at stability of our API. And so having access to dereplicating their data relative to data that’s already captured there, makes their structure elucidation or metabolite id much more efficient. And synthetic chemists are always responsible for capturing IP, you know, maybe publishing all that NMR and high res data in patents or publications. And having an ability to pull per project all of that information out in the format that they need, was a really important way for them to be able to interact with the database as well. So early on, they kind of working with ACD, were able to put together sort of these user stories and create an infrastructure and scripts and all sorts of different buttons for everyone to access the database. When we were building this out, we had three different teams where everyone was giving their input on how they wanted to interact with it. You know, specifically for chemists, when thinking about patents, like not over-reporting those multiplets is an important feature. For us, the automatic data processing was really important. And for our downstream colleagues in small molecule analytical chemistry working in development, being able to have Markush structures that aren’t over assigned, but are very specific to that part of the molecule and localizing say this extra oxygen to that area is important. It’s really easy to move between this database processor and ChemSketch. And I would say my number one favorite thing about the database is that it is so structure centric. And so if I want to say is there anything similar or do a substructure search on a compound, within ChemSketch, you can specify your database and either do a compound search or an SS or substructure search for a specific region of that molecule to pull everything in your database out and see the raw data associated with it.

Sarah Srokosz  20:01

Sarah’s talk really reminded me of our conversation with Hans from last month. All the work began with understanding the Genentech team’s needs and then building a solution to those needs.

Jesse Harris  20:13

Yes, details like how they manage Markush structures for DMPK studies or not over reporting multiplet reports for patents only comes from working closely with the chemists and understanding their needs.

Sarah Srokosz  20:28

We closed the symposium with Tatiana Didenko, a principal scientist at Amgen in the lead discovery and characterization group. The group performs structure elucidation of molecules both large and small by NMR. And she talked about their high throughput NMR workflow for the small molecules.

Baljit Bains  20:45

I’m familiar with high throughput experimentation. But what does that look like for NMR.

Sarah Srokosz  20:50

It can vary a bit depending on what it’s being used for. But in Tatiana’s case, it consists of the automated sample preparation and data acquisition, storage of the data in a unified database along with their LCMS data, and automated data processing and interpretation, also known as automated structure verification.

Jesse Harris  21:10

Well, I know that sample preparation acquisition is a little out of our wheelhouse, but automated processing and storage of NMR and LCMS data sounds like somewhere we can help out,

Sarah Srokosz  21:20

You got it. In her presentation. Tatiana gives us an overview of the instrumentation and data flows, and how the different hardware and software components work together to make these high throughput workflows a reality. She highlights that in addition to having their NMR and LCMS data in the same place, they really appreciate that they can also easily retrieve it for human or machine use.

Tatiana Didenko  21:44

With regards to the storage of the data, which we get into a nice and minimal format, we use unified database solution from the ACD/Labs. And here is the basic description of it. So we actually store all the data obtained during the process that I described previously. So it’s LCMS and NMR, we store it all at the same place, this data is accessible to everyone and can be actually mineable. So now let’s see how it is organized practically. This is the access to the unified database that we have. So we access it through a browser, and we use Citrix for that. This is actually quite convenient, because it doesn’t actually require a fast computer and you can access it from anywhere. One of the most important features and most useful for us is that everything is at the same place. And it’s very mineable. So it has well developed search. Like here, for example, you can perform a basic search based on the sample ID and using some logical operators. But since it has a lot of metadata, you can also search data using such parameters as in NMR probe head number which, which is shown here. And you also can add logical operators and look at the data acquired at some specific timeframe, for example.

Baljit Bains  23:38

So what kind of work do they do with high throughput NMR.

Sarah Srokosz  23:43

Good question, she wraps everything up with an example of how they use it for quality control of compound libraries that they store or buy, and provides the results of the automated structure verification for a library of more than 5000 samples.

Baljit Bains  23:59

And that should cover what happened at the Driving Efficiency with Spectrus Symposium.

Jesse Harris  24:03

Oh that’s all? Seriously though, this was actually a really incredible event. I’m very happy with how things turned out. Thank you so much to all of the presenters who participated in for everybody who helped to make this event possible.

Sarah Srokosz  24:18

As we said at the beginning, you can watch all these presentations on our website. You can also find a link to a page with all the presentations in our show notes.

Baljit Bains  24:28

So that’s all for today. Thanks for listening, and remember to subscribe so you never miss an episode.

Sarah Srokosz  24:35

The Analytical Wavelength is brought to you by ACD/Labs. We create software to help scientists make the most of their analytical data by predicting molecular properties and by organizing and analyzing their experimental results. To learn more, please visit us at www.acdlabs.com


Enjoying the show?

Suscribe to the podcast using your favourite service.