The Importance of Digitalization in Pharmaceutical R&D

In this episode, ACD/Labs Vice President of Innovation and Informatics Strategy, Andrew Anderson, and Strategic Partnerships Director, Graham McGibbon continue their discussion on digitalization and digital transformation in the context of pharmaceutical R&D. They assess how well the industry is adapting to the current digital landscape, and what can be done to contribute to the overall digitalization process.

Andrew and Graham also share their thoughts on how digitalization can be used alongside machine learning and artificial intelligence tools to assist with the innovation process through generating and enhancing the synthesis of novel chemical molecules.

Read the full transcript

Barbora Townsend 00:15

Welcome back to the Analytical Wavelength, a podcast about chemistry and analytical data. Brought to you by ACD/Labs. I’m Barbora

Baljit Bains 00:23

And I’m Bally. We are your hosts today. Last episode, we were joined by our VP of Innovation and Informatics Strategy, Andrew Anderson, and our Strategic Partnerships Director, Graham McGibbon, who discussed the basics of digitalization, digital transformation, machine learning, and AI, and how these fit in in the context of pharmaceutical R&D

Barbora Townsend 00:43

In the second part of the series, Andrew and Graham continue their discussion on digitalization as they delve deeper into digital transformation in a pharma industry and assess how it is performing in the current digital landscape.

Sarah Srokosz 00:58

So as you mentioned, this world that you and Graham live in, you kind of get exposure to what these pharmaceutical companies and other chemical companies are interested in doing with their data. And I’m sure you have come across, you know, several times, at least now the idea of using AI in the pharmaceutical industry. And so how exactly does that relate with digital transformation, and how do they kind of fit together?

Andrew Anderson 01:39

Yeah, I’ll field this one to start, and I’m sure Graham will have spirited commentary on it, no doubt. So yeah, indeed, to me, there’s a couple of different, call them value propositions for digital transformation. Certainly, one of them is productivity, right? And we sort of alluded to productivity, where, you know, you don’t have to print things, you don’t have to review things, right? We talked a little bit about presentation of data to machine interfaces, right, for automated interpretation. So I have a belief, a philosophy that machine learning, and correspondingly, artificial intelligence, relies on a volume of the specific observational or interpretive data to establish, you know, patterns, even if, in the case of artificial intelligence, those patterns aren’t necessarily obvious a priori, right? So, and if you think about how AI is leveraged, right? I tend to think I’m a layman, right? And I tend to think of it as a black box where you produce well-structured data with and call it structure that data using, you know, some sort of ontology or controlled vocabulary to represent your structure and present that data, there’s a quite a bit of different approaches to AI, no doubt, but they all require well-structured data, training data and ultimately, that training data will be used in a validative exercise, right, whatever that exercise is.

So you know, the goal of AI, or implementation of AI is really, at least in the industries we work in, is two things from, from my perspective. One is to support the innovation process, right, being able to identify new business opportunities. That’s the innovation process in a nutshell, right? And so you augment your innovation process using artificial intelligence, finding patterns and correlations that you wouldn’t ordinarily be able to find on your own, with a collection of humans, right? A collection of humans looking at data. That’s the way I see AI. I’m sure the like AI practitioners will have their own definition, no doubt. But you know, in mine and Graham’s world, we tend to think about things in the abstract. So when it comes to any sort of like application of AI in an industry, you start with well-structured data, right? And here we are.

All you know, in these various industries, we’re all working on digital transformation, and so I don’t want to say AI is the free prize inside it. But in a way, it is right. You can, if you establish a set of good practices, right? I’ve heard the term data hygiene practices. You often you know are able to benefit by implementing those practices to direct your collections of data to places where you build and validate models. Okay, so you know, if you’re going to the effort of digital transformation, you know that that’s when you make the folks that are involved in disciplines like data science making their lives easier, right by making data more available. I’ve said this so many times in so many interviews, but if you look at the standard of effort required to implement a so called learning organization, right one, that one that uses the physical activities that are going on within the organization, and capturing the insights that one obtains from those activities, right? So there’s some mechanism by which you capture observations you’re able to then learn from those observations, and then improve your processes, like, that’s like my definition of a learning organization. Okay, so being able to improve the ability to interpret observations, that’s like a great application of AI, right? So you get capture, synthesis, and then receipt of those observations or insights, I think, is the application of AI and in an industrial setting, no doubt, then there’s a whole like discipline around generative AI. You know, if you have again, like they require, you know, good, well-structured training data to assure that your recommendations are valid, right? So Sarah, I hope that helps. Graham, I’m sure you have, like, additional comments on the subject, no doubt.

Graham McGibbon 07:17

Sure, things that we’ve talked about, you and I in other contexts. I think for me, the big thing is that they’re the underlying requirements to leverage artificial intelligence and machine learning in the pharma industry, is that you have a substantial amount of data. And Andrew, you talked about data hygiene, and that’s important, because you need not just large data sets, but not noise free, but not biased data sets. Because I think there’s a lot of stories out there, not necessarily from the pharma industry, but many industries where, if you have bias, either conscious or unconscious, in the acquisition of the data, and that that spills over into the data, you’re going to do some kind of analysis using these approaches, that you’re going to end up with outputs that reflect those biases, and those can be wasteful or detrimental. Could be that they’re positive or effective enough, but that’s where that cleaning and structuring of the data, as well as having a lot of it. I think that the pharma industry has a challenge where we’ve seen some of these tools at the large language models, for instance, of artificial intelligence, they could help in the way that people describe their experiments and understanding and interacting as colleagues. I think that there’s really a role, or in looking at the literature and trying to get insights, or parsing of the literature using AI to assist a scientist. And I haven’t really thought about that too much before preparing for the podcast and some of the things that you and I have talked about over the more recent course the time in the past year or two. But this, you mentioned generative AI, and that that’s a place where leveraging those pools of data and trying to support the innovation process. You said that, Andrew, I think that’s really key here is can those systems, and we’re seeing it, give ways of generating new chemicals which are not identical to the way that chemists currently think about them, because that’s where the advance and advantage comes in. The computers can propose molecules much, much faster. But if a system is going to synthesize them again, it goes to that, you know, the large array of conditions and materials, you have to pare it down.

And Sarah, you and Andrew both noted the importance of data reduction and also have the concerns about data reduction, but it’s absolutely essential, and that’s what machine learning approaches are doing. They’re taking huge pools of data, but not infinite. We don’t have the capacity to store infinite amounts of data. We’re trying to pick out what’s important from it. Modern machine learning approaches, and going into deep learning has been highly focused on neural nets, and neural network types of analyses and optimizations and representations and deep neural nets used to be that people used just a few layers in them, but now they use more layers computing power has increased. Cloud computing makes it possible to do these kinds of things on large data sets, and that’s really the technology behind it has changed, and it means that you can represent complex data and tease out subtle relationships in those data to help you with your improved generation of molecules, for instance, that’s one of the big areas where it’s used in pharma.

But I see Andrew, I think you’re right. You know, process optimization, even beyond the simple, how are my instruments being used? I think that’s also an early area where it was you know, how can we optimize the use of equipment. How can we make sure our maintenance cycles keep operating efficiently? But it’s going to go beyond that much more, I think, into what are the optimum conditions, not just for like a yield of product, but for the entire impurity profile. Because you’ve talked about that a few times, that digital twin is not just the attributes of the substance which you expected to make, but the composition of matter in terms of everything that’s in the sample, and how do you control that, and what happens to all of them? So I think it’s a really interesting and complex area right now.

Sarah Srokosz 11:56

Yeah, no doubt, certainly, yeah. So we’ve kind of set laid the groundwork now, but my next question is, more of your opinions. So how do you think that the pharmaceutical industry is performing in their digital transformation? What is your assessment, I guess, of the current landscape?

Andrew Anderson 12:22

It’s a great question. Sarah, I laugh because it’s… here’s the good news, right? It’s like we’re all in this together. That’s the way I feel. There’s a feeling, there’s a number of, like, pre-competitive consortia involved in this very area, you know, let’s not also like, there is the tendency amongst stakeholders in this community to look at like progress in digital transformation, and to give some score, right? And, like, I can’t comment on the like, progress towards a goal, because that depends on what one’s goal is, right? If you look at digital transformation, at ACD/Labs, like what we have done in our own digital transformation journey, you know, not to get too detailed here, but, like we’ve implemented certain digital tools that replace paper ones, right, or at least document ones in mine and Graham’s department, right? We’ve moved to digital systems for certain document-based, formerly document-based activities, right? So, I could put a goal, and you know, but that goal depends on, like, what are your aspirations, right? What do you want to accomplish through digital transformation. So, if I take you know that approach right to comment on, how are things going in digital transformation, in pharma, what I typically see is the goals are between, like five years ago and 2030 right? And so, you know, leaders in these organizations definitely take a long-term innovation view and then attempt to determine their goals in the future, right? And those goals in the future are based on, instead of like insights about the future, they call these foresights about the future. So in order to predict, like you know, your journey from now to your future, and your future, accomplishing your future set of goals, you have to think a little bit about, you know, your foresights, right, the insights about the future that you want to act upon and establish a plan. To address, you know, whatever those foresights have revealed about the future.

So when I look at the vast collection of activities, right, I say, I see that you’ve got some still efforts to go in terms of digital transformation. No doubt, secondly, you see great applications of AI in the industries we work in and around, right? No doubt, you see lots of publications about, you know, implement use of AI. I think the focus in the industries we work in is about extensibility, right, operating or sorry, implementing these, this digital twin, for example, the digital twin paradigm at scale, right is an aspirational goal of lots of organizations, and we still have some, what I call foundational goals to achieve right meaning, and Graham will laugh when I say this, because I’m going to use a term that I use in my day job. You can’t shoot a cannon out of a canoe, right? What that alludes to is, if you’ve got a goal, right? You have to have, you know, reaching that goal when it’s aspirational, requires that you have a solid footing, a solid foundation upon which you climb the mountain, so to speak. So to mix my metaphors. So like, in this case, right? Like if you did, if you picture someone trying to shoot a cannon out of a canoe, the canoe is going to flip over, right? So we use that about that metaphor all the time in in AI, right? Like you have to establish a really solid foundation for streaming well-structured data to the AI machine for suitable consumption. And ideally, that data gets streamed automatically, as opposed to poor data scientists having to, you know, wrangle, literally, that’s the term they use. Like data wrangling to get data so that the ML or the AI model can chew on it, can consume it. To further mix my metaphors, so from my perspective, you know, like there is a right now, in 2024 there’s a combination of extending the utility or application of AI across a wide set of application areas in industry. That’s one thing, and then second thing is to further establish a solid foundation for feeding the models with well-structured data. So what we often see is a matrix approach to arriving at one’s goals right, continuing to identify activities like inside an organization that aren’t that, that stream data, observational data or collected data that aren’t yet well structured. So there’s ongoing data, electronic data capture efforts number one, and then number two is, can we then build models that will allow for ML or AI, augmented activities, iterative activities over time, and that, that for me, as like a two part or a two by two, if you will, matrix that we see in industry right now. You know each organization, each person within an organization, has their own you know position, or you know where they are in their digital transformation journey, no doubt, but I think everyone at least knows what the problem is, right and knows, likely, where their you know, soft spots, so to speak, are and are attempting to address them by a number of different ways. So that’s what I see in in the state of the industries we work in and around. Graham, how do you feel?

Graham McGibbon 19:49

So I’m going to mix metaphors and get a little philosophical, since you started it. So, I’m going to go off piste here and say that I think that the fundamental goals like have the philosophical side of things. Do you agree or not that the basic goals of the pharmaceutical industry are still the betterment of the human condition at its like the initial goal, right? Because I got a couple other things, yeah, and I do too. And the reason I bring that up, Sarah is because a lot of people in society will, and you see it in the media criticisms of the pharmaceutical industry for some of the shortcomings, you know, people have made, you know, financially advantageous but ethically questionable, if not unjustified activities and decisions and behaviors in the industry as a whole. I’m not singling people out, but there are examples of this. But I’ve traveled to a lot of sites, and Andrew, you have too, and I met a lot of people at a lot of levels in pharma. And those people are not like that. These people are really trying to make medicines that improve the quality of life for people all around the globe and at all levels. And that to me, it’s admirable. It’s one of the reasons why, in my former life, I was proud to be in the industry. And so I know I come from a position of bias, but I think that’s really important, because if that’s their goal, and then you layer on top of it, the demand in the modern world for sustainability, so sustainability of processes and things like that, that’s something our organization’s concerned about. And the demand for information, everybody has a greater demand for information, including the customers and consumers of these medicines, and lastly, but not least, quality, how do we bring that all together?

So the goals for me, that I see for the pharma industry are satisfying, that, how do we better the human condition? How do we have sustainability in our organization and what our activities are? How do we acquire and share information and have increasing quality of doing all of that? So they need to make meds. They need to make money, because otherwise they don’t have sustainability of a business, let alone the process of being greener and simpler and things like that. And they want to do it faster, because it’s a competitive industry. They want it to be safer. That’s the point of quality is to make safer medications. But because of this information demand, they need what you refer to Andrew many times as the democratization of data, right? And I’m not going to beat up on silos, because they’re very useful things for what they’re designed for, for sharing, right? Yeah. I mean, these are tools, right?

So recognize that a tool has a value, but it becomes a metaphor for all that’s wrong, instead of recognizing the things that that have value. And that’s my last point. Is that organizations are across the spectrum in the stages of their journey, because they have the goals, but it’s what are their priorities that they’ve articulated out of those goals we’ve just talked about, and how do they realize and increase the value of the data that they have and understand that some data reduction is necessary. You can’t keep all data. So this understanding of value is what data needs to be, you know, raised to the top, what is eliminated or reduced, data abstraction and data reduction, because a data lake that has everything in it is expensive and less and less useful over time. And that’s kind of, I know, a hugely philosophical but that’s my kind of perception. So, the industry as a whole is moving on this, but different companies are at different places in that journey, depending on their prioritizations of those various factors. I think Sarah, yeah.

Andrew Anderson 24:00

So Sarah, if there is a title to this pot, this series, this uh episode of the podcast, please let it be cannons out of canoes in the data lake or something like that.

Graham McGibbon 24:15

That’s awesome.

Sarah Srokosz 24:16

Definitely. I’m loving the metaphors like I think it’s very helpful from a very layman perspective, it really does kind of put things into context. And so kind of switching a little bit from the very philosophical and stuff we, you know, obviously make tools, then we are in these spaces of digital transformation. And, you know, AI, even though that is not, you know, something that we do directly as we’ve talked about it is closely related. And so, how do ACD/Labs software and services help companies address this topic? Where do we fit into all of this?

Andrew Anderson 25:03

Yeah, I hope, I think that fundamentally we help with that trend, the translational step from it’s like, from signal to like instrument, signal to insight, to knowledge to wisdom, right? There’s like that journey, right? And whether it’s digital or machine-driven, at the end of the day, organizations aspire to be learning organizations, right? And so there’s a journey from like the physical activity you do to observations that you can infer from those observations, like you hold certain knowledge, or wisdom you know over time, and so that establishing that wisdom requires energy, requires activity effort. We want to cut down both the magnitude of the effort and the time required to obtain the insight, the knowledge, the wisdom. And so, you know, our portfolio of products we pretty tediously meticulously maintain a portfolio of products that help with that duty cycle of seeking wisdom, not to get too philosophical, but that’s our place. You know, depending on the product, right, there are a series of a very wide breadth of data formats, analytical data is not monolithic, right? I try to make that point a lot in the early days of attempts to standardize analytical data. Folks with metaphors speaking, there was an attempt to describe analytical data like audio files were characterized, right? So folks saw different formats in audio files and said, MP3 is the answer. Well, analytical data isn’t monolithic, right? Think spectra, think chromatograms, like that’s the simplest way to kind of diffuse the metaphor, right? It doesn’t apply. There’s different like, dimensions of data, and so you can’t treat analytical data as a monolith. Furthermore, you know, to digitalize the entire cycle. You know, analytical data is just like one part of the overall cycle, right? The DMTA cycle in this context. So being able to juxtapose or represent analytical data relative to, you know, other types of data is like just as important as a representation itself. And so what Graham and I like to talk about is data contextualization. Classic example is representing a series of data in a quantitative study. Right? Like the most basic contextualization is helping folks organize like data sets from an instrument, right? And that those data sets have, you know, there’s some relationality to it, like a quantitative study with, you know certain concentration values from for each sample, right that you have in such a study or a longitudinal study? For that matter, you know you profile a certain study over a certain experiment over time, and that profiling is represented by either like probe points or sample points that are characterized offline, ex situ, if you will. So we help you know in two ways. One is, you know, by being able to capture and visualize and facilitate interpretation of data from various techniques and formats that’s the first place, and then further contextualize that data. When you have numerous data sets, they could be orthogonal. They could be similar in some sort of grouping, you know, and represent digitally, that grouping, okay, with further context, so that’s what our products do you know, whether it’s our workbook collection, whether it’s Spectrus Processor, whether it’s the new Spectrus Processor JS, right, all of them have a capability to represent like data files, you know, originally in, you know, some format and understanding the technique. There are, you know, capabilities within our applications to facilitate either machine or human interpretation of that data. So I hope that makes sense. Graham, anything to add,

Graham McGibbon 30:24

I would only add one tiny thing, and that is that, because we know customers want many similar things, but with nuances or variations in the order of the way that they do things or tiny differences, sometimes, or large differences in the kind of data extraction that they want to perform. We use professional services and the set of tools enable customization of workflows on top of their sort of capability set. There’s the way to augment those capabilities and include some more configuration or customization towards that, but I think you got the portfolio. It’s you have the strategic lead on that, and there’s a good reason for it.

Andrew Anderson 31:15

Yeah, Sarah, the only other thing I’d add is, again, like our software applications, you know, they’re built for human interpretation. That’s one part of the portfolio. The other part in the automation side and enterprise informatics side of our portfolio is to facilitate presentation of those digitally transformed data sets, contextualized data sets for consumption by machines. Right? That’s one of our portfolio elements is to enable digital trend to help organizations reach their digital transformation goals by helping implement a digital twin paradigm, right, by helping transform well assembled and contextualized data, making it available for machines to consume and to build models and the like. Other part, I’ll say, is we have our own, you know, collection of AI based or ML based products, right? We have well developed, well validated, highly accurate prediction applications in our Percepta line that that are sort of the proof in the pudding, right for companies to leverage. We’ve done some of the data assembly, data aggregation and data validation work in certain contexts to deliver, you know, sort of ready to go models for certain descriptors and attributes and properties. So we have that available as well. And, you know, cut down the amount of time and effort required to deploy a predictive model in industry is another part of our portfolio.

Sarah Srokosz 33:17

Yeah. So you know, since we’re throwing around a whole bunch of metaphors and everything today, which, as I said, I think it’s great, sometimes at a very high level. I’m sure the metaphor falls apart if you look into it too closely, but at a very high level, you know, I think of digital transformation and the implementation of AI, sort of like if you’re cooking a dish. And I sort of see our tools as like playing a part in, like the mise-en-place, so when you are chopping up all your ingredients and measuring everything out and preparing that. And I kind of see that as getting things into the form that it will be helpful to you in, in the quantity, in the right places, that kind of thing. And I see that our tools as being a part of those steps, sort of like your knife and your cutting board and your measuring cups kind of thing, where they apply regardless of what your end goal might be. You might be cooking various types of dishes kind of thing, but our tools kind of sit below that and help you prepare for those things and set the stage I guess.

Andrew Anderson 34:39

That’s a good one. Yeah, I like it anytime we’re talking about, like food and science. Sign me up. Not durian fruit, right? Graham, but, you know,

Graham McGibbon 34:47

But they hear it tastes good.

Andrew Anderson 34:51

I’m sure, no doubt, love the analogy, you know, the combination of a collection of tools that facilitate the museum, plus also to facilitate the tasting, right? Like, I’ll extend the metaphor a little bit more. You know, it’s a tool for the generator of the dish, but also the taster of the dish to facilitate interpretation of what has been made is another part. Not to overuse the metaphor of the culinary experience, but yeah, on the after the meal has been prepared, to facilitate, you know, interpretation, right? Maybe sensory analysis. And they call it descriptive analysis, right? Help with determining the taste and flavor of the meal is another place we fit in. Well, I think at least,

Sarah Srokosz 36:00

yeah, yeah, no, that that’s great. Thank you for adding on that part. So we are running a little long, so I appreciate you both sticking around. But the my one last thing that I want to end off on is, if you had to give one piece of advice to companies that want to start are in the midst of their journey and their digital transformation journey, what would you tell them, Graham, do you want to go first for this one?

Graham McGibbon 36:34

Sure, so I would say that they need to get started today and like we talked about a little bit earlier, goals are really important. What are the goals that they have, the important goals, and what data are needed for decision making and quality that was where I talked earlier, because then you can identify the data and information, as Andrew said, building upon that, from the data sources to that you know, the knowledge, insights and the wisdom, if you’re a learning organization, because that’s what their aspiration is to be, and then they can get that strategy, because that’s what controls it. What is their priority in terms of generating a digital twin that will enable them to leverage their data. Andrew?

Andrew Anderson 37:25

Yeah, my only other advice would be, you know, keep calm and carry on I guess is the right term. Transformational activities are often met with like anxiety, right? Because change can be difficult, and what I like to do for myself is eat the elephant one bite at a time, right? But savor the bite, right? What I mean by that, in using that metaphor, is that if you break the challenge of digital transformation into digestible you know actionable steps and have goals for each step, that you can sort of celebrate the value of reaching and of accomplishing the goal like that infuses passion and enthusiasm for the project overall, right? All of us hate being on projects where you have this like lofty, aspirational goal, and it’s a really tedious journey to get to the goal. Speaking of cannons and canoes like you can often get disillusioned by the tedium of difficult, transformational processes. So I suggest to folks, have milestones that are relatively easy to achieve in the early phases of projects, so that when you reach them, like all the stakeholders, can kind of celebrate success. So with certainly keeping an eye on the overall prize, no doubt, but anytime you’re in a transformational, and dare I say, disruptively, transformational journey, make sure you break it into easy to achieve milestones over time. I hope that’s a good suggestion for all the listeners on the on the podcast,

Sarah Srokosz 39:44

Yeah, I think it will be. I mean, again, from an outside perspective, it does seem like a lofty goal, and something where, if your only criteria is, you know, this one ultimate kind of lofty goal that. Defines success, that’s going to be really difficult to do. But, yeah, if you can kind of set yourself up for these smaller successes along the way, I think that that makes a big difference. And when people can recognize the progress, and at the end of the day, there are still people involved at the heart of all of these efforts, and so, it’s important to manage that kind of side of things as well.

Andrew Anderson 40:37

Absolutely. What a fun discussion. I really enjoyed it.

Sarah Srokosz 40:40

This has been great. I really enjoyed it. Yeah, I really appreciate both of you sitting down with me for so long and then lending your expertise. And we always enjoy having you on the podcast, and we will hope to have you back again soon.

Andrew Anderson 40:56

Thanks, everybody

Baljit Bains 40:57

That wraps up our fascinating two-part series on digitalization and digital transformation in pharmaceutical R&D. Thank you again, Andrew and Graham for taking the time to talk to us and sharing your thoughts and expertise on this topic.

Barbora Townsend 41:10

If you enjoyed this episode, don’t forget to subscribe to the analytical wavelength on your favorite podcast platform.

The analytical wavelength is brought to you by ACD/Labs. We create software to help scientists make the most of their analytical data by predicting molecular properties and by organizing and analyzing their experimental results. To learn more, please visit us at www.acdlabs.com

Enjoying the show?

Suscribe to the podcast using your favourite service.

Season 4, Episode 8