Presentation Summary
Thorsten Gressling from Bayer discussed the integration of ISA-88 standards with ACD/Labs software for process chemistry. He highlighted the importance of combining batch data with analytical information for decision-making and automation. Bayer uses a four-layer architecture: benchtop experiments, a basic database, authoring layer, and a data science layer. Key vendors include Dotmatics, SYSTAC, and Mettler Toledo. The process involves exporting planned receipts in ISA-88 format, integrating analytical data, and using machine learning tools like ChatGPT for insights. Gressling emphasized the benefits of ISA-88 for Industry 4.0 and data-driven process optimization.
Transcript
Thorsten Gressling 00:00
Yeah. Thank you very much. Thank you for the invitation for the ACD/Labs conference. My name is Thorsten Gressling. Let me introduce myself. So I’m working in Bayer in Wuppertal, which is quite remarkable location. So as you can see here, beside the famous River Bern, there are the laboratories on this side. And here we have production. And it’s one of the seldom constellations where you have R & D and production side-by-side. And that’s very interesting, especially for me, as a chemist by training, I love that very much, that you are so close with your colleagues, and you can do real intense scale-ups. My other profession, I’m also lecturer at Humboldt University in Berlin for ChemInformatics and Lab Informatics. So a lot of activities ongoing. My role is Digital Chemistry, Transformation Lead.
Thorsten Gressling 01:05
Okay, so let’s go to the talk today. So Accepting What Works: ISA-88 for Process Chemistry. When I was suggesting this title, there was a bit silence in the room. That really a good title. Yeah, actually, I was missing, I think that is a real good title, because my, one of my fundamental assumptions is that a lot of good things have been done in the past, but there was not the time to get a good maturity level. And that’s why I’m telling this, so it’s not always ChatGPT to have the latest, biggest achievements for the audience. There are also things that have evolved over a long time that are working very well. Yeah, so not always ChatGPT, but ISA-88 is also working.
Thorsten Gressling 01:59
Well, what is analytics and process chemistry? Let’s, yeah, let’s ask ChatGPT about that. So why do we need analytical data for process development? Quite profound answer we have here. So nine points, and so I can refine say, Okay, please limit to the four most important topics. And as you can see, it’s also tolerant about typos. I love that very much. And you can see, okay, so for example, analytics is important for process assessment, yeah, sure. For optimization, identify bottlenecks, inefficiencies in a process. But the most important thing is the bullet point three, which is decision making support. That is what we are going because we go into the direction of data driven science, and that is because the automation of decisions is getting more and more important. Now we are moving into a direction where not only humans, but also algorithms, and as you can see here, maybe also neural networks become more and more the driver, and that is also for reactions, yeah. And by the way, why is analytics important for process chemistry? But what is the first thing, the molecule or the reaction? Yeah? Sure. The reaction, yeah. And that is why we have to put our focus on that. Okay, let’s continue.
Thorsten Gressling 03:41
How is ACD/Labs software used in this context? Well, it’s a very profound answer. I’m very happy that you are, let me say, rooted in this GPT. ACD/Labs software efficiently supports improvement of process based on data driven insights. That’s the last sentence here. Data driven insights. That’s what we’re talking today. And the final question I gave to ChatGPT is, I need a real world example with real world process data. That’s unfortunately due to confidentially and so on, I cannot give an example, and that is good. If there would be an example, I would be really concerned, good.
Thorsten Gressling 04:25
So now let’s go into details. What we’re talking about, the ISA-88 what is it? Well, if you want to have insights, and especially later on, features for artificial intelligence, you need to combine the information you have for the batch, the experiment. That’s what you’ve done on the benchtop or in the kilo lab, and you’ve combined that with analytic information. And that combinement with analytic information is definitely the home turf of ACD/Labs Luminata, for example, and that’s where we’re talking about. So the analytic part that is in the middle, well, we talk about that later.
Thorsten Gressling 05:01
But we start with the batch data and unit operations, the building blocks, and what has Bayer done here to collect all these things. Are we the first to collect all the batch data across different functions? We got the idea two years ago, for sure, to use a data format that, as I said in the beginning, is quite mature. So it’s the ISA-88 format, which is standardized since 1988 yeah. And if you look to a very common blueprint of unit operation sequence. You see on the left side that, for example, you have things like you put things on a tank here, so raw materials, A, B or C, then you do the preparation. And after that, from the tank, you go into the reactor. Yeah. Also decide if it’s reactor one or two you have the reaction, the chemical reaction, and it’s the end of it. And how do you put that into data? Of course, you use this standard data format where you have it’s a classic thing, batch information, which is the process action, the process operation, the process stage, and the process itself. That’s it. It’s nothing more, very simple. And, you know, do not need to reinvent the wheel. Okay, so that’s why we committed to that. And said, okay, it’s ISA-88 because it’s 30 years old in industry, 4.0 Yeah. And it says, okay, in 4.0 it aids the integration of data from various sources. That’s excellent. This harmonization is crucial for creating a connected and transparent system. One of the key principle of industry, 4.0. Take the first bullet point and you’re happy, yeah. Is ISA-88’s used in industry 4.0 also a typo here, perfect. Okay, let’s commit to that. We have done that.
Thorsten Gressling 07:07
And so I’ll give you a short example. Don’t be afraid. You will understand that very, very, very soon. So this is the standard, classic file format, yeah. So this taxonomy that is hierarchical. Yeah. You start, in this case with an experiment. You go to the process, the process stages, the phases, yeah. And for example, for the operation sequence, here you have something like dose or heat. These are the things that are collected in this data file; must not be a data file, it’s just a data structure. And then it closes. It is a hierarchical thing. This taxonomy closes, then at the bottom, and there you have, for example, the chemicals or other meta information you need for the information you have done in the experiment or the batch. That’s it. It’s quite simple. Yeah, it’s a hierarchical taxonomy, and that’s why it is such a pretty and wonderful thing to use this.
Thorsten Gressling 08:13
Good so what is with the ISA-88 architecture we have because we just have not… barriers very big. We just have not one reactor or one bench top. We have hundreds of them. And so we need an architecture to integrate all this tiny experiments and things. So now a second slide. Don’t be afraid. This is an architecture that consists of four layers. Four layers. On the bottom we have all the things that are carried out on the benchtop, where we have, for example, Siemens or Mettler Toledo. We have systolic reactors and things like lab operator. Then on the layer that is in red, we have the famous basics database, yeah, where, where you have collected all these ISA-88 information. It’s not a file system. It’s really a database where you can query and you can exchange nodes. This is combined, because we are in chemistry with a database. It’s called Open reaction database. If you have information about what is happening in the reactor, you do not have information about the reaction. It’s chemistry. So chemistry and the what is happening in the reactor is complementary. And then after the read layer, you have the authoring layer. Authoring means, here are the receipts created. Yeah, this is where you write your unit operations and what you intend to do and what is later on carried out in the lab. So it’s what we call the master receipt. Yeah, and when you get the result with all the time, serious data, for example, the sensors, the I will show you later, all the curves, then you have the control receipt, yeah. And finally, and that is on the upper right here you have, oops, you have the data science layer, and that’s where ACD/Labs is coming into the game. Because here you map. Here you map the analytical information about the process to the receipt. And here you also get the information what has really happened, because here you have the eyes that are looking into the reaction. And then you go with that to data science, and later to AI. I will give you also an example for that.
Thorsten Gressling 10:40
Good every green bullet is implemented, or, let me say, at least, has an MVP or has a proof of concept. We’re quite, quite major here. So let’s go to the building blocks, the different vendors. How do they support this ISA-88 I will start, for good reasons, with the experiment, which, in our case, is based on Dotmatics. Dotmatics is a ELN that is very, very pretty, designed for chemical reactions. So on the right side you see, for example, the decision tree that leads to a structure with the different experiments. But also in the middle you see analytics, for example, for proven acceptable ranges, we are going into solubility and all these things you need for process chemistry. And by the way, on the lower right side you also have statistical analysis. Yeah, good. That is the key aspect of the dogmatics. But you also, when you’re creating an experiment, I will give you an example. Right now you have, I’ve obfuscated the structure here. You have this typical sheet of experiment planning. This is the upper part of that. And there you have the lower part of that. And you can see, I will enlarge that here. You have all the operations that are planned to carry it out. Yeah, so you add chemicals, you add chemicals, which is the start point, and then you dose later on, when you heat and stir. All these things are unit operations we are carrying out. And now what we are doing is we export this. This is an export button. We call it export XML. And it’s ISA-88 we export that into our infrastructure, though, so that we have the planned receipt. Yeah. So this is, we’ve seen that before. This is exactly what we now then have into the database, and that what is carried out by the lab equipment.
Thorsten Gressling 12:47
Next example, what is the building block? Or synthesis automat one of them is SYSTAC. SYSTAC is a Swiss company creating these machines. These are some photos of that. So pretty sure you’re aware of these synthesis machines, of these robots. We have quite major digitalization level here. So for example, we can go on all machines with software interface. This is an example. For example, you can put this also on devices like the HoloLens when you’re in the lab, and you can control the reactors via virtual screens. That’s really excellent. And we can carry out the receipt, what we have put from Dotmatics into the SYSTAC, and vice versa, and we get out for sure the executed things. Yeah. So this is a very simple example, one of the first implementations. Here you can see, for example, curves that are out of these reactors, good, SYSTAC.
Thorsten Gressling 14:00
What we have else, lab operator. So not everything on benchtop is based on a completely assembled reactor. We have ecosystem. It’s called lab forward, which is more for the spreaded things that are on the table. So if you had, for example, a stirrer that is not mapping to other ecosystems or heater or something like that. This is done by little boxes based on recipe pi. It’s very real robust the system. As you can see here in the photo, there is a scale, there’s a shaker, and that’s also working, and also lab operator now can create out from the many, many different things we have right here. They can create this ISA-88 so these are the trend data for to temperature, for example. Check works.
Thorsten Gressling 14:57
Next one, okay, Mettler Toledo, yeah. They are the pioneers here. Actually, this ISA-88 format were adopted by them many years ago. And I’m very happy. And it’s really a great pleasure that they are working in so smart way to share also their knowledge. It’s really great. I appreciate that. And Mettler Toledo, I will also give a photo here. Has also very excellent automated reactors. And also they speak this language to have the control batch, the control data you can get out of that. And as you can see here, it’s also based on ISA-88 also here you have all the time series. Good. That’s it. Just a few example how the building blocks can work together.
Thorsten Gressling 15:46
We also have a tiny viewer app, so if you’re have done, carried out your experiment or your batch, you can look into that. It’s a very simple, yeah, you open the experiment and the database and can immediately look into the what we call trends. That’s very helpful. And as it is all database based, you do not need to have file handles or something like that. Yeah.
Thorsten Gressling 16:10
Okay, back to our infrastructure. What is missing? Well, as you can see, we integrate a lot of things. Where is ACD/Labs coming? Do we agree on all the things we have defined. That’s why we and CSL Behring took the Pistoia Alliance into the boat. Maybe you’re aware of that. It is unification of pharma companies, non profit organization incorporated in 2009 and we said, okay, to be sure that all these different things on our infrastructure work together, let’s start a work group. Yeah, so it should be not licensed. No one owns the standard, and that’s currently working on and we’re very happy about that. So now we have the data. Now we have the batch. What is next for sure, let’s take all the analytic data we have and merge that together to get insights. So you start with a simple Luminata record without any eye control or S-88 attachments. This is the starting point. The display is very simple, and here you have a button. Maybe some of you has realized what it is. It is called eye control. Here, there it is, yeah, I control. And with that, you open the door to the new universe I’ve shown you. So there is the next thing with the new designed S-88 import tool. This is connecting to the database here. Here are two experiments you’ve seen before. You can import the data. The XML, then from the database is retrieved. You see also a little bit of the database administration UI here. So and then there it is. All the trends are imported and displayed. So on the bottom of this screen, you see the trends are carried out in an experiment. That’s great, isn’t it? So now we have all the information, as you can see here, temperature and stirring, pH meter and all these things. So where is the analytics for sure?
Thorsten Gressling 18:26
Now the next step is, if you do not have already done you have to insert, we have metadata for sure. We have all the metadata, for example, of what you have seen in the receipt authoring. That’s quite sure. But you also have to map now the analytical things, and for that, we need time points. Other call that event frames. That’s quite fair. You can call the time points, and you can add them manually whenever you have taken a spectrum, for example, what I call IPC and in process control, but also maybe the software is able to log Yeah. Mettler Toledo, for example, can do that when you have some of these events and you need that later on to put the spectra on it so you can overlay, then on the different points in your batch the reaction. And for sure, then you get some changes in the chromatograms, and that’s going into the direction of the result. So this is the key point. What you see here you have different conditions, which are the time points you see in the upper screen. So condition one, condition two, and so on as time is passing. And on the lower screen you have the control chart, I will enlarge that. And here in the control chart, you have the integration of the diode array. And as you can see here, concentrations of reactants are changing. So in this example, it is increasing, where, for example, one maybe of the EDOX is decreasing, yeah, and that’s exactly what we would like to have, so we can have our time series combined with the analytical data. And that is the key point for that what we use in process, chemistry.
Thorsten Gressling 20:25
Finally, you have a kinetic plot. So as you can see here, this is an example from ACD/Labs where you decrease the adduct and you increase the product, for example. And our plan, for sure, is all that is now orchestrated in a very visual matter, you can should then later on, export or give the data to reaction lab from Mettler Toledoler where you can do real kinetic calculations and then prove if your assumptions for the schema are correct or not, that is really great.
Thorsten Gressling 21:06
Okay, so what is the difference to PI team? Many of you know that there are also major applications of very dense time. Let me say, enlarge this here. When you have a time access and you observe the spectra, you can have that in a very, very dense way. So that is the difference to what we what we have done here. This is the automatic collection, yeah, this is called P-A-T and what we are doing is IPCs in process control, which is general, simpler and less costly, because that what you see on the right side is very cost intensive, but maybe it has advantages. Yeah, for example, you can use that in production environments to control your reaction automatically. But for that, you need a model, you need calibration, and that’s the difference to IPC what we have seen before, because if you need a model and calibration, you are not free, as a scientist, to develop your reaction in the direction you intend to. So the classic P-A-T, is much more, let me say, tied to a use case. So IPC can cover a wider variety and does not require calibration or modeling. That’s the difference.
Thorsten Gressling 22:39
Yeah, okay, let’s go to data science. We’re going to the end right now, hand over to machine learning. Data science means, for example, that you have Python, and this is the example you’ve seen in one of the slides before, time series. So if you plot the trends we gather from the database, you have this and, for example, a very modern fashion to, let me say, discuss, discuss these curves is the connection to ChatGPT, yeah. What you can do is it’s very, very easy to understand this program code. You get one of the experiment series. This is the example I always have here. For example, I can have some data and I can have the time access here. Then you go to Open AI, yeah, connect to Open AI and ask on the base of the trends you have imported here, for example, what are the important landmarks in these trends? And with this cliffhanger about the answer, I will finish my lecture here, so I hope it was not too boring. Yeah, it contains ChatGPT, ACD/Labs and ISA-88 Thank you very much, and I’m open for questions I prefer by chat right now or by LinkedIn. Thank you very much.