June 23, 2022
by Richard Lee, Director, Core Technology and Capabilities, ACD/Labs
By Richard Lee, Director of Core Technology and Capabilities, ACD/Labs
The world generates a jaw-dropping amount of data and the trend continues upward. By 2025, the amount of data generated each day is expected to reach 463 exabytes globally.
Analytical data, of which a large amount in collected in R&D labs worldwide on a daily basis, poses its own challenges. It is often collected across multiple instruments and vendors, in varying data files and formats. As a result, the data is distributed across a variety of systems. Furthermore, more than one dataset or data type is typically used in go/no-go decisions.
For scientists that use this data in daily decision-making, automation of analytical workflows can help relieve the burden of data management and support faster decision-making, among other benefits. For data scientists that want to use vast quantities of the same data to generate insights not possible by human analysis, automation can significantly reduce data collection and preparation time.
Should you be looking to automate analytical data workflows? The answer is that you should consider it if you’re interested in:
1. Transforming Data with Automation
Scientists want easy access to data—one way to do this is to unify or gather data that is collected on different instruments and vendors in one system. For example, store data from Waters Empower and Thermo Chromeleon together and query across all data and to prepare it for business intelligence and machine learning (ML) applications. Automation can help standardize analytical data and collect metadata in a unified format using software designed for analytical data handling in R&D.
Metadata, which is data that encompasses information about one or more aspects of collected data, can provide context to the data – like information about the instrument, method, experiment – so scientists can understand where it originated and the intent of the data. Automation can be used to unify meta data across a variety of vendor formats, as long as there is meta data mapping system in place, in order to query data across all vendors.
Automation also grants the ability to combine separate analytical data files and assemble them into a representation of a physical experiment or experimental study. This process creates a digital twin of the study so scientists have context as to how their experiment performed. For example, for a scientist conducting a metabolism study, automation helps assemble all the data files across multiple species and time points in one place so the scientist can spend more time analyzing the data vs. collating the data manually. It helps quickly transform the data needed to advance an experiment. Automation is especially helpful in process chemistry when scientists look for the most optimal condition for a synthetic route, where impurities at each stage of the synthesis can be tracked and summarized.
2. Using Data More Effectively and Efficiently
ML is often talked about in the context of automation; a scientist can’t just have analytical data to reap the benefits of ML, artificial intelligence (AI), and business intelligence applications. These applications require data to be in a specific format or structure, which means the data must be engineered through automation.
ML frameworks require systems to pull and abstract data from a set of analytical data in a format that is machine-readable. This process needs automation because the data used by ML frameworks must be consistent, standardized, clean, and of a certain quality. Combining automation and ML/AI helps scientists use data effectively, access data quicker, and spend more time analyzing results versus pulling the data manually. On the IT side, automation streamlines an IT team’s workflow because the data repositories are simplified and unified.
Tips on how to get started automating analytical workflows
If doing more with the data that’s already collected in your organization interests you then here is how to start.
Determine which workflows and hence data flows to automate
It’s best to start with a workflow that is similar to others and for which automation can impact the business significantly. Perhaps the same instruments are used for multiple workflows which means the automation can later be extended to impact those.
Evaluate all manual touchpoints
Hopefully some manual touchpoint can be automated but you will also need to include manual steps that still need a scientist’s oversight. You may want to include notifications to indicate readiness for review to prompt those manual steps.
Plan out your process and map how the data should flow
From the source to the destination, you want to map out what will happen to the data. This will help you understand how many systems the data flow will interact with, how users will interact with it, and where that data will go in terms of the destination.
Consider the volume of data generated
Be this from a particular instrument or laboratory, you need to understand the volume of data.
If you’re generating a lot of small data files, for example in high-performance liquid chromatography (HPLC) in an open access environment, that is a good candidate for automation because of the time savings possible when such high volumes of data are involved.
On the other hand, structure elucidation workflows may result in a smaller quantity of very large datasets across different techniques (e.g. high resolution LC/MS data and a variety of 2D NMR experiments). Here automation can help assemble those disparate datasets into one location and save the scientist time in searching and gathering information.
Want to learn more? Watch our webinar “A Practical Guide to Digitalizing Analytical Data Management” and sign up to receive more information.
We offer solutions that support a variety of ways for scientists to automate repetitive, tedious processes. To identify an approach that best meets your organization’s needs get in touch for a consultation.