Optimizing DoE and Production Runs with Little Data

Posted by

For many batch processes (e.g. in Life Sciences, Food & Beverage), Design of Experiments (DoE) is usually conducted before scaling up to production runs. We believe that Bayesian Optimization and its variants could significantly improve the performance of DoE and production runs, especially when there is little historical data available. And we are demonstrating exactly that with our Penicillin Fermentation simulation. If you are interested in how Bayesian Optimization works, please check out our other posts.

In practice, before both Design of Experiment and production runs, subject matter experts (SME) would already have a baseline recipe (e.g. based on first principle models for DoE or the best DoE run for production runs), as well as some safe search ranges for the algorithm to explore. For the experiments here, we use the proposed recipe from the penicillin fermentation project as the baseline and allow the algorithm to optimize recipes that are at most 10% away from the baseline. 

Design of Experiments 

For DoE, the goal is to find the best recipe with as few batches as possible. In other words, it is the batch with the maximum yield that we need to pay attention to. Usually, we will only run a small number of batches at this stage. And we would like the algorithm to have full flexibility to explore uncertain areas as it sees fit, given it is safe to do so. An example run of 10 batches with Bayesian Optimization looks like this: 

With as few as 5 batches, it successfully finds a much better recipe with a 7.2% improvement compared to the baseline. 

Production Runs

Once we obtain the best recipe from DoE, we may consider scaling up to production runs. For production runs, the goal now is to optimize every batch. In other words, it is the average yield improvement that we need to keep track of. Usually, we would restrict the algorithm to certain areas where it can achieve good performance with a high likelihood. And here is an example run of 100 batches with Bayesian Optimization on production: 

As the plot suggests, the average yield improvement is steadily going up when running on production. At the end of this 100 batch run, the average yield improvement is 6.6% with maximum yield improvement being 9.7%. 

Human in the Loop Optimization

While little data may be available historically, usually there is someone with deep subject matter expertise for any process. And it is wasteful if that deep knowledge was not utilized. Hence, for DoE and production environment, we highly recommend approaching the problem with Human in the Loop. Namely, the SMEs work alongside the algorithm to monitor and adjust search space, objective and constraints to achieve even better performance.

If you are interested in optimizing your processes or curious about how Bayesian Optimization can be useful to you in general, please feel free to reach out and we would love to chat with you!