# Investigations 22-23: Modeling the Effect of Solvent-to-Solid Ratio and Extraction Time on Extraction Yield of Danshensu

• • Contributed by David Harvey
• Professor (Chemistry and Biochemistry) at DePauw University

Although the results in Table 2 are instructive in helping us understand how the extraction time and the solvent-to-solid ratio affect danshensu's extraction yield, a more quantitative model will provide us with a better ability to predict its extraction yield for any combination of factor levels. We can build an empirical model for danshensu's extraction yield using a second-order polynomial equation of the general form

$EY={ \beta }_{ 0 }+{ \beta }_{ a }A+{ \beta }_{ b }B+{ \beta }_{ aa }{ A }^{ 2 }+{ \beta }_{ bb }{ B }^{ 2 }+{ \beta }_{ ab }AB\nonumber$

where EY is the extraction yield, A is the extraction time, B is the solvent-to-solid ratio, and β0, βa, βb, βaa, βbb, and βab are the model's coefficients.

Investigation 22

What does it mean to describe a model as empirical instead of theoretical? What are the advantages and the disadvantages of an empirical model? What is the significance of each coefficient in this empirical model in terms of how it affects the extraction yield?

We can fit this empirical model to the data in Table 2 using a linear regression analysis . The resulting empirical model of

$EY=0.575+0.0225A+0.00905B-0.00125{ A }^{ 2 }-0.000165{ B }^{ 2 }+0.000100AB\nonumber$

is significant at p = 0.0057 with β0 significant at p < 0.001, βa significant at p < 0.01, and βb and βbb significant at p < 0.05.

Investigation 23

What does it mean to say that the regression analysis is significant at p = 0.0057? Do the results of this regression analysis, as expressed in the model's coefficients, agree with your results from Investigation 21? Why or why not? What is the meaning of the intercept in this model and how does it affect your understanding of the empirical model's validity? Use the full regression model to calculate danshensu's predicted extraction yields for the central-composite design in Table 2. Organize your results in a table with columns for the factor levels, the experimental extraction yields, and the predicted extraction yields. Add a column that shows the difference between the experimental extraction yields and the predicted extraction yields. Calculate the mean, the standard deviation, and the 95% confidence interval for these difference values and comment on your results.

 You can read more about linear regression in Chapter 5 of Analytical Chemistry 2.0. Although the context in this reference is fitting a straight-line to a response based on a single factor, the general approach, but not the specific equations, applies to fitting a full second-order polynomial to data with two factors. For a more detailed discussion of central-composite designs and linear regression, see Myers, R. H.; Montgomery, D. C. Response Surface Methodology, Wiley Series in Probability and Statistics, Wiley-Interscience:New York, 2002 or Brereton, R. G. Chemometrics: Data Analysis for the Laboratory and Chemical Plant, Wiley:Chichester, England, 2003.