Órdenes de experimentación en diseños factoriales

  1. Alexander Alberto Correa Espinal
Supervised by:
  1. Pere Grima Cintas Director

Defence university: Universitat Politècnica de Catalunya (UPC)

Year of defence: 2007

Committee:
  1. Xavier Tort-Martorell Llabrés Chair
  2. Josep Ginebra Molins Secretary
  3. Enrique Francisco González Dávila Committee member
  4. Alberto José Ferrer Riquelme Committee member

Type: Thesis

Abstract

A common recommendation when thinking in a factorial design is randomizing the run order. The purpose of this randomization is to protect the response from the possible influence of unknown factors. This influence is expected to be blurred among all the effects, thus none of them is specially affected and no mistakes are made when estimating its statistical significance. But this praxis has two essential problems: 1. The number of factors level changes due to randomization might be large (much larger than in other sequences). It can be also difficult to conduct, making the experimentation complicated and more expensive. 2. Making some reasonable hypothesis regarding the influence of the unknown factors, there are some sequences clearly better than others for minimizing the influence of this undesirable factors. Many authors have worked on this topic, and some matters have already been solved. For instance, the experimentation sequence that better neutralises the influence of unknown factors is already determined, but without taking into consideration the number of level changes that this sequence implies. It has also been solved the problem of finding sequences that have the minimum number of level changes, but without considering simultaneously the potential influence of unknown factors. When both the influence of unknown factors and the number of level changes is considered, the problem has been solved up to designs with 16 runs. But not further as the searching procedures used are nonviable when the number of possible sequences becomes so huge (with 32 runs the number of different sequences is 32! = 2,6 · 1035) The aim of this thesis is finding a procedure that makes it possible to obtain run sequences with the minimum number of level changes, and that besides minimize the influence of unknown factors in the effect estimation, for any 2 level factorial design. Moreover, the desired run sequence should be obtained easily by the experimenter when using the proposed procedure. The content is structured in 7 chapters and 8 appendixes. Chapter 1 shows the motivation that lead to chose this research topic. It also defines the basic elements of this work (complete and fractional 2 level factorial designs, problems that appear when randomizing this designs, and how to quantify the influence of unknown and undesired factors in the effect estimation). In addition, the hypothesis and context in which the search for run orders with the desired properties will take place are presented. Chapter 2 gives an exhaustive bibliographic review of the current solutions related with run orders in these designs robust to the influenceof factors alien to the experimentation and/or with minimum number of level changes. The end of the chapter lists weaknesses of the current state of the art and advances the expected contributions of this thesis. Chapter 3 presents an original procedure for finding run orders for 2 level factorial designs with the minimum number of changes in the level factors and a known bias. We called this procedure duplication method, as duplicating the rows of a 2k design and adding a factor with a specific sign sequence, a 2k+1 design with the same properties as the first design is achieved. An important property of this method is that it can be applied to any number of factors. This procedure guarantees the minimum number of level changes, but not always guaranties the minimum bias (measure of the influence that unknown factors have in the effect estimation). Chapter 4 shows different methods for finding run orders with less bias than the one produced by the duplication method. These methods are: - Random search with restrictions: The procedure randomly generates the run order, but in a way that a run is followed by another one that has only one change in the factor levels (the minimum number of changes is then guaranteed). Once the sequence is completed its bias is calculated, and the sequences with a bias under a threshold are stored. - Exhaustive search: An algorithm proposed by Dickinson (1974) and adapted by De León (2005) is used. It is similar to the previous algorithm, but it does not generate the runs in a random manner. Instead, it behaves systematically in order to find all the possible run orders. With this algorithm the best run order for designs with 32 experiments has been found (and it was unknown until now). The best run order means the one that has minimum number of changes in the levels and, among these, the one with less bias. - Exhaustive search with forced feeding. The exhaustive exploration of all possible run orders with more than 32 runs is impossible. The procedure of exhaustive search around a good run order already found with one of the previous methods allowed the exploration of the most promising run order area. For designs with more than 32 runs the best run orders are obtained from a combination of the proposed methods. For designs with 64 runs the best order comes from the exhaustive search with forced feeding method, feeding the algorithm with a run order obtained from the random search with restrictions method. We used the same procedure for obtaining the best run order for 128 runs, but feeding the algorithm with a run order obtained from duplication of the one for 64 runs. Methods described in chapter 4 provide the so called seed orders: from this orders new ones with the same properties can be deduced. Chapter5 shows two procedures for obtaining orders with the expected properties from the seed orders. These methods are called permutation and sign change method, and expansion columns method. Both methods have been programmed as Minitab macros, making it possible to automatically and randomly generate (among all possible ones) the orders with the desired properties. A new measure for attenuating the influence of factors alien to experimentation is presented in chapter 6. This allows the comparison among the attenuation of factorial designs with different number of factors, thus showing that the duplication procedure shown in chapter 3 is appropriate for obtaining run orders with the properties desired in designs with more than 128 runs. Finally, chapter 7 gives the main conclusions and defines possible future research areas that could extend our studies. Appendix 1 shows the orders proposed by De León (2005) for designs with 8 and 16 experiments, cited several times in the thesis and one of our starting points. Appendix 2 explains the FreeBasic programming language, used for implementing the search algorithms. Appendixes 3 and 4 include 2 programs: random search for designs with 32 runs (appendix 3) and exhaustive search for designs with 32 experiments (appendix 4). Appendix 5 shows one of the obtained orders with the desired properties for designs with 128 runs. Appendixes 6 and 7 have the Minitab macros that using the seed orders for each kind of experiment proposes an order among all the possible ones with the desired properties. Finally, appendix 8 has some comments about the proposed run orders, with restrictions in the randomization, and summarizes the proposals about this topic.