Speaker:Prof. Cheng-Yu Sun (Institute of Statistics, NTHU)
Topic:Factorial designs under baseline parameterization and space-filling designs with applications to big data
Speaker:Prof. Cheng-Yu Sun (Institute of Statistics, NTHU)
Date Time:FRI. Dec 17, 2021, 10:40 AM - 11:30 AM
Place: 4F-427, Assembly Building I
Online Seminars- Google Meet
https://meet.google.com/xcu-vcbo-dsx
https://meet.google.com/xcu-vcbo-dsx
Abstract
In this talk, I will report my research work on three topics in the areas of two-level factorial designs under the baseline parameterization, space-filling designs, and sub-data selection for big data. When studying two-level factorial designs, factorial effects are usually given by the orthogonal parameterization. But if each factor has an intrinsic baseline level, the baseline parameterization is a more appropriate alternative. We obtain a relationship between these two types of parameterization, and show that certain design properties are invariant. The relationship also allows us to construct an attractive class of robust baseline designs. We then consider two classes of space-filling designs driven by very different considerations: uniform projection designs and strong orthogonal arrays (SOAs), where the former are obtained by minimizing the uniform projection criterion while the latter are a special kind of orthogonal arrays. We express the uniform projection criterion in terms of the stratification characteristics related to an SOA. This new expression is then used to show that certain SOAs are optimal or nearly optimal under the uniform projection criterion. Finally, we consider the problem of selecting a representative sub-dataset from a big dataset for the purpose of statistical analyses without massive computation. Under the nonparametric regression situation, we present a two-phase selection method, which embodies two important ideas. First, the sub-dataset should be a space-filling subset within the full dataset. Second, in the area where the response surface is more rugged, more data points should be selected. Simulations are conducted to demonstrate the usefulness of our method.