As the timing behavior of the good and defective chips become statistical, the traditional notion that there exists a one-dimensional timing boundary to separate the good and defective behavior may no longer be true. This paper studies issues in test optimization for screening statistical delay defects. After the first silicon tapeout, test data learning based on silicon samples can be utilized to optimize the test set for mass production. This approach depends on the availability of known good and known defective samples. This paper focuses the discussion on silicon sample based test optimization. We relate this problem to binary classification and pattern selection to the feature selection problem in statistical learning. Experimental results are presented to explain the methodologies and the new concepts