Click-through rate estimation, the core task of programmatic display advertising, is associated with typical big data problems. Online algorithms for generalized linear models, such as Logistic Regression, are the most widely used data mining techniques for learning at such a massive scale. Since these models are unable to capture the underlying nonlinear data patterns, conjunction features are often introduced. This paper is focused on the problem of selecting the most informative 2nd and 3rd order conjunction features used in Logistic Regression. The performance of different feature selection methods based on mutual information is compared over a real-world dataset with over 10 million records. The empirical evaluation show the effectiveness of the proposed approach.