Conditional logistic regression is often used to study the relationship between event outcomes and specific prognostic factors in order to application of logistic regression and utilizing its predictive capabilities into environmental studies. This research seeks to demonstrate a novel approach of implementing conditional logistic regression in environmental research through inference methods predicated on longitudinal data. Thus, statistical analysis of longitudinal data requires methods that can properly take into account the interdependence within-subjects for the response measurements. If this correlation ignored then inferences such as statistical tests and confidence intervals can be invalid largely.
For estimating the conditional regression model in the analysis of environment pollution as a function of oil production and environmental factors using the generalized estimating equation (GEE) in the formulation of inference methods that facilitate the conditional logistic regression model taking advantage of the actual correlations between responses in the data, as well as the specific correlation structure through robust sandwich estimators (RSE) as well as application many of various model selection criteria. Because the efficiency of estimates is contingent on the working correlation matrix specification, the appropriate selection of a working correlation matrix can significantly advance the GEE statistical inference efficiency. After comparing the performance of specific criteria indicating that QIC is the selection criterion that is most suited for GEE method. The application results showed that QIC had the lowest information loss in GEE method in which the objective to develop a predictive model of the candidate set, Through this research, condition logistic regression has also been demonstrated to be an effective tool that can be used in other studies to explore the relationships between response and explanatory variables.