Hydrocarbon soil pollution is one of the most dangerous pollutants in the world. It occurs for several reasons and increases due to factories not adhering to environmental protection controls, the most prominent of which is oil production. This work used two sets of soil petroleum contamination to demonstrate principal component analysis (PCA) and partial least squares regression (PLS) modeling. To determine the variables adopted in this study based on spectroscopic analysis within the spectrum range of 1700-1800 nm and 2200-2400 nm, the distinct absorption peaks at 1720, 1750, 2220, 2300, and 2350 nm indicated the crude oil content. Chemical analysis of the samples was used to measure the relationship and build a PLS and PC model, which helped obtain a high percentage of match of up to 90%. The work indicates that this technique may enhance field investigation of oil contamination, providing an accurate in-field technique.