Mixtures of benzene, toluene, ethylbenzene, p-xylene and naphthalene dissolved in water were probed with an array of partially selective gold nanoparticle chemiresistor sensors. A full factorial experimental design was followed to generate every possible combination (unary, binary, ternary, quaternary and quinary). The nominal concentrations of the individual components in the mixtures were 0, 0.5, 1, 5 or 10mg/L and the combined concentrations were between 0 and 45mg/L, which are relevant to EPA defined maximum contaminant levels in drinking water. Several different statistical techniques were used to predict the component concentrations in the mixtures based on the sensor array responses. The most accurate technique was a non-linear ensemble method called random forests. The overall root mean square error between the predicted and measured concentrations (residuals) was 0.2–1.5mg/L for the mixtures with a nominal component concentration of 10mg/L. The accuracy of the random forests predictions was not unduly affected by increasing mixture complexity. Random forests analysis is a statistical technique suitable for quantifying the relationship between responses of partially selective sensors to the concentration of different hydrocarbons in water.