Predicting gas phase entropy of select hydrocarbon classes through specific information-theoretical molecular descriptors

Abstract

The usefulness of five specific information-theoretical molecular descriptors was investigated for predicting the gas phase entropy of selected classes of acyclic and cyclic compounds. Among them, total information on atomic number (TIZ), graph vertex complexity (HV) and total information on bonds (TIBAT), considered together showed the best correlation along with a low standard deviation (r2 = 0.97, s = 21.14) with gas phase entropy values of 130 compounds. The multiple regression equation treating these three indices as independent variables was statistically highly significant which was evident from the F-statistics. In particular, very small difference between r2 and r2-pred values indicates that the regression model is not overfitted and is, therefore, suitable for prediction purposes. When truly used as a training set to predict (from regression equation) 40 additional compounds we get a very high correlation (r2 = 0.975), which remains almost identical (r2 = 0.97) for the combined data set of 170 compounds. The three indices appear to be useful descriptors producing correlation that remains stable with the change in the size of the data set. Also, the information-theoretical measures appear to capture an additive-cum-constitutive nature of gas phase entropy yielding an acceptable statistical fit.

Publication
In SAR and QSAR in Environmental Research
Md Imbesat Hassan Rizvi
Md Imbesat Hassan Rizvi
Technical (Research) Associate

My research interests include scientific machine learning, natural language processing, reinforcement learning, robotics and human-robot interaction.