| Version: | 1.2 |
| Name: | (Q)SAR Model Reporting Format |
| Author: | European Chemicals Bureau |
| Date: | July 2007 |
| Contact: | Joint Research Centre, European Commission |
| e-mail: | qsardb@jrc.it |
| www: | http://ecb.jrc.ec.europa.eu/qsar/ |
Molcode QSAR for abiotic degradation in air (NO3 radical reaction of volatile organic compounds)
QSARModel 4.0.4
Molcode Ltd., Turu 2, Tartu, 51014, Estonia
http://www.molcode.com
17.02.2010
Indrek Tulp
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Tarmo Tamm
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Gunnar Karelson
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Dimitar Dobchev
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Dana Martin
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Kaido Tämm
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Deniss Savchenko
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Jaak Jänes
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Eneli Härk
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Andres Kreegipuu
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Mati Karelson
Molcode Ltd.
Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
Molcode model development team
Molcode Ltd. Turu 2, Tartu, 51014, Estonia
models@molcode.com
http://www.molcode.com
16.02.2010
Software is proprietary but model training and test sets provided. Algorithm provided.
None to date
Not applicable - environmental fate parameter
2.Environmental fate parameters. 2.Persistence: Abiotic degradation in air (Phototransformation). 2.2.b.Indirect photolysis (OH-radical reaction, ozone-radical reaction, other)
Rate constant for NO3 radical reaction (degradation).
The dominant chemical process of chemicals in the gasphase is their reaction with OH radicals, NO3 radicals and ozone.
cm3 s-1 molecule-1
-logK (NO3) (original rate constants were transformed into log scale and multiplied by -1 to reduce data range and obtain positive values)
The selected data are for reactions at 25 °C and 1 atm. The gas-phase reaction rate constants of NO3 radical and organic chemicals have been directly measured.
Original experimental data were collected from ref 1.
Statistics (for -logK(NO3):
max value: 17.5
min value: 9.41
standard deviation: 2.20
skewness: -0.305
QSAR
Multilinear regression QSAR
Multilinear regression QSAR derived with BMLR (Best Multiple Linear Regression) method
-logK (NO3) = -7.355
+9.660E-002*HASA-2 (AM1) (all)
-2.070*HOMO energy (AM1)
+12.005*Relative number of aromatic bonds
HASA-2 (AM1) (all), [au]Area-weighted surface charge of hydrogen bonding acceptor atoms (from AM1 calculation)
HOMO energy (AM1), [eV]energy of highest occupied molecular orbital energy
Relative number of aromatic bonds, [unitless]Relative number of aromatic bonds
Initial pool of ~1000 descriptors for each structure calculated. Stepwise descriptor selection was applied to reduce the pool based on a set of statistical selection rules.
For one-parameter equations: Fisher criterion and R2 over threshold, variance and t-test value over threshold, intercorrelation with another descriptor not over threshold).
Two parameter correlations developed from previously reduced pool, the statistical selection applied: intercorrelation coefficient below threshold, significant correlation with endpoint, in terms of correlation coefficient and t-test.
Stepwise trial of additional descriptors not significantly correlated to any already in the model. See refs 2-3.
1D, 2D, and 3D theoretical calculations. Descriptors derived from mol files. Quantum chemical descriptors from AM1 calculations. Model developed by using multilinear regression using ordinary least squares.
QSARModel 4.0.4
QSAR/QSPR package that will compute chemically meaningful descriptors and includes statistical tools for regression modeling
Molcode Ltd, Turu 2, Tartu, 51014, Estonia
http://www.molcode.com
27.67 ( 83 chemicals / 3 descriptors)
Applicability domain based on training set:
a) by chemical identity: Diverse set of Volatile Organic Compounds (alphatic and aromatic hydrocarbons, alcohols, amines, halogenated compounds, etc)
b) by descriptor value range: The model is suitable for compounds that have the descriptors in the following minimal-maximal range:
HASA-2 (AM1) (all): 0 - 24.9
HOMO energy (AM1): -11.6 - -8.02
Relative number of aromatic bonds: 0 - 0.400
By chemical identity - compounds must be similar to traing set compounds in terms of functionality.
By descriptor value range: range of descriptor values similar to training set with ±30% confidence. Descriptor values must fall between maximal and minimal descriptor values of training set ±30%.
QSARModel 4.0.4
QSAR/QSPR package that will compute chemically meaningful descriptors and includes statistical tools for regression modeling
Molcode Ltd, Turu 2, Tartu, 51014, Estonia
http://www.molcode.com
See 5.1
Yes
Chemname:Yes
SMILES:No
CAS RN:Yes
InChI:No
MOL file:Yes
Formula:Yes
All
All
83 data points: 0 negative values; 83 positive values
Original source dataset of 114 compounds split into training and testing sets - sorted by experimental value, each 4th structure subjected to testing set, others to training set
No more than specified in 3.5
R2 = 0.914 (Correlation coefficient)
s2 = 0.661 (Standard error of the estimate)
F = 256.8 (Fisher function)
R2CV = 0.905
R2CVMO = 0.904 ((80% : 20% , training : testing)
ABC analysis (2:1 training : prediction) on sorted (in increased order of endpoint value) data divided into 3 subsets (A;B;C). Training set formed with 2/3 of the compounds (set A+B, A+C, B+C) and validation set consisted of 1/3 of the compounds (C, B, A). average R2 (fitting) = 0.916; average R2 (prediction) = 0.899
Yes
Chemname:Yes
SMILES:No
CAS RN:Yes
InChI:No
MOL file:Yes
Formula:Yes
All
All
27 data points: 0 negative values; 27 positive values
Original source dataset split into testing and training. From the original source data, sorted by endpoint value, each 4th was subjected to the test set.
R2 = 0.908 (Coefficient of determination)
All are in range of applicability domain:
HASA-2 (AM1) (all): 0 - 11.1
HOMO energy (AM1): -11.8 - -8.75
Relative number of aromatic bonds: 0 - 0.286
The validation coefficient of determination (R2) is close to the coefficients derived by internal validation (R2CV and R2CVMO).
The descriptor "HASA-2 (AM1) (all)" represents the capability of hydrogen acceptor bonding relatively to the total surface area. "HOMO energy (AM1)" is an indicator of the nucleophilicity of the molecule - reactive molecules have relatively higher HOMO energy. "Relative number of aromatic bonds" is represening a (relative) count of aromaticity which differentiates these compounds from aliphatic ones. The descriptors in the model are presenting important molecular properties related to H-abstraction. For most compounds, H-abstraction is known to be the predominant pathway for reactions with NO3 radicals. As HOMO energy has a negative sign in the equation, the larger the energy the faster the reaction. Strong hydrogen bond acceptor type compounds as well as aromatic compounds have smaller rate constants, as indicated by the negative signs in the equation.
A posteriori mechanistic interpretation, consistent with published scientific interpretations of experiments.
Most published studies and models (see ref 4-5) indicate that the HOMO energy is the most important factor detrmining the rate constants for gas phase reactions with NO3 radicals. Other descriptors depend on the training set used but usually add corrections for structural variations (e.g. aromatics) or heteroatoms.
Atkinson R (1991). Kinetics and mechanisms of the gas-phase reactions of the NO3 radical with organic compounds. Journal of Physical Reference Data 20, 459-507.
Karelson M, Dobchev D, Tamm T, Tulp I, Jänes J, Tämm K, Lomaka A, Savchenko D & Karelson G (2008). Correlation of blood-brain penetration and human serum albumin binding with theoretical descriptors. ARKIVOC 16, 38-60.
Karelson M, Dobchev D, Tamm T, Tulp I, Jänes J, Tämm K, Lomaka A, Savchenko D & Karelson G (2008). Correlation of blood-brain penetration and human serum albumin binding with theoretical descriptors. ARKIVOC 16, 38-60.
Gramatica P, Pilutti P & Papa E (2003). Predicting the NO3 tropospheric degradability of organic pollutants by theoretical molecular descriptors. Atmospheric Environment 37, 3115-3124.
OECD (2004). OECD Series on Testing and Assessment, Number 49, The Report from the Expert Group on (Quantitative) Structure-Activity Relationships [(Q)SARs] on the Principles for the Validation of (Q)SARs.
Training data set Photo Train83Validation data set Photo Test 27Other documents
Molcode, abiotic degradation in air, NO3 radical reaction, volatile organic compounds