ere based on the type of the dataset. In our PCM modeling, 10-fold cross-validation was evaluated on all four kernels to select effective kernel functions. The cross-validation results PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/19762596 = d 1 k x; y 2 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi12 3o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 = 2 41@2 jjxyjj s 2 o 1A 5 RBF Kernel doi:10.1371/journal.pone.0122416.t001 k = exp 3 / 15 PCM Modeling by New Protein Fingerprints and EPIF a b c Normalized Poly Kernel 0.52 0.49 0.47 0.49 Polynomial Kernel 0.35 0.36 0.31 0.34 Puk 0.40 0.26 0.38 0.35 RBF Kernel 0.51 0.49 0.43 0.48 Models created using antibody fingerprint and antigen fingerprint with EPIF as cross-term Models created using antibody fingerprint and antigen fingerprint with the multiplication of antibody fingerprint and antigen fingerprint as cross-term Models created using only antibody fingerprint and antigen fingerprint. doi:10.1371/journal.pone.0122416.t002 Development and evaluation of Proteochemometric Modeling Proteochemometric model with different combination of descriptors were summarized in e a R2 0.92 0.99 0.91 0.79 0.39 0.81 0.57 0.41 Q2 test 0.74 0.61 0.68 0.50 0.21 0.44 0.42 0.22 MAE 124.10 139.44 131.17 137.86 150.17 150.11 137.56 149.66 RMSE 164.28 187.92 175.30 188.21 193.58 214.66 179.80 193.45 RAE 69.12% 77.66% 73.06% 82.13% 94.85% 84.44% 86.88% 94.52% RRSE 69.41% 79.39% 74.06% 86.26% 97.70% 92.70% 90.75% 97.64% Fab-Fag-MLPDb Sab-Sag-EPIFf Gab-Gag-EPIFg Gab-Gag-MLPDh a b c Models created using antibody fingerprint and antigen fingerprint with EPIF as cross-term Models created using antibody fingerprint and antigen fingerprint with the multiplication of antibody fingerprint and antigen fingerprint as cross-term Models created using only antibody fingerprint and antigen fingerprint. Models created using only Relebactam sequence similarity descriptor of antibody and sequence similarity descriptor of antigen d e f Models created using only geometry descriptor of antibody and geometry descriptor of antigen Models created using sequence similarity descriptor of antibody and sequence similarity descriptor of antigen with EPIF as cross-term Models created using geometry descriptor of antibody and geometry descriptor of antigen with EPIF as cross-term g h Models created using geometry descriptor of antibody and geometry descriptor of antigen with the multiplication of antibody descriptor and antigen descriptor as cross-term doi:10.1371/journal.pone.0122416.t003 4 / 15 PCM Modeling by New Protein Fingerprints and EPIF features may be more suitable for cross-terms. Cross-terms calculated by the multiplication of ligand and target descriptors may not be a reliable reflection of the binding side, sometimes performed even worse than those only use fingerprints of both antibody and antigen side. Therefore, it may indicate that, in the case of antigen-antibody recognition, only when a suitable cross-term such as EPIF is used in Proteochemometric Modeling, the model performance can be significantly improved. Compared with peers Existed protein descriptors can be divided into sequence similarity descriptors and geometric structure descriptors. In this PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/19763758 study, both sequence similarity descriptor and geometry descriptor were compared with our fingerprints. For sequence similarity descriptor, the amino acid sequences of all the antigen and antibody proteins were retrieved from PDB. BLAST was used to calculate sequence identities of all the antigen a
Posted inUncategorized