CSU(Central South University)-Caco2
Dataset: Download it here.
Dataset description: A curated dataset containing 1,018 compounds and their experimental \(\log P_{\text{app}}\) values (\(\log\) of apparent permeability coefficient).
Dataset preprocessing
- Obtain the raw dataset from here, which contains 1,272 compounds;
- Remove 34 duplicate rows based on SMILES strings and logPapp values;
| logPapp | smi |
|---|---|
| -5.220000 | Fc1c(cccc1F)\C(=C/C(=O)NC)\c1cc2n(C(C)C)c(nc2cc1)N |
| -5.220000 | Fc1c(cccc1F)\C(=C/C(=O)NC)\c1cc2n(C(C)C)c(nc2cc1)N |
| -4.600000 | Fc1cc2c(N(C=C(C(O)=O)C2=O)CC)cc1N1CCN(CC1)C |
| -4.602060 | Fc1cc2c(N(C=C(C(O)=O)C2=O)CC)cc1N1CCN(CC1)C |
| -4.600000 | Fc1cc2c(N(C=C(C(O)=O)C2=O)CC)cc1N1CCN(CC1)C |
| -5.630000 | O(C)c1ccc(cc1)CC(NCC(O)c1cc(NC=O)c(O)cc1)C |
| -5.630000 | O(C)c1ccc(cc1)CC(NCC(O)c1cc(NC=O)c(O)cc1)C |
| -5.770000 | O1c2c(C(=O)C(O)=C1c1cc(O)c(O)c(O)c1)c(O)cc(O)c2 |
| -5.770000 | O1c2c(C(=O)C(O)=C1c1cc(O)c(O)c(O)c1)c(O)cc(O)c2 |
| -6.700000 | O1c2c(C(=O)C(O)=C1c1cc(O)c(O)cc1)c(O)c(O)c(O)c2 |
| -6.700000 | O1c2c(C(=O)C(O)=C1c1cc(O)c(O)cc1)c(O)c(O)c(O)c2 |
| -5.090000 | O1c2c(C(=O)C(OC)=C1c1cc(O)c(O)cc1)c(O)c(OC)c(OC)c2 |
| -5.090000 | O1c2c(C(=O)C(OC)=C1c1cc(O)c(O)cc1)c(O)c(OC)c(OC)c2 |
| -5.050000 | O1c2c(C(=O)C=C1c1cc(O)c(O)cc1)c(O)cc(O)c2 |
| -5.050000 | O1c2c(C(=O)C=C1c1cc(O)c(O)cc1)c(O)cc(O)c2 |
| -6.000000 | O1c2c(C(=O)C=C1c1cc(OC)c(O)c(O)c1)c(O)cc(O)c2 |
| -6.000000 | O1c2c(C(=O)C=C1c1cc(OC)c(O)c(O)c1)c(O)cc(O)c2 |
| -5.100000 | O1c2c(C(=O)C=C1c1cc(OC)c(OC)c(OC)c1)c(O)cc(O)c2 |
| -5.100000 | O1c2c(C(=O)C=C1c1cc(OC)c(OC)c(OC)c1)c(O)cc(O)c2 |
| -5.180000 | O1c2c(c(O)c(OC)c(O)c2)C(=O)C=C1c1ccc(O)cc1 |
| -5.180000 | O1c2c(c(O)c(OC)c(O)c2)C(=O)C=C1c1ccc(O)cc1 |
| -5.020000 | O1c2c3c4c(c(O)c2C)C(=O)C(NC(=O)/C(=C\C=C[C@H](C)C@HC@@HC@@HC@@HC@HC@HC@@H\C=C\O[C@]1(C)C3=O)/C)=C1NC2(N=C14)CCN(CC2)CC(C)C |
| -5.020000 | O1c2c3c4c(c(O)c2C)C(=O)C(NC(=O)/C(=C\C=C[C@H](C)C@HC@@HC@@HC@@HC@HC@HC@@H\C=C\O[C@]1(C)C3=O)/C)=C1NC2(N=C14)CCN(CC2)CC(C)C |
| -4.630000 | O=C(NO)C(C)c1ccc(cc1)CC(C)C |
| -4.630000 | O=C(NO)C(C)c1ccc(cc1)CC(C)C |
| -4.630000 | O=C(NO)C(C)c1ccc(cc1)CC(C)C |
| -4.770000 | Oc1c(cccc1C(C)C)C(C)C |
| -4.770000 | Oc1c(cccc1C(C)C)C(C)C |
| -5.260000 | S(=O)(=O)(N)c1ccc(N)cc1 |
| -5.260000 | S(=O)(=O)(N)c1ccc(N)cc1 |
| -4.360000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)c(F)cc1)C(C)C |
| -4.360000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)c(F)cc1)C(C)C |
| -4.480000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)c(OC)cc1)C(C)C |
| -4.480000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)c(OC)cc1)C(C)C |
| -4.240000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)cc(F)c1)C(C)C |
| -4.240000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)cc(F)c1)C(C)C |
| -4.540000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1)C(C)C |
| -4.140000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1)C(C)C |
| -4.140000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1)C(C)C |
| -4.170000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1)NC(=O)N(C)C |
| -4.170000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1)NC(=O)N(C)C |
| -4.460000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1F)C(C)C |
| -4.460000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1cc(F)ccc1F)C(C)C |
| -4.330000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1ccc(F)cc1)C(C)C |
| -4.330000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1ccc(F)cc1)C(C)C |
| -4.370000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1ccc(F)cc1F)C(C)C |
| -4.370000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)N)/c1ccc(F)cc1F)C(C)C |
| -4.190000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1)CCC |
| -4.190000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1)CCC |
| -4.060000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1)NC(=O)N(C)C |
| -4.060000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1)NC(=O)N(C)C |
| -4.260000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1F)C(C)C |
| -4.260000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cc(F)ccc1F)C(C)C |
| -4.140000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1ccc(F)c(F)c1F)C(C)C |
| -4.140000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1ccc(F)c(F)c1F)C(C)C |
| -4.210000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cccc(F)c1F)C(C)C |
| -4.210000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NC)/c1cccc(F)c1F)C(C)C |
| -4.070000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NCC)/c1cccc(F)c1F)C(C)C |
| -4.070000 | S(=O)(=O)(n1c2cc(ccc2nc1N)/C(=C/C(=O)NCC)/c1cccc(F)c1F)C(C)C |
| -4.360000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=C\C(=O)N)\c1ccccc1)C(C)C |
| -4.360000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=C\C(=O)N)\c1ccccc1)C(C)C |
| -4.200000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=C\C(=O)NC)\c1ccccc1)C(C)C |
| -4.200000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=C\C(=O)NC)\c1ccccc1)C(C)C |
| -3.870000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=N\O)\c1ccccc1)C(C)C |
| -3.870000 | S(=O)(=O)(n1c2cc(ccc2nc1N)\C(=N\O)\c1ccccc1)C(C)C |
| -4.400000 | S1c2c(cccc2)C(=O)Cc2cc(ccc12)C(C(O)=O)C |
| -4.400000 | S1c2c(cccc2)C(=O)Cc2cc(ccc12)C(C(O)=O)C |
| -4.410000 | s1c(ccc1C(C(O)=O)C)C(=O)c1ccccc1 |
| -4.410000 | s1c(ccc1C(C(O)=O)C)C(=O)c1ccccc1 |
- Use RDKit to transform the SMILES to their canonical forms;
- For the 146 compounds with more than one \(\log P_{\text{app}}\) value, drop 59 compounds (131 rows) with \(\log P_{\text{app}}\) difference larger than 0.1
| logPapp | canonical_smiles |
|---|---|
| -3.860000 | C#CC#CCCCC/C=C/C(=O)NCC(C)C |
| -3.750000 | C#CC#CCCCC/C=C/C(=O)NCC(C)C |
| -5.658789 | C/N=C(\NC#N)NCCSCc1csc(N=C(N)N)n1 |
| -5.880000 | C/N=C(\NC#N)NCCSCc1csc(N=C(N)N)n1 |
| -4.020000 | C=CCOc1ccccc1OCC(O)CNC(C)C |
| -4.247504 | C=CCOc1ccccc1OCC(O)CNC(C)C |
| -4.680000 | C=CCOc1ccccc1OCC(O)CNC(C)C |
| -3.837653 | C=CCc1ccccc1OCC(O)CNC(C)C |
| -4.370207 | C=CCc1ccccc1OCC(O)CNC(C)C |
| -4.122469 | CC#CC#CCC/C=C\C=C\C(=O)NCC(C)CC |
| -3.780000 | CC#CC#CCC/C=C\C=C\C(=O)NCC(C)CC |
| -4.424591 | CC(=O)CC(c1ccccc1)c1c(O)c2ccccc2oc1=O |
| -4.630000 | CC(=O)CC(c1ccccc1)c1c(O)c2ccccc2oc1=O |
| -5.620000 | CC(=O)Nc1ccc(OCC(O)CNC(C)C)cc1 |
| -5.744728 | CC(=O)Nc1ccc(OCC(O)CNC(C)C)cc1 |
| -5.600904 | CC(=O)Nc1ccc(OCC(O)CNC(C)C)cc1 |
| -4.625000 | CC(=O)[C@H]1CC[C@H]2[C@@H]3CCC4=CC(=O)CC[C@]4(C)[C@H]3CC[C@]12C |
| -4.280000 | CC(=O)[C@H]1CC[C@H]2[C@@H]3CCC4=CC(=O)CC[C@]4(C)[C@H]3CC[C@]12C |
| -4.696804 | CC(C(=O)O)c1ccc(-c2ccccc2)c(F)c1 |
| -4.660000 | CC(C(=O)O)c1ccc(-c2ccccc2)c(F)c1 |
| -4.470000 | CC(C(=O)O)c1ccc(-c2ccccc2)c(F)c1 |
| -4.707191 | CC(C(=O)O)c1cccc(C(=O)c2ccccc2)c1 |
| -4.351000 | CC(C(=O)O)c1cccc(C(=O)c2ccccc2)c1 |
| -4.390000 | CC(C(=O)O)c1cccc(C(=O)c2ccccc2)c1 |
| -4.280853 | CC(C)(C)NCC@@HOC(=O)C1CC1 |
| -4.635000 | CC(C)(C)NCC@@HOC(=O)C1CC1 |
| -4.639015 | CC(C)(C)NCC@HCOc1nsnc1N1CCOCC1 |
| -4.893000 | CC(C)(C)NCC@HCOc1nsnc1N1CCOCC1 |
| -5.780000 | CC(C)C[C@H]1C(=O)N2CCC[C@H]2[C@]2(O)OC@(C(C)C)C(=O)N12 |
| -5.910000 | CC(C)C[C@H]1C(=O)N2CCC[C@H]2[C@]2(O)OC@(C(C)C)C(=O)N12 |
| -4.710000 | CC(C)NCC(O)COc1cccc2[nH]ccc12 |
| -4.414867 | CC(C)NCC(O)COc1cccc2[nH]ccc12 |
| -4.463947 | CC(C)NCC(O)COc1cccc2ccccc12 |
| -4.580000 | CC(C)NCC(O)COc1cccc2ccccc12 |
| -5.540000 | CC(C)NCC(O)c1ccc(NS(C)(=O)=O)cc1 |
| -5.376751 | CC(C)NCC(O)c1ccc(NS(C)(=O)=O)cc1 |
| -5.760000 | CC(C)NCC(O)c1ccc(NS(C)(=O)=O)cc1 |
| -4.540000 | CC(C)S(=O)(=O)n1c(N)nc2ccc(/C(=C/C(N)=O)c3cccc(F)c3)cc21 |
| -4.140000 | CC(C)S(=O)(=O)n1c(N)nc2ccc(/C(=C/C(N)=O)c3cccc(F)c3)cc21 |
| -4.230000 | CC(C)S(=O)(=O)n1c(N)nc2ccc(/C(=C/C(N)=O)c3cccc(F)c3F)cc21 |
| -4.430000 | CC(C)S(=O)(=O)n1c(N)nc2ccc(/C(=C/C(N)=O)c3cccc(F)c3F)cc21 |
| -5.540000 | CC1=C(C(=O)O)N2C(=O)C@@H[C@H]2SC1 |
| -5.690000 | CC1=C(C(=O)O)N2C(=O)C@@H[C@H]2SC1 |
| -4.890000 | CC1=C(CC(=O)O)c2cc(F)ccc2/C1=C\c1ccc(S(C)=O)cc1 |
| -5.236829 | CC1=C(CC(=O)O)c2cc(F)ccc2/C1=C\c1ccc(S(C)=O)cc1 |
| -4.260000 | CCC(=O)N(c1ccccc1)C1(COC)CCN(CCn2nnn(CC)c2=O)CC1 |
| -3.824779 | CCC(=O)N(c1ccccc1)C1(COC)CCN(CCn2nnn(CC)c2=O)CC1 |
| -3.800000 | CCC(=O)N(c1ccccc1)C1(COC)CCN(CCn2nnn(CC)c2=O)CC1 |
| -4.960000 | CCC12CCN(CC3(O)CC3)C(Cc3ccc(O)cc31)C2(C)C |
| -4.731645 | CCC12CCN(CC3(O)CC3)C(Cc3ccc(O)cc31)C2(C)C |
| -4.730000 | CCN(CC)CC(=O)Nc1c(C)cccc1C |
| -4.285000 | CCN(CC)CC(=O)Nc1c(C)cccc1C |
| -4.209997 | CCN(CC)CC(=O)Nc1c(C)cccc1C |
| -5.860251 | CC[C@H]1OC(=O)C@HC@@HC@HC@@HC@(O)CC@@HC(=O)C@HC@@H[C@]1(C)O |
| -5.430000 | CC[C@H]1OC(=O)C@HC@@HC@HC@@HC@(O)CC@@HC(=O)C@HC@@H[C@]1(C)O |
| -5.515409 | CN(C)C(=N)N=C(N)N |
| -6.200000 | CN(C)C(=N)N=C(N)N |
| -4.260000 | CN(C)CCC=C1c2ccccc2CCc2ccccc21 |
| -3.900665 | CN(C)CCC=C1c2ccccc2CCc2ccccc21 |
| -4.750000 | CN(C)CCCN1c2ccccc2Sc2ccc(Cl)cc21 |
| -5.062984 | CN(C)CCCN1c2ccccc2Sc2ccc(Cl)cc21 |
| -6.400000 | CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)NC@@HC(=O)O)cc1 |
| -6.100000 | CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)NC@@HC(=O)O)cc1 |
| -5.090000 | CN1/C(=C(\O)Nc2ccccn2)C(=O)c2sccc2S1(=O)=O |
| -4.678651 | CN1/C(=C(\O)Nc2ccccn2)C(=O)c2sccc2S1(=O)=O |
| -4.450000 | CN1C(=O)CN=C(c2ccccc2)c2cc(Cl)ccc21 |
| -4.390000 | CN1C(=O)CN=C(c2ccccc2)c2cc(Cl)ccc21 |
| -3.924453 | CN1C(=O)CN=C(c2ccccc2)c2cc(Cl)ccc21 |
| -4.173839 | CN1C(=O)CN=C(c2ccccc2)c2cc(Cl)ccc21 |
| -3.954677 | CN1CCC(=C2c3ccccc3CC(=O)c3sccc32)CC1 |
| -4.950000 | CN1CCC(=C2c3ccccc3CC(=O)c3sccc32)CC1 |
| -4.710000 | CN1CCCC1c1cccnc1 |
| -4.390000 | CN1CCCC1c1cccnc1 |
| -4.608477 | CN1CCCC1c1cccnc1 |
| -5.450000 | CN1CC[C@]23c4c5ccc(O)c4O[C@H]2C@@HC=C[C@H]3[C@H]1C5 |
| -5.100000 | CN1CC[C@]23c4c5ccc(O)c4O[C@H]2C@@HC=C[C@H]3[C@H]1C5 |
| -4.670000 | CNCCCN1c2ccccc2CCc2ccccc21 |
| -4.558530 | CNCCCN1c2ccccc2CCc2ccccc21 |
| -4.245240 | COc1cc(Cc2cnc(N)nc2N)cc(OC)c1OC |
| -4.460000 | COc1cc(Cc2cnc(N)nc2N)cc(OC)c1OC |
| -5.886056 | C[C@H]1OC@@HCC@H[C@@H]1O |
| -5.585000 | C[C@H]1OC@@HCC@H[C@@H]1O |
| -4.430000 | C[C@]12CC[C@H]3C@@H[C@@H]1CC[C@@H]2O |
| -4.300000 | C[C@]12CC[C@H]3C@@H[C@@H]1CC[C@@H]2O |
| -4.470000 | Cc1cc(=O)n(-c2ccccc2)n1C |
| -3.890535 | Cc1cc(=O)n(-c2ccccc2)n1C |
| -4.400000 | Cc1ccc2c(=O)c3cccc(CC(=O)O)c3oc2c1C |
| -4.600000 | Cc1ccc2c(=O)c3cccc(CC(=O)O)c3oc2c1C |
| -5.152427 | Cc1ncc(CO)c(CO)c1O |
| -4.520000 | Cc1ncc(CO)c(CO)c1O |
| -3.920000 | Cc1nnc2n1-c1ccc(Cl)cc1C(c1ccccc1)=NC2 |
| -4.257226 | Cc1nnc2n1-c1ccc(Cl)cc1C(c1ccccc1)=NC2 |
| -4.770000 | Cc1noc(NS(=O)(=O)c2ccc(N)cc2)c1C |
| -4.920000 | Cc1noc(NS(=O)(=O)c2ccc(N)cc2)c1C |
| -4.363412 | NC(N)=N/N=C/c1c(Cl)cccc1Cl |
| -4.680000 | NC(N)=N/N=C/c1c(Cl)cccc1Cl |
| -5.890000 | NC(N)=NC(=O)c1nc(Cl)c(N)nc1N |
| -5.564235 | NC(N)=NC(=O)c1nc(Cl)c(N)nc1N |
| -5.630000 | NC(N)=NC(=O)c1nc(Cl)c(N)nc1N |
| -4.810000 | NC1[C@H]2CN(c3nc4c(cc3F)c(=O)c(C(=O)O)cn4-c3ccc(F)cc3F)C[C@@H]12 |
| -4.520000 | NC1[C@H]2CN(c3nc4c(cc3F)c(=O)c(C(=O)O)cn4-c3ccc(F)cc3F)C[C@@H]12 |
| -4.870000 | NCCc1ccc(O)c(O)c1 |
| -5.030000 | NCCc1ccc(O)c(O)c1 |
| -5.821955 | NC@@HC(=O)O |
| -6.050000 | NC@@HC(=O)O |
| -4.678971 | NC@@HC(=O)O |
| -5.000000 | NC@@HC(=O)O |
| -5.000000 | Nc1ccc(S(=O)(=O)Nc2ccccn2)cc1 |
| -4.670000 | Nc1ccc(S(=O)(=O)Nc2ccccn2)cc1 |
| -4.667561 | Nc1ccc(S(=O)(=O)Nc2ccccn2)cc1 |
| -6.700000 | O=C(/C=C/c1ccc(O)c(O)c1)OC(Cc1ccc(O)c(O)c1)C(=O)O |
| -6.280749 | O=C(/C=C/c1ccc(O)c(O)c1)OC(Cc1ccc(O)c(O)c1)C(=O)O |
| -7.018865 | O=C(O[C@H]1Cc2c(O)cc(O)cc2O[C@H]1c1ccc(O)c(O)c1)c1cc(O)c(O)c(O)c1 |
| -6.840000 | O=C(O[C@H]1Cc2c(O)cc(O)cc2O[C@H]1c1ccc(O)c(O)c1)c1cc(O)c(O)c(O)c1 |
| -3.858864 | O=C1CN=C(c2ccccc2)c2cc(Cl)ccc2N1 |
| -4.017729 | O=C1CN=C(c2ccccc2)c2cc(Cl)ccc2N1 |
| -5.230000 | O=C1Cc2cc(CCN3CCN(c4nsc5ccccc45)CC3)c(Cl)cc2N1 |
| -4.910047 | O=C1Cc2cc(CCN3CCN(c4nsc5ccccc45)CC3)c(Cl)cc2N1 |
| -4.540000 | O=C1N(c2ccccc2)c2ccccc2C1(Cc1ccncc1)Cc1ccncc1 |
| -4.770000 | O=C1N(c2ccccc2)c2ccccc2C1(Cc1ccncc1)Cc1ccncc1 |
| -4.620000 | O=C1N(c2ccccc2)c2ccccc2C1(Cc1ccncc1)Cc1ccncc1 |
| -3.928269 | O=C1Nc2ccc(Cl)cc2C(c2ccccc2)=NC1O |
| -4.040958 | O=C1Nc2ccc(Cl)cc2C(c2ccccc2)=NC1O |
| -5.310000 | O=c1c(O)c(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 |
| -6.115000 | O=c1c(O)c(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 |
| -4.035013 | O=c1ccc2ccccc2o1 |
| -4.250000 | O=c1ccc2ccccc2o1 |
| -4.660000 | OC(Cn1cncn1)(Cn1cncn1)c1ccc(F)cc1F |
| -4.552494 | OC(Cn1cncn1)(Cn1cncn1)c1ccc(F)cc1F |
| -6.810000 | OC[C@H]1OC@@HC@HC@@H[C@H]1O |
| -6.545757 | OC[C@H]1OC@@HC@HC@@H[C@H]1O |
- Calculate the average \(\log P_{\text{app}}\) value for the remaining 87 compounds:
| logPapp | canonical_smiles |
|---|---|
| -4.670000 | C=CCN1CC[C@]23c4c5ccc(O)c4O[C@H]2C(=O)CC[C@@]3(O)[C@H]1C5 |
| -4.571772 | C=CCN1CC[C@]23c4c5ccc(O)c4O[C@H]2C(=O)CC[C@@]3(O)[C@H]1C5 |
| -4.229574 | CC#CC#CCCCC/C=C/C(=O)NCC(C)CC |
| -4.230000 | CC#CC#CCCCC/C=C/C(=O)NCC(C)CC |
| -4.950000 | CC(C(=O)O)c1cccc(Oc2ccccc2)c1 |
| -4.946921 | CC(C(=O)O)c1cccc(Oc2ccccc2)c1 |
| -4.320000 | CC(C)C(=O)Nc1ccc(N(O)O)c(C(F)(F)F)c1 |
| -4.323307 | CC(C)C(=O)Nc1ccc(N(O)O)c(C(F)(F)F)c1 |
| -4.343375 | CC(C)Cc1ccc(C(C)C(=O)O)cc1 |
| -4.280000 | CC(C)Cc1ccc(C(C)C(=O)O)cc1 |
| -5.679136 | CC(C)C@HC(=O)OCCOCn1cnc2c(=O)nc(N)[nH]c21 |
| -5.720000 | CC(C)C@HC(=O)OCCOCn1cnc2c(=O)nc(N)[nH]c21 |
| -5.638272 | CC(C)C@HC(=O)OCCOCn1cnc2c(=O)nc(N)[nH]c21 |
| -5.440000 | CC1CN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CCN1 |
| -5.441849 | CC1CN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CCN1 |
| -4.660771 | CC1CN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CCN1C |
| -4.660000 | CC1CN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CCN1C |
| -4.790000 | CC1COc2c(N3CCN(C)CC3)c(F)cc3c(=O)c(C(=O)O)cn1c23 |
| -4.791587 | CC1COc2c(N3CCN(C)CC3)c(F)cc3c(=O)c(C(=O)O)cn1c23 |
| -4.790000 | CCCC1O[C@@H]2C[C@H]3[C@@H]4CCC5=CC(=O)C=C[C@]5(C)[C@H]4C@@HC[C@]3(C)[C@]2(C(=O)CO)O1 |
| -4.890000 | CCCC1O[C@@H]2C[C@H]3[C@@H]4CCC5=CC(=O)C=C[C@]5(C)[C@H]4C@@HC[C@]3(C)[C@]2(C(=O)CO)O1 |
| -5.481486 | CCCCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -5.480000 | CCCCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.769551 | CCCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.770000 | CCCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.858492 | CCCCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.860000 | CCCCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.585027 | CCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.590000 | CCCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -5.010000 | CCCCc1nc(Cl)c(C=O)n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1 |
| -5.110000 | CCCCc1nc(Cl)c(C=O)n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1 |
| -4.820000 | CCCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.819078 | CCCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.570000 | CCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.568636 | CCCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -5.187087 | CCCSc1ccc2nc(NC(=O)OC)[nH]c2c1 |
| -5.190000 | CCCSc1ccc2nc(NC(=O)OC)[nH]c2c1 |
| -4.163171 | CCCc1cc(=O)[nH]c(=S)[nH]1 |
| -4.170000 | CCCc1cc(=O)[nH]c(=S)[nH]1 |
| -5.480000 | CCCc1nc(CC)c(C=O)n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1 |
| -5.380000 | CCCc1nc(CC)c(C=O)n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1 |
| -4.760928 | CCCc1nc2c(C)cc(-c3nc4ccccc4n3C)cc2n1Cc1ccc(-c2ccccc2C(=O)O)cc1 |
| -4.821000 | CCCc1nc2c(C)cc(-c3nc4ccccc4n3C)cc2n1Cc1ccc(-c2ccccc2C(=O)O)cc1 |
| -4.341829 | CCCc1nn(C)c2c(=O)[nH]c(-c3cc(S(=O)(=O)N4CCN(C)CC4)ccc3OCC)nc12 |
| -4.318759 | CCCc1nn(C)c2c(=O)[nH]c(-c3cc(S(=O)(=O)N4CCN(C)CC4)ccc3OCC)nc12 |
| -4.651695 | CCN(CC)CCNC(=O)c1cc(Cl)c(N)cc1OC |
| -4.650000 | CCN(CC)CCNC(=O)c1cc(Cl)c(N)cc1OC |
| -4.770000 | CCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.772345 | CCN1CCN(c2cc3c(cc2F)c(=O)c(C(=O)O)cn3C2CC2)CC1C |
| -4.640985 | CCOC(=O)C1=C(C)NC(C)=C(C(=O)OC)C1c1cccc(Cl)c1Cl |
| -4.640000 | CCOC(=O)C1=C(C)NC(C)=C(C(=O)OC)C1c1cccc(Cl)c1Cl |
| -4.550000 | CCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.552842 | CCOc1cccc(NC(=O)OCC[NH+]2CCCC2)c1.[Cl-] |
| -4.600000 | CCn1cc(C(=O)O)c(=O)c2cc(F)c(N3CCN(C)CC3)cc21 |
| -4.602060 | CCn1cc(C(=O)O)c(=O)c2cc(F)c(N3CCN(C)CC3)cc21 |
| -4.660000 | CN(C)CCC(c1ccc(Br)cc1)c1ccccn1 |
| -4.744728 | CN(C)CCC(c1ccc(Br)cc1)c1ccccn1 |
| -4.990000 | CNC(C)C(O)c1ccccc1 |
| -4.970000 | CNC(C)C(O)c1ccccc1 |
| -4.054532 | COC(=O)C1=C(C)NC(C)=C(C(=O)OC(C)C)C1c1cccc2nonc12 |
| -4.050000 | COC(=O)C1=C(C)NC(C)=C(C(=O)OC(C)C)C1c1cccc2nonc12 |
| -4.305563 | COC1=CC(=O)CC@@H[C@]12Oc1c(Cl)c(OC)cc(OC)c1C2=O |
| -4.400000 | COC1=CC(=O)CC@@H[C@]12Oc1c(Cl)c(OC)cc(OC)c1C2=O |
| -4.980000 | COCC(=O)O[C@]1(CCN(C)CCCc2nc3ccccc3[nH]2)CCc2cc(F)ccc2[C@@H]1C(C)C |
| -4.955000 | COCC(=O)O[C@]1(CCN(C)CCCc2nc3ccccc3[nH]2)CCc2cc(F)ccc2[C@@H]1C(C)C |
| -4.630000 | COc1c(O)cc(O)c2c(=O)cc(-c3ccccc3)oc12 |
| -4.632256 | COc1c(O)cc(O)c2c(=O)cc(-c3ccccc3)oc12 |
| -6.150000 | COc1cc([C@@H]2c3cc4c(cc3C@@H[C@H]3COC(=O)[C@H]23)OCO4)cc(OC)c1O |
| -6.150000 | COc1cc([C@@H]2c3cc4c(cc3C@@H[C@H]3COC(=O)[C@H]23)OCO4)cc(OC)c1O |
| -4.842332 | COc1ccc(-c2cc(=O)c3c(O)cc(O)cc3o2)cc1 |
| -4.840000 | COc1ccc(-c2cc(=O)c3c(O)cc(O)cc3o2)cc1 |
| -4.530000 | COc1ccc(-c2cc(=O)c3c(OC)c(OC)c(OC)c(OC)c3o2)cc1 |
| -4.531567 | COc1ccc(-c2cc(=O)c3c(OC)c(OC)c(OC)c(OC)c3o2)cc1 |
| -4.530000 | COc1ccc(-c2coc3cc(O)ccc3c2=O)cc1 |
| -4.529355 | COc1ccc(-c2coc3cc(O)ccc3c2=O)cc1 |
| -4.450000 | COc1ccc(Cc2nccc3cc(OC)c(OC)cc23)cc1OC |
| -4.452964 | COc1ccc(Cc2nccc3cc(OC)c(OC)cc23)cc1OC |
| -4.260000 | COc1ccc2nc(S(=O)Cc3ncc(C)c(OC)c3C)[nH]c2c1 |
| -4.170696 | COc1ccc2nc(S(=O)Cc3ncc(C)c(OC)c3C)[nH]c2c1 |
| -4.630000 | CS(=O)(=O)Nc1cccc(Cn2cnc3ccccc32)c1 |
| -4.630784 | CS(=O)(=O)Nc1cccc(Cn2cnc3ccccc32)c1 |
| -4.899987 | CSc1ccc2ncn(/C=C/C(=O)O)c(=O)c2c1 |
| -4.985000 | CSc1ccc2ncn(/C=C/C(=O)O)c(=O)c2c1 |
| -4.519040 | C[C@@H]1CC[C@H]2C@@HC(=O)O[C@@H]3O[C@@]4(C)CC[C@@H]1[C@]32OO4 |
| -4.520000 | C[C@@H]1CC[C@H]2C@@HC(=O)O[C@@H]3O[C@@]4(C)CC[C@@H]1[C@]32OO4 |
| -4.810657 | C[C@@H]1C[C@H]2[C@@H]3CCC4=CC(=O)C=C[C@]4(C)[C@@]3(F)C@@HC[C@]2(C)[C@@]1(O)C(=O)CO |
| -4.830000 | C[C@@H]1C[C@H]2[C@@H]3CCC4=CC(=O)C=C[C@]4(C)[C@@]3(F)C@@HC[C@]2(C)[C@@]1(O)C(=O)CO |
| -4.540000 | C[C@]12CCC(=O)C=C1CC[C@@H]1[C@@H]2C@@HC[C@@]2(C)[C@H]1CC[C@]2(O)C(=O)CO |
| -4.568636 | C[C@]12CCC(=O)C=C1CC[C@@H]1[C@@H]2C@@HC[C@@]2(C)[C@H]1CC[C@]2(O)C(=O)CO |
| -4.730000 | C[C@]12CC[C@@H]3c4ccc(O)cc4CC[C@H]3[C@@H]1CC[C@@H]2O |
| -4.732296 | C[C@]12CC[C@@H]3c4ccc(O)cc4CC[C@H]3[C@@H]1CC[C@@H]2O |
| -4.280000 | Cc1ccc(-c2ncc(Cl)cc2-c2ccc(S(C)(=O)=O)cc2)cn1 |
| -4.281498 | Cc1ccc(-c2ncc(Cl)cc2-c2ccc(S(C)(=O)=O)cc2)cn1 |
| -5.075721 | Cc1ccc(C(=O)c2ccc(CC(=O)O)n2C)cc1 |
| -5.090000 | Cc1ccc(C(=O)c2ccc(CC(=O)O)n2C)cc1 |
| -4.747147 | Cc1cccc(Nc2ccccc2C(=O)O)c1C |
| -4.750000 | Cc1cccc(Nc2ccccc2C(=O)O)c1C |
| -4.505356 | Cc1ccnc2c1NC(=O)c1cccnc1N2C1CC1 |
| -4.520000 | Cc1ccnc2c1NC(=O)c1cccnc1N2C1CC1 |
| -4.400000 | Cc1ncc2n1-c1ccc(Cl)cc1C(c1ccccc1F)=NC2 |
| -4.361012 | Cc1ncc2n1-c1ccc(Cl)cc1C(c1ccccc1F)=NC2 |
| -4.580000 | Clc1cccc(Cl)c1NC1=NCCN1 |
| -4.497726 | Clc1cccc(Cl)c1NC1=NCCN1 |
| -4.140000 | N#Cc1cccc(Cn2cccn2)c1 |
| -4.143271 | N#Cc1cccc(Cn2cccn2)c1 |
| -4.168770 | N#Cc1cccc(Cn2ccnc2)c1 |
| -4.170000 | N#Cc1cccc(Cn2ccnc2)c1 |
| -4.181774 | N#Cc1cccc(Cn2cnc3ccccc32)c1 |
| -4.180000 | N#Cc1cccc(Cn2cnc3ccccc32)c1 |
| -4.995678 | N=c1nc(N2CCCCC2)cc(N)n1O |
| -5.000000 | N=c1nc(N2CCCCC2)cc(N)n1O |
| -5.164309 | NC(=O)c1cccc(Cn2ccnc2)c1 |
| -5.160000 | NC(=O)c1cccc(Cn2ccnc2)c1 |
| -4.600000 | NC(=O)c1cccc(Cn2cnc3ccccc32)c1 |
| -4.600326 | NC(=O)c1cccc(Cn2cnc3ccccc32)c1 |
| -4.820000 | NC(N)=NCC1COc2ccccc2O1 |
| -4.819587 | NC(N)=NCC1COc2ccccc2O1 |
| -4.550000 | NCc1cccc(Cn2cccn2)c1 |
| -4.546682 | NCc1cccc(Cn2cccn2)c1 |
| -4.559091 | NCc1cccc(Cn2cnc3ccccc32)c1 |
| -4.560000 | NCc1cccc(Cn2cnc3ccccc32)c1 |
| -4.630784 | NS(=O)(=O)c1ccc(Cn2cnc3ccccc32)cc1 |
| -4.630000 | NS(=O)(=O)c1ccc(Cn2cnc3ccccc32)cc1 |
| -4.241088 | Nc1cccc(Cn2cnc3ccccc32)c1 |
| -4.240000 | Nc1cccc(Cn2cnc3ccccc32)c1 |
| -4.739965 | O=C(CCc1ccc(O)cc1)c1ccc(O)cc1O |
| -4.740000 | O=C(CCc1ccc(O)cc1)c1ccc(O)cc1O |
| -4.416802 | O=C(NCC1CCCCN1)c1cc(OCC(F)(F)F)ccc1OCC(F)(F)F |
| -4.420000 | O=C(NCC1CCCCN1)c1cc(OCC(F)(F)F)ccc1OCC(F)(F)F |
| -5.539102 | O=C(O)Cc1ccc(Cn2cccn2)cc1 |
| -5.540000 | O=C(O)Cc1ccc(Cn2cccn2)cc1 |
| -6.770000 | O=C(O)[C@H]1OC@@HC@HC@@H[C@@H]1O |
| -6.769776 | O=C(O)[C@H]1OC@@HC@HC@@H[C@@H]1O |
| -4.740000 | O=C(O)c1cc(-c2ccc(F)cc2F)ccc1O |
| -4.735182 | O=C(O)c1cc(-c2ccc(F)cc2F)ccc1O |
| -5.270000 | O=C(O)c1ccc(Cc2ncccn2)cc1 |
| -5.268411 | O=C(O)c1ccc(Cc2ncccn2)cc1 |
| -5.440000 | O=C(O)c1ccc(Cn2cccn2)cc1 |
| -5.436519 | O=C(O)c1ccc(Cn2cccn2)cc1 |
| -5.500000 | O=C(O)c1ccc(Cn2cnc3ccccc32)cc1 |
| -5.496209 | O=C(O)c1ccc(Cn2cnc3ccccc32)cc1 |
| -5.214670 | O=C(O)c1cccc(Cc2ncccn2)c1 |
| -5.210000 | O=C(O)c1cccc(Cc2ncccn2)c1 |
| -5.533133 | O=C(O)c1cccc(Cn2cccn2)c1 |
| -5.530000 | O=C(O)c1cccc(Cn2cccn2)c1 |
| -4.738855 | O=C(O)c1ccccc1O |
| -4.820000 | O=C(O)c1ccccc1O |
| -5.315940 | O=C(O)c1cn(C2CC2)c2cc(N3CCNCC3)c(F)cc2c1=O |
| -5.410000 | O=C(O)c1cn(C2CC2)c2cc(N3CCNCC3)c(F)cc2c1=O |
| -5.370000 | O=C(c1ccccc1)c1ccc2n1CCC2C(=O)O |
| -5.366531 | O=C(c1ccccc1)c1ccc2n1CCC2C(=O)O |
| -6.590000 | O=C1CC@@HOc2cc(O)ccc21 |
| -6.587513 | O=C1CC@@HOc2cc(O)ccc21 |
| -4.530000 | O=C1NC(=O)C(c2ccccc2)(c2ccccc2)N1 |
| -4.529394 | O=C1NC(=O)C(c2ccccc2)(c2ccccc2)N1 |
| -5.420217 | O=c1[nH]c2ccccc2n1CCCN1CCC(n2c(=O)[nH]c3cc(Cl)ccc32)CC1 |
| -5.410000 | O=c1[nH]c2ccccc2n1CCCN1CCC(n2c(=O)[nH]c3cc(Cl)ccc32)CC1 |
| -4.570000 | O=c1c(-c2ccc(O)cc2)coc2cc(O)cc(O)c12 |
| -4.650000 | O=c1c(-c2ccc(O)cc2)coc2cc(O)cc(O)c12 |
| -4.678890 | O=c1c(-c2ccc(O)cc2)coc2cc(O)ccc12 |
| -4.680000 | O=c1c(-c2ccc(O)cc2)coc2cc(O)ccc12 |
| -6.208804 | O=c1c(O)c(-c2ccc(O)cc2O)oc2cc(O)cc(O)c12 |
| -6.210000 | O=c1c(O)c(-c2ccc(O)cc2O)oc2cc(O)cc(O)c12 |
| -5.111191 | O=c1c(O)c(-c2ccccc2)oc2cc(O)cc(O)c12 |
| -5.110000 | O=c1c(O)c(-c2ccccc2)oc2cc(O)cc(O)c12 |
| -6.370000 | O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O[C@@H]3OC@HC@@HC@H[C@H]3O)cc(O)c12 |
| -6.368266 | O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O[C@@H]3OC@HC@@HC@H[C@H]3O)cc(O)c12 |
| -5.200000 | O=c1cc(-c2ccc(O)cc2)oc2cc(O)c(O)c(O)c12 |
| -5.240000 | O=c1cc(-c2ccc(O)cc2)oc2cc(O)c(O)c(O)c12 |
| -5.138636 | O=c1cc(-c2ccccc2)oc2cc(O)c(O)c(O)c12 |
| -5.140000 | O=c1cc(-c2ccccc2)oc2cc(O)c(O)c(O)c12 |
| -4.170000 | c1ccc(Cn2ccnc2)cc1 |
| -4.166853 | c1ccc(Cn2ccnc2)cc1 |
| -4.148130 | c1ccc(Cn2cnc3ccccc32)cc1 |
| -4.150000 | c1ccc(Cn2cnc3ccccc32)cc1 |
- Excluding one compound with fewer than four atoms;
| logPapp | canonical_smiles |
|---|---|
| -4.490000 | CO |
Reference
- N.-N. Wang, J. Dong, Y.-H. Deng, M.-F. Zhu, M. Wen, Z.-J. Yao, A.-P. Lu, J.-B. Wang, and D.-S. Cao, Adme properties evaluation in drug discovery: prediction of caco-2 cell permeability using a combination of nsga-ii and boosting, Journal of Chemical Information and Modeling 56, 763 (2016).