AZ(AstraZeneca)-lipo

Dataset: Download it here.

Dataset description: 4,195 compounds and their measured \(\log D_{7.4}\) values, octanol-water partition coefficient at pH7.4 measured using a shake flask method, deposited by AstraZeneca in ChEMBL.

Dataset preprocessing

logD7.4 canonical_smiles
3.04 CN1CCC[C@@H]1CCOC@(c1ccccc1)c1ccc(Cl)cc1
3.48 CN1CCC[C@@H]1CCOC@(c1ccccc1)c1ccc(Cl)cc1
-0.66 CN1[C@@H]2CC[C@H]1CC@@HC2
-0.09 CN1[C@@H]2CC[C@H]1CC@@HC2

and take the average for the third compound:

logD7.4 canonical_smiles
2.16 C=C[C@H]1CN2CC[C@H]1C[C@H]2C@Hc1ccnc2ccc(OC)cc12
2.26 C=C[C@H]1CN2CC[C@H]1C[C@H]2C@Hc1ccnc2ccc(OC)cc12

Reference

  1. M. Wenlock and N. Tomkinson, Experimental in vitro DMPK and physicochemical data on a set of publicly disclosed compounds.