Which molecular generation method is best for large molecular generations? Finding a good lead molecule is an important task in drug discovery. Recently several deep graph generative models have been developed for generating novel molecules that can be further tested for synthesizability in the drug development process. Most of the developed models are trained on small molecules with a maximum length of thirty. However, there is a need for the generation of larger molecules. We tested six recently proposed graph neural network-based molecular generation methods on their large molecular generation performance using two datasets from the LigandBox database, which contain larger molecules than typically used ZINC250k and QM9 datasets. We use twelve evaluation measures to evaluate the quality of the generated molecules, including stability measures such as logP values and QEDs.
This work was supported in part by the National Research Foundation of Korea grant funded by the Korean government (2018R1A5A1060031). (Corresponding author: Lee Sael.)