CLiDE Pro is the latest incarnation of software to emerge from the long-term CLiDE (Chemical Literature Data Extraction) project. Chemical OCR involves three main problems: (a) identification of chemical images within a document, (b) compilation of chemical graphs of individual molecules from chemical images, and (c) interpretation of complex objects such as generic molecules and reaction schemes using the retrieved chemical graphs. The structure recognition methods implemented in CLiDE Pro will be presented. Structure features which frequently cause problems such as crossing bonds, lines found in various chemical entities such as single bonds attached to triple bonds, dashed bonds and parts of atom labels commonly misclassified as lines (e.g. I and Cl) will be discussed together with our solutions to these problems. A key component of the presentation will be CLiDE Pro’s approach to the interpretation of generic structures.
Get in touch
You need a high-performing, reliable and easy-to-use software solution to speed up your next big scientific breakthrough. Getting the right solution is integral to advance your research and workflow.