Scientific R&D organizations often have a hard time wringing out the full use of their data. Some data spill out in instrument readouts, legacy platforms, electronic laboratory notebooks, or even on paper. “In fact, we had one company we worked with where we digitized information for them, and they found, I think it was, 50–75% of the content had been duplicated,” says Jennifer Sexton, director of custom services for CAS, a division of the American Chemical Society specializing in scientific knowledge management. “So they had actually redone the experiments because they didn’t know they had done them.” Other companies had “storage sheds full of these lab notebooks,” she says.And the data aren’t always neatly linked. As a curator of chemical knowledge, CAS recognized this need for cohesion and its outcome: a lack of artificial intelligence readiness. Fragmented data ingested by machine learning tools yield inaccurate output and wasted investment. To provide chemical and pharmaceutical organizations with a platform for data harmonization, it built the CAS Intelligence Hub, which launched in January.The cloud-based Hub prepares companies’ proprietary data for AI ingestion. It can also combine company data with CAS reference data to enhance model accuracy. Essentially, CAS applies some of its own information-structuring tools to customer data, “combining our decades of curation expertise with secure infrastructure that accelerates discovery and enables confident AI adoption,” CAS president Manuel Guzman said in a press release.“Content management and data management are something we’ve been doing for a very long time. You could argue it’s
Sydney Smith (Mon,) studied this question.