A comprehensive global cross-modal dataset integrating AI-related academic papers and patents with advanced BERT-based classification and hypergraph novelty quantification. For the calculation method of Novelty, please refer to https://github.com/jameswweis/delphi
The DeepInnovationAI dataset is publicly available on Figshare (permanent access link: https://doi.org/10.6084/m9.figshare.28578947). The dataset features a modular structure with three core files: patent data (DeepPatentAI.csv), academic papers (DeepDiveAI.csv), and paper-patent similarity (DeepCosineAI.csv).