Template-type: ReDIF-Paper 1.0 Author-Name: Manera, Maria Author-Email: maria.manera@unito.it Author-Workplace-Name: Department of Economics and Statistics Cognetti de Martiis, University of Turin, Italy Author-Workplace-Homepage: https://www.est.unito.it Author-Name: Fusillo, Fabrizio Author-Email: fabrizio.fusillo@unito.it Author-Workplace-Name: Department of Economics and Statistics Cognetti de Martiis, University of Turin, Italy Author-Workplace-Homepage: https://www.est.unito.it Author-Name: Orsatti, Gianluca Author-Email: gianluca.orsatti@unito.it Author-Workplace-Name: Department of Economics and Statistics Cognetti de Martiis, University of Turin, Italy Author-Workplace-Homepage: https://www.est.unito.it Author-Name: Quatraro, Francesco Author-Email: francesco.quatraro@unito.it Author-Workplace-Name: Department of Economics and Statistics Cognetti de Martiis, University of Turin, Italy Author-Workplace-Homepage: https://www.est.unito.it Title: Addressing the identification of Critical Raw Material Patents Using Pretrained and Large Language Models Abstract: In modern technologies, critical raw materials (CRMs) have gained attention due to supply chain risks, environmental concerns, and their essential role in industries such as renewable energy, electric vehicles, and advanced electronics. However, identifying and classifying CRM-related patents, and thus technologies, remains challenging due to the lack of specific classification systems. Traditional approaches, such as keyword- based searches and Cooperative Patent Classification (CPC) and International Patent Classification (IPC) codes, suffer from inaccuracies due to evolving terminology, ambiguous context, as well as the inability in recognizing alternative material usage. This study proposes a novel methodology leveraging advanced natural language processing (NLP) tools to overcome these limitations. Our approach addresses two key objectives: (1) distinguishing between substitutable and non-substitutable CRMs in patent abstracts through the GPT-3.5-turbo-16k model and (2) identifying CRM- related patents via a fine-tuned BERT for Patents model. Our findings reveal distinct geographical, technological, and temporal patterns in CRM- related innovation, emphasizing the significance of NLP techniques in overcoming traditional classification challenges. This research offers policymakers and industry stakeholders valuable insights into CRM innovation trends, supporting strategic decision-making for sustainable resource management. Length: 49 pages Creation-Date: 2025-05 File-URL: https://www.est.unito.it/do/home.pl/Download?doc=/allegati/wp2025dip/wp_05_2025.pdf File-Format: Application/PDF Handle: RePEc:uto:dipeco:202505