Generative AI Adding to the Risks for Embedded Credentials

by Sue Poremba on August 4, 2023

Everyone wants to make their work processes easier. A step that many developers take to provide simpler access to their workflows is to embed credentials directly into the code.

It’s not hard to understand why developers do this–as more applications become cloud-native and need to authenticate to more third-party APIs, applications need more credentials. The easiest way for a developer to have their app authenticate to a third party is by putting the credentials in the code or in a configuration file alongside the code, explained Chris Anley, chief scientist at security consultancy NCC Group, in a formal statement.

Sponsorships Available

“But this means that anyone who has the code has the keys,” said Anley. That can include the development team, contractors and more as credentials become located in backups and servers.

“Rather than holding the key in the source code, the right way is to use a credential vault designed for this purpose and fetch the key from the vault,” Anley added. This creates an access and management system that is extremely hard to detect and manage and could open the organization to security risks.

Why Embedded Passwords are a Bad Idea

Threat actors love getting their hands on access credentials. According to the 2023 Verizon Data Breach Investigations Report (DBIR), 86% of breaches involved stolen credentials. The easier it is to steal them, the better. It’s why phishing has remained such a popular attack vector.

The problem with code-embedded credentials—most often passwords, but it can be other types used by developers—is that they are usually added to source code as plain text. That underscores the problem: Anyone can read anything saved in plain text. If a threat actor gets into your system through another entrance, the plain text in the source code provides them with all kinds of information to access your applications, your network and your data.

And yet, this practice remains popular with developers because it saves time when they are building new software applications. “The code and scripts developers write often need credentials and secrets, such as SSH keys and API tokens, to access cloud resources and interact with other apps and tools,” Justyna Kucharczak wrote for CyberArk. The practice is risky, Kucharczak explained, because code is so often exchanged in repositories like GitHub. “Each exchange can inadvertently expose credentials, including hard-coded credentials, to potential attackers and, sometimes, the entire public,” Kucharczak said.

The Generative AI Threat

Developers are using generative AI technology to write code, and that means that AI could also be embedding credentials in the code it creates. But at the same time, the machine learning model of generative AI could also be sharing those credentials unwittingly.

“When these models encounter repeated patterns across the training data—like hardcoded secrets in code snippets—they learn to replicate those patterns,” Lotem Guy wrote for SC Magazine.

That type of credential sharing is part of the risk of using generative AI. Developers can take steps to keep the credentials safe using a secrets manager with an IAM system. Using a secrets manager can also protect credentials from threat actors using generative AI to steal embedded credentials in source code. Recently, cybercriminals were “advertising stolen OpenAI API tokens that have been scraped from other peoples’ code,” according to a Vice article. The threat actors targeted a website where developers collaborated on coding projects and credentials were left vulnerable.

Credentials are the keys to the kingdom, and they’ve always been at risk when embedded into source code. As AI is used more as a development tool, it is more necessary than ever to deploy an IAM system that offers secrets management and protects credentials from becoming exposed.