From OpenAI’s GPT series to Google’s Gemini, and various open-source models, advanced artificial intelligence is profoundly reshaping our work and lifestyles. However, alongside the rapid technological advancements, a concerning dark side is gradually emerging - the rise of unrestricted or malicious large language models.
The so-called unrestricted LLM refers to language models that are specifically designed, modified, or “jailbroken” to bypass the built-in safety mechanisms and ethical constraints of mainstream models. Mainstream LLM developers typically invest significant resources to prevent their models from being used to generate hate speech, false information, malicious code, or provide instructions for illegal activities. However, in recent years, some individuals or organizations have begun to seek or develop unrestricted models for motives such as cybercrime. In light of this, this article will review typical unrestricted LLM tools, introduce their abuse in the encryption industry, and discuss related security challenges and responses.
Tasks that used to require professional skills, such as writing malicious code, creating phishing emails, and planning scams, can now be easily accomplished by ordinary people without any programming experience with the assistance of unrestricted LLMs. Attackers only need to obtain the weights and source code of open-source models, and then fine-tune them on datasets containing malicious content, biased statements, or illegal instructions to create customized attack tools.
This model has given rise to multiple risk hazards: attackers can “magically modify” models based on specific targets to generate more deceptive content, thereby bypassing the content review and security restrictions of conventional LLMs; the model can also be used to quickly generate code variants for phishing websites or tailor scam copy for different social media platforms; meanwhile, the accessibility and modifiability of open-source models continue to foster the formation and spread of an underground AI ecosystem, providing a breeding ground for illegal transactions and development. Below is a brief introduction to such unrestricted LLMs:
WormGPT is a malicious LLM openly sold on underground forums, whose developers explicitly claim it has no ethical restrictions, making it a black version of the GPT model. It is based on open-source models like GPT-J 6B and trained on a large amount of data related to malware. Users need to pay a minimum of $189 to gain a month’s access. The most notorious use of WormGPT is to generate highly realistic and persuasive Business Email Compromise (BEC) attack emails and phishing emails. Its typical abuses in the encryption space include:
DarkBERT is a language model developed in collaboration between researchers from the Korea Advanced Institute of Science and Technology (KAIST) and S2W Inc., specifically pre-trained on dark web data (such as forums, black markets, and leaked information) with the intention of helping cybersecurity researchers and law enforcement agencies better understand the dark web ecosystem, track illegal activities, identify potential threats, and gather threat intelligence.
Although DarkBERT was designed with good intentions, the sensitive content it holds regarding the dark web, including data, attack methods, and illegal trading strategies, could have dire consequences if malicious actors were to obtain it or utilize similar technologies to train unrestricted large models. Its potential misuse in encryption scenarios includes:
FraudGPT: The Swiss Army Knife of Online Fraud
FraudGPT claims to be an upgraded version of WormGPT, offering more comprehensive features, primarily sold on the dark web and hacker forums, with monthly fees ranging from $200 to $1,700. Its typical abuses in the encryption scenario include:
GhostGPT is an AI chatbot explicitly positioned as having no ethical constraints, with typical abuses in the encryption scenario including:
Venice.ai provides access to various LLMs, including some models with less scrutiny or looser restrictions. It positions itself as an open gateway for users to explore the capabilities of different LLMs, offering cutting-edge, accurate, and unmoderated models for a truly unrestricted AI experience, but it may also be exploited by malicious actors to generate harmful content. The risks associated with the platform include:
The emergence of unrestricted LLMs marks a new paradigm of attacks on cybersecurity that is more complex, scalable, and automated. These models not only lower the threshold for attacks but also introduce new threats that are more covert and deceptive.
In this ongoing game of offense and defense, all parties in the security ecosystem must work together to cope with future risks: on one hand, there is a need to increase investment in detection technologies to develop systems capable of identifying and intercepting phishing content generated by malicious LLMs, exploiting smart contract vulnerabilities, and malicious code; on the other hand, efforts should also be made to promote the construction of model anti-jailbreaking capabilities and to explore watermarking and tracing mechanisms to track the sources of malicious content in critical scenarios such as finance and code generation; in addition, a sound ethical framework and regulatory mechanism must be established to fundamentally limit the development and misuse of malicious models.
From OpenAI’s GPT series to Google’s Gemini, and various open-source models, advanced artificial intelligence is profoundly reshaping our work and lifestyles. However, alongside the rapid technological advancements, a concerning dark side is gradually emerging - the rise of unrestricted or malicious large language models.
The so-called unrestricted LLM refers to language models that are specifically designed, modified, or “jailbroken” to bypass the built-in safety mechanisms and ethical constraints of mainstream models. Mainstream LLM developers typically invest significant resources to prevent their models from being used to generate hate speech, false information, malicious code, or provide instructions for illegal activities. However, in recent years, some individuals or organizations have begun to seek or develop unrestricted models for motives such as cybercrime. In light of this, this article will review typical unrestricted LLM tools, introduce their abuse in the encryption industry, and discuss related security challenges and responses.
Tasks that used to require professional skills, such as writing malicious code, creating phishing emails, and planning scams, can now be easily accomplished by ordinary people without any programming experience with the assistance of unrestricted LLMs. Attackers only need to obtain the weights and source code of open-source models, and then fine-tune them on datasets containing malicious content, biased statements, or illegal instructions to create customized attack tools.
This model has given rise to multiple risk hazards: attackers can “magically modify” models based on specific targets to generate more deceptive content, thereby bypassing the content review and security restrictions of conventional LLMs; the model can also be used to quickly generate code variants for phishing websites or tailor scam copy for different social media platforms; meanwhile, the accessibility and modifiability of open-source models continue to foster the formation and spread of an underground AI ecosystem, providing a breeding ground for illegal transactions and development. Below is a brief introduction to such unrestricted LLMs:
WormGPT is a malicious LLM openly sold on underground forums, whose developers explicitly claim it has no ethical restrictions, making it a black version of the GPT model. It is based on open-source models like GPT-J 6B and trained on a large amount of data related to malware. Users need to pay a minimum of $189 to gain a month’s access. The most notorious use of WormGPT is to generate highly realistic and persuasive Business Email Compromise (BEC) attack emails and phishing emails. Its typical abuses in the encryption space include:
DarkBERT is a language model developed in collaboration between researchers from the Korea Advanced Institute of Science and Technology (KAIST) and S2W Inc., specifically pre-trained on dark web data (such as forums, black markets, and leaked information) with the intention of helping cybersecurity researchers and law enforcement agencies better understand the dark web ecosystem, track illegal activities, identify potential threats, and gather threat intelligence.
Although DarkBERT was designed with good intentions, the sensitive content it holds regarding the dark web, including data, attack methods, and illegal trading strategies, could have dire consequences if malicious actors were to obtain it or utilize similar technologies to train unrestricted large models. Its potential misuse in encryption scenarios includes:
FraudGPT: The Swiss Army Knife of Online Fraud
FraudGPT claims to be an upgraded version of WormGPT, offering more comprehensive features, primarily sold on the dark web and hacker forums, with monthly fees ranging from $200 to $1,700. Its typical abuses in the encryption scenario include:
GhostGPT is an AI chatbot explicitly positioned as having no ethical constraints, with typical abuses in the encryption scenario including:
Venice.ai provides access to various LLMs, including some models with less scrutiny or looser restrictions. It positions itself as an open gateway for users to explore the capabilities of different LLMs, offering cutting-edge, accurate, and unmoderated models for a truly unrestricted AI experience, but it may also be exploited by malicious actors to generate harmful content. The risks associated with the platform include:
The emergence of unrestricted LLMs marks a new paradigm of attacks on cybersecurity that is more complex, scalable, and automated. These models not only lower the threshold for attacks but also introduce new threats that are more covert and deceptive.
In this ongoing game of offense and defense, all parties in the security ecosystem must work together to cope with future risks: on one hand, there is a need to increase investment in detection technologies to develop systems capable of identifying and intercepting phishing content generated by malicious LLMs, exploiting smart contract vulnerabilities, and malicious code; on the other hand, efforts should also be made to promote the construction of model anti-jailbreaking capabilities and to explore watermarking and tracing mechanisms to track the sources of malicious content in critical scenarios such as finance and code generation; in addition, a sound ethical framework and regulatory mechanism must be established to fundamentally limit the development and misuse of malicious models.