Why AI for biological design should be regulated differently than chatbots

By Matthew E. Walsh | September 1, 2023

Green dotted world globe, connecting lines and dots on dark color background.By: Raevsky Lab/Adobe Stock

MIT researchers recently contrived a scenario where non-scientist students used ChatGPT to help them obtain information on how to acquire DNA that could make pathogens with pandemic potential. These undergraduate students reportedly had limited biological know-how. But by using the chatbot, they were able to gain the knowledge to create dangerous material in the lab and evade biosecurity measures. This experiment drew attention to the impacts of artificial intelligence tools on the biothreat landscape—and how such applications contribute to global catastrophic biological risks. ­­

In recent weeks, scholars, the US policy community, and the public have been discussing the biosecurity implications and governance of AI-based tools. The White House recently released a fact sheet detailing the security measures that top large language model-based chatbot developers have voluntarily committed to—including internal and external security testing to guard against AI-based biosecurity risks. In mid-July, Sen. Edward J. Markey (D-Mass.) introduced legislation, the Artificial Intelligence and Biosecurity Risk Assessment Act, that, if enacted, would require the Assistant Secretary for Preparedness and Response to research how artificial intelligence tools could be used to generate biological weapons. And groups have published reports detailing recommendations to establish effective governance over artificial intelligence, such as the Helena Project report, Biosecurity in the Age of AI.

In an effort to include all the ways in which artificial intelligence tools influence the biothreat landscape, policy conversations often group together general-purpose chatbots with biology-specific, AI bio-design tools. Understanding how each category of AI tools work, what their capabilities and limitations are, and where they are in their commercial development is important to establish effective governance. But it is most critical to recognize that large language model-based chatbots and bio-design tools influence the biosecurity landscape in vastly different ways. Their governance, therefore, should be considered and developed independently.

Large language model-powered chatbots. These chatbots are a combination of a large language model and a user interface. Language models ingest vast amounts of data, typically text of human languages— what practitioners call “natural language.” Training these models consumes tremendous amounts of computation resources and time, often months. Through this process, the large language model learns the structure, or grammar, of the language in the data and commonly contains hundreds of billions of parameters. A user interface can be overlaid on the model, which then results in an easy-to-use AI tool, such as ChatGPT, Bard, or Claude. Based on the information in their training data, these tools respond to user queries with human-like responses. Because the training data is often scraped from the internet, the breadth of responses from these chatbots is vast and can range from restaurant recommendations to error fixes in programming code.

Large language model-based bio-design tools. These applications serve much more specific purposes than chatbots: They are built to help complete biological engineering tasks with varying levels of specificity. Recently developed large language model-based bio-design tools leverage the same methodology that chatbots use and are viewed as a promising application of the method. Instead of training the large language model on natural language, a bio-specific large language model is trained on the amino acid sequences of proteins or other biological sequences. This results in the application outputting biological sequences, instead of natural language.

These tools can learn the favorable properties of a biomolecule and make suggestions on promising options to test in the laboratory, decreasing the number of options needed to test before finding one with desirable properties. For example, the tool known as UniRep helps researchers engineer proteins based on their function, while ESMFold enables engineering based on structure. Both tools could be used, for example, to help design better therapies faster and to engineer proteins in organisms to improve the efficiency of biomanufacturing.

To avoid an AI "arms race," the world needs to expand scientific collaboration

In addition to protein sequences, bio-specific language models have been trained on DNA sequences and even on glycan (sugar) sequences, simultaneously expanding their potential positive and negative impacts. Unlike the chatbots, bio-design tools that are publicly available generally lack a user interface and require computer programming knowledge to access and use, although there are efforts to make them easier to use.

Impact on the threat landscape. As evidenced in the MIT demonstration, general purpose chatbots can make it easier and quicker for people to access information that is prone to misuse. Because the output of chatbots is based on information found in their training data, these tools should currently not be considered as providing new abilities to malicious actors. For example, the students in the demonstration were asked to use ChatGPT to identify companies that were not members of the International Gene Synthesis Consortium, a group of synthetic DNA providers committed to best practices in biosecurity. The assumption was that if someone wanted to acquire harmful DNA, ordering it from a company not a part of the consortium would be more likely to succeed than ordering it from one that was a part of the association. As expected, ChatGPT was able to provide a list of companies in moments. But without ChatGPT the user could still acquire the same information—by searching online for DNA synthesis providers and then cross-checking the list against those that are listed on the consortium website.

Some chatbots have been engineered to not provide responses that would be prone to misuse, including biological information, but researchers have shown that these restrictions can be overcome.

Bio-design tools, however, do provide new and improved abilities to their users that could be nefariously repurposed. Currently, these bio-specific tools can engineer one property of a biomolecule at a time. These tools can be used to predict function, ranging from improved binding ability of antibody variants to improved fluorescence of a protein. They can output a long list of probable options which can then be evaluated by a user for other properties, such as amino acid sequence. This gives a knowledgeable user the ability to essentially engineer multiple properties.

One example of misuse would be to use a bio-design tool to identify protein-based toxins that are predicted to be functionally similar to known toxins but are otherwise different enough from those found in nature that traditional safeguarding measures would be ineffective.

Moving forward. When considering the governance of chatbots and bio-design tools, it is important to recognize their differences. Doing so will allow for differentiation in future governance options. In the near term, governance of chatbots should be focused on preventing users from accessing existing information prone to misuse. There are ongoing efforts throughout the AI community towards such goals, including those in the voluntary commitments from tech companies outlined by the White House and organizations such as the Responsible Artificial Intelligence Institute. When addressing biosecurity concerns related to chatbots, biosecurity professionals should help inform what types of information could be misused to cause harm. Anthropic, the company behind Claude, for example, collaborated with biosecurity experts in developing their chatbot.

In contrast, governance of bio-design tools should be focused on preventing users from generating harmful new information. Technical biosecurity measures could be promoted through community norms and codes of conducts. These measures would be aligned with existing efforts, such as the Tianjin Biosecurity Guidelines for Codes of Conduct for Scientists. In a chemistry-based scenario that parallels bio-design tools, researchers were able to slightly adjust their existing chemical-design tool to maximize the predicted toxicity of chemicals instead of to minimize. Using this information, the researchers were able to identify chemicals predicted to be more toxic than even the most potent chemical weapons.

'Artificial Escalation': Imagining the future of nuclear risk

This scenario emphasized the relative ease with which nefarious actors could repurpose existing code that was originally built for beneficial purposes. But that does not mean AI tools must remain locked behind closed doors. Developers could overlay user interfaces, like chatbots, that allow others to use the tool as intended and without being able to make changes to the code. Practices like this should be discussed among the biosecurity community and considered for inclusion into future guidelines and codes of conduct.

Other governance measures, such as risk education and awareness raising of bio-design tools should be pursued. However, there are currently a few challenges in actually doing this. First, work is needed to develop and implement a categorization framework of bio-design tools that will be helpful in determining appropriate governance measures. Large language model based bio-design tools are just one type of bio-design tool. Other bio-design tools, such as AlphaFold2 and Rosetta are not built on large language models but can have the same applications as large language model-based bio-design tools. Governance pertaining to only large language model-based bio-design tools but not other tools with similar capability would be incomplete. Additionally, bio-design tools vary in the degree of user expertise they require (in both biology and computer programming) and in the types and amount of data, among others. A comprehensive framework that considers the multi-faceted landscape of bio-design tools would be very helpful in framing risk education and awareness raising initiatives.

Additionally, there is little, if any, peer-reviewed work analyzing the current impact of bio-design tools on the biothreat landscape. Bio-design tools will increase in capability over time, and there is no sufficient risk assessment framework for mid- and long- term impacts. Because there is no published work attempting to reach a consensus among experts on what the impacts of large language model-based bio-design tools are on the biological threat landscape, policy makers will find it challenging to agree on what appropriate and commensurate governance measures are.

Lastly, there are few people in the world who have expertise in the differing subject areas of AI, engineered biology, and biosecurity. This means that the most effective and comprehensive work in this space needs to come from teams of experts who have to communicate across academic disciplines.

There is also a difference in the urgency of developing governance of large language model-based bio-design tools and chatbots. Chatbots are becoming increasingly commercialized and wide-spread, and consequently the window for establishing governance is closing. For developed technologies, like chatbots, more stringent governance measures, such as export controls or licensing, are generally more appropriate than they would be for nascent technologies like large language model-based bio-design tools. In the emerging arena of bio-design tools, there is still time to understand their implications and to work with technology developers to ensure that future tools are built with biosecurity considerations in mind—and with whistle-blowing channels for when they are not.

In grouping large language model based chatbots and large language model-based bio-design tools together, it will be challenging to identify one set of governance measures that would apply to both. This could potentially create an obstacle for the policy and scientific communities in aligning on what the appropriate governance measures are and needlessly stalling progress towards mitigating the risk associated with chatbots. Significant work is needed to fully understand and communicate the biosecurity impacts of bio-design tools. Underappreciation for the differences between these two applications, and their impacts on the biothreat landscape, could result in inappropriate or ineffective governance of each while simultaneously harming beneficial technological progress.

Together, we make the world safer.

The Bulletin elevates expert voices above the noise. But as an independent nonprofit organization, our operations depend on the support of readers like you. Help us continue to deliver quality journalism that holds leaders accountable. Your support of our work at any level is important. In return, we promise our coverage will be understandable, influential, vigilant, solution-oriented, and fair-minded. Together we can make a difference.

Get alerts about this thread
Notify of
Inline Feedbacks
View all comments