Engineering AI Systems with Quality Attributes  — A Critical Step for AI Governance

While the EU Parliament has already voted in favor of world’s first set of comprehensive rules for the development and usage of artificial intelligence, the US senate has only started a discussion on this type of legislation by inviting Tech Moguls to Capitol Hill. Not all senators attended this meeting — roughly 40 members were missing. One of them, Senator Josh Hawley, skipped the event, saying that he refused to participate in what he called a “giant cocktail party for big tech.” This could express  alack of interest and expertise in this important domain; it could also be due to a lack of knowledge about AI technologies or simply a lack of willingness to be hassled by Big Tech lobbyists.

The EU Artificial Intelligence Act (AI Act) has been designed to establish governance and enforcement to protect human rights and safety regarding the use of AI. The AI Act is the first AI law established by a major regulator. This law seeks to ensure that AI is used safely and responsibly, with the interests of both people and enterprises in mind. The EU parliament has been hesitating on the final form and content of this AI Act. Some concerns are that it can put breaks on the development of these technologies EU and make EU the laggard in AI behind US and China, which could in turn impact the overall competitiveness of European companies.

However, the AI Act is an important step in the development of an effective and responsible regulatory framework for AI in Europe. It is hoped that this law will create a level playing field for all enterprises, while also protecting the rights and interests of people. The AI technology landscape is much more complex than most people perceive. It consists of GOFAI (“Good old fashioned artificial intelligence”), a term used for “classical symbolic AI” that was the subject of research since the 1960s. It includes the collection of all methods in artificial intelligence research based on high-level symbolic (human-readable) representations of problems, logic, and search. As such, it considered tools such as logic programming, production rules, automated scheduling systems, semantic nets and frames, one we’ve seen implemented in expert systems in the 1980s. But from the 1990s on a number of new technologies have been developed. Among others this includes ML (Machine Learning) and Robotics, Image Recognition, as well as generative AI tools such as ChatGPT.

The latter development represents the LLM (large language model) segment of the market. Since 2022 it has become a hot topic, garnering the interests of the public and investors (putting billions into generative AI development). Initially, Google, which for several years was ahead of everyone with its DeepMind subsidiary, found itself behind OpenAI and Microsoft partners in ChatGPT development. It triggered several new announcements from Google as well as from META that made its LLM (Llama2) available as an open-source platform, hoping that it will stimulate new developments and improvements in this technology.

 A large language model (LLM) is a deep learning algorithm that can perform a variety of natural language processing (NLP) tasks. Large language models use transformer models and are trained using massive datasets — hence the adjective “ large”. This enables them to recognize, translate, predict, or generate text or other content. Recent LLMs already have over 100 trillion tokens –representing more neurons than the average human brain, which has about 85 trillion neurons. Are they as good as the human brain? Obviously not, and they tend often to hallucinate (a pressing issue that plagues LLMs).

Hallucination in the context of LLMs refers to the generation of text that is erroneous, nonsensical, or detached from reality. Generative AI content poses significant risks, perhaps most notably, the spread of misinformation. Generative AI can be used to create fake news or videos and other forms of misinformation that can be spread quickly and widely. This can have serious consequences including damaging individuals’ and organizations’ reputations, engendering political instability, and undermining public trust in the media. AI tools such as ChatGPT can write with confidence and persuasiveness, thereby providing a sense of authoritativeness. The resulting text may be taken at face value by casual users, who can subsequently propagate incorrect data and ideas throughout the Internet.

An example of data inaccuracy from ChatGPT is Stack Overflow, which is a question-and-answer website for programmers. Coders have been filling Stack Overflow’s query boards with AI-generated posts. Due to a high volume of errors, Stack Overflow has taken action to prevent anyone from posting answers generated by ChatGPT. Another risk of generative AI content is malicious use. In the wrong hands, generative AI can be a powerful tool for causing harm. For example, generative AI can be used to create fake reviews, scams, and other forms of online fraud. It can also automate spam messages and other unwanted communications. In addition, there have been proof-of-concept attacks where AI created mutating malware. ChatGPT may also be used to write malware (researchers found a thread named “‘ChatGPT—Benefits of Malware'” on a hacking forum).

EU’s AI Act focuses on such risks. It would require companies to do due diligence to reduce the chance of human bias creeping into AI systems. That is a tall order. To comply, companies must be keenly aware of the coded algorithmic models that make up their AI systems, as well as the data that is fed into the systems. Even then, control can be tenuous. Introducing an AI system into a new environment can lead to unforeseen issues down the road. Getting a handle on AI bias requires a cross-functional team that specializes in identifying bias in both human and machine realms to holistically tackle the challenge. The AI Act recognizes that the risk of bias is not the same across all AI applications or deployments and categorizes potential risks into four buckets.

Unacceptable: Applications that comprise subliminal techniques, exploitative systems or social scoring systems used by public authorities are strictly prohibited. Also prohibited are any real-time remote biometric identification systems used by law enforcement in publicly accessible spaces.

High Risk: These include applications related to transport, education, employment, and welfare, among others. Before putting a high-risk AI system on the market or in service in the EU, companies must conduct a prior “conformity assessment” and meet a long list of requirements to ensure the system is safe.

Limited Risk: These refer to AI systems that meet specific transparency obligations. For instance, an individual interacting with a chatbot must be informed that they are engaging with a machine so they can decide whether to proceed (or request to speak with a human instead).

Minimal Risk: These applications are already widely deployed and make up most of the AI systems we interact with today. Examples include spam filters, AI-enabled video games and inventory-management systems.

The primary responsibility for compliance will be shouldered by the “providers” of AI systems; however, certain responsibilities will also be assigned to distributors, importers, users, and other third parties, impacting the entire AI ecosystem. It is not surprising that Big Tech companies operating in the EU are afraid of such responsibility.

A key question that needs to be addressed is how to engineer AI systems to comply not only with EU or future US legislation, but also to have built- in quality attributes to allow testing for safety, effectiveness, trustworthiness, the preservation of privacy, and non-discrimination. This is critical when we know that LLMs tend in some circumstances to hallucinate. This means that regulatory agencies will have to figure out how to instantiate policies given such principles.  Tough judgment calls and complex tradeoffs will be necessary, says Daniel Ho, a professor who oversees an artificial intelligence lab at Stanford University and is a member of the White House’s National AI Advisory Committee.

If we consult the Systems Engineering Body of Knowledge (SEBok), we can see that systems engineering methodologies often focus on the delivery of desired capabilities and in the case of AI systems, capabilities would have to comply with AI Act. Those methodologies are largely capability-driven and do not provide detailed, fully integrated attention to potential loss. Loss and loss-driven specialty areas are largely treated in isolation. Examples of loss-driven specialty areas include resilience, safety, security, operational risk, environmental protection, quality, and availability. Non-compliant AI systems can obviously cause lots of harm and we need to focus on their responsible engineering. The SE Handbook identified specialty engineering areas that shared the concerns of loss-driven systems engineering. Those identified include among others:

  • availability
  • environmental impact
  • maintainability
  • resilience engineering
  • reliability
  • risk management (now it should include compliance with AI Act)
  • system safety engineering
  • system security engineering
  • quality management (now it should include additional quality attributes that characterize responsible AI systems)

 

In 2023 Microsoft outlined six key principles for responsible AI: accountability, inclusiveness, reliability and safety, fairness, transparency, privacy, and security. We can also add compliance to this set if the AI Act and other such governance tools come to fruition, which would provide the legal authority to audit AI technology providers.

According to some of the leading authorities in AI, co-founder of DeepMind Mustafa Suleyman or Yan le Cun, VP of META AI, we can expect further accelerated AI developments in the next 5 years. Future LLMs could be 1000 times bigger than the current size of chatGPT4.0. Models produced by cutting edge AI companies would then allow for the generation of a sequence of actions over time. Thereby, we’ll have generative AI that goes well beyond generating text. For example,  It will be able to make phone calls to humans or call other Ais, use APIs for integration with WEB sites or various business systems, other knowledge bases or information stores, negotiate a contract, and establish a sequence in a supply chain. There is a high probability that in 5 years’ time we will have technology that will be able to create ideas independently and make decisions independently. That is why we urgently need proper legislation that can evolve at the same speed as technology in use or even be in front of technological innovation for governance that will be needed in many instances.

Governance of artificial intelligence: A risk and guideline-based integrative framework: https://www.sciencedirect.com/science/article/abs/pii/S0740624X22000181

Recent The Economist’s interview of Yuval Noah Harari and Mustafa Suleyman: https://www.youtube.com/watch?v=b2uEAgLeOzA&list=WL&index=1

Harvard TalktoModel Project: https://www.youtube.com/watch?v=Y8AJ4BwDEPI&t=1524s 

Authored by Alex Wyka, EA Principals Senior Consultant and Principal