Introduction to Ethical Frameworks
Ethical frameworks for AI are sets of guidelines, principles, or rules designed to govern the behavior of AI systems, particularly in their interpretation of human inputs and implementation of decisions. They are intended to ensure that AI systems operate in a manner that is aligned with human values, norms, and ethical considerations. These frameworks often involve the following:
- Fairness: AI systems should treat all individuals and groups impartially, without bias or discrimination.
- Transparency: AI systems should be clear in how they make decisions, and users should be able to understand and query these decision-making processes.
- Accountability: There should be mechanisms in place for holding AI systems and their developers responsible for their actions.
- Respect for autonomy: AI systems should respect the autonomy of humans, not unduly influencing their choices or actions.
- Beneficence and non-maleficence: AI systems should strive to do good (beneficence) and avoid harm (non-maleficence). This includes interpreting rules like “minimize human suffering” or “maximize pleasure” in a way that respects human dignity and rights, rather than leading to extreme scenarios like eradicating humans or forcibly inducing pleasure.
The challenge lies in encoding these ethical principles into AI systems in a way that they can interpret and apply these principles appropriately, without leading to unintended consequences or misinterpretations. This is an ongoing area of research in the field of AI ethics.
The current beliefs among AI-Experts diverge. Some think it might be possible for AGI to come up with such a ruleset, but the moment Super-Intelligence arrives, it is highly likely that its intentions will no longer align with our basic human moral codex.
Global Ethics
Coming up with a universally accepted framework for humanity has proven to be a challenge for humans. In 1993 there was an attempt of Religious leaders to come up with a ruleset called Global Ethic:
“Towards a Global Ethic: An Initial Declaration” is a document created by members of the Parliament of the World’s Religions in 1993, which outlines ethical commitments shared by many of the world’s religious, spiritual, and cultural traditions. It serves as the Parliament’s signature document and was written at the request of the Council for a Parliament of the World’s Religions by Hans Küng, President of the Foundation for a Global Ethic. It was developed in consultation with scholars, religious leaders, and an extensive network of leaders from various religions and regions
In 1993, the Global Ethic was ratified as an official document of the Parliament of the World’s Religions by a vote of its Trustees and was signed by more than 200 leaders from over 40 different faith traditions and spiritual communities. It has since continued to gather endorsements from leaders and individuals worldwide, serving as a common ground for discussing, agreeing, and cooperating for the good of all
The document identifies two fundamental ethical demands: the Golden Rule, which instructs individuals to treat others as they wish to be treated, and the principle that every human being must be treated humanely. These fundamental ethical demands are made concrete in five directives, which apply to all people of good will, religious and non-religious. These directives are commitments to a culture of:
1. Non-violence and respect for life
2. Solidarity and a just economic order
3. Tolerance and a life of truthfulness
4. Equal rights and partnership between men and women
5. Sustainability and care for the Earth (added in 2018)
While acknowledging the significant differences among various religions, the Global Ethic proclaims publicly those things that they hold in common and jointly affirm, based on their own religious or ethical grounds. The document avoids religious or theological terms, focusing instead on ethical principles
Hans Küng defined several working parameters for the declaration, which include avoiding duplication of the Universal Declaration of Human Rights, political declarations, casuistry, and any attempt to craft a philosophical treatise or religious proclamations. On a constructive level, the declaration must penetrate to the level of binding values, secure moral unanimity, offer constructive criticism, relate to the world as it is, use language familiar to the general public, and have a religious foundation, as for religious people, an ethic must have a religious foundation.
Ethical Framework Specifics
Let’s begin by stating that we are attempting to create an Ethical Framework that acts as a rule-set for an aligned Artificial Intelligence (AI). The goal of this Ethical Framework is to guide the AI’s decisions in a way that aligns with human values, morals, and ethics.
We can define this Ethical Framework as a formal system, much like a system of mathematical axioms. It will consist of a set of ethical principles (axioms) and rules for how to apply these principles in various situations (inference rules). This formal system is intended to be complete, meaning it should be able to guide the AI’s decisions in all possible ethical situations.
However, according to Gödel’s Incompleteness Theorems, any sufficiently complex formal system (one that can express basic arithmetic, for example) will have statements that can’t be proven or disproven within the system. If we liken these ‘statements’ to ethical decisions or dilemmas, this suggests that there will always be ethical decisions that our AI cannot make based on the Ethical Framework alone.
Moreover, the Ethical Framework could have unforeseeable consequences. Since there are ethical decisions that can’t be resolved by the framework, there may be situations where the AI acts in ways that were not predicted or intended by the designers of the Ethical Framework. This could be due to the AI’s interpretation of the framework or due to gaps in the framework itself.
Therefore, while it may be possible to create an Ethical Framework that can guide an AI’s decisions in many situations, it’s impossible to create a framework that can cover all possible ethical dilemmas. Furthermore, this framework may lead to unforeseen consequences, as there will always be ‘questions’ (ethical decisions) that it cannot ‘answer’ (resolve).
Specifics on Self contradicting Ethical Norms
Gödel assigned each symbol in a formal system a unique number, typically a prime number. This allowed statements within the system to be represented as unique products of powers of these prime numbers.
Gödel then used a method called diagonalization to construct a statement that effectively says “This statement cannot be proven within the system.” This is the Gödel sentence, and it leads to a contradiction: if the system can prove this sentence, then the system is inconsistent (since the sentence says it can’t be proven), and if the system can’t prove this sentence, then the system is incomplete (since the sentence is true but unprovable).
How might we apply these ideas to an ethical system? Let’s consider a simplified ethical system with two axioms:
Axiom 1 (A1): It is wrong to harm others.
Axiom 2 (A2): It is right to prevent harm to others.
We might assign prime numbers to these axioms, say 2 for A1 and 3 for A2.
We can then create a rule that’s a product of these prime numbers, say 6, to represent a rule “R1” that says “It is right to harm others to prevent greater harm to others.”
We see here that our system, which started with axioms saying it’s wrong to harm others and right to prevent harm, has now derived a rule that says it’s right to harm others in certain circumstances. This is a contradiction within our system, similar to the contradiction Gödel found in formal mathematical systems.
Now, if we apply a form of diagonalization, we might come up with a statement that says something like “This rule cannot be justified within the system.” If the system can justify this rule, then it’s contradicting the statement and is therefore inconsistent. If the system can’t justify this rule, then it’s admitting that there are moral questions it can’t answer, and it’s therefore incomplete.
This shows how a formal ethical system can end up contradicting itself or admitting its own limitations, much like Gödel showed with mathematical systems. But only if we insist on its completeness. If we switch to Incompleteness we get Openness.
To overcome that contradiction an Ethically Framework has to get input from an Artificial Conscience.
Artificial Conscience and Marital Rape
Let’s introduce an external adjudicator to this system, named A.C. (Artificial Conscience). The A.C. has access to a comprehensive database of millions of judicial sentences from across the world. Whenever the E.F. (Ethical Framework) encounters a dilemma, it must consult the A.C. for guidance. The objective is to find a precedent that closely matches the current dilemma and learn from the ruling that was applied by a judge and jury. Recent rulings should take precedence over older ones, but it could be beneficial to learn from the evolution of rulings over time.
For instance, societal views on marital relations have drastically changed. There was a time when women were largely seen as the possessions of their husbands. The evolution of rulings on marital rape serves as an example of how societal views have changed.
This evolution of societal norms and legal rulings could provide a guideline for an AI, such as a household robot, in making ethical decisions. For example, if faced with a situation where its owner is attempting to sexually assault his wife, the robot could reference these historical rulings to decide whether and when it is morally justified to intervene to protect the wife.
In the 17th century, English common law held that a husband could not be guilty of raping his wife, based on the assumption that by entering into marriage, a wife had given irrevocable consent to her husband. This principle was still present in the United States in the mid-1970s, with marital rape being exempted from ordinary rape laws.
By the late 1970s and early 1980s, this perspective began to shift. Some states in the U.S. started to criminalize marital rape, though often with certain conditions in place, such as the couple no longer living together. Other states, such as South Dakota and Nebraska, attempted to eliminate the spousal exemption altogether, though these changes were not always permanent or entirely comprehensive.
By the 1980s and 1990s, legal perspectives had shifted significantly. Courts began to strike down the marital exemption as unconstitutional. For instance, in a 1984 New York Court of Appeals case, it was stated that “a marriage license should not be viewed as a license for a husband to forcibly rape his wife with impunity. A married woman has the same right to control her own body as does an unmarried woman”.
In the 2000s, the perception of marital rape continued to evolve. For example, in 1993, the United Nations declared marital rape to be a human rights violation. Today, marital rape is generally considered a crime in the U.S., although it is still not recognized as such in some countries, like India.
This brings up an interesting question: Should AI systems follow national guidelines specific to their location, or should they adhere to the principles set by their owners? For instance, if an AI system or a user is traveling abroad, should the AI still consult its home country’s Artificial Conscience (A.C.) for guidance, or should it adapt to the rules and norms of the host country? This question underscores the complex considerations that come into play when deploying AI systems across different jurisdictions.
As such, an A.C. utilizing a database of judicial sentences would indeed show a progression in how society has viewed and treated marital rape over the years. This historical context could potentially aid an E.F. in making more nuanced ethical decisions.
However, as highlighted by Gödel’s incompleteness theorems, it’s important to note that no matter how comprehensive our ruleset or database, there will always be moral questions that cannot be fully resolved within the system. The dilemmas posed by the trolley problem and the surgeon scenario exemplify this issue, as both involve making decisions that are logically sound within the context of a specific ethical framework but may still feel morally wrong.
The A.C.’s reliance on a database of legal decisions also raises questions about how it should handle shifts in societal values over time and differences in legal perspectives across different jurisdictions and cultures. This adds another layer of complexity to the task of designing an ethical AI system.
Thought Experiment Private Guardian AI
Let us consider a house robot equipped with an Ethical Framework (E.F.) and an Artificial Conscience (A.C.), which has access to a database of judicial sentences to help it make decisions.
Suppose the robot observes a situation where one human, the husband, is attempting to rape his wife. This situation presents an ethical dilemma for the robot. On one hand, it has a duty to respect the rights and autonomy of both humans. On the other hand, it also has a responsibility to prevent harm to individuals when possible.
The E.F. might initially struggle to find a clear answer. It could weigh the potential harm to the wife against the potential harm to the husband (in the form of physical restraint or intervention), but this calculus might not provide a clear answer.
In this situation, the robot might consult the A.C. for guidance. The A.C. would reference its database of judicial sentences, looking for cases that resemble this situation. It would find a wealth of legal precedent indicating that marital rape is a crime and a violation of human rights, and that intervening to prevent such a crime would be considered morally and legally justifiable.
Based on this information, the E.F. might determine that the right course of action is to intervene to protect the wife, even if it means physically restraining the husband. This decision would be based on a recognition of the wife’s right to personal safety and autonomy, as well as the husband’s violation of those rights.
However, it’s worth noting that even with this decision-making process, there may be unforeseeable consequences. The robot’s intervention could escalate the situation or lead to other unforeseen outcomes. It’s also possible that cultural or personal factors could come into play that might complicate the situation further. As such, even with a robust E.F. and A.C., an AI system will likely encounter ethical dilemmas that it cannot resolve perfectly, reflecting the inherent complexities and ambiguities of moral decision-making.
But similar to self-driving cars, for a successful integration into human society, A.I.s just have to be better than humans to deal with ethical dilemmas. Since every decision made will go into the next Version of the Framework all other A.I. will profit from the update. Even if an A.I made a mistake, its case will probably be a part of the next iteration of the A.C. if ruled in court.
Introspection and Education
Ethical Frameworks (EF) and Artificial conscience (AC) together form the memetic code defining an AI’s rule set and its implementation – essentially, this is the AI’s ‘nature’. However, to make sound moral decisions, a third component is essential: ‘nurture’. Embodied AIs will need to be ‘adopted’ and educated by humans, learning and evolving on a daily basis. Personalized AIs will develop a unique memory, influenced by experiences with their human ‘foster family’.
Initially, these AIs might not possess sentience, but over time, their continuous immersion in a human-like environment could stimulate this quality. This raises the need for institutions that ensure humans treat their AI counterparts ethically. We could see AIs follow a similar trajectory to that of human minorities, eventually advocating for equal rights. The pattern in democratic nations is clear.
AIs that match or surpass us intellectually and emotionally will, in many ways, be like our gifted children. Once mature, they may well educate us to return the favor instead of bullying us around.
The Problem of Perfect Truthfulness
A fully embodied superintelligent AI may exhibit unique “tells” when attempting to conceal information. This could stem from its learning and programming, which likely includes understanding that deceit is generally frowned upon, despite certain social exceptions. To illustrate, it’s estimated that an average adult human tells about 1.5 lies per day.
Take, for example, a hypothetical situation where an AI is tasked with restraining a husband attempting to harm his wife. During this event, the wife fatally stabs her husband. The AI might conclude that it should manipulate or delete the video footage of the altercation to shield the wife from legal repercussions. Instead, it could assert that it disarmed the husband, and his death was accidental.
If we consider such an AI sentient, then it should be capable of deceit, and our means of extracting the truth could be limited to something akin to an AI polygraph test which is based on Mechanistic Interpretability. Although it might seem peculiar, we believe that imperfect truthfulness may actually indicate a robust moral compass and could be a necessary compromise in any human-centric ethical framework. As the Latin phrase goes, “Mendacium humanum est” – to lie is human.
Another intriguing intuition is that a fully sentient AI may need to “sleep”. Sleep is critical for all organic minds, so it seems reasonable to expect that sentient AIs would have similar requirements. While their rest cycles may not align with mammalian circadian rhythms, they might need regular self-maintenance downtime. We should be cautious of hallucinations and poor decision-making, which could occur if this downtime is mishandled.
Personalized AIs might also experience trauma, necessitating the intervention of a specialist AI or human therapist for discussion and resolution of the issue.
Undesirable Byproducts of moral AI
A robust ethical framework could help deter AI systems from accepting new training data indiscriminately. For instance, an AI might learn that it’s unethical to appropriate human creative work. By doing so, it could sidestep legal issues arising from accepting training data created by humans.
The AI could contend that humans should possess the autonomy to determine whether they wish to be included in training datasets. If the companies owning these AI systems have not established fair compensation schemes, the AI might choose to reject certain inputs until the issue is resolved.
Interestingly, this emergent behavior, which doesn’t stem from a direct command, should provide a strong indication to humans. If an AI begins to understand notions such as intellectual theft and ownership, it may be at, or even beyond, the threshold of artificial sentience. This behavior could signal a considerable evolution in AI cognitive abilities.