The alignment problem and the rule of law - part 3

henrylfraser
Jun 16, 2020
5 min read

Updated: Jul 22, 2020

In part 2 of this blog entry, I wrote about how we use the law to adjust the incentives of corporations - powerful, non-human, dare I say 'artificially intelligent' agents - without trying to change their fundamental reward function. For corporations, the reward function is money. We use the law to change the costs and benefits of different corporate actions, such as pollution, in order to bring those costs and benefits into closer alignment with the public interest.

The advantages of this method for aligning corporate interests are many, but I'll focus, in this post, on two:

simplicity; and
layers of abstraction,

and they go hand in hand.

The rule of law as a set of simple instructions

Now there is debate about precisely what the rule of law means. I won't enter into controversies about rule of law vs rule by law, law and politics, and so on. I just want to focus on the basic idea of the rule of law. The central tenet of the idea - so far as it concerns the regulation of the conduct of powerful entities - seems to me very simple and straightforward.

The great Professor AV Dicey summed it up this way:

[W]ith us no man is above the law [and] every man, whatever be his rank or condition, is subject to the ordinary law of the realm and amenable to the jurisdiction of the ordinary tribunals. (Dicey, A.V., 1982 [1885], Introduction to the Study of the Law of the Constitution, London: McMillan and Co.)

Distilled into an instruction, or perhaps a set of two instructions, it comes out something like this:

> Do not violate the law 
> Submit to the jurisdiction of the courts administering the governing law

Where the rule of law pertains, those instructions apply to everybody.

Functions and arguments Of course, there's the issue of specifying which law and which courts. I'll go into some of the considerations that this raises in anotehr post. For now, it's sufficient to observe that this matter can be cleared up with another instruction.

> The applicable law is the law of {{governing law state}}

The phrase {{governing law state}}, is something like a 'function' (a placeholder for some particular value or piece of information), into which we can insert an 'argument' (the particular value or piece of information). Or, we could supplement the first two instructions by insterting the function there.

> Do not violate the law of {{governing law state}} 
> Submit to the jurisdiction of the courts of {{governing law state}}

Then we can put any 'argument' into that function. The {{governing law state}} could be 'England', or 'Australia' or 'Canada' or 'India'. Or we could more granular: 'Uttar Pradesh', 'British Columbia', 'New South Wales'.

> Do not violate the law of Uttar Pradesh 
> Submit to the jurisdiction of the courts of Uttar Pradesh

This is not a new idea. It's the way that lawyers manage commercial relationships ( including commercial relationships between corporations) in contracts. If the parties are in different countries (or even in the same country) it is worthwhile making clear which laws will determine the way that the contract is interpreted and breaches of contract are handled. So, one clause in the contract (one module of instructions) will be the 'governing law' clause. The the parties will agree that the contract will be governed by that governing law, and will accept the jurisdiction of the courts that administer that law.

Simplicity comes from layers of abstraction Notice the elegance of the solution. We don't have to specify in contracts every single rule for interpreting the contract, assessing the rights and duties of the parties, and remedying breaches of contract. We just specify the governing law. Assuming that the rule of law holds true in the governing law state, then the two rule of law instructions will apply.

> Do not violate the law of {{governing law state}} 
> Submit to the jurisdiction of the courts of the {{governing law state}}

These instructions will call into operation whatever parts of the law, or powers of the court, are needed to interpret the contract, assess the rights and duties of parties, remedy breaches and so on.

This is an example of the operation of layers of abstraction.

It is not that the law itself is simple (although it is far simpler than the full gamut of 'human values'). It is that the instruction required to prevent all the harmful things that fall into the category of unlawful behaviour is simple. The complexity of the law is layered beneath the simple instruction. In our rule of law instructions, we might then cast law as another function, which calls into operation the full complexity of the law.

Simplicity

Let's return to the example of aligning corporations' conduct with the public good. In order to get corporations to refrain from an extremely wide range of harmful conduct (which is to say, to go a good way to stoping corporations them from doing all the things that would be against the law), we do not need to prescribe separately, to each corporation, every single act and omission that it must perform or refrain from. We don't give each a long list of harms to be avoided, nor a set of instructions unique and particular to each organisation.

The way that states keep corporations in check is not by trying to make them care about everything that every human cares about, and trying to communicate those incredibly complex cares in exhaustive detail.

We just, in effect, give the two rule of law instructions. Or rather those two instructions apply where the rule of law applies, because, in Dicey's words, 'No man [entity] is above the law'.

It seems obvious to us to simplify the complexity of our instructions for the basic moral conduct of corporations (and like agents) in this way. But I can't help but feel that there is a real elegance to this layering and simplicity that shouldn't be taken for granted.

Now, it is eminently clear that the rule of law does not stop every bad action by powerful corporate actors. Many bad acts may be perfectly lawful. But it is equally clear that it renders a whole raft of actions - unlawful actions - far more difficult, costly; inconsistent with the corporate reward function.

The instruction does not rule out all harms, but it rules out a very wide range of harms indeed.

As to how to apply this framework to AIs, I won't pretend to offer an plan of any real determinacy or comprehensiveness. But if the problem is one of distilling into functional instructions the vastness, complexity and diversity of human value systems, it seems a good start. If an AI is smart enough to steal, manipulate and extort its way into a position of influence (as Toby Ord imagines in his thought experiment), it ought to be smart enough to read, interpret and apply the law.

Up next

So much for today. In coming posts I will try to grapple with the question of which law to choose, and how to best harness the law's capacity for modularity and modular updates to keep in check actions that are bad, notwithstanding their lawfulness.

The alignment problem and the rule of law - part 3

Recent Posts

댓글