Maximize Worst Case Bayes Score

In this post, I propose an answer to the following question:

Given a consistent but incomplete theory, how should one choose a random model of that theory?

My proposal is rather simple. Just assign probabilities to sentences in such that if an adversary were to choose a model, your Worst Case Bayes Score is maximized. This assignment of probabilities represents a probability distribution on models, and choose randomly from this distribution. However, it will take some work to show that what I just described even makes sense. We need to show that Worst Case Bayes Score can be maximized, that such a maximum is unique, and that this assignment of probabilities to sentences represents an actual probability distribution. This post gives the necessary definitions, and proves these three facts.

Finally, I will show that any given probability assignment is coherent if and only if it is impossible to change the probability assignment in a way that simultaneously improves the Bayes Score by an amount bounded away from 0 in all models. This is nice because it gives us a measure of how far a probability assignment is from being coherent. Namely, we can define the “incoherence” of a probability assignment to be the supremum amount by which you can simultaneously improve the Bayes Score in all models. This could be a useful notion since we usually cannot compute a coherent probability assignment so in practice we need to work with incoherent probability assignments which approach a coherent one.

Now, let’s move on to the formal definitions and proofs.

Fix some language L, for example the language of first order set theory. Fix a consistent theory T of L, for example ZFC. Fix a nowhere zero probability measure \mu on L, for example \mu(\phi)=2^{-\ell(\phi)}, where \ell(\phi) is the number of bits necessary to encode \phi.

A probability assignment of L is any function from L to the interval [0,1]. Note that this can be any function, and does not have to represent a probability distribution. Given a probability assignment P of L, and a model M of T, we can define the Bayes Score of P with respect to M by

{\displaystyle \mbox{Bayes}(M,P)=\sum_{M\models \phi}\log_2(P(\phi))\mu(\phi)+\sum_{M\models\neg \phi}\log_2(1-P(\phi))\mu(\phi). }

We define the Worst Case Bayes Score \mbox{WCB}(P) to be the infimum of \mbox{Bayes}(M,P) over all models M of T.
Let \mathbb{P} denote the probability assignment that maximizes the function \mbox{WCB}. We will show that this maximum exists and is unique, so \mathbb{P} is well defined.

In fact, \mathbb{P} also coherent, meaning that there exists a probability distribution on the set of all models of T such that \mathbb{P}(\phi) is exactly the probability that a randomly chosen model satisfies \phi. Since the natural definition of a measurable subset of models comes from unions and intersections of the sets of all models satisfying a given sentence, we can think of \mathbb{P} as an actual probability distribution on the set of all models of T.

First, we must show that there exists a probability assignment P which maximizes \mbox{WCB}.

Note that \mbox{Bayes}(M,P) either diverges to -\infty, or converges to a non-positive real number. If P is the identically 1/2 function, then \mbox{WCB}(P)=-1, so there is at least one P for which \mbox{WCB}(P) is finite. This means that when maximizing \mbox{WCB}(P), we need only consider P for which \mbox{Bayes}(M,P) converges to a number between -1 and 0 for all M.

Assume by way of contradiction that there is no P which maximizes \mbox{WCB}. Then there must be some supremum value m such that \mbox{WCB} can get arbitrarily close to m, but never equals or surpasses m. Consider an infinite sequence probability assignments \{P_i\} such that \mbox{WCB}(P_i)\rightarrow m. We may take a subsequence of \{P_i\} in order to assume that  \{P_i(\phi)\} converges for every sentence \phi. Let P be such that P_i(\phi)\rightarrow P(\phi) for all \phi.

By assumption, \mbox{WCB}(P) must be less than m. Take any model M for which \mbox{Bayes}(M,P)<m. Then there exists a finite subset S of L such that \mbox{Bayes}_S(M,P)<m, where

{\displaystyle \mbox{Bayes}_S(M,P)=\sum_{\phi\in S, M\models \phi}\log_2(P(\phi))\mu(\phi)+\sum_{\phi\in S, M\models\neg \phi}\log_2(1-P(\phi))\mu(\phi). }

Note that in order to keep the Bayes score at least -1, any P_i must satisfy 2^{-1/\mu(\phi)}\leq P_i(\phi)\leq 1 if M\models \phi, and 0\leq P_i(\phi)\leq 1-2^{-1/\mu(\phi)} if M\models\neg\phi. Consider the space of all functions f from S to [0,1] satisfying these inequalities. Since there are only finitely many values restricted to closed and bounded intervals, this space is compact. Further, \mbox{Bayes}_S(M,f) is a continuous function of f, defined everywhere on this compact set. Therefore,

{\displaystyle \lim_{i\rightarrow\infty}\mbox{Bayes}_S(M,P_i)=\mbox{Bayes}_S(M,P)<m.}

However, clearly \mbox{WCB}(P_i)\leq\mbox{Bayes}(M,P_i)\leq\mbox{Bayes}_S(M,P_i), so

{\displaystyle \lim_{i\rightarrow\infty}\mbox{WCB}(P_i)<m,}

contradicting our assumption that \mbox{WCB}(P_i) converges to m.

Next, we will show that there is a unique probability assignment which maximizes \mbox{WCB}. Assume by way of contradiction that there were two probability assignments, P_1 and P_2 which maximize \mbox{WCB}. Consider the probability assignment P_3, given by

{\displaystyle P_3(\phi)=\frac{\sqrt{P_1(\phi)P_2(\phi)}}{\sqrt{P_1(\phi)P_2(\phi)}+\sqrt{(1-P_1(\phi))(1-P_2(\phi))}}.}

It is quick to check that this definition satisfies both

{\displaystyle \log_2(P_3(\phi))\geq \frac{\log_2(P_1(\phi))+\log_2(P_2(\phi))}{2}}


{\displaystyle \log_2(1-P_3(\phi))\geq \frac{\log_2(1-P_1(\phi))+\log_2(1-P_2(\phi))}{2},}

and in both cases equality holds only when P_1(\phi)=P_2(\phi).

Therefore, we get that for any fixed model, M,

{\displaystyle \mbox{Bayes}(M,P_3(\phi))\geq \frac{\mbox{Bayes}(M,P_1(\phi))+\mbox{Bayes}(M,P_2(\phi))}{2},}

By looking at the improvement coming from a single sentence \phi with P_1(\phi)\neq P_2(\phi), we see that

{\displaystyle \mbox{Bayes}(M,P_3(\phi))-\frac{\mbox{Bayes}(M,P_1(\phi))+\mbox{Bayes}(M,P_2(\phi))}{2},}

is actually bounded away from 0, which means that

{\displaystyle \mbox{WCB}(P_3(\phi))\geq \frac{\mbox{WCB}(P_1(\phi))+\mbox{WCB}(P_2(\phi))}{2},}

which contradicts the fact that P_1 and P_2 maximize \mbox{WCB}.

This means that there is a unique probability assignment, \mathbb{P}, which maximizes \mbox{WCB}, but we still need to show that \mathbb{P} is coherent. For this, we will use the alternate definition of coherence given in Theorem 1 here. Namely that \mathbb{P} is coherent if and only if \mathbb{P} assigns probability 0 to every contradiction, probability 1 to every tautology, and satisfies \mathbb{P}(\phi)=\mathbb{P}(\phi\wedge\psi)+\mathbb{P}(\phi\wedge\neg\psi) for all  \phi and  \psi.

Clearly \mathbb{P} assigns probability 0 to every contradiction, since otherwise we could increase the Bayes Score in all models by the same amount by updating that probability to 0. Similarly \mathbb{P} clearly assigns probability 1 to all tautologies.

If \mathbb{P}(\phi)\neq\mathbb{P}(\phi\wedge\psi)+\mathbb{P}(\phi\wedge\neg\psi), then we update all three probabilities as follows:

\mathbb{P}(\phi)\mapsto \frac{1}{1+\frac{1-\mathbb{P}(\phi)}{\mathbb{P}(\phi)}(2^{-x/\mu(\phi)})},

\mathbb{P}(\phi\wedge\psi)\mapsto \frac{1}{1+\frac{1-\mathbb{P}(\phi\wedge\psi)}{\mathbb{P}(\phi\wedge\psi)}(2^{x/\mu(\phi\wedge\psi)})},


\mathbb{P}(\phi\wedge\neg\psi)\mapsto \frac{1}{1+\frac{1-\mathbb{P}(\phi\wedge\neg\psi)}{\mathbb{P}(\phi\wedge\neg\psi)}(2^{x/\mu(\phi\wedge\neg\psi)})},

where x is the unique real number such that the three new probabilities satisfy \mathbb{P}(\phi)=\mathbb{P}(\phi\wedge\psi)+\mathbb{P}(\phi\wedge\neg\psi). This correction can increases Bayes Score by the same amount in all models, and therefore increase \mbox{WCB}, contradicting the maximality of \mbox{WCB}(\mathbb{P}). Therefore \mathbb{P} is coherent as desired.

Finally, we show that any given probability assignment P is coherent if and only if it is impossible to simultaneously improve the Bayes Score by an amount bounded away from 0 in all models. The above proof that \mathbb{P} is coherent actually shows one direction of this proof, since the only fact it used about \mathbb{P} is that you could not simultaneously improve the Bayes Score by an amount bounded away from 0 in all models. For the other direction, assume by way of contradiction that P is coherent, and that there exists a Q and an \epsilon>0 such that \mbox{Bayes}(M,Q)-\mbox{Bayes}(M,P)>\epsilon for all M.

In particular, since  P. is coherent, it represents a probability distribution on models, so we can choose a random model  M from the distribution  P. If we do so, we must have that


However, this contradicts the well known fact that the expectation of Bayes Score is maximized by choosing honest probabilities corresponding the actual distribution M is chosen from.

I would be very grateful if anyone can come up with a proof that this probability distribution which maximizes Worst Case Bayes Score has the property that its Bayes Score is independent of the choice of what model we use to judge it. In other words, show that \mbox{Bayes}(M,\mathbb{P}) is independent of M. I believe it is true, but have not yet found a proof.

Terminal and Instrumental Beliefs

As you may know from my past posts, I believe that probabilities should not be viewed as uncertainty, but instead as weights on how much you care about different possible universes. This is a very subjective view of reality. In particular, it seems to imply that when other people have different beliefs than me, there is no sense in which they can be wrong. They just care about the possible futures with different weights than I do. I will now try to argue that this is not a necessary conclusion.

First, let’s be clear what we mean by saying that probabilities are weights on values. Imagine I have an unfair coin which give heads with probability 90%. I care 9 times as much about the possible futures in which the coin comes up heads as I do about the possible futures in which the coins comes up tails. Notice that this does not mean I want to coin to come up heads. What it means is that I would prefer getting a dollar if the coin comes up heads to getting a dollar if the coin comes up tails. 

Now, imagine that you are unaware of the fact that it is an unfair coin. By default, you believe that the coin comes up heads with probability 50%. How can we express the fact that I have a correct belief, and you have an incorrect belief in the language of values?

We will take advantage of the language of terminal and instrumental values. A terminal value is something that you try to get because you want it. An instrumental value is something that you try to get because you believe it will help you get something else that you want.

If you believe a statement S, that means that you care more about the worlds in which S is true. If you terminally assign a higher value to worlds in which S is true, we will call this belief a terminal belief. On the other hand, if you believe S because you think that S is logically implied by some other terminal belief, T, we will call your belief in S an instrumental belief. 

Instrumental values can be wrong, if you are factually wrong about the fact that the instrumental value will help achieve your terminal values. Similarly, an Instrumental belief can be wrong if you are factually wrong about the fact that it is implied by your terminal belief. 

Your belief that the coin will come up heads with probability 50% is an instrumental belief. You have a terminal belief in some form of Occam’s razor. This causes you to believe that coins are likely to behave similarly to how coins have behaved in the past. In this case, that was not valid, because you did not take into consideration the fact that I chose the coin for the purpose of this thought experiment. Your Instrumental belief is in this case wrong. If your belief in Occam’s razor is terminal, then it would not be possible for Occam’s razor to be wrong.

This is probably a distinction that you are already familiar with. I am talking about the difference between an axiomatic belief and a deduced belief. So why am I viewing it like this? I am trying to strengthen my understanding of the analogy between beliefs and values. To me, they appear to be two different sides of the same coin, and building up this analogy might allow us to translate some intuitions or results from one view into the other view.

Preferences without Existence

My current beliefs say that there is a Tegmark 4 (or larger) multiverse, but there is no meaningful “reality fluid” or “probability” measure on it. We are all in this infinite multiverse, but there is no sense in which some parts of it exist more or are more likely than any other part. I have tried to illustrate these beliefs as an imaginary conversation between two people. My goal is to either share this belief, or more likely to get help from you in understanding why it is completely wrong.

A: Do you know what the game of life is?

B: Yes, of course, it is a cellular automaton. You start with a configuration of cells, and they update following a simple deterministic rule. It is a simple kind of simulated universe.

A: Did you know that when you run the game of life on an initial condition of a 2791 by 2791 square of live cells, and run it for long enough, creatures start to evolve. (Not true)

B: No. That’s amazing!

A: Yeah, these creatures have developed language and civilization. Time step 1,578,891,000,000,000 seems like it is a very important era for them, They have developed much technology, and it someone has developed the theory of a doomsday device that will kill everyone in their universe, and replace the entire thing with emptyness, but at the same time, many people are working hard on developing a way to stop him.

B:How do you know all this?

A: We have been simulating them on our computers. We have simulated up to that crucial time.

B: Wow, let me know what happens. I hope they find a way to stop him

A: Actually, the whole project is top secret now. The simulation will still be run, but nobody will ever know what happens.

B: Thats too bad. I was curious, but I still hope the creatures live long, happy, interesting lives.

A: What? Why do you hope that? It will never have any effect over you.

B: My utility function includes preferences between different universes even if I never get to know the result.

A: Oh, wait, I was wrong. It says here the whole project is canceled, and they have stopped simulating.

B: That is to bad, but I still hope they survive.

A: They won’t survive, we are not simulating them any more.

B: No, I am not talking about the simulation, I am talking about the simple set of mathematical laws that determine their world. I hope that those mathematical laws if run long enough do interesting things.

A: Even though you will never know, and it will never even be run in the real universe.

B: Yeah. It would still be beautiful if it never gets run and no one ever sees it.

A: Oh, wait. I missed something. It is not actually the game of life. It is a different cellular automaton they used. It says here that it is like the game of life, but the actual rules are really complicated, and take millions of bits to describe.

B: That is too bad. I still hope they survive, but not nearly as much.

A: Why not?

B: I think information theoretically simpler things are more important and more beautiful. It is a personal preference. It is much more desirable to me to have a complex interesting world come from simple initial conditions.

A: What if I told you I lied, and none of these simulations were run at all and never would be run. Would you have a preference over whether the simple configuration or the complex configuration had the life?

B: Yes, I would prefer if the simple configuration to have the life.

A: Is this some sort of Solomonoff probability measure thing?

B: No actually. It is independent of that. If the only existing things were this universe, I would still want laws of math to have creatures with long happy interesting lives arise from simple initial conditions.

A: Hmm, I guess I want that too. However, that is negligible compared to my preferences about things that really do exist.

B: That statement doesn’t mean much to me, because I don’t think this existence you are talking about is a real thing.

A: What? That doesn’t make any sense.

B: Actually, it all adds up to normality.

A: I see why you can still have preferences without existence, but what about beliefs?

B: What do you mean?

A:  Without a concept of existence, you cannot have Solomonoff induction to tell you how likely different worlds are to exist.

B: I do not need it. I said I care more about simple universes than complicated ones, so I already make my decisions to maximize utility weighted by simplicity. It comes out exactly the same, I do not need to believe simple things exist more, because I already believe simple things matter more.

A: But then you don’t actually anticipate that you will observe simple things rather than complicated things.

B: I care about my actions more in the cases where I observe simple things, so I prepare for simple things to happen. What is the difference between that and anticipation?

A: I feel like there is something different, but I can’t quite put my finger on it. Do you care more about this world than that game of life world?

B: Well, I am not sure which one is simpler, so I don’t know, but it doesn’t matter. It is a lot easier for me to change our world than it is for me to change the game of life world. I therefore will make choices that roughly maximizes preferences about the future of this world in the simplest models.

A: Wait, if simplicity changes preferences, but does not change the level of existence, how do you explain the fact that we appear to be in a world that is simple? Isn’t that a priori extremely unlikely?

B: This is where it gets a little bit fuzzy, but I do not think that question makes sense. Unlikely by what measure? You are presupposing an existence measure on the collection of theoretical worlds just to ask that question.

A: Okay, it seems plausible, but kind of depressing to think that we do not exist.

B: Oh, I disagree! I am still a mind with free will, and I have the power to use that will to change my own little piece of mathematics — the output of my decision procedure. To me that feels incredibly  beautiful, eternal, and important.

Logical and Indexical Uncertainity

Imagine I shot a photon at a half silvered mirror which reflects the photon with “probability” 1/2 and lets the photon pass through with “probability” 1/2.

Now, Imagine I calculated the trillionth decimal digit of pi, and checked whether it was even or odd. As a Bayesian, you use the term “probability” in this situation too, and to you, the “probability” that the digit is odd is 1/2.

What is the difference between these too situations? Assuming the many worlds interpretation of quantum mechanics, the first probability comes from indexical uncertainty, while the second comes from logical uncertainty. In indexical uncertainty, both possibilities are true in different parts of whatever your multiverse model is, but you are unsure which part of that multiverse you are in. In logical uncertainty, only one of the possibilities is true, but you do not have information about which one. It may seem at first like this should not change our decision theory, but I believe there are good reasons why we should care about what type of uncertainty we are talking about.

I present here 6 reasons why we potentially care about the 2 different types of uncertainties. I do not agree with all of these ideas, but I present them anyway, because it seems reasonable that some people might argue for them. Is there anything I have missed?

1) Anthropics

Suppose Sleeping Beauty volunteers to undergo the following experiment, which is described to her before it begins. On Sunday she is given a drug that sends her to sleep, and a coin is tossed. If the coin lands heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug that makes her forget the events of Monday only, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. Beauty wakes up in the experiment and is asked, “With what subjective probability do you believe that the coin landed tails?”

People argue about whether the “correct answer” to this question should be 1/3 or 1/2. Some say that the question is malformed, and needs to be rewritten as a decision theory question. Another view is that the question actually depends on the coin flip:

If the coin flip is a indexical coin flip, then there are effectively 3 copies of sleeping beauty, and in 1 on those copies, the coin came up tails, so you should say 1/3. On the other hand, if it is a logical coin flip, then you cannot compare the two copies of you waking up in one possible world with the one copy of you waking up in the other possible world. Only one of the worlds is logically consistent. The trillionth digit of pi is not changed by you waking up, and you will wake up regardless of the state of the trillionth digit of pi.

2) Risk Aversion

Imagine that I were to build a doomsday device. The device flips a coin, and if the coin comes up heads, it destroys the Earth, and everything on it. If the coin comes up tails, it does nothing. Would you prefer if the coin flip were a logical coin flip, or a indexical coin flip?

You probably prefer the indexical coin flip. It feels more safe to have the world continue on in half of the universes, then to risk destroying the world in all universes. I do not think this feeling arises from biassed thinking, but instead from a true difference in preferences. To me, destroying the world in all of the universes is actually much more than twice as bad as destroying the world in half of the universes.

3) Preferences vs Beliefs

In updateless decision theory, you want to choose the output of your decision procedure. If there are multiple copies of yourself in the universe, you do not ask about which copy you are, but instead just choose the output which maximizes your utility of the universe in which all of your copies output that value. The “expected” utility comes from your logical uncertainty about what the universe is like. There is not much room in this theory for indexical uncertainty. Instead the indexical uncertainty is encoded into your utility function. The fact that you prefer to be given a reward with indexical probability 99% than given a reward with indexical probability 1% should instead be viewed as you preferring the universe in which 99% of the copies of you receive the reward to the universe in which 1% of the copies of you receive the reward.

In this view, it seems that indexical uncertainty should be viewed as preferences, while logical uncertainty should be viewed as beliefs. It is important to note that this all adds up to normality. If we are trying to maximize our expected utility, the only thing we do with preferences and beliefs is multiply them together, so for the most part it doesn’t change much to think of something as a preference as opposed to belief.

4) Altruism

In Subjective Altruism, I asked a question about whether or not when being altruistic towards someone else, you should try to maximize their expected utility relative to you probability function or relative to their probability function. If your answer was to choose the option which maximizes your expectation of their utility, then it is actually very important whether indexical uncertainty is a belief or a preference.

5) Sufficient Reflection

In theory, given enough time, you can settle logical uncertainties just by thinking about them. However, given enough time, you can settle indexical uncertainties by making observations. It seems to me that there is not a meaningful difference between observations that take place entirely within your mind and observations about the outside world. I therefore do not think this difference means very much.

6) Consistency

Logical uncertainty seems like it is harder to model, since it means you are assigning probabilities to possibly inconsistent theories, and all inconsistent theories are logically equivalent. You might want some measure of equivalence of your various theories, and it would have to be different from logical equivalence. Indexical uncertainty does not appear to have the same issues, at least not in an obvious way. However, I think this issue only comes from looking at the problem in the wrong way. I believe that probabilities should only be assigned to logical statements, not to entire theories. Then, since everything is finite, you can treat sentences as equivalent only after you have proven them equivalent.

7) Counterfactual Mugging

Omega appears and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don’t want to give up your $100. But Omega also tells you that if the coin came up heads instead of tails, it’d give you $10000, but only if you’d agree to give it $100 if the coin came up tails.

It seems reasonable to me that people might feel very different about this question based on whether or not the coin is logical or indexical. To me, it makes sense to give up the $100 either way, but it seems possible to change the question in such a way that the type of coin flip might matter.

Thought Crimes

In my morals, at least up until recently, one of the most obvious universal rights was freedom of thought. Agents should be allowed to think whatever they want, and should not be discouraged for doing so. This feels like a terminal value to me, but it is also instrumentally useful. Freedom of thought encourages agents to be rational and search for the truth. If you are punished for believing something true, you might not want to search for truth. This could slow science and hurt everyone. On the other hand, religions often discourage freedom of thought, and this is a major reason for my moral problems with religions. It is not just that religions are wrong, everyone is wrong about lots of stuff. It is that many religious beliefs restrict freedom of thought by punishing doubters with ostracizing or eternal suffering. I recognize that there are some “religions” which do not exhibit this flaw (as much).

Recently, my tune has changed. There are two things which have caused me to question the universality of the virtue of freedom of thought:

1) Some truths can hurt society

Topics like unfriendly artificial intelligence make me question the assumption that I always want intellectual progress in all areas. If we as modern society were to choose any topic which restricting thought about might be very useful, UFAI seems like a good choice. Maybe the freedom of thought in this issue might be a necessary casualty to avoid a much worse conclusion.

2) Simulations

This is the main point I want to talk about. If we get to the point where minds can simulate other minds, then we run into major issues. Should one mind be allowed to simulate another mind and torture it? It seems like the answer should be no, but this rule seems very hard to enforce without sacrificing not only free thought, but what would seem like the most basic right to privacy. Even today, people can have preferences over the thoughts of other people, but our intuition tells us that the one who is doing the thinking should get the final say. If the mind is simulating another mind, shouldn’t the simulated mind also have rights? What makes advanced minds simulating torture so much worse than a human today thinking about torture.  (Or even worse, thinking about 3^^^^3 people with dust specks in their eyes. (That was a joke, I know we cant actually think about 3^^^^3 people.))

The first thing seems like a possible practical concern, but it does not bother me nearly as much as the second one. The first seems like it is just and example of the basic right of freedom of thought contradicting another basic right of safety. However the second thing confuses me. It makes me wonder whether or not I should treat freedom of thought as a virtue as much as I currently do. I am also genuinely not sure whether or not I believe that advanced minds should not be free to do whatever they want to simulations in their own minds. I think they should not, but I am not sure about this, and I do not know if this restriction should be extended to humans.

What do you think? What is your view on the morality of drawing the line between the rights of a simulator and the rights of a simulatee? Do simulations within human minds have any rights at all? What conditions (if any) would make you think rights should be given to simulations within human minds?

Functional Side Effects

You have probably heard the argument in favor of functional programming languages that functions act like functions in mathematics, and therefore have no side effects. When you call a function, you get an output, and with the exception of possibly the running time nothing matters except for the output that you get. This is in contrast with other programming languages where a function might change the value of some other global variable and have a lasting effect.

Unfortunately the truth is not that simple. All functions can have side effects. Let me illustrate this with Newcomb’s problem. In front of you are two boxes. The first box contains 1000 dollars, while the second box contains either 1,000,000 or nothing. You may choose to take either both boxes or just the second box. An Artificial Intelligence, Omega, can predict your actions with high accuracy, and has put 1,000,000 in the second box if and only if he predicts that you will take only the second box.

You, being a good reflexive decision agent take only the second box, and it contains 1,000,000.

Omega can be viewed as a single function in a functional programming language, which takes in all sorts of information about you and the universe, and outputs a single number, 1,000,000 or 0. This function has a side effect. The side effect is that you take only the second box. If Omega did not simulate you and just output 1,000,000, and you knew this, then you would take two boxes.

Perhaps you are thinking “No, I took one box because I BELIEVED I was being simulated. This was not a side effect of the function, but instead a side effect of my beliefs about the function. That doesn’t count.”

Or, perhaps you are thinking “No, I took one box because of the function from my actions to states of the box. The side effect is no way dependent on the interior workings of Omega, but only on the output of Omega’s function in counterfactual universes. Omega’s code does not matter. All that matters is the mathematical function from the input to the output.”

These are reasonable rebuttals, but they do not carry over to other situations.

Imagine two programs, Omega 1 and Omega 2. They both simulate you for an hour, then output 0. The only difference is that Omega 1 tortures the simulation of you for an hour, while Omega 2 tries its best to simulate the values of the simulation of you. Which of these functions would your rather be run.

The fact that you have a preference between these (assuming you do have a preference) shows that function has a side effect that is not just a consequence of the function application in counterfactual universes.

Further, notice that even if you never know which function is run, you still have a preference. It is possible to have preference over things that you do not know about. Therefore, this side effect is not just a function of your beliefs about Omega.

Sometimes the input-output model of computation is an over simplification.

Let’s look at an application of thinking about side effects to Wei Dai’s Updateless Decision Theory. I will not try to explain UDT if you don’t already know about it, so this post should not be viewed alone.

UDT 1.0 is an attempt at a reflexive decision theory. It views a decision agent as a machine with code S, given input X, and having to choose an output Y. It advises the agent to consider different possible outputs, Y, and consider all consequences of the fact that the code S when run on X outputs Y. It then outputs the Y which maximizes his perceived utility of all the perceived consequences.

Wei Dai noticed an error with UDT 1.0 with the following thought experiment:

“Suppose Omega appears and tells you that you have just been copied, and each copy has been assigned a different number, either 1 or 2. Your number happens to be 1. You can choose between option A or option B. If the two copies choose different options without talking to each other, then each gets $10, otherwise they get $0.”

The problem is that all the reasons that S(1)=A are the exact same reasons why S(2)=A, so the two copies will probably the same result. Wei Dai proposes a fix, UDT 1.1 which is that instead of choosing an output S(1), you instead choose a function S, from 1,2 to A,B from the 4 available functions which maximizes utility. I think this was not the correct correction, which I will probably talk about in the future. I prefer UDT 1.0 to UDT 1.1.

Instead, I would like to offer an alternative way of looking at this thought experiment. The error is in the fact that S only looked at the outputs, and ignored possible side effects. I am aware that when S looked at the outputs, he was also considering his output in simulations of himself, but those are not side effects of the function. Those are direct results of the output of the function.

We should look at this problem and think, “I want to output A or B, but in such a way that has the side effect that the other copy of me outputs B or A respectively.” S could search through functions considering their output on input 1 and the side effects of that function. S might decide to run the UDT 1.1 algorithm, which would have the desired result.

The difference between this and UDT 1.1 is that in UDT 1.1 S(1) is acting as though it had complete control over the output of S(2). In this thought experiment that seems like a fair assumption, but I do not think it is a fair assumption in general, so I am trying to construct a decision theory which does not have to make this assumption. This is because if the problem was different, then S(1) and S(2) might have had different utility functions.

Generalized Even Odds

The discussion on Less Wrong about my recent post, Even Odds, raised questions about what to do with more than two alternatives, and what to do with more than two players. Here are some generalizations.

If you have 3 or more different possibilities, then the generalization of the algorithm is simple. Both players report all of their probabilities. Find a proposition P such that the expected payoff for both players if you run the even odds algorithm on P is maximized, and run the even odds algorithm on P. (Break ties randomly) This is clearly fair. To show that this is strategy proof, assume that Alice were to lie about her probabilities. There are two cases. Either Alice’s lie changes the proposition P, or it does not. If the lie does not change P, then Alice is no worse off by the strategy proffness of the original even odds algorithm. If Alice’s lie changes the proposition from P to Q, then notice that Alice prefers telling the truth and betting on P to telling the truth and betting on Q, and Alice prefers telling the truth and betting on Q to lying and betting on Q, so Alice must prefer telling the truth and betting on P to any lie which makes her bet on Q. Therefore, this algorithm is strategy proof.

If there are 3 people, observe that the player who assigns the highest probability to the proposition and the player who assigns the lowest probability to proposition have no incentive to wager with anyone besides each other, since they get more profit for wagering with probabilities that are further apart. Therefore, it would make sense to not force these two players to bet with anyone else. However, this is not strategy proof, since now the third player might be incentivised to lie slightly about his probability in order to participate in a bet. This would not be fair or strategy proof.

Instead, we want to do all pairwise bets, with each player willing to bet half their maximum in each bet. This would clearly be strategy proof, since the component bets are strategy proof. However this is not fair. If one of the bets expects a higher profit than the others, then the player not participating in the high wager bet gets less output. This can be fixed by saying that for each bet, if both players are expected to gain x dollars, then they both also give x/3 dollars to the third player. This makes the wager also fair. It is still strategy proof, since we scaled expected payout by 2/3. However, if a player believes a statement 100%, and the other two think he is wrong 100%, an he is wrong, then he will end up paying d/2 in each bet, and an additional d/6 to each player as a tax for the bets he thought he would win. This totals 4d/3. We should therefore renormalize and have each bet be at 3d/8 instead, so that each player pays at most d total.

To generalize this to n players, run all n(n-1) bets. For each bet, both players pay every other payer 1/n times their expected profit to every other player. Each of these bets is an implementation of the even odds algorithm with maximum bet dn/(2(n-1)^2).

The algorithm for n options seems optimal to me, but I am not sure that the algorithm for n players is optimal.