Probability is an illusion

leo

What are the following in your view?

1. Probability

2. Determinism

3. Non-determinism — TheMadFool

1. Probability expresses incomplete knowledge that we have about a system.

2. The exact same initial states in a deterministic system lead to the exact same outcome.

3. The exact same initial states in a non-deterministic system can lead to different outcomes.

What are they in your view?

Also read my previous two posts carefully, I think eventually it will click for you. I’m taking quite a lot of time to help you understand, so it would be fair if you took at least as much time to read and attempt to understand my posts.

leo

As to why the observed frequencies are often (not always) close to 1/6, consider the following:

Since in a deterministic system the outcome depends solely on the initial state, the observed frequencies of the outcomes depend solely on the initial states that are chosen.

And the key point to understand: there are many more ways to pick initial states leading to outcomes that have similar frequency, than there are ways to pick initial states leading to outcomes with very different frequencies. This is a result arrived at through combinatorics, something I have mentioned a few times but that you have consistently ignored.

And this result implies that in most experiments where the die is thrown arbitrarily (that is where we aren’t preferring some initial states over some others), the observed outcomes have a similar frequency, close to 1/6. Which in no way implies that the die is behaving non-deterministically at any point. Nor that the initial states are chosen non-deterministically.

TheMadFool

1. Probability expresses incomplete knowledge that we have about a system.

2. The exact same initial states in a deterministic system lead to the exact same outcome.

3. The exact same initial states in a non-deterministic system can lead to different outcomes.

What are they in your view?

Also read my previous two posts carefully, I think eventually it will click for you. I’m taking quite a lot of time to help you understand, so it would be fair if you took at least as much time to read and attempt to understand my posts. — leo

That's a great explanation. Thank you for your time and patience.

However...

In your definition of non-determinism you concede that there is something you don't know viz. the outcomes and then you go on to say that probability is about incomplete knowledge. So it must follow that non-determinism is just probability or are you claiming that there's a difference that depends on what you're ignorant about- only the initial states or only the outcomes - and probability would be an issue of ignorance regarding initial states but non-determinism would be ignorance about outcomes despite having knowledge of the initial states.

If that's the case you're making then non-determinism can't be understood in any way because the outcomes will not exhibit any pattern whatsoever. In other words non-determinism is true randomness with every outcome having equal probability and that brings us to where we began - that non-determinism = probability.

leo

In your definition of non-determinism you concede that there is something you don't know viz. the outcomes and then you go on to say that probability is about incomplete knowledge. So it must follow that non-determinism is just probability or are you claiming that there's a difference that depends on what you're ignorant about- only the initial states or only the outcomes - and probability would be an issue of ignorance regarding initial states but non-determinism would be ignorance about outcomes despite having knowledge of the initial states.

If that's the case you're making then non-determinism can't be understood in any way because the outcomes will not exhibit any pattern whatsoever. In other words non-determinism is true randomness with every outcome having equal probability and that brings us to where we began - that non-determinism = probability. — TheMadFool

Long post here, I hope you will read all of it carefully in order to understand, I could have made it shorter but I wanted to answer you as clearly as possible.

1.
In a deterministic system, the outcome is a deterministic function of the initial state, let’s write it O = f(I). No matter how many times you run the system from the same initial state I, you get the same outcome O.

You can have incomplete knowledge of the initial state I, or incomplete knowledge of how the system behaves (the function f), or incomplete knowledge of the outcomes O, or any combination of the three.

1.a)
Without any knowledge about that system, we don’t know anything about the outcomes, anything is possible.

1.b)
If we know that the system involves the throw of a six-sided die numbered from one to six and the outcome is the top face of the die when the die has stopped moving, we know that the outcome can be any number in the set {1, 2, 3, 4, 5, 6}. This counts as partial knowledge of the outcomes O.

You can express that by saying that any outcome outside of this set cannot occur, that it has 0% probability of occurring. But for now you have zero knowledge of whether all outcomes in this set actually occur, in principle it is possible that the outcome is always ‘3’, so at this point you can’t assign any probability to the outcomes in the set.

1.c)
If we know that the initial state of the die is the initial position/orientation/velocity of the die, we can determine the range of possible initial states that exist, this counts as partial knowledge of the initial states I.

But we still don’t know anything about the function f, we still don’t know how any initial state transforms into any outcome, so we still can’t assign probabilities to the outcomes {1, 2, 3, 4, 5, 6}.

1.d)
If we know that the die is perfectly symmetrical, then combining that knowledge with our incomplete knowledge of the initial states and outcomes described in the previous paragraphs, we can conclude that 1/6th of the initial states lead to outcome ‘1’, 1/6th of the initial states lead to outcome ‘2’, 1/6th of the initial states lead to outcome ‘3’, and so on. This is the same as saying that each outcome has probability 1/6 of being realized, that’s the definition of probability. This result isn’t obvious but it can be proven mathematically, offering us partial knowledge of the function f.

If we knew nothing of the function f, even if we knew the initial states perfectly we couldn’t predict anything about the outcomes, we couldn’t predict how the die is going to behave while it is flying and bouncing, but the symmetries of the die and the determinism of the function f allow us to say that no matter how the die behaves, it behaves exactly the same whether in the initial state the side ‘1’ is facing upwards or any other side is facing upwards.

1.e)
Then if we have more complete knowledge of the function f (more complete knowledge of how the die behaves while it is flying and bouncing), and more complete knowledge of the initial state when the die is thrown, we can predict the outcome more accurately, and this changes the probabilities from 1/6 to something else that depends on the initial state. And if we have complete knowledge then we can predict the outcomes exactly from the initial states and we don’t need to talk of probabilities anymore.

2.
In a non-deterministic system, the outcome is a non-deterministic function of the initial state, let’s write it O = pf(I). Even if you run the system many times from the same initial state I, you don’t always get the same outcome O. You may get some outcomes more often than some others, but you don’t get only one outcome.

In such a system, even if you gain complete knowledge of the initial states and of the function pf, you still don’t know what outcome you are going to get each time you run the experiment. But you do know things, for instance you may know which outcomes are possible (have a non-zero probability of occurring) and which outcomes are impossible (have zero probability of occurring). You may know that some outcomes are more likely than others, and assign probabilities to them.

So as you can see, it is not the case that in a non-deterministic system the outcomes will not exhibit any pattern whatsoever, it isn’t the case that a non-deterministic system is totally random. You may know that starting from initial state I, the outcome will be O1 90% of the time and O2 10% of the time. But this probability is irreducible, in the sense that it cannot be removed from gaining more knowledge, because that additional knowledge doesn’t exist.

Basically in non-deterministic systems there is irreducible probability even if you have complete knowledge of the system, whereas in deterministic systems the probabilities are only a sign of incomplete knowledge, and disappear when we have complete knowledge.

Personally I believe that we shouldn’t even make a distinction, I believe that there is no such thing as non-deterministic systems, that probabilities are always due to incomplete knowledge. And when you see things that way you clearly see that formulating probabilities isn’t a sign that you’re dealing with a non-deterministic system, but merely that you’re expressing the incomplete knowledge you have of a system.

TheMadFool

1.d)
If we know that the die is perfectly symmetrical, then combining that knowledge with our incomplete knowledge of the initial states and outcomes described in the previous paragraphs, we can conclude that 1/6th of the initial states lead to outcome ‘1’, 1/6th of the initial states lead to outcome ‘2’, 1/6th of the initial states lead to outcome ‘3’, and so on. This is the same as saying that each outcome has probability 1/6 of being realized, that’s the definition of probability. This result isn’t obvious but it can be proven mathematically, offering us partial knowledge of the function f. — leo

This is what I've been saying all along. Deterministic systems can behave probabilistically.

↪leo

Let me get this straight.

1. In a deterministic system there's a well defined function that maps each initial state (I) to a unique outcome (O) like so: f(I) = O.

2. In a non-deterministic system there is no such function because there are more than one outcome e.g. initial state A could lead to outcomes x, y, z,...

You mentioned a "function" pf(I) = O but if memory serves a function can't have more than one output which is what's happening in non-deterministic systems according to you: one initial state and multiple outcomes.

Basically in non-deterministic systems there is irreducible probability even if you have complete knowledge of the system, whereas in deterministic systems the probabilities are only a sign of incomplete knowledge, and disappear when we have complete knowledge. — leo

A fine point. :up:

So as you can see, it is not the case that in a non-deterministic system the outcomes will not exhibit any pattern whatsoever, it isn’t the case that a non-deterministic system is totally random. — leo

So, there's a difference between non-determinism and randomness but you have to admit that both can be described with mathematical probability.

Thanks for being so helpful.

leo

This is what I've been saying all along. Deterministic systems can behave probabilistically. — TheMadFool

No no this is where your confusion lies. What do you mean exactly by “behave probabilistically”? It can be interpreted in various ways:

I. Either you mean “behave non-deterministically”, but by definition a deterministic system does not behave non-deterministically. Also in order to arrive at the result that “1/6th of initial states lead to a specific outcome” we had to assume in the first place that the system behaves deterministically, so this result does not mean at all that the system behaves non-deterministically.

II. Or you mean that the behavior of the system depends on probabilities. But as we have seen, probabilities in a deterministic system are an expression of our incomplete knowledge of that system, and surely the behavior of a deterministic system does not depend on the knowledge that we have about it. So it isn’t meaningful to say that a deterministic system “behaves probabilistically” in this sense.

III. Or you mean that when we throw the die many times, the observed frequencies converge towards 1/6 for each outcome, which is similar to how a non-deterministic would behave. But you have to realize that this is false, because:

a) If we have complete knowledge of the deterministic system, we can throw the die such that we always get the outcomes we want, and then the observed frequencies can be totally different from “1/6 for each outcome”. Would you still say that the system “behaves probabilistically” then?

b) Whereas in a non-deterministic system, you might always start from the same initial state and get 6 different outcomes each with frequency 1/6, you never get that in a deterministic system.

c) In the experiment of the die the observed frequencies converge towards 1/6 only in special cases: when the initial states are chosen such that they lead to outcomes with similar frequencies. As it turns out, when we have no knowledge of the initial states we often choose them unwittingly in this manner, for the simple reason that there are many more combinations of initial states that have this property than there are combinations of initial states without this property (this result can be arrived at through combinatorics, if you understand this it will finally click for you, but you will never understand if you keep ignoring this).

1. In a deterministic system there's a well defined function that maps each initial state (I) to a unique outcome (O) like so: f(I) = O.

2. In a non-deterministic system there is no such function because there are more than one outcome e.g. initial state A could lead to outcomes x, y, z,...

You mentioned a "function" pf(I) = O but if memory serves a function can't have more than one output which is what's happening in non-deterministic systems according to you: one initial state and multiple outcomes. — TheMadFool

A function maps inputs to outputs. Deterministic functions (the ones we are used to) map one or several inputs to one output. Non-deterministic functions can map one input to several outputs.

One example of such a non-deterministic function would be: when input is I, there is 90% probability that output is O1, and 10% probability that output is O2. Each time you run the function you only get one output, either O1 or O2. But when you run it a very high number of times, you would get O1 90% of the time and O2 10% of the time.

If that makes you uneasy, you can consider like I do that non-deterministic systems fundamentally do not exist, that if we get different outcomes from the exact same initial state, it’s simply that we falsely believe that it was the exact same initial state, while in fact there was something different about it that we didn’t take into account.

So, there's a difference between non-determinism and randomness but you have to admit that both can be described with mathematical probability. — TheMadFool

There is randomness involved in non-deterministic systems, sure. And randomness can be described with probabilities, sure. But this does not imply that there is randomness in deterministic systems. Because the probabilities in deterministic systems refer to incomplete knowledge, not to randomness. A deterministic system only seems to have randomness in it when we don’t fully understand it.

For instance with a poor understanding of how planets move, their motion in the sky can seem to be partially random, but apparent randomness isn’t fundamental randomness. As another example, you believe there is fundamental randomness involved in the throw of a die, because you haven’t yet understood fully how the frequencies that we observe can be explained without invoking randomness.

TheMadFool

No no this is where your confusion lies. What do you mean exactly by “behave probabilistically”? — leo

There is no confusion at all. A die is deterministic and it behaves probabilistically. This probably needs further clarification.

A die is a deterministic system in that each initial state has one and only one outcome but if the initial states are random then the outcomes will be random.

TheMadFool

↪leo

:up: :smile: :ok:

leo

A die is deterministic and it behaves probabilistically. This probably needs further clarification because it looks like you're confused. — TheMadFool

I explained carefully why saying that “the die behaves probabilistically” is at best meaningless and at worst a contradiction, and yet you’re saying I’m the one who is confused ...

A die is a deterministic system in that each initial state has one and only one outcome but if the initial states are random then the outcomes will be random. — TheMadFool

In a deterministic system where all initial states lead to the same outcome, even if the initial states are picked randomly the outcome isn’t random.

In a deterministic system where all initial states don’t lead to the same outcome, there are subsets of all initial states within which if you pick initial states randomly the outcomes aren’t random.

In the example of the die you can pick the initial states deterministically (rather than randomly) and still get outcomes with frequency 1/6.

Clearly, the underlying reason why the observed frequencies are often 1/6 is not that the initial states are picked randomly, you are still confused about that.

It is correct that picking the initial states randomly in the example of the die leads often (not always) to frequencies close to 1/6 for each outcome, but it is incorrect to believe that randomness is required to obtain such outcomes. The same frequencies can be obtained deterministically.

With your current understanding, you can’t explain why we can pick initial states deterministically and get outcomes with frequency 1/6 each. Because your understanding is incomplete. Now you can keep believing I’m the one who is confused if you want, but meanwhile you’re the one who hasn’t addressed many of the points I’ve made.

Dawnstorm

There is no confusion at all. A die is deterministic and it behaves probabilistically. This probably needs further clarification.

A die is a deterministic system in that each initial state has one and only one outcome but if the initial states are random then the outcomes will be random. — TheMadFool

A variable has an event space, and that event space has a distribution. How you pick a value for the variable determines whether the variable is independent or dependent. An independent variable can be a random variable, and a dependent variable can depend on one or more random variables.

How we retrieve the values for the variable in an experiment (i.e. if it's a random variable or not) has no influence on the distribution of the event space of the variable, but it can introduce a bias into our results.

That the same variable with the same distribution can have its values computed or chosen at random in different mathematical contexts is no mystery. It's a question of methodology.

TheMadFool

I explained carefully why saying that “the die behaves probabilistically” is at best meaningless and at worst a contradiction, and yet you’re saying I’m the one who is confused ... — leo

With your current understanding, you can’t explain why we can pick initial states deterministically and get outcomes with frequency 1/6 each. — leo

I think I get what you mean.

Assuming that the die is a deterministic system two things are possible:

A. The usual way we throw the die - randomly - without knowing the initial state. The outcomes in this case would have a relative frequency that can be calculated in terms of the ratio between desired outcomes and total number of possible outcomes. It doesn't get more probabilistic than this does it?

B. If we have complete information about the die then we can deliberately select the initial states to produce outcomes that look exactly like A above with perfectly matching relative frequencies.

When I said "a deterministic system is behaving probabilistically" I did so on the basis of A above. The reason for this is simple: though each outcome is fully determined by the initial state of the die, the initial states were themselves randomly selected which precludes definite knowledge of outcomes. Thus we must resort to probability theory and it seems to work pretty well; too well in my opinion in that the die when thrown without knowledge of the initial states behaves in a way that matches theoretical probability.

I'm in no way saying 2 can't be done.

However, there's a major difference between A and B to wit the probabilities on a single throw of the die will be poles apart. In situation A, the probability of any outcome will be between 0 and 1 but never will it be 1 or 100% but in situation B every outcome will have a probability 1 or 100%

A variable has an event space, and that event space has a distribution. How you pick a value for the variable determines whether the variable is independent or dependent. An independent variable can be a random variable, and a dependent variable can depend on one or more random variables.

How we retrieve the values for the variable in an experiment (i.e. if it's a random variable or not) has no influence on the distribution of the event space of the variable, but it can introduce a bias into our results.

That the same variable with the same distribution can have its values computed or chosen at random in different mathematical contexts is no mystery. It's a question of methodology. — Dawnstorm

:chin:

leo

we must resort to probability theory and it seems to work pretty well; too well in my opinion in that the die when thrown without knowledge of the initial states behaves in a way that matches theoretical probability. — TheMadFool

I want you to focus on that, on this feeling that it seems to work too well. That feeling is telling you something, that there is something off in your understanding that you can’t quite pinpoint yet, so don’t stop now thinking that the confusion has disappeared. But we’re circling in on that confusion, and I think you aren’t far from finally seeing it.

A. The usual way we throw the die - randomly - without knowing the initial state. The outcomes in this case would have a relative frequency that can be calculated in terms of the ratio between desired outcomes and total number of possible outcomes. It doesn't get more probabilistic than this does it? — TheMadFool

You absolutely have to understand this: the theoretical probabilities do not tell us about the relative frequencies that we will observe. They merely express the best knowledge that we have when we don’t know the initial state we’re in. And indeed as you have correctly noticed, it seems strange that the observed frequencies match the theoretical probabilities so well, why would they?

And the “why” is what you need to understand now.

When you throw the die arbitrarily many times, what you are essentially doing is picking an arbitrary combination of initial states. Now why would this combination contain each outcome with about the same frequency? That’s what you can’t explain yet. Why is it that the bigger the combination, the more similar the frequencies of each outcome are? Once you understand that you will finally see the confusion, and you will finally get it.

If you pick an arbitrary combination of initial states, and most of the time that combination contains each outcome with about the same frequency, do you agree that either something magical is going on and guiding your hand when you pick the initial states, or it means that there are many more combinations where each outcome has about the same frequency, than combinations where the frequencies are different?

And indeed this is something that we can prove: in the example of the die, there are many more combinations of initial states where each outcome has about the same frequency, than there are combinations where the frequencies are different.

If you want we can focus on proving that, if you finally understand that this is the only way that we can make sense of what we observe, without invoking magic or randomness, without saying that our ignorance of the initial states somehow makes the die behave differently.

leo

However, there's a major difference between A and B to wit the probabilities on a single throw of the die will be poles apart. In situation A, the probability of any outcome will be between 0 and 1 but never will it be 1 or 100% but in situation B every outcome will have a probability 1 or 100% — TheMadFool

In situation A the probably of one outcome is also 1 or 100% once the die is thrown, it is simply our incomplete knowledge that makes us say that any outcome is possible, but the outcome that is about to be realized is already determined.

So there is no fundamental difference between A and B, the only superficial difference is that in situation A we don’t know what the outcome is going to be, the difference lies in our knowledge of the system and not in how the system behaves.

When we have no knowledge of the initial states, the frequencies of the outcomes are often similar simply because we pick the initial states arbitrarily, and there are many more combinations of initial states where outcomes have a similar frequency, so we pick such combinations much more often. That’s all there is to it.

The die behaves deterministically, there is no fundamental randomness, there is no magical force guiding our hand or the die in order to yield these frequencies. The probabilities we talk about are not a result of an underlying randomness but simply an expression of our incomplete knowledge. The observed frequencies often match these probabilities not because the system behaves probabilistically (randomly, non-deterministically), but because most combinations of initial states lead to these frequencies.

And you can see that it is a coincidence that these observed frequencies match these probabilities, it isn’t always true. For instance if the initial states are picked deliberately so that the observed frequencies do not match these probabilities, then this is what happens. And in some cases the initial states are picked arbitrarily and still the observed frequencies are very different from these probabilities, even after many throws.

TheMadFool

In situation A the probably of one outcome is also 1 or 100% once the die is thrown, it is simply our incomplete knowledge that makes us say that any outcome is possible, but the outcome that is about to be realized is already determined. — leo

The scenarios A and B in my previous post was to explain that deterministic systems can behave probabilistically and I think it accomplished its purpose.

Bear in mind though that I don't mean deterministic systems are non-deterministic. I just mean that sometimes, as when we have incomplete knowledge, we can use probability on deterministic systems.

Considering we can use probability on non-deterministic systems too, it must follow that probability theory has within its scope non-determinism and determinism,some part of which we're ignorant of.

If you want we can focus on proving that, if you finally understand that this is the only way that we can make sense of what we observe, without invoking magic or randomness, without saying that our ignorance of the initial states somehow makes the die behave differently. — leo

Yes, I believe I wrote something to that effect in my reply to Harry Hindu but that was because I thought he claimed ignorance had some kind of a causal connection to randomness. Later in my discussions with him/her and you, I realized that ignorance of deterministic systems is not a cause of but rather an occasion for, probability. I hope we're clear on that.

When we have no knowledge of the initial states, the frequencies of the outcomes are often similar simply because we pick the initial states arbitrarily, and there are many more combinations of initial states where outcomes have a similar frequency, so we pick such combinations much more often. — leo

This is an obvious fact and doesn't contradict anything I've said so far.

Harry Hindu

Yes, I believe I wrote something to that effect in my reply to Harry Hindu but that was because I thought he claimed ignorance had some kind of a causal connection to randomness. Later in my discussions with him/her and you, I realized that ignorance of deterministic systems is not a cause of but rather an occasion for, probability. I hope we're clear on that. — TheMadFool

Just as long as what we are clear on is that probabilities only exist in the system of your mind, not in the system of dice being rolled. Determinism exists in both systems. The idea of probabilities are a determined outcome of ignorant minds. When you are ignorant of the facts, you can't help but to engage in the idea of probabilities, just as when you aren't ignorant of the facts, you can't think in terms of probabilities. The system is determined from your perspective, which just means that you understand the causal relationships that preceded what it is that you are observing or talking about in this moment.

Can you think of any point of your life where you were not ignorant of the facts and still thought of the system as possessing probability or indeterminism? Can you think of any point in your life where you were ignorant of the facts and you perceived the system as being deterministic? It seems to me that ignorance and probabilities aren't just a correlation, but a causal relationship.

TheMadFool

Just as long as what we are clear on is that probabilities only exist in the system of your mind, not in the system of dice being rolled. Determinism exists in both systems. The idea of probabilities are a determined outcome of ignorant minds. — Harry Hindu

So, probability didn't exist before there was such a thing as mind, say 9 billion years ago when the earth hadn't even formed? Everything was deterministic before minds came into being and now probability exists because there are now minds and to add, these minds can be ignorant.

Do you mean to imply that if, by some freak of nature, all minds were wiped out, probability would disappear?

Surely, you don't mean to say that do you?

If so, what exactly do you mean by "probability only exists in the system of your mind"?

I agree that the restricted domain herein, of die throwing, is ultimately deterministic and that whereof we're ignorant we can only guess and ignorance being a state of mind there is a sense in which your statement is true but if your statement means that non-determinism i.e. true randomness doesn't exist and that every instance of probabilistic behavior is simply us being forced to engage in mathematical guessing (probability theory) due to ignorance, then I need more convincing if you don't mind.

Dawnstorm

↪TheMadFool

My latest post seems to have come out more technical than I meant it to. I went through a lot of drafts, discarded a lot, and ended up with this. But there's a point in there somewhere:

A. The usual way we throw the die - randomly - without knowing the initial state. The outcomes in this case would have a relative frequency that can be calculated in terms of the ratio between desired outcomes and total number of possible outcomes. It doesn't get more probabilistic than this does it?

B. If we have complete information about the die then we can deliberately select the initial states to produce outcomes that look exactly like A above with perfectly matching relative frequencies. — TheMadFool

The scenarios A and B in my previous post was to explain that deterministic systems can behave probabilistically and I think it accomplished its purpose. — TheMadFool

It's clear to me that you think scenarios A and B explain why deterministic systems "behave probabilistically", but as leo pointed out "behaving probabilistically" isn't well defined, and in any case the maths works the same in both A and B.

You use terms like "the initial state", and "complete information about the die", but those terms aren't well defined. "The initial state" is the initial state of a probabilistic system, but that's pure math and not the real world. We use math to make statements about the real world. The philosophy here is: "How does mathmatics relate to the real world?"

The mathematical system of the probability of a fair die has a single variable: the outcome of a die throw. There is no initial state of the system, you just produce random results time and again. The real world always falls short of this perfect system. You understand this, which is why you're comparing ideal dice to real dice. "Initial states" aren't initial states of ideal dice, but of real dice. (I understand you correctly so far, no?)

Now to describe a real die you need to expand the original system to include other variables. That is you expand to original ideal system into a new ideal system, but one with more variables taken into account. This ideal system will have an "initial state", but it's - again - an ideal system, and if you look at the "initial state", you'll see that the variables that make up the initial state can be described, too. This is important, because you're arriving at the phrase "complete information about the die" and you go on to say that "we can deliberately select the initial states." But there are systematic theoretical assumptions included in this in such a way that what initial states we pick is not part of the system we use to describe the die throw. (But, then, is the information really "complete"? What do you mean by "complete"?)

So now to go back to my original post:

A variable has an event space, and that event space has a distribution. — Dawnstorm

Take a look at a die. A die has six sides, and there are numbers printed on every side, and it's those numbers we're interested in. This is what makes the event space:

1, 2, 3, 4, 5, 6

The distribution is just an assumption we make. We assume that everyone of those outcomes is equally likely. This isn't an arbitrary assumption: it's a useful baseline to which we can compare any deviation. If a real die, for example, were most likely to throw a 5 due to some physical imbalance, then it's not a fair die. The distribution changes.

In situations such as games of chance we want dice to behave as closely to a fair die as possible. Even without knowing each die's distribution, for example by simple rule: never throw the same die twice. The idea here is that we introduce a new random variable: which die to throw. Different dice are likely to have different biases, so individual biases won't have as much an effect of the outcome. In effect, we'd be using many different real dice, to simulate an ideal one.

And now we can make the assumption that biases cancel each other out, i.e. there are equally man dice that are biased towards 1 than towards 2, etc. This two is an ideal assumption with its own distribution, and maybe there's an even more complicated system which equals out the real/ideal difference for this one, too. For puny human brains this gets harder and harder every step up. But the more deterministic a system is, the easier it gets to create such descriptive systems. And with complete knowledge of the entire universe, you can calculate every proability very precisely: you don't need to realy on assumptions and the distinction between ideal and real dice disappears.

Under prefect knowledge of a deterministic system probability amounts to the frequentist description of a system of limited variables. An incomplete frequentist description of a deterministic system will always include probabilities, because of this. If, however, you follow the chain of causality for a single throw of a die, what you have isn't a frequentist description, and probability doesn't apply. They're just different perspectives: how the throw of a die relates to all the other events thus categorised, and how it came about. There's no contradiction.

leo

Bear in mind though that I don't mean deterministic systems are non-deterministic. I just mean that sometimes, as when we have incomplete knowledge, we can use probability on deterministic systems.

Considering we can use probability on non-deterministic systems too, it must follow that probability theory has within its scope non-determinism and determinism,some part of which we're ignorant of. — TheMadFool

OK we can agree on that. With the caveat that it would be wrong to expect that throwing the die many many times will always yield each outcome with the same frequency as the others, it would be wrong to expect that the observed frequencies will always match the theoretical probabilities we’ve come up with, it would be wrong to expect that if you throw the die a gazillion times you will always get 1/6 frequency for each outcome.

But still...

This is an obvious fact and doesn't contradict anything I've said so far. — TheMadFool

It isn’t an obvious fact, it’s not easy to prove. And I would venture to say that if it was so obvious you wouldn’t have made this thread in the first place, and you still wouldn’t be saying that observed frequencies fit the probabilities too well.

Say you pick a number in the set {1, 2, 3, 4, 5, 6} and you do that 100 times. You get a combination of 100 numbers, each number being either 1 2 3 4 5 or 6. You can compute how many times each number appears in the combination, compute what its frequency is.

If it was obvious that there are many more such combinations in which each number’s frequency is about the same than there are combinations in which the frequencies are very different, then it would be obvious that when we throw the die arbitrarily 100 times we often get each number with about the same frequency, and we wouldn’t think that it’s weird even though we’re dealing with a deterministic system. If the latter isn’t obvious then the former isn’t either...

Also, it is misleading to say that the system behaves probabilistically, it creates confusion, what’s more accurate is to say that the outcomes of a deterministic system can be distributed evenly when the system is run many times. Fundamentally there is nothing mysterious about that, what appears to be mysterious is why oftentimes the outcomes of a die throw are distributed evenly, but this is explained as above.

TheMadFool

it would be wrong to expect that the observed frequencies will always match the theoretical probabilities we’ve come up with, it would be wrong to expect that if you throw the die a gazillion times you will always get 1/6 frequency for each outcome. — leo

What about the law of large numbers which says exactly the opposite of what you're saying? The law of large numbers states that the average of the values of a variable will approache the expected value of that variable as the number of experiments become larger and larger.

If the random variable x is the probability of getting an odd number in a die throw then we could conduct n experiments of T number of trials in each and get the following values: x1, x2,...xn
The average value for x = (x1+x2+...+xn)/n

The expected value for x, E(x) = P(x) * T = (3/6) * T where P(x) is the theoretical probability[/i[ of event x.

The law of large numbers states that (x1+x2+...+xn)/n will approach E(x) = P(x) * T and this is only possible if the actual probabilities themselves are in the vicinity of the theoretical probability.

Note: my math may be a little off the mark. Kindly correct any errors

Your claim that we shouldn't expect that theoretical probabilities will not match observed frequencies is applicable only to small numbers of experiments.

It isn’t an obvious fact, it’s not easy to prove. — leo

What could be more obvious than saying if there are more ways of x happening than y then x will happen more frequently if the probabilities of all outcomes are equally likely?

↪Dawnstorm

Your comments are basically about practical limitations and these can be safely ignored because, as actual experimentation shows, even a standard-issue die/coin behaves probabilistically.

There's no contradiction. — Dawnstorm

That is correct.

leo

The expected value for x, E(x) = P(x) * T = (3/6) * T where P(x) is the theoretical probability[/i[ of event x.

The law of large numbers states that (x1+x2+...+xn)/n will approach E(x) = P(x) * T

Note: my math may be a little off the mark. Kindly correct any errors — TheMadFool

You have a flawed understanding of expected value, it is 1*P(1)+2*P(2)+3*P(3)*4*P(4)+5*P(5)+6*P(6) = (1+2+3+4+5+6)*1/6 = 3.5

That ‘law’ states that the average of outcomes will converge towards 3.5, not towards 1/6 times the number of trials (that wouldn’t make sense).

What about the law of large numbers which says exactly the opposite of what you're saying? The law of large numbers states that the average of the values of a variable will approach the expected value of that variable as the number of experiments become larger and larger. — TheMadFool

People have come up with plenty of ‘laws’, are they always correct? Just because something is called a ‘law’ that means it’s always true? A ‘law’ has a domain of validity, the law of large numbers doesn’t always apply. Try to understand how and why people have come to formulate this law, rather than assuming it’s always true and applies everywhere.

There are two main limitations to this ‘law’ here:

1. Even if you pick the initial states randomly, it is possible that the average of outcomes will not converge towards the expected value no matter how many times you throw the die (it’s possible that you always get some outcome, or never get some outcome, or get totally different frequencies, it’s rare but possible).

2. You don’t know in the first place that you are picking initial states randomly. For instance if you unwittingly always pick initial states that never lead to outcome ‘6’, then it’s wrong to say that outcome ‘6’ has probability 1/6 of occurring, it has probability 0 of occurring, and then the average of outcomes won’t converge towards 3.5.

What could be more obvious than saying if there are more ways of x happening than y then x will happen more frequently if the probabilities of all outcomes are equally likely? — TheMadFool

Firstly, what’s not obvious is that there are more ways of x happening than y.

Secondly, the probabilities of all outcomes are the same only theoretically, in practice the effective probabilities depend on how you throw the die.

TheMadFool

You have a flawed understanding of expected value, it is 1*P(1)+2*P(2)+3*P(3)*4*P(4)+5*P(5)+6*P(6) = (1+2+3+4+5+6)*1/6 = 3.5 — leo

I was expecting that but the math works out in my explanation. The law of large numbers does say that the experimental probability will approach the theoretical probability.

eople have come up with plenty of ‘laws’, are they always correct? — leo

Are you in any way challenging the law of large numbers?

A special form of the LLN (for a binary random variable) was first proved by Jacob Bernoulli.[7] It took him over 20 years to develop a sufficiently rigorous mathematical proof which was published in his Ars Conjectandi (The Art of Conjecturing) in 1713. — Wikpedia

Talk to Jacob Bernoulli :grin:

TheMadFool

Firstly, what’s not obvious is that there are more ways of x happening than y. — leo

Then why did you claim it?

TheMadFool

You have a flawed understanding of expected value, it is 1*P(1)+2*P(2)+3*P(3)*4*P(4)+5*P(5)+6*P(6) = (1+2+3+4+5+6)*1/6 = 3.5

That ‘law’ states that the average of outcomes will converge towards 3.5, not towards 1/6 times the number of trials (that wouldn’t make sense). — leo

Jokes aside, the 3.5 value is obtained because the probability of each outcome is 1/6.

Dawnstorm

Your comments are basically about practical limitations and these can be safely ignored because, as actual experimentation shows, even a standard-issue die/coin behaves probabilistically. — TheMadFool

On the one hand, you say that practical limitations can be safely ignored, and on the other hand you wish to appeal to actual experimentation. You have to choose one. Practical limitations may not be important to the law of large numbers when it comes to an ideal die, but they're certainly vitally important to actual experimentation. That's a theoretical issue, by the way: the universe we live in is only a very small sample compared to the infite number of throws, and what any sample we throw in the real world converges to is the actual distribution of the variable, and not the ideal distribution (though the sets can and often will overlap).

More importantly, though, since you're talking about determinism, you're actually interested in practical limitations and how they relate to probability. It's me who says practical limitations are unimportant to the law of large number, because it's an entirely mathematical concept (and thus entirely logical). Not even a universe in which nothing but sixes are thrown would have anything of interest to say about the law of large numbers.

I'd say the core problem is that without a clearly defined number of elements in a set (N), you have no sense of scale. How do you answer the question whether all the die throws in the universe is a "large number" when you're talking about a totality of infinite tries? If you plot out tries (real or imagined, doesn't matter) you'll see that the curve doesn't linearily approach the expected value but goes up and down and stabilises around the value. If all the tries in the universe come up 6, this is certainly unlikely (1/6^N; N = number of dice thrown in the universe), but in the context of an ideal die thrown an infinite number of times, this is just be a tiny local devergance.

That ‘law’ states that the average of outcomes will converge towards 3.5, not towards 1/6 times the number of trials (that wouldn’t make sense). — leo

The two of you work with different x's. Your x is the outcome of a die throw {1,2,3,4,5,6}. His x is the number of odd die-throws in a sample of the size of T. He's using the probability of throwing an odd number as the expected value. Explaining the particulars, here, is beyond me, as I'm out of the loop for over a decade, but he's basically using an indicator function for x (where the value = 1 for {1,3,5} and 0 for {2,4,6}).

As far as I can tell, what he's doing here is fine.

TheMadFool

On the one hand, you say that practical limitations can be safely ignored, and on the other hand you wish to appeal to actual experimentation. You have to choose one. Practical limitations may not be important to the law of large numbers when it comes to an ideal die, but they're certainly vitally important to actual experimentation. That's a theoretical issue, by the way: the universe we live in is only a very small sample compared to the infite number of throws, and what any sample we throw in the real world converges to is the actual distribution of the variable, and not the ideal distribution (though the sets can and often will overlap).

More importantly, though, since you're talking about determinism, you're actually interested in practical limitations and how they relate to probability. It's me who says practical limitations are unimportant to the law of large number, because it's an entirely mathematical concept (and thus entirely logical). Not even a universe in which nothing but sixes are thrown would have anything of interest to say about the law of large numbers.

I'd say the core problem is that without a clearly defined number of elements in a set (N), you have no sense of scale. How do you answer the question whether all the die throws in the universe is a "large number" when you're talking about a totality of infinite tries? If you plot out tries (real or imagined, doesn't matter) you'll see that the curve doesn't linearily approach the expected value but goes up and down and stabilises around the value. If all the tries in the universe come up 6, this is certainly unlikely (1/6^N; N = number of dice thrown in the universe), but in the context of an ideal die thrown an infinite number of times, this is just be a tiny local devergance. — Dawnstorm

I think if one ignores practical limitations then we can rely on experiments. If it weren't that way we could never be sure of experimental results because they would never be perfect enough.

You mentioned things like changing the die with every throw and other variations to die throwing that, to me, were an attempt to make the process ideal.

However as you already know experimentation with die bought from any stall under normal non-ideal conditions yields results that agree with the law of large numbers. So, I saw no need for us to go in that direction because it was unnecessary.

I did some very basic research and you're right in that no finite large number can compare to infinity but it seems we really don't need infinity to see the trend of the sample mean/average value of outcomes approaching the expected value.

Thanks.

sandman

This seems similar to the Schrodinger's cat example.
The uncertainty lies in the radioactive sample, not the cat.
The uncertainty lies in the dynamics of the toss, not the die.

Probability is an illusion

Welcome to The Philosophy Forum!

Categories

More Discussions