## The Little Vulgar Book of Mechanics (v0.17.0) - Probability I

Last updated: June 11th 2022

Just updated this section of the book: Probability I

## Probability I #

"We have to come back to something like ordinary language after all when we want to talk "about" mathematics!" – Sir Harold Jeffreys (1891–1989)

Probability can be confusing because everyone – including myself, but also your school teacher, your textbook, and even your favorite "probability rockstar" in pop-sci circles – talks about probabilities using phrasing that gives many people the wrong idea, even in "technical" contexts.

I have three goals for this introduction:

- Tell you
*why*Probability exists. - Give you a couple of examples of why loose language can confuse people.
- Clarify what one actually means when one talks about probabilities.

There will be no function graphs, no numbers, no sets, or combinatorics talk in this section. All of those things will enter the picture in the following sections. Though I will introduce some basic symbols (without the numbers) at the very end.

To understand why Probability exists, let's first see a case where we *don't* use it. Consider the following proposition:

"John Petrucci owns a 7-string electric guitar."

For most practical purposes, that proposition is going to be either `False`

or `True`

, and that'll be the end of it. In a computer program, a "boolean" data type would suffice to describe the state of knowledge about it.

But sometimes we need more than `True`

or `False`

. Consider this question:

- Was the UFO reported by the geologist at the Antartica research station an alien spaceship?

When confronted with such questions, we usually don't have enough information to be able to say either `True`

or `False`

, so we want something more granular, which we can use while in the process of discovering the truth – ie. the process of arriving at either `True`

or `False`

(at least temporarily). Something to use as we accumulate/improve data from observations, experience, etc.

And that is why Probability exists: To represent, with mathematical and logical rigour, the different stages of what we might call "truth discovery," which in common parlance we express with phrases such as:

- "I'd be
*very*surprised..." - "I don't know..."
- "I bet $50 bucks..."

...and so on.

Since `True`

and `False`

are not useful enough as values for such purposes, we use numbers between `0`

and `1`

(never actually `0`

or `1`

, cos they're just equivalent of good ol' `False`

and `True`

respectively), along with certain operations to engaging in "probabilistic" reasoning and deductions. **In this sense, Probability is an extension of Logic.**

We also follow strict rules for calculation, so in the study of Probability there's also a calculus to be learned – in fact, a century ago French treatises would talk about "Calculus of Probabilities" instead of "Probability Theory" as we call it nowadays – which we will study later on.

(The big c, *Calculus*, will also enter the picture later, but in the previous paragraph I'm talking about Probability Theory having *a* "calculus" too. A calculus is any set of rules for calculation in some context. E.g. Propositional calculus is the system that specifies how to make inferences in Logic.)

Of course, it's not about pulling arbitrary numbers out of your ass to express your feelings about some guesses you have. Ie. you don't just pick `0.416`

out of the blue to express how likely you think the UFO from our Antartica researcher was an alien spaceship. You must explain why `0.416`

and not, say `0.415`

. So we will learn that probabilities are derived from data (or, at very least, from some *extremely* common, common sense, near-truth/near-false, "for all practical purposes"-type assumptions).

(Spoiler alert: Having methods and algorithms to *count* things is going to help a lot. Also: There Will Be Fractions. And numerators and denominators in said fractions will be based on counted things. And from such fractions, and operations on them, you'll arrive at such specific values such as `0.416`

.)

OK that's enough of *why* we use Probability. Now let's talk about the shit we all say irresponsibly. The loose way of talking about probabilities which can and does confuse others, and even ourselves.

Consider this:

"The object encountered by the geologist has a probability of 1 in 55000 of being an alien spaceship."

Do you see anything peculiar about that proposition?

I know what you're thinking: It is a "probabilistic" proposition. Yes, it's *meant* to be. But here's what's "peculiar" about it: It doesn't make sense!

Can you see why? Let see if it gets better if I rephrase it this way:

- "The geologist has a chance of 1 in 55000 of having seen an alien spaceship."

What do you think? Does it make sense now? Let's put the two next to each other:

- "The object encountered by the geologist
**has**a probability of 1 in 55000 of being an alien spaceship." - "The geologist
**has**a chance of 1 in 55000 of seeing an alien spaceship tomorrow."

Which one do you think is more accurate?

The right answer is: Neither! Both propositions are examples of the wrong speak I'm talking about. Both make the mistake of talking about some mythical substance. Who "has" the probability? The geologist? The UFO? The answer is neither. Because here's the thing: **There is no such thing as a probability.**

By which I mean: Nobody "has" a probability. No object does. No event does. Both examples above are trying to say the same idea, but they're being loose with language in the same way: They seem to say that a probability is somehow a *property*, or *attribute*, of a person, or some UFO, or event.

Another example:

"There is a 30% probability that Zander Noriega's next song is good."

Well, my next song doesn't even exist, so that "30% probability" certainly can't be a property of *it*. **An entity that doesn't exist can't have any property.**

Which leads us to the second lesson in this introduction: **Probability is not a property of anything. A probability is a measure of an observer's uncertainty.**

That's good news, though. I mean, what would you prefer?

**Probability as a mystical substance somehow reified**, floating around, being embedded in objects, people, events, as an imaginary property.**Probability as a rigorously computed number**, reflecting someone's degree of uncertainty with respect to some proposition about the world.

Call me crazy, but I'm glad we got the latter. No mysticism here (though plenty of *belief* all over the place, as I'll explain next.) But you can see why loose language can confuse people into thinking Probability is a mystical substance. Here's one last example, coming from a guy who constantly reminds everyone that he is a master probabilist:

"...the risk of being killed as a pedestrian is one per 47,000 years."

See his use of "the" risk. Implying there's "the" probability of being killed as a pedestrian. But now you know that that's not a thing: No things have "the" probability of this and that. No cars, no pedestrians, no drivers, or roads. "The" probability is not a property of anything in spacetime.

What he *means*, or, what you should derive from his babble, is that, according to some data that *he* (presumably) has reviewed, plus his experience and so on, the quantification of his degree of belief in anyone getting killed by a car as a pedestrian is "1 in 47,000 years." And I'm sure somewhere in the calculation there's some total of street crossings per capita, per year, etc. And some "Exponentials," of course.

But what if this month *you* are living at a particularly busy and chaotic urban area, and have to cross particularly busy intersections, on your way to work, and it's a month with particularly extreme weather, with slippery pavement and foggy vision? Is "the" probability of getting wrecked still "1 in 47,000"? Is the number provided by Mr. Probability Guru of any use to you?

Probably not. (See what I did there?). I mean, you *could* take his number, act on it *as if* it was "data," and/or go around regurgitating it. But there's no such thing as "the" probability of being killed as a pedestrian, or "the" probability of dying in a plane crash, or "the" probability of a UFO encounter.

There's only *a* calculation *you* can do, based on *some* data, leading to *a* number that will reflect *your* level of uncertainty. Other people might then just regurgitate *your* number, *as if* it was "data," but that's a different thing from your number being "the" probability, as some kind of property of the fabric of the Universe or whatever. Never ever forget: **A probability is a number expressing someone's degree of belief in something.**

Speaking of, let me finish this introduction with a word on "belief." A probability is either:

- Our degree of belief in a hypothesis
`H`

(from "hypothesis"), given some data`D`

. - Our degree of belief in data
`D`

, given our belief in a hypothesis`H`

.

Respectively expressed symbolically:

\[P(H \mid D)\]

\[P(D \mid H)\]

It's *all* about *belief*. Yes, *belief*. Not "facts." Facts are `True`

and `Right`

stuff. Which is handy, but, whether you like it or not, almost all of your actions are based on *belief*. Aside from mathematical theorems, "facts" are a minuscule part of your life. You don't really know much of anything for a fact.

You have very little idea of what's gonna happen in your day after you wake up (if you wake up). All kinds of substances, physical and mental illnesses mess with your perceptions and memories. Software and hardware bugs confuse your monitoring devices. Not to mention personal biases, fears, life goals, peer pressure, group think, and emotions in general. And that's not even taking into account that you might also just be dumb as a fence post.

Nonetheless, you still build things. And so do I. That's literally all I do, all day, every day: Create things. Engineer things. From music to software to meals. So we still need to at least compare the relative (un)certainties with regards to various possible beliefs. **Probabilities encode our ever-changing incompleteness of information, and Probability Theory provides the logic and calculus for operating with them.**

Here's a last quote by one of the biggest minds in history:

"The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man's mind." – James Clerk Maxwell (1850)

Let this be your welcome to Probability I.

In the following sections, we'll be looking at examples from various areas from the rest of the book.

### Books #

See the rest of the book's WIP here.