How would you interpret that utterance? We will use Bayes' Theorem to figure it out. There are two possibilities we will consider here: "planetary" and "plant a tree." The way that a listener interprets the utterance will be determined by whichever possibility the listener deems to have a higher posterior probability. That is:

Therefore, in order to determine which interpretation arises, we clearly need to figure out which of these posterior probabilities will be greater. We will use Bayes' Theorem to do this; using this theorem, we rewrite the posterior probabilities as follows:

Recall from the tutorial about the dress that we can ignore the denominators of the right hand sides (since these denominators are the same for both equations, and all we care about is which of the posterior probabilities is larger, not what their actual values are). Therefore, we can rewrite the posterior probabilities as follows (using the ∝ symbol to mean "is proportional to"):

This simplification is useful because it means there's one fewer thing we have to calculate. This leaves two terms on each right hand side: The likelihoods and the priors.

First we will determine the likelihoods of these two possibilities, that is p(audio | "planetary") and p(audio | "plant a tree"). How should you interpret these expressions? p(audio | "planetary") means "supposing that the speaker is saying "planetary", how likely is it that their utterance would sound the way it does in the audio clip?" Similarly, p(audio | "plant a tree") means "supposing that the speaker is saying "plant a tree", how likely is it that their utterance would sound the way it does in the audio clip?"

A major factor in determining these likelihoods is your past experience. For example, British people and American people pronounce "planetary" differently. Brits generally reduce the final "a" so that it is very short or even unpronounced:

Therefore, since our mystery recording does not have a clear vowel between the "t" and the "r", a Brit is relatively likely to hear it as "planetary," while an American would probably hear it unambiguously as "plant a tree."

Because of these differences, we will compute likelihoods under two sets of conditions: For an American speaker and for a British speaker.

One way we could compute the likelihood would be to treat the presence or absence of the "a" as a binary variable - that is, this vowel either will (with some probability p) appear, or will (with some probability 1-p) not appear. However, a more realistic framing of the problem is that the vowel will have some duration and that, the longer the vowel is, the more likely the word is to be interpreted as "planetary". Therefore, we will instead base the likelihoods on the duration of this vowel. The vowel itself is difficult to measure properly, so as a proxy for its length we will instead use the duration of the portion of the utterance from the "t" onward.

Below we have collected utterances from American speakers and British speakers saying "planetary" and "plant a tree." For each one, measure the duration of the part of the word from "t" to the end of the word. We will then use these measurements to get a sense of the distribution of durations.

How to do this: If you click a point in an image (each of these images is a spectrogram, which is a representation of the acoustic properties of some sound) and then drag your mouse to the right, it will highlight the portion of the spectrogram that you have dragged over. For each image, highlight the portion of the image that represents the "t" sound to the end of the word. To help, you can use the "Play entire clip" button to hear the whole clip; and once you've highlighted a section, you can click "Play highlighted portion" to make sure that what you've highlighted sounds like either "tree" or "tary." Note: Don't feel like you have to spend too much time measuring the spectrograms carefully-the important thing is that you understand how Bayes' Theorem is being used here, so it's fine if the measurements are not perfect.

American speaker 1

"Plant a tree"