ESL Pronunciation Explained: The 10 Elements of Proper English Pronunciation

Mentioning pronunciation in the ESL classroom opens up a whole can of worms.

The trick with being an ESL teacher is that all the concepts you’re teaching are intuitive to you.

You know how to do it all. You’ve been speaking English for a long time, perhaps since birth or early childhood, and you’re really, really good at it.

But can you explain the language properly to someone else?

It’s one thing to explain sentence structure, grammar patterns and vocabulary usage. Even the most complicated details still follow rules and patterns.

When it comes time to teach English pronunciation, you’ve got to explain how sounds sound, why different letter combinations make different sounds, how to produce those sounds with your mouth, tongue, teeth and throat—and, to boot, English pronunciation is notoriously irregular and funky.

Not to mention, students can have trouble hearing their own accents, and may not even be able to recognize a pronunciation problem.

So, what to do?

ESL Pronunciation Explained: The 10 Elements of Proper English Pronunciation

If you’re after a super fun way to model native English pronunciation, then look no further than FluentU.

FluentU takes authentic videos—like music videos, movie trailers, news and inspiring talks—and turns them into personalized language lessons.

Hearing native speaking is one of the top methods for your students to perfect and overcome their pronunciation problems. Be sure to request a free trial of the FluentU program and explore all the other incredible ways FluentU can help in the classroom. 

1. Vowels

English vowels are often mistaken as simple by ESL students. Just “A, E, I, O, U.”

Vowels can actually be the most complicated aspect of English pronunciation to learn and to teach.

The English language has 44 sounds, 20 of which are vowel sounds. 

Even the simple vowels are anything but simple. Simple vowels are often referred to as “short” vowels. The letter “a” all by itself can be pronounced several ways. Try saying it in these English words:

  • apple
  • all
  • father

(And, of course, the way English speakers say these words varies depending on their background and their particular accent.)

The letter “a” in these three words represents three distinct simple vowel sounds. Simply teaching your students that a particular letter represents a particular sound generally does not work with English, though it may with your students’ first language.

If you want to say a word that has two vowel sounds, the vowels can no longer be simple vowels. Instead, when combined they become a diphthong. (See the explanation of diphthongs below.) In various letter combinations, “a” becomes part of a diphthong and takes on a new sound. For example:

  • may
  • mate
  • wait

Or it sounds as a different diphthong such as in

  • hair
  • wear

and another one again in words such as

  • ear

They all sound different. So how can you explain to your students how to pronounce the different vowel sounds?

Use the chart!

Even if you haven’t learned phonetics, and you are not very comfortable with the symbols, the Phonemic Chart can help you because it has “hints” or words to tell you what each sound is.

You can point your students to the chart, too, and they don’t have to learn the symbols (although they often find that it’s fun to do so!)

Just looking at the simple vowels on the chart, their positions on the chart give clues about how they’re pronounced. So if we just use the “hints”:

ship      sheep      book          shoot

left        her          teacher     door

Basically, the relevance of the rows and columns is:

  • In a very generalized sense, moving from left to right on the chart the vowels are formed from the front to the back of the mouth. Forming a sound at the back of the mouth just means the raised part of the tongue is further back.
  • In general, from left to right the lips will need to be increasingly more rounded.
  • Moving from the top row to the bottom row of the chart, the mouth becomes more relaxed and open.

With this information, now try saying each of the words in sequence and feel for yourself how your mouth moves. This may help you to explain to your students how to correct a particular mispronunciation.

Let’s look at some main vowel problem areas:

  • ship and sheep

Many languages do not make this vowel distinction, and often have a simple vowel that is pronounced somewhere between these two, usually a bit closer to “sheep” than to “ship.” There are a number of “minimal pairs”—two words with different meanings that are identical other than this vowel sound difference—using these two sounds, so you need to teach your students to make the distinction both in speaking and listening.

Ask them to move the sound forward in their mouths, closer to the tips of their tongues, to make the “ship” sound. Then ask them to lengthen the “sheep” vowel sound to accentuate the difference. Practice this until they get more used to it.

  • book and shoot

Again, many languages do not have the distinction between these two vowel sounds, and most commonly have a simple vowel that is partway between the two, slightly closer to the “shoot” sound.

However, you may be aware that in some dialects of English (especially northern England), the distinction between these two sounds is somewhat blurred. Words such as “room,” for example, can be said with either sound. Other words, such as “moon,” are only ever pronounced the one way.

There are not a lot of minimal pairs using these two sounds. A few are “pull” and “pool,” “full” and “fool” and “look” and “Luke.” In order to develop good, clear pronunciation, you should encourage your students to feel the difference. Again, lengthening the “shoot” sound when practicing can help.

  • her and teacher

The sound in “teacher is often referred to as “schwa.” It is the most common vowel sound in the English language. This is because it can be represented by almost any vowel sequence in written language. For example:

  • teacher
  • collar
  • doctor
  • measure
  • zebra
  • garden
  • fossil
  • lion
  • circus

The schwa is always in an unstressed syllable. And many (but not all) vowels in unstressed syllables become the schwa sound. This is one of the things that makes English spelling difficult, because in normal speech you can’t hear the specific vowels in the unstressed syllables. The vowels in the schwa sound are mostly always pronounced with a short “u” sound, as in “uh.”

This is particularly difficult for students whose first language has a different stress system or an almost phonetic spelling system.

The sound in “her is very much like the schwa but it is longer and can be stressed. This is what students sometimes resort to when they are struggling with the short, weak schwa. While these two sounds do not seem to present themselves in significant minimal pairs, it is important for students to be able to correctly pronounce the “schwa” because it is so common in English.

  • “left” and “hat”

The sound in “hat” does not occur as frequently as you might expect in other languages, and many students struggle with it. Some students will tend to replace it with the sound in “left”—which is found in pretty much all other languages.

Teach your students to practice dropping their jaws, opening their mouths a little, and lowering their tongues to find the new sound.

  • “hat” and “up”

Sometimes when struggling to pronounce the “hat” sound, students will come out with the up” sound because it is more similar to a sound produced in their first language. In fact, in some English dialects (such as in northern England) words which elsewhere have the “hat” sound are pronounced with the “up” sound.

But a common problem with the up” sound is the fact that it is most often written with the letter “u.” This causes students to expect to use the vowels which form the sounds in “book” or “shoot.” (Again, some dialects of English do pronounce words such as “up” and “umbrella” with the sound that is in “book.”)

Help your students to distinguish between “hat” (which is an open, front vowel) and “up” (which is an open, mid vowel), but also make sure they don’t get confused by the letter “u” in the written form.

2. Diphthongs

A diphthong is two vowel sounds that glide together and become like one long vowel, taking up only one syllable together. In some languages this doesn’t occur, and adjacent vowels must form two syllables (often separated by a “glottal stop,” which is like a catch in the throat).

Some languages only allow diphthongs in special positions such as the end of a word. It is uncommon to have as many diphthongs, or the same diphthongs, as we have in English.

Students learning English will often either:

  • Shorten them (like saying “kek” for “cake”).
  • Split them into two short sounds.

So if your students are having difficulties with diphthongs:

  • Make sure your students are aware of which two sounds make up the diphthong. This is clear from the chart, but not always obvious from the spelling of the words.
  • Help your students to blend the two sounds smoothly together, and don’t worry if the sound seems a bit long. They will only occur in stressed syllables when a lengthened vowel sound is acceptable. Make a fun activity of practicing long diphthong sounds.

3. Consonants

The consonants are actually quite a bit simpler than the vowels. There are generally six types of consonants:

  • Plosives (sometimes called “stops”) are formed when the air is stopped at a particular point in the mouth and then suddenly released. These are: p, b, t, d, k, g.
  • Fricatives are made by allowing the air to pass through a narrow gap causing friction. These include: f, v, th, s, z, sh, h and the sound of “si” in “television.”
  • Affricates are basically plosives that blend into fricatives. These are the sounds at the beginning of “cheese” and “joke.”
  • Nasals are sounds that vibrate through the nasal cavity. These are “m,” “n” and the sound usually written “ng” as in “thing.”
  • Liquids and laterals. These are the sounds “l” and “r.”
  • Semi-vowels. There are two of these: “w” and “y,” as they sometimes work as vowels and sometimes as consonants.

Generally there are three main points of articulation, or places in your mouth where the sounds are made. These are:

  • Right at the front of your mouth, using lips and/or teeth, and/or tongue. (p, b, f, v, th, m, w)
  • Behind your teeth with the tip of your tongue against the ridge behind your teeth, or further back against your palate. (t, d, s, z, sh, n, l, r)
  • In the back of your mouth near your throat. (k, g, ng, h)

Let’s look at some consonant problem sounds for ESL students: 

  • The liquids and laterals, “l” and “r.” Try saying these two sounds yourself, and work out what you do with your tongue. With “l” your tongue actually touches the roof of your mouth and the air rushes past on either side (hence the “lateral”). If you have Asian students, you will need to point this out to them, get them to practice and then have a signal or hand sign to help them to notice the different sounds when they are listening as well as when they speak.
  • Liquid “r.” Issues with “r” sounds go far beyond trouble distinguishing “l” and “r.” There are many different “r” sounds in English. Among English accents and dialects there are some who trill the letter “r.” There is also the difference between the rhotic accents (American and Canadian) and non-rhotic accents (others). For some of your students there may be significant differences between the various “r” sounds in their first language. As a result, your students may flap, trill or retro-flex the “r” sound in English. The difference is all about where the tip of the tongue goes and how it moves while executing the sound. Maybe you could have a lesson where you all practice making the sound. Working with a small mirror may help students to work out what it is that they are really doing with their tongues.
  • Fricative “th.” Many students are unfamiliar with this sound, and in a half-hearted attempt to create it they usually end up sounding out a “t” or “d.” To correctly produce the sound, it is vital for them to stick the tip of their tongue right out between their teeth. At first they may think that you are kidding! Get them to work with a partner and/or a mirror to make sure that their tongue is actually visible between their teeth. Or you can get them to put a piece of paper or their finger right in front of their lips and make sure their tongue touches it while they practice the sound.
  • Consonants using lips and teeth. Sometimes students get confused between “v” and “w.” You need to make sure they realize that for “v” (and also “f”) their top teeth should be resting on their bottom lip, whereas for “w” their lips are merely close together.
  • Sibilant Fricative “s.” This sound is common to every language, and so it shouldn’t cause a problem. But sometimes you might find that your students are saying it with a slight lisp, or making it sound more like “sh.” Try saying these sounds yourself, and you will realize that by simply moving the tip of your tongue further forward or back the sound is changed. Too far forward will cause a lisp, further back is where the “sh” sound is formed. Let students try making the different sounds by moving their tongue tip around.

4. Voicing

Some sounds are made using our voices, and some are not. Vowels are always voiced, but not all consonants are. To tell which is which, simply place your fingers gently on your voice-box as you speak and feel the vibrations there.

  • These consonants are voiced: b, d, g, z, m, n, ng, l, r, w, y, “-si-” (television), “j” (joke). There is also a voiced “th” (this).
  • These are voiceless, or unvoiced: p, t, s, k, h, th, sh, “ch” (cheese).

Students who have a problem with the unvoiced “th” will also struggle with the voiced version. Try saying these words:

  • “bath” and “bathe”: The “th” in the first word is unvoiced, in the second one it is voiced.
  • “cloth” and “clothes.”
  • the,” “then” and “that”: The “th” is voiced at the beginning of these common words, but in “thanks,” “theory” and “thick” the sound is unvoiced.

There is no easy rule about when to voice the sound, it is simply a case of becoming familiar with the words. But the first step is to make sure that your students can make both sounds accurately.

Consonants are affected by the sounds around them, because there are some situations where it is uncomfortable or very difficult to pronounce an unvoiced consonant next to a voiced consonant or surrounded by vowels.

For example, the simple past tense verb ending is “-ed,” but when spoken it doesn’t always have the voiced “d” sound. If the verb ends in an unvoiced consonant (e.g. “wash,” “pick”), the “-ed” sounds like “-t” (although it is still written the same). Just try saying “washed” or “picked” to hear that “-t” sound I’m talking about.

This aspect of pronunciation is generally practiced as part of a grammar lesson, but it helps to make students aware of the general principle.

It is also nice to discuss how many English dialects or accents tend to use voiced consonants where others would use unvoiced. For example, some might pronounce “better” as “bedder.”

5. Aspiration

Some plosive consonants (e.g. “p”) are aspirated.

That means that there is a little puff of air after the sound.

To test this, you can ask students to hold a sheet of paper up in front of their mouths while they say words with plosive consonants such as “paper.” They should notice that the first “p” has a puff of air, but the second one does not.

In English the aspiration is not significant. There are no minimal pairs where it makes a difference in meaning, and we tend to aspirate at the beginning of words but not in the middle or end. If we are making a point or trying to accentuate something we may add aspiration, without affecting the meaning. However, in other languages the aspiration may be more relevant.

If students say a word such as “paper” without the expected aspiration, it can sound like they are instead using the voiced consonant “b.” It can sound a little confusing, so it is worth explaining aspiration to your students and practicing it with them.

6. Unreleased Consonants

The consonants at the end of words are often not “released.”

For example, if you say the word “stop,” you close your lips on the final “p” and keep them closed—unless you are very excited, in which case the final sound might burst forth along with saliva and exasperation.

Some Asian languages have a very strong CVCV (consonant, vowel) pattern, and for native speakers of those languages this is a problem. They tend to add extra vowels rather than allow a word to end in a consonant, especially an unreleased one. Thus “Get up!” comes out as “Geta upa!” These students need to be taught to relax and let the consonants stay unreleased.

In other languages (e.g. Malaysian), when the final consonant is a plosive it is only present in the written form, neither sounded nor released. These students need to be encouraged to make the effort to actually form the final consonant and make some sound from it.

7. The Sounds Between Words

When the final consonant is unreleased, it generally reappears at the start of the next word…if that next word starts with a vowel.

Thus, in naturally-spoken English, the words all run into one another. They may form a continuous stream right up until the end of the phrase, clause or even sentence. While this makes listening (and understanding) difficult for language learners, it is also important for second language learners to learn to speak this way too. Students need to move from speaking word by word to speaking in whole chunks of language. That is how fluency is attained!

Teach pronunciation of words in context. Once they can pronounce a particular word, practice saying it next to other words. So now that expression “Geta upa!” should become “Getup!”

Practice dictation. Speak your mind, say one complete thought (e.g. clause or phrase) at a natural pace, all in one go, and let students try breaking it down into words. Get them to do the same in pairs.

8. Syllable Stress

Incorrect stress is not only uncomfortable, but it changes the meaning of words.

In some languages, syllable stress is almost irrelevant to meaning. However, in English, changing the stress can change the meaning of a word and the grammatical structure of a whole sentence. For example:

  • desert, desert, dessert: These are three different words, with the same consonants and vowels, but the stress changes the meaning.
  • permit, permit: These two words are clearly related in meaning. However, the first one is a noun (a piece of paper) and the second one is a verb (the action of allowing something). There are many other words like this.

While native speakers of English can generally understand a word even when the stress is misplaced, it can be very uncomfortable or confusing to listen to.

With long words in English which have added prefixes and suffixes, the stress often changes from the base word. This can also change the vowel sounds as they move from stressed to unstressed syllables. For example:

  • photo, photographer, photography, photographic.

*Note: Notice how the”o” sound changes quality (from a diphthong as in “show” to a simple vowel sound as in “on”) depending on whether or not it is in the stressed syllable.

This can be very confusing for language learners, and distressing when they are faced with reading aloud a text which contains a number of long multi-syllable words.

There are some rules (although they naturally also have exceptions) which you can teach your students to practice and increase their confidence in saying long words. For example:

  • Stress falls on the third-last syllable in words ending in a consonant plus “y” (but not “-ly”).
  • Stress falls on the third-last syllable in words ending in “-ize.”
  • Stress falls on the third-last syllable in words ending in “-ate.”
  • Stress falls on the syllable just before “-ic” or “-tion”/”-sion”/”-cion”/”-xion”

Although learning these rules may not help students at the moment when they are about to say a word, if they are preparing themselves to read something aloud they can practice new words until they are familiar with them.

9. Sentence Stress

English is generally considered to be a stress-timed language. While for linguistic purists this is not hard and fast, it does demonstrate an important difference in English compared to other languages which are syllable-timed.

What it means is that the number of important words in a sentence will determine how long it takes to say the sentence, rather than the overall number of words. The little, unimportant words are mumbled through quickly in between the important words.

So, for example, the following sentences all have the same important words (in capital letters), and adding in the other words/syllables does not make the sentence any longer when spoken:

  • SAM LIVES in a NICE, OLD HOUSE.
  • SAM LIVES in a LOVEly, OLD HOUSE.
  • SAM’s been LIVing in a deLIGHTful, OLD HOUSE.
  • SAM’ll be LIVing in a deLIGHTful, VicTORian cotTAGE.

In each of these sentences there are five stressed syllables, and so they essentially take the same time to say. Try clicking your fingers to the beat as you say the stressed syllables.

Secondly, in English, the deeper meaning behind a statement is in the stress. Exactly the same sentence can hold a different meaning depending on how it is stressed. Take this sentence for example:

  • HAVE you seen my new red car? (Really? Have you actually seen it?)
  • Have YOU seen my new red car? (Because everyone else has seen it.)
  • Have you SEEN my new red car? (You’ve heard about it, but have you seen it?)
  • Have you seen MY new red car? (There are lots of cars out there, this one is mine.)
  • Have you seen my NEW red car? (Yes, I had one before, this is my new one.)
  • Have you seen my new RED car? (I have several new cars, this is my red one!)
  • Have you seen my new red CAR? (It matches my other red toys.)

Students can have great fun dramatizing these sentences.

Ask your students to try stressing the right syllables in these sentences to get the correct meaning:

  • David stole the money, not Mike. (Stress “David” and “not.”)
  • David stole the money. He didn’t have permission to take it. (Stress “stole.”)
  • I haven’t seen the film, but David has. (Stress “I” and “David.”)
  • David stole the money. He didn’t touch the jewelry. (Stress “money.”)
  • Mike’s birthday is on the 28th, not the 24th. (Stress “8th.”)

10. Intonation

Even students who achieve a high level of accuracy in their general pronunciation of sounds and words can still struggle with intonation.

Although not a tonal language (like Chinese, for example), English has a particularly musical intonation, going generally higher and lower than others.

Listening to a native English speaker trying to speak another language and using English intonation can send speakers of that language into fits of laughter. So when they try to use English intonation, they actually feel a little embarrassed and often end up sounding rather flat!

The theory of English intonation is complicated, and not really necessary to learn to develop good intonation skills. It’s better to use immersion, and get students to listen to and copy as much natural English speaking as possible—including the intonation.

The high point, or peak syllable, comes at the end of an utterance, so this is where the drama happens. When focusing on the intonation for a particular sentence, always start at the back end. For example:

The sentence is: “Making my own pancakes every day is such a chore!”

Try saying it with attitude!

In this case the peak syllable is “chore,” so the pitch here should be high. Also “such” should be high, as well as maybe “own” and “every.”

Now to practice:

  • “…chore!”
  • “… such a chore!”
  • “… every day is such a chore!”
  • “… my own pancakes every day is such a chore!”
  • “Making my own pancakes every day is such a chore!”

Choose a favorite line from a movie, and let your students have a lot of fun!

Make ESL Pronunciation Fun and Easy

We’ve wrapped up those 10 main areas to focus on for pronunciation practice, so I’ll just send you off with some final tips.

  • Use that phonemic chart! You may choose to let your students try using the chart. It is very user-friendly, has a number of useful built-in functions and students often enjoy learning the symbols as well.
  • Use mirrors. Your female students may already be carrying their own makeup mirrors, or you could get hold of some small mirrors to share around. If students are struggling with particular sounds—such as “th”—then it is useful for them to see what they are doing with their mouths.
  • Use rhythm and beat. Especially when trying to teach or explain stress and intonation, get into some rhythm and beat. Besides being very relevant to the lesson, it is also very motivating for students, especially (but not only) younger ones. Go on the Internet and find some Jazz Chants, or make up some of your own.

That is all for now—and that is quite a lot of information.

Best of luck teaching pronunciation.

Just remember to teach all of these pronunciation lessons a little bit at a time.

You might be surprised by how much your students enjoy learning about this!

Enter your e-mail address to get your free PDF!

We hate SPAM and promise to keep your email address safe

Close