My uncle picked us up from the airport in Suva, Fiji, around 4 a.m. local time.
As we drove to his house, I leaned my head back and looked through the sunroof at the stars.
Being from California, I was used to the stars and constellations of the Northern Hemisphere: Orion, the North Star, the Big and Little Dippers.
That Fijian morning, there were no points of reference.
I stared into the sky the whole way home and felt completely lost for every second of it.
If you’re starting to learn Chinese, you’ll probably experience that same lost feeling when you stare at a Mandarin Chinese textbook, your plane ticket to China or the face of a Chinese person who can’t speak English.
Having no points of reference will do that to you.
So if you feel a little overwhelmed at the thought of trying to pronounce Mandarin, you’re not alone.
As you may know, there are levels of sound and inflection that can affect your 意思 (yì si — meaning, as in what you’re trying to convey).
The wrong tone can make a genuine expression sound sarcastic. The wrong inflection can make a confident expression sound nervous. If you mispronounce a word and accidentally say another word in the wrong context, you can really get yourself in trouble.
Those are just a few of the challenges of speaking English, but I’m sure you’re handling them well. Chinese isn’t much different in these respects, so don’t panic. Here’s your guide to the Chinese pronunciation galaxy.
Don’t Panic: Your Guide to the Chinese Pronunciation Galaxy
Why You Can Pronounce Chinese Fearlessly
The first word almost anyone learns in Chinese is 你好 (nǐ hǎo — hello). This word is a great example of how to use English sounds to pronounce Chinese.
- The “n” in nǐ sounds like the “n” in “nose.”
- The “i” in nǐ sounds like the “ee” in “beep.”
- The “h” in hǎo sounds like the “h” in “horse.”
- The “a” in hǎo sounds like the “ah” in “blah.”
- The “o” in hǎo sounds like the “oh” in… “oh.”
- The combination of “a” and “o” in hǎo sounds like the “ow” in “towel.”
Now we’ll combine the sounds:
- n + ee gives you the English word “knee.”
- h + ah + oh gives you the English word “how.”
You just used English to pronounce the most common Chinese greeting there is. Here’s why you can do that.
How Pronunciation Works
Pronunciation (the sounds you make) comes from a combination of three factors: mouth shape, tongue placement and air flow.
- Mouth shape. Mouth shape is the width of your jaw and the shape of your lips. For example, make the “ee” sound without actually making a sound. Your jaw is closed so your teeth almost touch, and the corners of your lips are pulled back, almost as if you were smiling. Most Chinese sounds require the same mouth shapes as English sounds.
- Tongue placement. Your tongue goes all over the place when you talk. Very slowly, re-read that last sentence out loud (you can whisper if you want) and think about the different parts of your mouth your tongue touches. Depending on your accent, your tongue might have had two placements when you said “your” (behind your bottom teeth for the “y” sound and then a quick move upwards to make the “r” sound). Most Chinese sounds require the same tongue placements as English sounds.
- Air flow. The air flow of English sounds involves your diaphragm, lungs, throat, nose and tongue. Chinese sounds don’t use the diaphragm and use little of the throat, so in a way you’re already equipped to speak Chinese. If you speak from your diaphragm and use more oomph than necessary to pronounce Chinese, it won’t ruin your pronunciation.
Different muscle movements throughout your torso create different sounds. Professional linguistics has specific terms for different sounds based on how you use your mouth, throat and tongue to make them. These terms may be helpful when practicing at home, so don’t disregard them. At the same time, you probably don’t think about how saying “la-la-la” requires liquid alveolar articulation, so if you don’t have professional linguistic help, you can still do just fine.
How Tones Work
Tones are big. Really big. They can instantly turn your kindness into a misunderstood assault of words or make a deep thought sound like you never went to school. Practicing your tones is the greatest foundation you can lay for solid Chinese pronunciation. Here’s how you can use English tones to help you use Chinese tones.
Yes, English uses tones, too. Below is an example dialogue. Your lines have no punctuation so you can focus on putting your personal emotion into it. When you read your lines, give it all you’ve got. Person A is your life-long best friend who happens to be horribly monotone and dreary, and Person B is you:
A: “I just won the $50 million lottery.”
B: “Whaaaaaat” (unbridled, happy shock)
A: “I just tried it for the fun of it and I won.”
B: “Yeah” (excitedly said as a question, seeking confirmation of what was just said)
A: “Yeah, but it’s all meaningless anyway…”
B: “Huh” (mind-numbing confusion)
A: “…wouldn’t you agree?”
B: “No” (an exclamation equivalent to a verbal slap)
If you executed those lines well, you just used all four tones in Mandarin.
- The first tone (unbridled happiness) is a longer, drawn-out tone that doesn’t change pitch.
- The second tone (question) rises, just like you’re asking a question.
- The third tone (utter confusion) dips down a bit, then rises (say “huh” really slowly to hear it well).
- The fourth tone (exclamation) starts high and drops very sharply.
Try that conversation again so you can hear your tones.
The tricky part is removing your native emotion from those tones.
Now try the conversation one more time: Keep your emotional cues so your tones maintain form, but this time say your lines with emotional limpness.
That third time was probably a little harder. Once you can remove English-speaking emotion from those four inflections, you’ve nailed all four Chinese tones.
There’s also a neutral tone, which sounds like a syllable that’s said only because it has to be said. For example, 谢谢 (xiè xie — thank you) emphasizes the first “xiè” because it’s the fourth tone, but the second “xie” is dull and short. If you’re uncertain how to pronounce a syllable with no specified tone, then pronounce it with said uncertainty, because that’s how it’s supposed to sound.
Emotion in Chinese is expressed with intensity and volume. In English, if I hand you a small package of peanuts for no apparent reason, you might say “thanks” with deadened zeal, expressing your confusion for what just happened. If I hand you movie tickets and say “happy anniversary,” you (hopefully) would say “thanks” with a little more vigor.
Similarly, you can say 谢谢 (xiè xie) with complete un-excitement if you have no deep feelings toward what just happened, or with a higher intensity and volume because you sincerely appreciate someone’s gesture.
Our brains are good at separating and organizing communication concepts, so if you’ve never learned a language before, don’t worry about Chinese affecting your English. At the same time, training your brain to accept Chinese communication concepts will take time, so don’t give up if you don’t get it right away. Once you learn to pronounce Chinese properly, learning words and their meanings will be the only thing standing between you and fluency.
Why You Should Learn 拼音 (pīn yīn)
Chinese characters are basically hieroglyphs, a form of writing that gives no clues for pronunciation. That’s why you probably didn’t know how to pronounce 拼音 (pīn yīn — phonetic writing) when you first saw it. Although some Chinese learners feel that pinyin is too much of a crutch, most will say that it’s very helpful, and it has a lot of uses outside of learning pronunciation.
Pinyin uses “English” letters to create new sounds. That may sound challenging, but if you can say “tortilla” correctly (tor-TEE-ya), you can already do this.
If someone says something to you in Chinese, pinyin is good to know so you can look up what it means. For example, if someone said the word 拼音 (pīn yīn) to you, you would know how to write “p-i-n-y-i-n” to spell out the sounds and find the translation.
Two useful resources for English-Chinese translation, among others, are Pleco and Baidu Fanyi.
- Pleco (available on Android and iOS mobile platforms) is probably the translation resource most used by English learners. Some benefits are color coded tones, OCR (optical character recognition, meaning you can scan a character and it will look it up for you) and a clipboard reader which lets you copy entire sentences and work through them word by word.
A couple downsides are that each word entry has a slew of meanings—which is great for studying, but may be overwhelming when you need on-the-spot translation—and that Pleco doesn’t specify if a word is a noun, verb, adjective or more than one of those, so you’ll have to work to figure it out.
- Baidu Fanyi is a great English-Chinese and Chinese-English translation website. It searches the web for sentences that use the word you’re looking up and shows you different contexts for using that word. If a word can be translated several ways, you can select the different meanings and see how they fit into the context. The trouble is, once you find the meaning you’re looking for, it gives you characters but not the pinyin, so you’ll have to look that up yourself.
Pinyin is also useful for daily life. If you hear a word often enough that you want to write it down to look up later, you know how to spell the pronunciation. Chinese people regularly use pinyin for texting since the stroke order of a character isn’t easy to remember.
Pinyin is being taught in schools in China as well, so it even helps Chinese kids understand how their pronunciation works. Some characters that look the same may have very different meanings, so seeing the pinyin can help avoid confusion. Learning Chinese well without pinyin is possible, but highly improbable, and its uses go far beyond the training wheels of pronunciation.
The Basics of Pinyin and Mandarin Pronunciation
Mandarin syllables are broken down into initial sounds (initials), final sounds (finals) and tones.
Initials and Finals
Mandarin has a total of 21 initials and 34 finals that combine to make around 400 sounds (not all combinations are used). In contrast, English has 20 vowel sounds and 28 consonant sounds but can have layers of sound combinations in one word (“not,” “naut,” “nought,” “ought,” “drought,” “draught,” “laughter,” “daughter”). In this respect, Chinese pinyin is actually easier.
Tones and Their Marks
Pinyin uses tone marks to signal the “shape” of the tone:
- The first tone is long and straight (mā).
- The second tone rises from its starting point (má).
- The third tone drops down a bit, then rises (mǎ).
- The fourth tone drops sharply (mà).
- The neutral tone does absolutely nothing (ma).
The tone marks are helpful for training your pronunciation; if you miss the tone, you might as well be speaking another language.
The combination of initials, finals and tone marks help you read pinyin. Let’s take the first character in “pīnyīn” (拼 — pīn) as an example:
- The initial is “p” (like the “p” in “pot”).
- The final is “in” (like the “een” sound in “scene”).
- The tone is the first tone, marked by a straight line over the vowel, resulting in “pīn.”
For practice, you can try reading the following sentence out loud, preferably with someone who can tell you how you did. Be patient and read it slow at first, then see if you can run your speed up a little.
wǒ de zhōng wén pīn yīn dú de hén hǎo (我的中文拼音读得很好 — My Chinese pinyin reading is good.)
Side note: 读 (dú) refers to reading aloud, not reading a book silently to yourself as you might do when sitting by a fire. Unless, of course, you read out loud next to the fire, which is purely a personal choice.
The English equivalent of pinyin is the characters in an English dictionary that try to explain the pronunciation of the word. It would be supremely helpful if we learned how to read that language as children, but that would negate the value of the unnecessarily complicated written language you’re reading now. Unlike English, Chinese characters are rich with meaning, although not pronounceable, making both pinyin and the written language necessary. Although some may think it lacks long-term benefits, learning pinyin is definitely the way to go.
Challenging Chinese Sounds for an English Speaker to Pronounce
As previously mentioned, the majority of Chinese sounds resemble English sounds. The word “majority” implies that there will be hiccups along the way. This part of the guide will help you navigate the parts of Mandarin that are trickier for English speakers.
Note: The pinyin pronunciations with links below will take you to Baidu Fanyi, where you can click on the sound icon to hear the pronunciation of example words. The “combinations” section includes examples of all the sounds not given examples in the “initials” and “finals” sections.
You can also find these sounds and much more in use in authentic videos on FluentU.
Each video comes with interactive captions, flashcards and exercises to make sure you’re actively boosting your Mandarin Chinese language skills while you watch. The immersive, entertaining content makes grammar, vocabulary and pronunciation much more memorable!
8 Challenging Initials
For these first three initials, you should feel the bottom of your tongue against the ridge of your mouth. “The ridge” is the part of your mouth behind your top teeth that turns upward toward your palate. Linguists and oral health experts refer to this as the alveolar ridge; the rest of us refer to this as “the part that gets burnt when you eat really hot pizza.” When you say “la-la-la,” your tongue hits somewhere around the ridge. English rarely requires your tongue to go above the ridge and closer to your palate, but this is an important ability when speaking Chinese.
- zh (English “j” with your tongue above the ridge).
- ch (English “ch” with your tongue above the ridge).
- sh (English “sh” with your tongue above the ridge).
For the next initial, your tongue should almost touch the roof of your mouth. The higher you can get your tongue, the more authentic your pronunciation will sound.
- r (English “r” with your tongue almost touching the roof of your mouth).
- 人 (rén — person).
For the next two initials, the trick is to relax your lips and not pucker when your pronounce them. Mouth shape is an important part of pronunciation.
- x (English “sh” through your teeth).
- q (English “ch” through your teeth).
The last two are English sounds but can be easy to mix when speaking. Chinese language students have an especially hard time with this, so if you’ll be taking Chinese classes, train yourself to differentiate the two before starting your class.
- z (“ds,” as in “kids”).
- 自己 (zì jǐ — the suffix “-self”; also used to refer to a person mentioned earlier in a sentence).
- c (“ts,” as in “kits”).
- 词典 (cí dǐan — dictionary).
Consider that even in English, “kids” and “kits” have two very different meanings leading to two very different reactions:
- “Our office doesn’t have any first aid kids.”
- “I was carrying two kits and dropped one of them.”
Context helps, but even in context, you would probably correct someone who said “first aid kids.”
3 Challenging Finals
There’s only three finals listed, but they’re used a lot. The first two finals are extremely nasal.
- ong (“o” as in “home,” “ng” pronounced strong through the nose). I chose “home” as the example word because when Chinese people pronounce 红 (hóng — the color red), it sometimes sounds like they’re saying “home.” That’s how strong the nasal sound should be.
- ing (“iing” as in “skiing,” “ng” pronounced strong through the nose). I chose “skiing” as the example word because when Chinese people pronounce an ing word slowly and deliberately, it sounds like the “iing” in “skiing” (“ee-ing” squished into one sound). For example, you speak 英文 (yīng wén — English).
The last final is as non-English as it gets in Chinese.
- ü (“Louie” without the “L,” softening the ending). This is another word where, when Chinese people pronounce the sound slowly and deliberately, you can clearly hear the “oo-wee” sound (again, squished into one sound). You’ll need this if you ask for 绿茶 (lǜ chá — green tea) or if you tell someone you speak 英语 (yīng yǔ — another way of saying English).
8 Challenging Combinations
Let’s start with the ridge again.
- zhi (English “j” with your tongue above the ridge, finishing with a soft “r” sound; like “jerk” without the “k”).
- 知道 (zhī dào — to know).
- chi (English “ch” with your tongue above the ridge, finishing with a soft “r” sound; like “chirp” without the “p”).
- 吃饭 (chī fàn — to eat).
- shi (English “sh” with your tongue above the ridge, finishing with a soft “r” sound; like “shirt” without the “t”).
- 是 (shì — yes).
For all three of those sounds, the faint “r” sound at the end comes naturally for Chinese speakers. Training yourself to finish with that soft “r” will give your pronunciation a more authentic feel. It will also help you distinguish those sounds from the next three (think “first aid kids”):
- zhe (English “j” with your tongue above the ridge, finishing with an “uh” sound).
- 这 (zhè — this).
- che (English “ch” with your tongue above the ridge, finishing with an “uh” sound).
- 车 (chē — car).
- she (English “sh” with your tongue above the ridge, finishing with an “uh” sound).
- 设 (shè — to design; to establish). Beginners don’t use this sound often, but it’s still good for practice so you’ll be prepared when you reach intermediate levels.
The next two are said through your teeth. Again, don’t pucker your lips.
- xi (English word “she” said through your teeth).
- 西 (xī — west).
- qi (English word “cheat” without the “t” said through your teeth).
- 七 (qī — seven).
8 Tricky Pinyin Pronunciations
The next four are pinyin rules you’ll have to memorize. The u sound in these words is the ü sound even though the two dots aren’t written over the vowel.
- ju (English “j” plus the ü sound).
- 句子 (jùzi — sentence, as in phrase).
- xu (English “sh” through your teeth plus the ü sound).
- 需要 (xūyào — to need).
- qu (English “ch” through your teeth plus the ü sound).
- 去 (qù — to go).
- yu (only the ü sound)
- 语 (yǔ — language). That’s why yīngyǔ didn’t have the two dots over the u earlier.
That last one, yu, introduces the next pinyin rule. In the following four sounds, the first letter is silent. Pinyin was invented long after Chinese was a written language. The silent initial letter is written because every word needs an initial and a final, but in these cases, it shouldn’t be pronounced.
- yu (only the ü sound).
- 母语 (mǔ yǔ — literally “mother language,” as in your native language).
- yi (only the i sound, pronounced “ee”).
- 一 (yī — one).
- ying (only the ing sound).
- 英国 (yīng guó — England).
- wu (only the u sound, pronounced “oo” as in “food”).
- 五 (wǔ — five).
On resources like Learn NC, you can find more detailed help for initials, simple finals, compound finals and nasal finals, with audio files.
Don’t forget, though, your greatest Chinese language learning resources are actual Chinese-speaking humans.
And if you’re hungry for live practice, there’s always your local Chinese restaurant at the end of the Mandarin universe.
If you liked this post, something tells me that you'll love FluentU, the best way to learn Chinese with real-world videos.