A question of vocabulary (part two)
Getting to grips with Google N-gram
In my previous blog I outlined a few ways you can intuitively respond to students who ask about differences in word meanings. If time and resources allow, another way of handling questions of vocabulary is with the Google Ngram Viewer - a really useful tool for English language teachers!
An N-gram is a string of letters - it can be a word (like 'get'), or a word cluster (like 'a lot of'). Google N-gram can cope with up to 5-word clusters. The program runs a fancy statistical analysis that find the frequency of the N-gram in a large collection of texts (which is called a 'corpus'). Google has a really large collection of texts - the Google Books project, which amounts to over 4 million words. Google's N-gram Viewer gives you a visual display of the frequency of your N-gram over time, either in the entire corpus, or in only a specific sub-corpus, for example American English.
This is how it works. Go to Google Ngrams Viewer. Type your keyword into the search bar at the top and click to see the results as the graph. You can select which corpus interests you in the drop-down menu - select 'English' for everything, or only 'British English' (for example), to search a more specific corpus. You can also adjust the time frame - as far back as 1500. Cool huh? If this is all new to you, let me show you some ways this might be useful to you as a language teacher.
English over time
Because the Ngram viewer plots frequency of the N-gram's occurrence over time, it's easy to see how words are coming in and out of favor. As an example, have a look at what's happening to 'ought to':
From the looks of it, this word is becoming less frequent. Does that seem right?
The Ngram viewer can also give you some insight into how words are changing. An obvious example is what has happened to the word 'gay' over the last century:
You can see when it is that this word became endowed with its new meaning from the subsequent increase in the frequency of its occurrence. It's also quite interesting to see how its use (with its meaning of 'carefree') rose slightly between the two world wars.
Because you can put multiple items in the search area (separated by commas), you can compare the use of different words. So, to take a straightforward example, let's have a look at the difference between 'apothecary', 'chemist' and 'pharmacy':
It is clear that 'apothecary' is becoming increasingly rare. That's probably no surprise. That 'pharmacy' is gaining ground over 'chemist' was a surprise for me, given that I never buy anything from a pharmacy. I must have learned to speak English in the 1960s during the 'chemist' heyday.
We can do more! Do you say 'at the weekend' or 'on the weekend'? Is either preferable? Let's see what Ngram says:
Is that the answer? Possibly, but we can explore further. Selecting from the 'corpus' dropdown menu just below the search function, we can isolate the search to only British English (and so filter out the American influence), and so get this result:
Now it's easy to see that the British are more prone to doing things 'at the weekend'. I wonder why people didn't refer to the weekend much before the second world war. Is doing things at/on the weekend a modern thing?
Giving students autonomy
We can teach students to use the Ngram Viewer to get some answers for themselves. Some student questions are really easy to handle, but others feel like any response relies on your own idiosyncratic usage of English. One example may be whether it's better to say 'toward' or 'towards'. The answer to this one is usually along the lines of: 'towards' is British. We can check:
It's true! The Americans say 'toward'. What do Australians say? How about something a little less obvious? Should it be 'different from', or 'different than'? I've been part of a few fearsome staffroom brawls on this one. If a student has this question, simply whip out your Ngram Viewer, and the answer is no longer a mystery.
Staffroom brawl averted, and the student's question answered.
There is much more that Ngram Viewer can do. When you want to find for a variety of responses to a single search, you could use wildcards (using *, like 'as * as'), and the programme will find the most common selection of items that fit (in this case, 'as well as' is the most common). You can also compare words with different parts of speech, like the word 'book': using a search tag 'book_NOUN, book_VERB'. To demonstrate these two functions, imagine you are planning to teach phrasal verbs with 'get', but there are too many to focus on in just one lesson. Using Ngram Viewer, you can see which are the most common using the search term 'get *_ADP'.
It's really worth the time and trouble to explore its functions, which are outlined here and are quite easy to follow. Obviously, as with all things statistical, there are some problems with the Ngram Viewer's results, and it may interest you to find out about these, which are introduced in this blog on Wired. You can then spot a few issues with some the results I have presented here.
Problems aside, Ngram Viewer is a great tool for language teachers. If you feel like playing, open the Ngram viewer and see what you can find out about the following:
Does 'enrol' have one or two l's?
What's more common - 'couch' or 'sofa'?
Is 'beneath' better than 'underneath'?
Should we say 'have a shower', or 'take a shower'?
Is it okay to say 'if I were you', or should it be 'if I was you'?
The Ngram Viewer works on your phone too. Get your students to see what they can find out - if you are allowed to have them using their phones in class!
Steve has been a teacher and teacher trainer for over 30 years, and is currently a lecturer on the Master’s in TESOL program at King Mongkut’s University of Technology Thonburi in Bangkok.
Post a Comment
(no sign-in required)
Great blog! Very informative.
I think I may use this as source material for an IELTS writing Part One task,
By Joko, Myanmar (5th October 2016)