ACES Logo

The hunt for cromulent words in the online wild

October 11, 2015 By Matthew Crowley

Heard any memorable neologisms, like bibliobesity or misophonia, lately? Maybe you saw them on social media or read them in a blog. You love these words, but typing them into Microsoft Word blows the spellchecker’s mind, causing a string of red underlines.

Erin McKean, a lexicographer and former editor-in-chief of American Dictionaries for Oxford University Press, knows there are bundles of these words floating around. So she wants to corral them for proper display on Wordnik, a not-for-profit online dictionary she co-founded in 2009. McKean and Wordnik last month launched a Kickstarter campaign to fund a search for a million “missing words” absent from traditional lexicons.

“I’ve been working on dictionaries for more than 20 years,” McKean says in a video on the Kickstarter campaign page. “People often come up to me and say, ‘I went to look up this word and it wasn’t in the dictionary.’ And often that breaks my heart, because the word that they’re interested in is a perfectly good word, a perfectly cromulent word, a great word oftentimes.”

(As Bustle’s Jaime Lutz notes, David X. Cohen a writer for “The Simpsons” coined cromulent for a 1996 episode of the show called “Lisa the Iconoclast.” The word means “fine” or “acceptable.”)

The campaign, McKean explains, will let Wordnik hunt for these words in the online wild — and see them used in real examples by real people, perhaps in tweets, Flickr photo captions or other ways. Particularly, McKean adds, she and her lexicographic team will search for “free-range definitions,” in which the examples define the words. These are also known as FRDs, or “freds.”

On the Kickstarter page, McKean said Wordnik will hunt for nonce-words (words used once); affixes (prefixes and suffixes) and words created by adding affixes to existing words; fixed idioms and phrases; and words English writers and speakers have borrowed from other languages but that other dictionaries haven’t classified as naturalized in English.

Wordnik will start its project by using thousands of already found free-range-definitions as a training set for a new machine-learning-based tool, McKean wrote.

Although the campaign had already met and exceeded its $50,000 goal by Sunday, Oct. 11, it was set to continue until Friday, Oct. 16. As of Sunday, 640 backers had already signed up.

Contribution levels include random backer listing ($1 donation), which will get a random donor’s name displayed when Web surfers click Wordnik’s “random word” link, word adopter ($25 donation), for which Wordnik will list the donor as a neologism’s adopter for a year, and many others.

If you can’t find a neologism to sponsor, Wordnik will, for a donation of $7,500 or more, create one for you and shower you with accompanying adoption perks, which include a set of Wordnik and word adoption stickers and a downloadable adoption certificate.

In case you were wondering, misophonia, cited in the Huffington Post, is a hatred of sound. Bibliobesity, which The New York Times credited to Wall Street Journal theater critic Terry Teachout, is a problem of bloated books.

Other words awaiting lexicographical moments include budthrill, which PRI’s Patrick Cox describes as an “unforgettable moment in a podcast,” and farecasting, which New York Times travel columnist Stephanie Rosenbloom describes as the effort, often on travel apps, to predict which ticket-buying date will yield the lowest fares. In an Oct. 3 article on Wordnik’s campaign, the Times’ Natasha Singer mentioned Rosenbloom’s “farecast” and “roomnesia,” a condition in which people forget why they walked into a room. (Let me suggest “fridgetfulness,” the same phenomenon as roomnesia, except that it happens with one’s head in the icebox. Yup, just thought of it.)

A million is a big number, and some skeptics may wonder whether that many unrecorded English words are really at large. Absolutely, McKean writes, citing 2010 Harvard research cited in the journal Science.

McKean wrote that researchers tapped the Google Books Corpus, which includes 5 million books and 361 billion words, and compared samples to dictionaries including the Oxford English Dictionary and the Merriam-Webster Unabridged Dictionary. The conclusion? More than half of the English lexicon, 52 percent, consists of “lexical ‘dark matter’ undocumented in standard references.”

Why would anyone support this project to make seemingly fringy words lookupable? Maybe to make Wordnik, already a powerful tool for headline writers, even handier. Or maybe for simple logophilia.

“If you’re a writer, you should back this Kickstarter because we’ll find you many more words to use,” McKean says in the video. “If you like to cheat at Words With Friends, you should back this Kickstarter because we will find more words for you to play. … We believe every word deserves a place in the dictionary.”

Matthew Crowley is a copy editor, writer and member of the American Copy Editors Society and Editorial Freelancers Association. Follow him on Twitter @copyjockey or e-mail him at copyjockey.mcc@gmail.com.

Recent Posts

Linguist Lynne Murphy, Separated by a Common Language blogger, to be keynote speaker at ACES 2018

Becoming a better editor through reading

What happens when you add journalists, politicians and alcohol to a spelling bee?