In the weird and wonderful world of words, which world of words is the weirdest? And if we replace ‘weird’ with ‘hard’, we find one of those eternal questions facing language learners: which language is more difficult?

The question usually ends up in a battle between disparate but well-known languages, compared on the basis of their obscure writing systems, difficult pronunciation, complicated grammar or unfamiliar structure. Tones, clicks and ideograms all play their part. Yet if reason prevails and alcohol levels don’t get too high, most arguments are settled on the reasonable assumption that the most difficult or weirdest language to learn is always that one which is most different from the one(s) you already speak. Knowing English, for instance, means you should have a much easier time of picking up a language of Germanic or Romance origin. Range a little further in the Indo-European family and things get a little trickier, but still far easier than tackling a language from a different group altogether.

Yet when considering which languages are weird, we are liable to approach the topic from an Anglophone and Western-centric standpoint. Riding the chargers of history into every corner of the globe, Indo-European languages exhibiting similar characteristics can be found on every continent in every hemisphere, spoken by billions of people, plus many more who can speak one as a second language. It’s on the basis of these pillars that many of us consider ‘normal’ the features found in these languages: they appear to be commonly held principles across a large number of different tongues spoken by huge swathes of the population.

So how does this all weigh up if we recalibrate the scales? The people at Idibon produced an interesting little study1Since originally starting this post, the website has unfortunately died. I should probably learn to finish my draught posts a little sooner than two years. using the data stored in the World Atlas of Language Structures, which tracks 2,679 different languages according to an assortment of lingual features, including sound types and syntactical/grammatical structures. In total 192 topics are covered, ranging from the use of genders through to the order of words in a sentence, with many technical offerings in between. The group limited themselves to 21 features for which data was available for at least 100 languages, and eliminated correlative features which would have given undue weighting to particular features. Since the data in the WALS is relatively sparse, they limited their little report to 239 languages which could be compared using at least two-thirds of the selected features.

The interesting thing is just how far those common languages we take as being regular stand out from the crowd. German, Czech and Spanish all crop up in the 25 weirdest languages, with English appearing at number 33. What makes them so extraordinary? Two example features are used to highlight how unique many Indo-European languages are. Inverting subject and verb to form a question – ‘You are happy’ vs. ‘Are you happy?’ – is in fact a rarity, occurring in only 1.4% of languages surveyed (of the 955 languages covered, the vast majority use a form of question particle and only 13 use an altered word order. Similarly, having special subject pronouns present most of the time, as found in the Germanic languages, appears in only 82 of the 711 languages with data. At the opposite end of the scale, languages such as Basque, Cantonese, Turkish and Hindi all tally up as rather average, with Hindi being the most normal of the all (just one weird characteristic).

Of course, the survey has its limits, given the paucity of data in many areas, the fact that a majority of languages aren’t documented or covered by the survey, and even the very issue that the database chosen by its very nature reflects the interests and observations of speakers of Western languages. Nevertheless, it remains an interesting conclusion that English, despite its wide geographic spread and seeming ubiquity, is actually one of the weirder languages out there.

   [ + ]

1. Since originally starting this post, the website has unfortunately died. I should probably learn to finish my draught posts a little sooner than two years.

