Peter Norvig’s Spelling Corrector in 21 Lines of Coffeescript

CoffeeScript is a very nice (and relatively new) language that compiles down to JavaScript, making web programming (and making firefox plugins, nodejs apps, and so forth) possibly more joyful. Its object model is the same as javascript (one of coffeescript’s motto is Unfancy JavaScript), and its compiled JS form is quite easy to read and debug. It has many niceties, including array/object comprehensions (heavily influenced by Python’s list comprehensions).

Ruby also has a influence on the language, such as optional parenthesis on method/function invocation. In fact, the original version of CoffeeScript compiler was written in Ruby (but nowadays CoffeeScript is a self-hosting language).

CoffeeScript has been used by several projects, including a mobile framework written by 37 signals. I’ve been using for about one year (including some open source work, and a port of ruby functionalities).

Because of all the Ruby and Python influence on the language, and the fact that CoffeeScript can convey beautiful and concise code, I had a hunch that it could get a really good position on Peter Norvig’s Spelling Corrector implementation collection (JavaScript’s smallest version currently has 53 lines, which is a bit more than Python‘s 21). With some work, I managed to implement it in 21 lines as well:

words = (text) -&gt; (t for t in text.toLowerCase().split(/[^a-z]+/) when t.length &gt; 0)
Array::or = (arrayFunc) -&gt; if @length &gt; 0 then @ else arrayFunc()
Array::flat = -&gt; if @length == 0 then @ else @[0].concat(@[1..].flat())
train = (features) -&gt;
 model = {}
 (model[f] = if model[f] then model[f] +1 else 2) for f in features
 return model
NWORDS = train(words(require('fs').readFileSync('./lib/big.txt', 'utf8')))
alphabet = 'abcdefghijklmnopqrstuvwxyz'.split ""
edits1 = (word) -&gt;
 s = ([word.substring(0, i), word.substring(i)] for i in [0..word.length])
 deletes = (a.concat b[1..] for [a, b] in s when b.length &gt; 0)
 transposes = (a + b[1] + b[0] + b.substring(2) for [a, b] in s when b.length &gt; 1)
 replaces = (a + c + b.substring(1) for c in alphabet for [a, b] in s when b.length &gt; 0)
 inserts = (a + c + b for c in alphabet for [a, b] in s)
 return deletes.concat transposes.concat replaces.flat().concat inserts.flat()
known_edits2 = (word) -&gt; ((e2 for e2 in edits1(e1) when NWORDS[e2]? for e1 in edits1(word)).flat())
known = (words) -&gt; (w for w in words when NWORDS[w])
correct = (word) -&gt;
 candidates = known([word]).or -&gt; known(edits1(word)).or -&gt; known_edits2(word).or -&gt; [word]
 ({k: w, v: NWORDS[w] or 1} for w in candidates).sort((a, b)-&gt; b.v&nbsp; - a.v)[0].k

All the code is hosted on Github. The code above can be seen in a more readable version here. There is a more testable version, along with Jasmine tests.

Considerations

Findall by regex is not a native function in Javascript, however it is equivalent to splitting by the complementary regex (see line 1).

Array::or (on line 2) was needed to be implemented because Python’s truthfulness allows a collection to be true (actually, any iterable) as long as it is not empty. Array::flat (on line 3) has to be implemented because CoffeeScript’s loop comprehension is a bit different from python’s: double loops (example: x + y for y in col1 for z in col2) return array of arrays instead of a single array.

Also note that the order of a loop comprehension’s syntax is inverted. That is: x + y for x for y in Python is translated as x + y for y for x in CoffeeScript.

This version runs really fast on NodeJs 0.4.1, and I was quite happy with the way the resulting code looked. I was even happier that I did not have to write the compiled JavaScript file and its whooping 148 lines of Spelling Corrector (minus the tests).

Finally please check out Peter Norvig’s original post.

Peter Norvig’s Spelling Corrector in 21 Lines of Coffeescript

Considerations

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112