Issue 77 * August 20 2009

Introducing Spellcheck and RunRevPlanet
Our newest revSelect Vendor lets you into some of the secrets of creating a SpellChecker in Rev

by Scott McDonald

Would you or your clients benefit from having spell checking in your Revolution applications? You may have seen Revolution spell checking solutions that are not cross platform or look a little dated, But if need a solution with neither of these limitations, then there is the new RunRevPlanet SpellCheck Stack.

With the RunRevPlanet SpellCheck Stack you can add spell checking with two lines of script:

call "Initialize" of stack "rrpSpellCheck"
call "SpellCheckModal myTextField" of stack "rrpSpellCheck"

Where myTextField is the name of the field object that you want to spell check. It can be that simple and only takes minutes to add full spell checking to your application. This script adds what some would call an old fashioned (but still very effective and easy to use) type of spell checking. If you prefer the modern look, a line like this:

call "SpellCheckModeless myContextualMenu, myTextField" of stack "rrpSpellCheck"

Is for modeless spell checking, the type where the incorrect words are highlighted with an underline and you can correct then in any order you want with a pop-up menu. There is more script required when adding modeless spell checking, but it is no more complex that the lines above.

There is more to spell checking than just comparing words with those that exist in a list. The RunRevPlanet SpellCheck Stack includes three separate 60,000 word dictionaries for UK, US and Australian versions. These dictionaries are our own which have been developed over 15 years of real world use and you can use them in the RunRevPlanet SpellCheck Stack without any further licensing complications or costs.

Introducing RunRevPlanet

Given that RunRevPlanet is a new name for the Revolution community, how did the RunRevPlanet SpellCheck Stack come about? Scott McDonald PC Services provides reporting and assessment software to schools largely in Australia, but until now we have been largely focused on the Windows platform using Delphi. In the search for a suitable cross-platform development environment Revolution made the short list. To be sure that it would meet our programming needs and expectations we wrote a spell checker as a test of the speed and robustness of Revolution.

Two characteristics of a spellchecker are:

  • The amount of data are being handled can be large
  • Speed of execution has a direct impact on the users experience

Is Revolution up to the task?

Brief look inside

Assuming you have a suitable list of words for the dictionary, two simple ways (there are others) to store the dictionary are:

  1. Lines of text in a single variable or property
  2. Elements in an array

During development the dictionary can be stored in a text file with one word per line and looks something like this:

a
aardvark
aardvarks
aback
abacus
abacuses
abaft
...
zucchini
zucchinis

Then the dictionary can be loaded into Revolution when doing a check. Let's look at how a dictionary could work by storing the words in a single variable.

global gWordDict
put URL "file:dictionarylist.txt" into gWordDict

This line of script puts all the words from the dictionary into the variable named gWordDict. Then it is very easy to create a function that tests whether a word is in the dictionary. For example:

function IsValidWord pWord
   return pWord is among the words of gWordDict
end IsValidWord

Is one way to do the test. This simple function could form the basis of a simple spellchecker. If the amount of text that is being checked is small and the dictionary suitably modest, then a spellchecker based on code like this may perform adequately. The problem is once the amount of text being checked is measured in the thousands of words and the dictionary is more complete then performance begins to be an issue.

This is because searching for a single word among all the lines in the dictionary is inefficient as the entire dictionary must be scanned for every word. When doing preliminary experiments for the RunRevPlanet SpellCheck Stack, storing the dictionary in such a simple structure could result in spell checks taking minutes while checking tens of thousands of words. Keeping users waiting that long really isn't acceptable, so further thinking was required.

Since the dictionary is sorted, it would be possible to speed up the search for a word by using a more sophisticated approach that doesn't need to scan the entire dictionary, such as a binary search. You can look up "Binary Search" on Wikipedia if you are not familiar with it, but suffice to say it is fast, real fast if you are searching through sorted data. While a binary search script is not overly difficult to write, why use a binary search if Revolution has a solution that require less script? Instead the second way of storing the dictionary in an array is better and doesn't require a binary search to make it fast.

Binary search not needed, shorter scripts, faster development

An array in Revolution is not your traditional array like in a third generation language such as Delphi or C++, it is actually an "Associative array" that has the characteristic, as described in the Revolution User Guide, that "Each element in an array can be accessed in constant time." This allows for very fast checking without having to use complex algorithms and scripts. That is one benefit of Revolution – less code, faster development. Loading the dictionary is now like this:

global gWordDict
local tWordList, tWord
put URL "file:dictionarylist.txt" into tWordList
repeat for each line tWord in tWordList
   put true into gWordDict[tWord]
end repeat

And the valid word function becomes:

function IsValidWord pWord
   return gWordDict[pWord]
end IsValidWord

So how much faster is the IsValidWord function when using an array? The graph below shows the difference in performance between using lines of text in a variable and using an array when checking a 50,000 word article.


That black line is not a mistake, using the array in the script above took only 196 milliseconds – quite a difference. That is over 500 times faster with very little effort. That's real fast without complicated code. That is not to say that there anything wrong with the among operator, just that it is not appropriate for this purpose.

More than just searching

Of course, there is much more to a spellchecker than just testing if words are in a dictionary and we put in considerable effort documenting it and making it easy to use. Then satisfying the expectation that a spellchecker can offer good suggestions for each incorrect word, was not a trivial problem to solve and is worthy of an article all its own. Ultimately we were most impressed with Revolution, we produced a spell checking stack that:

  • Required less code than our previous spellchecker written in a third generation language.
  • Was developed more efficiently than we could have hoped for.
  • Is cross-platform with little extra effort.

The experience was so positive, that Scott McDonald PC Services has mapped out a series of projects in Revolution, with the emphasis on supplying tools and stacks for Revolution developers to save you time and effort. The RunRevPlanet SpellCheck Stack is available now to allow you to add spell checking to your own Revolution applications.

About the Author

Scott McDonald is the owner of Scott McDonald PC Services, providing software to schools in the NSW DET in Australia. His search for cross-platform tools led to Revolution and resulted in the RunRevPlanet.com initiative. Visit RunRevPlanet here.


Main Menu

What's New

Revolution Conference