How to Handle Spelling Mistakes in Joomla’s Search

A company that we deal with runs a very large Joomla website mainly discussing law topics. The company complained that Joomla’s search doesn’t address a very common human issue: spelling mistakes.

For example, if you are trying to search for Lawyer and you searched for Lauyer, then you won’t find anything using Joomla’s search, this is because these are technically two different words. However, it’s obvious that you made a mistake and that you want to search for Lawyer instead of Lauyer.

Another issue that the company was facing is the spelling differences between US English and UK English, for example, when someone from the UK tries to search for the word colour, he won’t find anything, because in the US, it is spelled color. So, how can this be addressed?

Well, there are two ways:

  1. Create an algorithm that will generate all the possible erroneous ways a word can be spelled and then link these misspelled words to the original word.
  2. Use the MySQL built-in SoundEx function to search for all occurrences that sound like a certain word.

Now, unless you work for a very large company and you have access to a lot of user generated data, the first option is out of question. Because simply, you will spend months working on a solution, and you will still miss many misspelled words. Not to mention that this solution can be very, very slow and inefficient.

The second option is much better because it’s simpler and much more efficient. All you need to do is to tell MySQL that you want matches that contain words that sound like your word. So, if you’re searching for the word colour, you’ll get the matches that contain the word color. If you’re searching for Jon Smith, then you’ll get matches containing John Smith. If you make a spelling mistake, then Joomla will be flexible and will return results, despite your errors! Now that’s a good search engine!

So, how does the SoundEx function work?

OK, we have stated that the SoundEx function is the key to the solution, but we haven’t mentioned how it works. In short, and without getting into the details, the function SoundEx returns the phonetic representation of a word. That phonetic representation represents how the word sounds. When two words have the same SoundEx value, then they sound the same. Try the below experiment:

  • Login to phpMyAdmin.
  • Click on any database on the right, and then click on SQL on the top right.

  • Run the following query:

    Select SOUNDEX('itoctopus');

    You will get a value of I32312.

  • Now run the following query:

    Select SOUNDEX('itoctopous');

    (Notice the extra “o”)

    And you will get the same result as above. This means that both words sound the same.

So, how do you add this functionality to Joomla?

Unfortunately, while the concept is easy, putting it into practice is not as easy as one thinks! Here are the steps that will need to be done:

  • Create a plugin that will automatically create an index of all the words (and their SoundEx value) used in an article every time your create a new one.
  • Modify the search plugin (that of content and/or K2 if you’re using K2) the following way: Each time someone enters a query, split that query into words and for each word, search for the word, in your index (which is a database table) that has the same SoundEx value. Finally, re-create the query out of the correctly spelled words in your index and apply the search.

As you can see, the concept of handling spelling mistakes when searching in Joomla is simple but the implementation is definitely challenging. It took us nearly a week to devise, implement, and test the above solution – but it was worth it! Our client was definitely happy and we were thrilled because we felt that we made a critical functionality in Joomla even better!

But, will the above method slow down the Joomla’s search functionality?

The short answer is yes – the not so short answer is that Joomla’s search can always be enhanced. So, the enhancement done on the default search can be used to make up for the slight performance hit caused by this fuzzy search. (Feel free to Google fuzzy search to know exactly what we mean – in fact, this whole post is about implementing fuzzy search in Joomla – but we figured that not so many out there are familiar with the technical term).

If you want to implement the above but you don’t have the necessary resources/expertise/time to do it, then feel free to contact us! Our rates are favorable, our work is professional, we know our Joomla, and we are the friendliest Joomla developers on planet Earth!

One Response to “How to Handle Spelling Mistakes in Joomla’s Search”
  1. Comment by David Larpent — July 6, 2014 @ 12:35 pm

    Hi – this was interesting. I am attempting to improve a custom search function in a joomla custom component, to allow fuzzy searching on variously spelled location names. For example, “brazil” currently returns x results, “brasil” returns y results. We need to adapt our query so that either spelling returns x+y results.

    Our current search query is:

    // Filter By buyers destination //
    		$destination = $jinput->get('destination', '', 'string');
    		if($destination!=''){
    			 $destination = $db->Quote('%' . $db->escape($destination, true) . '%');
    			$query->where('a.deliverydestination LIKE '.$destination.' AND a.status="Collecting Bids"');
    			$query->order('a.created DESC');
    
    		}

    Grateful for a sense of how you can help us.

Leave a comment