The last few weeks has seen us sidetracked from the series of the process of eDiscovery by posts on other topics and now it is time to return. This entire post is devoted to just one method of finding documents that have been uploaded to an eDiscovery database solution. Keyword searching is the most used and most common method but you may be wondering why, in the title, I have stated that is an introduction for SA. The reason for that is simply that in other jurisdictions such as the USA and UK the market is more advanced and they tend not to rely upon keyword searching alone (or in some cases not at all!). SA is a fledgling eDiscovery market so before my friends and contacts in the USA and UK think this is “old hat” they should remember that they were here once!
The use of keyword searching is an effective method of finding the documents that you want but it is also easy to get it completely wrong. My good friend Andy King in New Zealand supports this view in his excellent and practical recent blog post which is well worth reading.
Let us begin with what you may consider to be obvious, and you may recall from my very first blog that a UK lawyer that I know included “….finding the right search terms…” in her description of eDiscovery.
Choosing effective keywords is not as simple as preparing a list of words that the lawyer thinks will help him find documents in a collection that he would want to see. It needs more than careful thought and discussion with the client at the outset as to what the case is all about. You have to understand what you are looking for before you look for it! It also needs consideration with the service provider and furthermore, this needs to be done at an early stage. If you do not do this I can guarantee that you will have some false positives and you will probably miss documents that you really need to see. Experienced advisers will be able to tell you that some of your chosen words will not be as helpful as you may think and will advise on methods to obtain the best results. An adviser who is experienced not only with the technology but the litigation process will help to put together a syntax showing how variants of your key words will assist. Furthermore, and this is crucial, do not think that your first selected list of keywords will be all that you need to do. The list will need to be revised and refined as the review progresses. Remember, it is not just about finding documents - the process MUST be defensible and proportionate.
I was once involved in a contractual dispute and one of the keywords on the list supplied by my law firm client was “disclaimer”. The first thing that a good service provider will tell you is that by itself this is never going to be an effective keyword. Disclaimer will probably appear in the footer of every single email in the collection and is also likely to appear in numerous contractual documents to the point that there will be more occurrences of the word than there are documents! If documents containing the word disclaimer are relevant to you then you need refinements to filter and find just those documents, for example, a good solution will have the ability to look for the word in the content of the email only and ignore the headers and footers.
Going back to how systems help, most database solutions have search features such as; stemming; fuzzy search; wildcards; the use of operators or connectors; Boolean searching; date and number recognition (the latter being great for credit card numbers); noise words and more. I won’t explain all of these to you in this post (you can contact me if you wish to know more) but these are matters that your service provider should know about. As I said you must produce your draft list of keywords or phrases and discuss with your service provider. A good and experienced provider will advise on the effectiveness or otherwise of your list and will also probably want to know what you are actually looking for so that they can advise the best methods and features of their software in order to achieve that.
I don't want to bore you by going into all the detail of the different methods mentioned but I hope I have said enough to show you that there is a complexity and skill requirement here as very careful thought needs to go into successful keyword searching. It is vital that you discuss with your service provider how words can be joined etc. and if all that your service provider does is apply your list as it is without giving you advice, then you need to know why.
I mentioned, as does Andy King in his blog, that the first list of keywords should not be the last. As the matter proceeds and particularly as reviews progress, you must revisit your list and refine it by changing, deleting or adding words. Feed back to your service provider and obtain their advice. I doubt that a short list of singular words, without the use of the methods I have referred to, would constitute “reasonable search”.
At the beginning of this post I inferred that keyword searching alone is no longer considered to be reliable. I take the view that it should be used in conjunction with other functionality and features within your database solution. Features such as clustering, concept searching, email threads, near duplicates, visualisation techniques and technology assisted review (predictive coding). I will discuss all of these in future posts but let us first of all digest keyword searching. Use it, it works, but use it properly, work closely with your service provider and always look to refine the list. I hope this has provided an insight to those who have never used this method of searching whilst to those that have I hope it has highlighted perhaps, something they had not realised before.
Let me conclude by telling you about a seminar I attended in London a few years ago when a high ranking Judge was a panellist and there were number of litigation lawyers in the audience. The Judge looked at the audience and in his stentorian tones, said, “If you don't know the key custodians, the key dates and the key words in your case, you are negligent”. Food for thought?