eDiscovery in SA - the importance of processing hard copy properly

I have been specifically asked by a number of people to write again on this topic for SA and I am not at all surprised. I say that because there is still a lot of paper in SA involved in litigation, regulatory and internal investigations, arbitrations and competition cases and I am coming across it all the time here and receiving a lot of questions as well as hearing a lot of misconceptions!

Firstly, let me remind all that wherever possible we should not be printing electronic documents such as emails etc. if they actually exist in their native format. However, in many cases that I have come across in SA there is still a large amount of “genuine” paper documents that need to be dealt with. Last year, I wrote a post on the subject - and it seems that this has helped to prompt people to enquire further and I am more than happy to do so because it is important to handle paper documents properly and in accordance with eDiscovery guidelines and best practice.

One of the issues surrounding paper is that it is not as “sexy” as electronic data where we speak of finding deleted documents, extracting metadata which tells us “Who knew What, When” or using all the clever analytical tools available etc. Paper is the “poor relation” if you like, but let me tell you that dealing with paper properly in a case is every bit as important as anything in relation to electronic documents in the general scheme of things. 

That being the case let us look again as to how paper documents are managed. The first thought in many lawyers minds when they are faced with files or boxes of paper is to photocopy them and start the process of looking at them by way of review. That is fine if all you have is a few files, but what if you have several boxes? It costs time and money to copy and then you still have to look at them at all and try to remember where you have seen a document earlier that relates to one that you have just seen. Worse, you may split the copies between various members of the team for them to review but how can they possibly see a relationship between a document they are reviewing and one which has been reviewed by someone else? So, the answer is not to copy but to scan - easy! No, not quite so easy. Scan, yes, but you still have to review, find relationships, search for similar documents, group documents by date, custodian, type, issue etc.

Even the scanning is far from the easy process many people think because for litigation (or other types of cases) I want it to be scanned properly. This is not a commercial scanning exercise. I want to know that the scanning company has experience of dealing with legal documents and can look out for important matters such as; the occasional double sided page; flimsy paper; poor quality images; bound documents; and pages containing colour text. I want my scanning company to use good quality scanners and have great scanning software that together will “add value” by for example “de-skewing”, de-speckling” on the fly, and I want to know that there is a comprehensive paper to image QC process. I want to know that they will do whatever they can to enhance a poor image. All of the above is common practice for an eDiscovery service provider that handles paper but it is not common practice for a commercial scanning company who work in a very different way. I mentioned colour text earlier or poor images and often I see scanning companies simply scanning in colour whenever they see colour or using greyscale to scan documents where an image is unclear. Those decisions may not necessarily be right and they have an impact that the commercial scanning would not have considered. The digital size of a colour image or a greyscale image is much larger than a mono image and if there are thousands like this then there would be an impact on the overall digital size of the collection. That is important because, in most cases, hosting platforms are charged by per Gb size. There is also the slight problem, that depending upon the reviewing firms in house IT systems, these images may take slightly longer to open thereby slowing the review.  I always advise that a document should only be scanned in colour where the colour affects the integrity of the document. For example a document containing a graph with coloured key codes, must be scanned in colour to identify the codes! However, a letter whereby the only colour is the sending company’s logo does not need to be scanned in colour. As to greyscale, this can often be avoided by enhancing the black and white image using different contrast settings on the scanner.

Then of course, (as I mentioned in my earlier post mentioned above) I want the best possible OCR software to be used to allow the very best opportunity of making the text searchable. Not all text will benefit from OCR software such as handwriting and very poor quality or illegible text.

Once scanned, and with OCR software applied, how are the documents going to be reviewed by lawyers? For example if the documents are contained in binders, then it would be normal practice to supply a PDF of each binder but assuming that each binder may well contain 350-400 pages it will still leave the reviewer with the task of scrolling through page by page on his computer. Proprietary software such as Adobe could be used to facilitate the text searching but there is still the problem of “linking” documents or sorting. Furthermore, if there is more than one reviewer then they will not be able to collaborate because they are not using a collaborative database solution. The point I am making here is that paper or not it still needs an eDiscovery solution for reviewing purposes.

Then, of course, we come to the most important part of processing paper documents - unitisation and coding. I dealt with these processes in depth in my post - and therefore I will not repeat but merely add to the comments. Without unitisation and coding the task of review is so much harder and the risks of missing relevant documents that much greater. These two processes identify all of the individual  documents within a scanned collection and then provide us the ability to search, not just across text but across the important properties of each document. This is the “paper documents” equivalent of metadata. 

All of the above is the correct way to deal with paper in the types of matters we are talking about. In my time here I have already witnessed problems of various kinds because these processes have not been followed. I have documented methodologies on many of these processes which I insist are followed. As far as scanning is concerned there are very few providers in SA who understand the process properly and are able to deliver in the manner I describe and all I can say on this subject is to think very carefully before you instruct a scanning company - even better if you involve me to ensure that it is done properly. The processes of unitisation and coding are more difficult here as there are few providers who do enough of this work to be really effective and competent. In the UK, I did more of this type of work than almost any other service provider for many years. I am going to break my unwritten rule by specifically mentioning a provider that I have used for 15 years with regard to unitisation and coding. The company is Cenza Technologies in Chennai, India.
Be under no misapprehension about unitisation and coding - it is absolutely essential and needs to be done correctly by experienced people. As I say, I have worked with Cenza for 15 years now and am delighted still to be working with them now that I have moved to SA. Without doubt they are the best company for this work that I know. I have visited them at their facility which is very impressive and together we have handled some of the world’s biggest cases. I do not just recommend Cenza, I would not work with anyone else. That is how important this work is and how seriously I take it. Do not hesitate to contact me for further information about the processes or about Cenza.

In summary, think very very carefully about how to deal with paper documents in litigation, investigations etc. and do not underestimate the importance. Few people understand it properly and thoroughly and I spend a large amount of my time dealing with queries and projects concerning paper collections. Make contact with me before you embark on a paper exercise!