Monday, November 17, 2014

Visualizing Data in a Predictive Coding Project – Part Two

By Ralph Losey
This is part two of my presentation of an idea for visualization of data in a predictive coding project.Please read part one first.
As most of you already know, the ranking of all documents according to their probable relevance, or other criteria, is the purpose of predictive coding. The ranking allows accurate predictions to me made as to how the documents should be coded. In part one I shared the idea by providing a series of images of a typical document ranking process. I only included a few brief verbal descriptions. This week I will spell it out and further develop the idea. Next week I hope to end on a high note with random sampling and math.
Vertical and Horizontal Axis of the Images
The visualizations here presented all represent a collection of documents. It is supposed to be pointillistimage, with one point for each document. At the beginning of a document review project, before any predictive coding training has been applied to the collection, the documents are all unranked. They are relatively unknown. This is shown by the fuzzy round cloud of unknown data.