电子鉴证的流程 edrm.net
1.发现相关数据
2a.保留
2b.收集
3a.处理
3b.评估
3c.分析
4 结论
5 处理结论
整个过程中数据量下降,数据的相关性上升。
Plaintiff 原告 defendant 被告 deposition 证词 conjugation ( change of verb form
文档查找效果可以用recall(多少正确文档被查处)和precision(number of target files retrieved divided by all files retrived)
relevance ranking相关度排序:关键词出现多,则相关度高,但关键词所出现的文章越多(the term appears in more documents of the corpus,the less important the term is)
聚类:rule based or sample based
Search effectiveness can be measured by
recall, the number of responsive documents retrieved divided by the total number of responsive documents, and
precision, the number of responsive documents retrieved divided by the total number of documents retrieved. A variety of technologies have been introduced to improve both recall and precision of keyword search. Boolean, which allows for use of AND, and NOT operators in search queries, and proximity search, which finds documents that contain terms within a specified distance of each other, have been used to improve precision by reducing false positives. Stemming, wildcard and fuzzy search, which find documents with different variations of the specified terms, such as differences in case, conjugation and spelling, have been used to improve recall by finding variations of the search word that have the same or similar meaning.