| 
											 eDoc PDF Data Extractor 
											- 
											Version 
											1.0 
											The purpose of eDoc Data Extractor is to extract text 
from a searchable PDF in a batch process, and use this 
text to rename the file and optionally create a CSV 
file. The searchable PDFs can come from an application 
or the output from scanning \ OCR programs. 
Since most of the time it will be used to process 
scanned files with OCR content and OCR is not perfect, 
the program was designed to validate the captured data 
with rules. It has also been designed to be flexible 
in the area that it captures as scanned, OCR'd files 
are not always formatted exactly the same. In other 
words one will have a value on line one and the next 
file may have the same value on line two. 
Since the line will most likely always have a static 
value such as "Invoice Number" it can be used to 
locate the line to parse. If it does not have a static 
value lines can be added to a line that does have a 
static value. So it can be set to look for a line that 
has "Invoice" add two lines and capture the first 30  |