Docscan
Docscan is a lightweight document scanner. It allows users to open up document types and return the information inside as strings via regex.
Requirements:
- zipfile
- io
- re
- XML
Usage: Note: fileName must be in the directory Example: DocuScan("C:\Users\You\Desktop\folder1\test.pdf")
- Instantiate
class Docscan('fileName'). - use
print(variable.returnFileText()) - use
print(variable.executeRegex('regex here')) - use
print(executeHeaderRegex('regex here')) - use
print(executeFooterRegex('regex here'))
Methods:
returnFileText()- Returns the text of a file.executeRegex(regexExpression)- creates a list of all matching cases of regexExpressionexecuteHeaderRegex(regularExpression)- creates a list of all matching cases of regexExpression in the header XML.executeFooterRegex(regularExpression)- creates a list of all matching cases of regexExpression in the Footer XML.