This repository has been archived by the owner. It is now read-only.

aidenybai / docscan Archived

👓 Scans documents and returns strings

4 stars 0 forks

Watch

5 commits

Failed to load latest commit information.

README.md

Docscan

Docscan is a lightweight document scanner. It allows users to open up document types and return the information inside as strings via regex.

Requirements:

zipfile
io
re
XML

Usage: Note: fileName must be in the directory Example: DocuScan("C:\Users\You\Desktop\folder1\test.pdf")

Instantiate class Docscan('fileName').
use print(variable.returnFileText())
use print(variable.executeRegex('regex here'))
use print(executeHeaderRegex('regex here'))
use print(executeFooterRegex('regex here'))

Methods:

returnFileText() - Returns the text of a file.
executeRegex(regexExpression) - creates a list of all matching cases of regexExpression
executeHeaderRegex(regularExpression) - creates a list of all matching cases of regexExpression in the header XML.
executeFooterRegex(regularExpression) - creates a list of all matching cases of regexExpression in the Footer XML.

About

👓 Scans documents and returns strings

docs xml docx pdf py

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%

You can’t perform that action at this time.