API
Module holding the base class for computing the mask holding document parts not to process.
DegrotesqueMarker
The base class for computing the mask holding document parts not to process.
get_extensions()
abstractmethod
Returns the extensions of file types that can be processed using this marker.
| Returns: |
|
|---|
get_mask(document, to_skip=None)
abstractmethod
Returns a string where all parts to exclude from replacements denoted as '1' and all with plain content that shall be processed as '0'.
| Parameters: |
|
|---|
| Returns: |
|
|---|
apply_masks(document, mask)
Masks (sets the contents of the mask to '1') all URLs and ISSN / ISBN.
The method is assumed to be called after an initial mask has been computed.
| Parameters: |
|
|---|
| Returns: |
|
|---|