Apryse SDK

Returns:

The appearance stream for this annotation, or NULL if the annotation does not have an appearance for the given combination of annotation and appearance states.

GetBorderStyle()[source]

Gets the border style for the annotation. Typically used for Link annotations.

Return type:: BorderStyle
Returns:: Annotation’s border style.

GetColorAsCMYK()[source]

Returns the annotation’s color in CMYK color space.

Return type:

ColorPt

Returns:

A ColorPt object containing an array of four numbers in the range 0.0 to 1.0, representing a CMYK color used for the following purposes:

The background of the annotation’s icon when closed

The title bar of the annotation’s pop-up window

The border of a link annotation

If the annotation does not specify an explicit color, a default color is returned. Text annotations return ‘default yellow;’ all others return black.

GetColorAsGray()[source]

Returns the annotation’s color in Gray color space.

Return type:: ColorPt
Returns:: A ColorPt object containing a number in the range 0.0 to 1.0, representing a Gray Scale color used for the following purposes: The background of the annotation’s icon when closed The title bar of the annotation’s pop-up window The border of a link annotation If the annotation does not specify an explicit color, black color is returned.

GetColorAsRGB()[source]

Gets an annotation’s color in RGB color space.

Return type:

ColorPt

Returns:

A ColorPt object containing an array of three numbers in the range 0.0 to 1.0,

representing an RGB colour used for the following purposes:

The background of the annotation’s icon when closed
The title bar of the annotation’s pop-up window
The border of a link annotation

If the annotation does not specify an explicit color, a default color is returned. Text annotations return ‘default yellow;’ all others return black.

GetColorCompNum()[source]

Returns the color space the annotation’s color is represented in.

Return type:: int
Returns:: An integer that is either 1(for DeviceGray), 3(DeviceRGB), or 4(DeviceCMYK). If the annotation has no color, i.e. is transparent, 0 will be returned.

GetContents()[source]

Extract the content of this annotation. (Optional).

Return type:: string
Returns:: A unicode string object with the text that is associated with this annotation. This is the text that annotation displays on user interaction, if the annotation type supports it.

GetCustomData(key)[source]

Returns custom data associated with the given key.

Parameters:: key (string) – The key for which to retrieve the associated data.
Return type:: string
Returns:: the custom data string. If no data is available an empty string is returned.

GetDate()[source]

Gets an annotation’s last modified date.

Return type:: Date
Returns:: The annotation’s last modified time and date. If the annotation has no associated date structure, the returned date is not valid (date.IsValid() returns false). Corresponds to the “M” entry of the annotation dictionary.

GetFlag(flag)[source]

Return type:: boolean
Returns:: The value of given Flag
Parameters:: flag (int) – The Flag property to query.

GetHandleInternal()[source]

GetOptionalContent()[source]

Returns optional content associated with this annotation.

Return type:: Obj
Returns:: A SDF object corresponding to the group of optional properties.

Notes: The return value is an Optional Content Group (OCG) or Optional Content Membership Dictionary (PDF::OCG::OCMD) specifying the optional content properties for the annotation. Before the annotation is drawn, its visibility shall be determined based on this entry as well as the annotation flags specified in the Flag entry. If it is determined to be invisible, the annotation shall be skipped, as if it were not in the document.

GetPage()[source]

Gets the page the annotation is associated with.

Return type:: Page
Returns:: A Page object or null page object if the page reference is not available. The page object returned is an indirect reference to the page object with which this annotation is associated. This entry shall be present in screen annotations associated with rendition actions.

Optional. PDF 1.3 PDF 1.4 PDF 1.5 not used in FDF files.

GetRect()[source]

Return type:: Rect
Returns:: Annotation’s bounding rectangle, specified in user space coordinates.

The meaning of the rectangle depends on the annotation type. For Link and RubberStamp annotations, the rectangle specifies the area containing the hyperlink area or stamp. For Note annotations, the rectangle is describing the popup window when it’s opened. When it’s closed, the icon is positioned at lower left corner.

GetRotation()[source]

Returns the rotation value of the annotation. The Rotation specifies the number of degrees by which the annotation shall be rotated counterclockwise relative to the page. The value shall be a multiple of 90.

Return type:: int
Returns:: An integer representing the rotation value of the annotation.

Notes: This property is part of the appearance characteristics dictionary, this dictionary that shall be used inructing a dynamic appearance stream specifying the annotation’s visual presentation on the page.

GetSDFObj()[source]

Return type:: Obj
Returns:: The underlying SDF/Cos object.

GetStructParent()[source]

Returns the struct parent of an annotation. (Required if the annotation is a structural content item; PDF 1.3)

Return type:: int
Returns:: An integer which is the integer key of the annotation’s entry in the structural parent tree.

Notes: The StructParent is the integer key of the annotation’s entry in the structural parent tree.

GetTriggerAction(trigger)[source]

Get the Action associated with the selected Annot Trigger event.

Parameters:: trigger (int) – the type of trigger event to get
Return type:: Obj
Returns:: the Action Obj if present, otherwise NULL

GetType()[source]

Return type:: int
Returns:: The type of this annotation. Corresponds to the “Subtype” entry of annotation dictionary, as per PDF Reference Manual section 12.5.2

GetUniqueID()[source]

Return type:: Obj
Returns:: The unique identifier for this annotation, or NULL if the identifier is not available. The returned value is a String object and is the value of the “NM” field, which was added as an optional attribute in PDF 1.4.

GetVisibleContentBox()[source]

It is possible during viewing that GetRect does not return the most accurate bounding box of what is actually rendered. This method calculates the bounding box, rather than relying on what is specified in the PDF document. The bounding box is defined as the smallest rectangle that includes all the visible content on the annotation.

Return type:: Rect
Returns:: the bounding box for this annotation. The dimensions are specified in user space coordinates.

IsMarkup()[source]

Return true if this annotation is classified as a markup annotation.

Return type:: boolean
Returns:: boolean value, true if this annotation is classified as a markup annotation.

IsValid()[source]

Return type:: boolean
Returns:: True if this is a valid (non-null) annotation, false otherwise. If the function returns false the underlying SDF/Cos object is null or is not valid and the annotation object should be treated as a null object.

RefreshAppearance(args)[source]

Overload 1:

Regenerates the appearance stream for the annotation. This method can be used to auto-generate the annotation appearance after creating or modifying the annotation without providing an explicit appearance or setting the “NeedAppearances” flag in the AcroForm dictionary.

Notes: If this annotation contains text, and has been added to a rotated page, the text in the annotation may be rotated. If RefreshAppearance is called after the annotation is added to a rotated page, then any text will be rotated in the opposite direction of the page rotation. If this method is called before the annotation is added to any rotated page, then no counter rotation will be applied. If you wish to call RefreshAppearance on an annotation already added to a rotated page, but you don’t want the text to be rotated, you can do one of the following; temporarily un-rotate the page, or, temporarily remove the “Rotate” object from the annotation. To support users adding text annotations while using a PDF viewer, you can also add any viewer rotation to the annotations Rotate object, to have text always rotated correctly, from the users perspective.

Overload 2:

A version of RefreshAppearance allowing custom options to make slight tweaks in behaviour.

Parameters:: options (RefreshOptions) – The RefreshOptions.

RemoveAppearance(args)[source]

Removes the annotation’s appearance for the given combination of annotation and appearance states.

Parameters:

annot_state (int, optional) – the annotation’s appearance state, which selects the applicable appearance stream from the appearance sub-dictionary. An annotation can define as many as three separate appearances: The normal, rollover, and down appearance.
app_state (string, optional) – is an optional parameter specifying the appearance state (e.g. “Off”, “On”, etc) under which the new appearance should be stored. If appearance_state is NULL, the annotation will have only one annotation state.

Resize(newrect)[source]

Scales the geometry of the annotation so that its appearance would now fit a new rectangle on the page, in user units. Users still have to call RefreshAppearance() later if they want a corresponding appearance stream to be generated for the new rectangle. The main reason for not combining the two operations together is to be able to resize annotations that do not have an appearance stream.

Parameters:: newrect (Rect) – A reference to the new rectangle to which this annotation has to be resized.

SetActiveAppearanceState(astate)[source]

Sets the annotation’s active appearance state. (Required if the appearance dictionary AP contains one or more subdictionaries; PDF 1.2)

Parameters:: astate (string) – Character string representing the name of the active appearance state. The string used to select the annotation’s appearance state, which selects the applicable appearance stream from an appearance subdictionary.

SetAppearance(args)[source]

Sets the annotation’s appearance for the given combination of annotation and appearance states. (Optional; PDF 1.2)

Parameters:

app_stream (Obj) – a content stream defining the new appearance.
annot_state (int, optional) – the annotation’s appearance state, which selects the applicable appearance stream from the appearance sub-dictionary. An annotation can define as many as three separate appearances: The normal, rollover, and down appearance.
app_state (string, optional) – is an optional parameter specifying the appearance state (e.g. “Off”, “On”, etc) under which the new appearance should be stored. If appearance_state is NULL, the annotation will have only one annotation state.

SetBorderStyle(bs, oldStyleOnly=False)[source]

Sets the border style for the annotation.

Parameters:

bs (BorderStyle) – New border style for this annotation.
oldStyleOnly (boolean, optional) – PDF manual specifies two ways to add border information to an annotation object, either through an array named ‘Border’ (PDF 1.0), or a dictionary called ‘BS’ (PDF 1.2) the latter taking precedence over the former. However, if you want to create a border with rounded corners, you can only do that using PDF 1.0 Border specification, in which case if you call SetBorderStyle() set the parameter oldStyleOnly to true. This parameter has a default value of false in the API and does not need to be used otherwise.

SetColor(col, numcomp=3)[source]

Sets an annotation’s color. (Optional; PDF 1.1)

Parameters:

col (ColorPt) –

A ColorPt object in RGB or Gray or CMYK color space representing the annotation’s color. The ColorPt contains an array of numbers in the range 0.0 to 1.0, representing a color used for the following purposes:

The background of the annotation’s icon when closed
The title bar of the annotation’s pop-up window The border of a link annotation

The number of array elements determines the color space in which the color shall be defined: 0 No color; transparent 1 DeviceGray 3 DeviceRGB 4 DeviceCMYK

Parameters:

numcomp (int, optional) –

The number of color components used to represent the color (i.e. 1, 3, 4).

SetContents(contents)[source]

Sets the content of this annotation. (Optional).

Parameters:: contents (string) – A reference to unicode string object with the text that will be associated with this annotation. This is the text that annotation displays on user interaction, if the annotation type supports it.

SetCustomData(key, value)[source]

Sets the custom data associated with the specified key.

Parameters:

key (string) – The key under which to store this custom data
value (string) – The custom data string to store

SetDate(date)[source]

Sets an annotation’s last modified date.

Parameters:: date (Date) – The annotation’s last modified time and date. Corresponds to the “M” entry of the annotation dictionary.

SetFlag(flag, value)[source]

Sets the value of given Flag.

Parameters:

flag (int) – The Flag property to modify.
value (boolean) – The new value for the property.

SetOptionalContent(content)[source]

Associates optional content with this annotation. (Optional, PDF1.5).

Parameters:: content (Obj) – A pointer to an SDF object corresponding to the optional content, a PDF::OCG::Group or membership dictionary specifying the PDF::OCG::Group properties for the annotation. Before the annotation is drawn, its visibility shall be determined based on this entry as well as the annotation flags specified in the Flag entry . If it is determined to be invisible, the annotation shall be skipped, as if it were not in the document.

SetPage(page)[source]

Sets the reference to a page the annotation is associated with. (Optional PDF 1.3; not used in FDF files)

Parameters:: page (Page) – The page object user wants the annotation to be associated with.

Notes: The parameter should be an indirect reference to the page object with which this annotation is associated. This entry shall be present in screen annotations associated with rendition actions

SetRect(pos)[source]

Sets the size and location of an annotation on its page.

Parameters:: pos (Rect) – Annotation’s bounding rectangle, specified in user space coordinates.

The meaning of the rectangle depends on the annotation type. For Link and RubberStamp annotations, the rectangle specifies the area containing the hyperlink area or stamp. For Note annotations, the rectangle is describing the popup window when it’s opened. When it’s closed, the icon is positioned at lower left corner.

SetRotation(angle)[source]

Sets the rotation value of the annotation. The Rotation specifies the number of degrees by which the annotation shall be rotated counterclockwise relative to the page. The value shall be a multiple of 90. (Optional)

Parameters:: angle (int) – An integer representing the rotation value of the annotation.

Notes: This property is part of the appearance characteristics dictionary, this dictionary that shall be used inructing a dynamic appearance stream specifying the annotation’s visual presentation on the page.

SetStructParent(parkeyval)[source]

Sets the struct parent of an annotation. (Required if the annotation is a structural content item; PDF 1.3)

Parameters:: parkeyval (int) – An integer which is the integer key of the annotation’s entry in the structural parent tree.

Notes: The StructParent is the integer key of the annotation’s entry in the structural parent tree.

SetUniqueID(id, id_buf_sz=0)[source]

Sets the unique identifier for this annotation.

Parameters:

id (string) – A buffer containing a unique identifier for this annotation.
id_buf_sz (int, optional) – The size of ‘id’ buffer, or 0 if the string is NULL terminated.

Notes: It is necessary to ensure that the unique ID generated is actually unique.

e_3D = 24: 3D annotation

e_Caret = 13: Caret annotation

e_Circle = 5: Circle annotation

e_FileAttachment = 16: File attachment annotation

e_FreeText = 2: Free text annotation

e_Highlight = 8: Highlight annotation

e_Ink = 14: Ink annotation

e_Line = 3: Line annotation

e_Link = 1: Link annotation

e_Movie = 18: Movie annotation

e_Polygon = 6: Polygon annotation

e_Polyline = 7: Polyline annotation

e_Popup = 15: Pop-up annotation

e_PrinterMark = 21: Printer’s mark annotation

e_Projection = 26: Projection annotation, Adobe supplement to ISO 32000

e_Redact = 25: Redact annotation

e_RichMedia = 27: Rich Media annotation, Adobe supplement to ISO 32000

e_Screen = 20: Screen annotation

e_Sound = 17: Sound annotation

e_Square = 4: Square annotation

e_Squiggly = 10: Squiggly-underline annotation

e_Stamp = 12: Rubber stamp annotation

e_StrikeOut = 11: Strikeout annotation

e_Text = 0: Text annotation

e_TrapNet = 22: Trap network annotation

e_Underline = 9: Underline annotation

e_Unknown = 28: Any other annotation type, not listed in PDF spec and unrecognized by PDFTron software

e_Watermark = 23: Watermark annotation

e_Widget = 19: Widget annotation

e_action_trigger_activate = 0

e_action_trigger_annot_blur = 6

e_action_trigger_annot_down = 3

e_action_trigger_annot_enter = 1

e_action_trigger_annot_exit = 2

e_action_trigger_annot_focus = 5

e_action_trigger_annot_page_close = 8

e_action_trigger_annot_page_invisible = 10

e_action_trigger_annot_page_open = 7

e_action_trigger_annot_page_visible = 9

e_action_trigger_annot_up = 4

e_down = 2

e_hidden = 1

e_invisible = 0

e_locked = 7

e_locked_contents = 9

e_no_rotate = 4

e_no_view = 5

e_no_zoom = 3

e_normal = 0

e_print = 2

e_read_only = 6

e_rollover = 1

e_toggle_no_view = 8

property mp_annot

property thisown: The membership flag

class apryse_sdk.Appearance[source]

Bases: object

Class used to customize the appearance of the optional redaction overlay.

property Border: Border specifies if the overlay will be surrounded by a border.

property HorizTextAlignment: Specifies the horizontal text alignment in the overlay: align<0 -> text will be left aligned. align==0 -> text will be center aligned. align>0 -> text will be right aligned.

property MaxFontSize

property MinFontSize: Specifies the minimum and maximum font size used to represent the text in the overlay.

property NegativeOverlayColor: NegativeOverlayColor defines the overlay background color in RGB color space for negative redactions.

property PositiveOverlayColor: PositiveOverlayColor defines the overlay background color in RGB color space for positive redactions.

property RedactedContentColor: Specifies the color used to paint the regions where content was removed. Only useful when ShowRedactedContentRegions == true. Default value is Gray color.

property RedactionOverlay: If RedactionOverlay is set to true, Redactor will draw an overlay covering all redacted regions. The rest of properties in the Appearance class defines visual properties of the overlay. If false the overlay region will not be drawn.

property ShowRedactedContentRegions: Specifies whether an overlay should be drawn in place of the redacted content. This option can be used to indicate the areas where the content was removed from without revealing the content itself. Default value is False. Notes: The overlay region used RedactedContentColor as a fill color.

property TextColor: Specifies the color used to paint the text in the overlay (in RGB).

property TextFont: Specifies the font used to represent the text in the overlay.

property UseOverlayText: Specifies if the text (e.g. “Redacted” etc.) should be placed on top of the overlay. The remaining properties relate to the positioning, and styling of the overlay text.

property VertTextAlignment: Specifies the vertical text alignment in the overlay: align<0 -> text will be top aligned. align==0 -> text will be center aligned. align>0 -> text will be bottom aligned.

property thisown: The membership flag

class apryse_sdk.AttrObj(args)[source]

Bases: object

An application or plug-in extension that processes logical structure can attach additional information, called attributes, to any structure element. The attribute information is held in one or more attribute objects associated with the structure element. An attribute object is a dictionary or stream that includes an entry identifying the application or plug-in that owns the attribute information. Other entries represent the attributes: the keys are attribute names, and values are the corresponding attribute values.

GetOwner()[source]

Return type:: string
Returns:: The name of the application or plug-in extension owning the attribute data.

GetSDFObj()[source]

Return type:: Obj
Returns:: Pointer to the underlying SDF/Cos object.

property thisown: The membership flag

class apryse_sdk.BarcodeModule[source]

Bases: object

The class BarcodeModule. static interface to Apryse SDK’s barcode extraction functionality

static ExtractBarcodes(args)[source]

Perform barcode extraction on a PDF. Scan the PDF for barcodes, and save a JSON array of detected barcodes to the specified file. By default, this will search for all supported barcode types in all orientations. The time required to process the document will depend on the number of barcode types and orientations to search for. Thus, the default behavior is the slowest. To improve speed, specify a subset of barcode types and orientations to search for using the options parameter. Very small barcodes may not be detected. While there is no hard limit to barcode size, accuracy will begin to decrease as barcodes get smaller. The smallest barcode that can be detected will depend on a number of factors, including page size, barcode type, and (if applicable) image quality.

Parameters:

src (PDFDoc) – The source document.
output_file_path (string) – The path to the output file.
options (BarcodeOptions, optional) – Barcode options (optional).

static ExtractBarcodesAsString(args)[source]

Perform barcode extraction on a PDF. Scan the PDF for barcodes, and return a JSON array of detected barcodes as a string. By default, this will search for all supported barcode types in all orientations. The time required to process the document will depend on the number of barcode types and orientations to search for. Thus, the default behavior is the slowest. To improve speed, specify a subset of barcode types and orientations to search for using the options parameter. Very small barcodes may not be detected. While there is no hard limit to barcode size, accuracy will begin to decrease as barcodes get smaller. The smallest barcode that can be detected will depend on a number of factors, including page size, barcode type, and (if applicable) image quality.

Parameters:

src (PDFDoc) – The source document.
options (BarcodeOptions, optional) – Barcode options (optional).

Return type:

string

Returns:

JSON string representing barcode extraction results.

static IsModuleAvailable()[source]

Find out whether the Barcode Extraction Module is available (and licensed).

Return type:: boolean
Returns:: Returns true if barcode extraction can be performed.

property thisown: The membership flag

class apryse_sdk.BarcodeOptions[source]

Bases: object

GetBarcodeOrientations()[source]

Gets the value BarcodeOrientations from the options object. Specifies a set of barcode orientations to be searched for in the target PDF. This value can be created by bitwise OR-ing together various values from BarcodeOrientation to select the orientations of interest. By default, all orientations are searched for. Additional search directions can have a modest impact on search time. Orientation only affects the following barcode types: e_linear, e_post_net_planet, e_four_state, e_gs1_databar_stacked, e_pdf417, e_micro_pdf417, e_patch_code and e_pharma_code.

Return type:: int
Returns:: The current value for BarcodeOrientations.

GetBarcodeProfile()[source]

Gets the value BarcodeProfile from the options object. Specifies the barcode detection profile. Depending on the type and quality of the input, specialized profiles may return a better result at the cost of a slight runtime performance penalty. Barcode detection has the best runtime performance on high quality sources with the default profile.

Return type:: int
Returns:: The current value for BarcodeProfile.

GetBarcodeSearchTypes()[source]

Gets the value BarcodeSearchTypes from the options object. Specifies a set of barcode types to be searched for in the target PDF. This value can be created by bitwise OR-ing together various values from BarcodeTypeGroup to select the types of interest. Searching for barcodes takes approximately linear time in the number of barcode types to be searched for. By specifying only the types of barcodes of interest, runtime may be significantly improved. By default, all types are searched for.

Return type:: int
Returns:: The current value for BarcodeSearchTypes.

GetDataOutputFormat()[source]

Gets the value DataOutputFormat from the options object. Specifies the format of the data output. The default is “auto”, which will attempt to decode the barcode data to a string if possible. Otherwise, a Base64-encoded binary stream will be used. If the data is known to be binary, the output format can be set to “binary” to avoid unnecessary decoding attempts. Data returned as binary will be stored in the “data” field of the barcode object in the output JSON, while decoded data will be stored in the “text” field.

Return type:: int
Returns:: The current value for DataOutputFormat.

GetPages()[source]

Gets the value Pages from the options object. Specifies a range of pages on which to perform barcode extraction, such as “1-5”, or “1-3,5,7-10”. Open ended ranges are supported, e.g., “3-”. By default all pages are converted. The first page is page number 1.

Return type:: string
Returns:: The current value for Pages.

SetBarcodeOrientations(value)[source]

Sets the value for BarcodeOrientations in the options object. Specifies a set of barcode orientations to be searched for in the target PDF. This value can be created by bitwise OR-ing together various values from BarcodeOrientation to select the orientations of interest. By default, all orientations are searched for. Additional search directions can have a modest impact on search time. Orientation only affects the following barcode types: e_linear, e_post_net_planet, e_four_state, e_gs1_databar_stacked, e_pdf417, e_micro_pdf417, e_patch_code and e_pharma_code.

Parameters:: value (int) – The new value for BarcodeOrientations.
Return type:: BarcodeOptions
Returns:: This object, for call chaining.

SetBarcodeProfile(value)[source]

Sets the value for BarcodeProfile in the options object. Specifies the barcode detection profile. Depending on the type and quality of the input, specialized profiles may return a better result at the cost of a slight runtime performance penalty. Barcode detection has the best runtime performance on high quality sources with the default profile.

Parameters:: value (int) – The new value for BarcodeProfile.
Return type:: BarcodeOptions
Returns:: This object, for call chaining.

SetBarcodeSearchTypes(value)[source]

Sets the value for BarcodeSearchTypes in the options object. Specifies a set of barcode types to be searched for in the target PDF. This value can be created by bitwise OR-ing together various values from BarcodeTypeGroup to select the types of interest. Searching for barcodes takes approximately linear time in the number of barcode types to be searched for. By specifying only the types of barcodes of interest, runtime may be significantly improved. By default, all types are searched for.

Parameters:: value (int) – The new value for BarcodeSearchTypes.
Return type:: BarcodeOptions
Returns:: This object, for call chaining.

SetDataOutputFormat(value)[source]

Sets the value for DataOutputFormat in the options object. Specifies the format of the data output. The default is “auto”, which will attempt to decode the barcode data to a string if possible. Otherwise, a Base64-encoded binary stream will be used. If the data is known to be binary, the output format can be set to “binary” to avoid unnecessary decoding attempts. Data returned as binary will be stored in the “data” field of the barcode object in the output JSON, while decoded data will be stored in the “text” field.

Parameters:: value (int) – The new value for DataOutputFormat.
Return type:: BarcodeOptions
Returns:: This object, for call chaining.

SetPages(value)[source]

Sets the value for Pages in the options object. Specifies a range of pages on which to perform barcode extraction, such as “1-5”, or “1-3,5,7-10”. Open ended ranges are supported, e.g., “3-”. By default all pages are converted. The first page is page number 1.

Parameters:: value (string) – The new value for Pages.
Return type:: BarcodeOptions
Returns:: This object, for call chaining.

e_auto = 0: The default setting. The barcode data will be decoded to a string if possible. Otherwise, a Base64-encoded binary stream will be used.

e_aztec = 64

e_binary = 1: The barcode data will be returned as a Base64-encoded binary stream.

e_data_matrix = 32

e_diagonal = 4

e_four_state = 4

e_gs1_databar_stacked = 8

e_high_quality_source_profile = 1: The default profile assumes an input quality of mediocre to high, such as vector graphics, or an average to high quality scan of a flat paper. This setting provides the fastest barcode read performance.

e_horizontal = 1

e_linear = 1

e_low_quality_source_profile = 2: This profile is useful for scanned paper of very poor quality, such as a very low resolution or a lot of noise. The barcode read performance is slightly slower.

e_maxi = 128

e_micro_pdf417 = 1024

e_micro_qr = 256

e_natural_picture_profile = 4: This profile should be selected for natural pictures, such as photographs of real-world objects that contain a barcode. The barcode read performance is significantly slower.

e_none = 0

e_patch_code = 2048

e_pdf417 = 512

e_pharma_code = 4096

e_post_net_planet = 2

e_qr = 16

e_small_barcodes_profile = 3: This profile is suitable for vector graphics or scanned paper containing an unusually small barcode. The barcode read performance is significantly slower.

e_vertical = 2

property thisown: The membership flag

class apryse_sdk.BitmapInfo(args)[source]

Bases: object

GetBuffer()[source]

property dpi

property height

property stride

property thisown: The membership flag

property width

class apryse_sdk.Bookmark(args)[source]

Bases: object

A %PDF document may optionally display a document outline on the screen, allowing the user to navigate interactively from one part of the document to another. The outline consists of a tree-structured hierarchy of Bookmarks (sometimes called outline items), which serve as a ‘visual table of contents’ to display the document’s structure to the user.

Each Bookmark has a title that appears on screen, and an Action that specifies what happens when a user clicks on the Bookmark. The typical action for a user-created Bookmark is to move to another location in the current document, although any action (see PDF::Action) can be specified.

Bookmark is a utility class used to simplify work with %PDF bookmarks (or outlines; see section 8.2.2 ‘Document Outline’ in %PDF Reference Manual for more details).

AddChild(args)[source]

Overload 1:

Adds a new Bookmark as the new last child of this Bookmark.

Parameters:: in_title (string) – The title string value of the new Bookmark.
Return type:: Bookmark
Returns:: The newly created child Bookmark.

Notes: If this Bookmark previously had no children, it will be open after the child is added.

Overload 2:

Adds the specified Bookmark as the new last child of this Bookmark.

Parameters:: in_bookmark (Bookmark) – The Bookmark object to be added as a last child of this Bookmark.

Notes: Parameter in_bookmark must not be linked to a bookmark tree. If this Bookmark previously had no children, it will be open after the child is added.

AddNext(args)[source]

Overload 1:

Adds a new Bookmark to the tree containing this Bookmark, as the new right sibling.

Parameters:: in_title (string) – The title string value of the new Bookmark.
Return type:: Bookmark
Returns:: The newly created sibling Bookmark.

Overload 2:

Adds the specified Bookmark as the new right sibling to this Bookmark, adjusting the tree containing this Bookmark appropriately.

Parameters:: in_bookmark (Bookmark) – The Bookmark object to be added to this Bookmark.

Notes: Parameter in_bookmark must not be linked to a bookmark tree.

AddPrev(args)[source]

Overload 1:

Adds a new Bookmark to the tree containing this Bookmark, as the new left sibling.

Parameters:: in_title (string) – The title string value of the new Bookmark.
Return type:: Bookmark
Returns:: The newly created sibling Bookmark.

Overload 2:

Adds the specified Bookmark as the new left sibling to this Bookmark, adjusting the tree containing this Bookmark appropriately.

Parameters:: in_bookmark (Bookmark) – The Bookmark object to be added to this Bookmark.

Notes: Parameter in_bookmark must not be linked to a bookmark tree.

static Create(in_doc, in_title)[source]

Creates a new valid Bookmark with given title in the specified document.

Parameters:

in_doc (PDFDoc) – The document in which a Bookmark is to be created.
in_title (string) – The title string value of the new Bookmark.

Return type:

Bookmark

Returns:

The new Bookmark.

Notes: The new Bookmark is not linked to the outline tree. Use AddChild()/AddNext()/AddPrev() methods in order to link the Bookmark to the outline tree

static CreateInternal(impl)[source]

Delete()[source]: Removes the Bookmark’s subtree from the bookmark tree containing it.

Find(in_title)[source]

Returns the Bookmark specified by the given title string.

Parameters:: in_title (string) – The title string value of the Bookmark to find.
Return type:: Bookmark
Returns:: A Bookmark matching the title string value specified.

GetAction()[source]

Returns the Bookmark’s action.

Return type:: Action
Returns:: The Bookmark’s action.

GetColor()[source]

Returns the Bookmark’s RGB color value.

Parameters:

out_r – Reference to a variable that receives the red component of the color.
out_g – Reference to a variable that receives the green component of the color.
out_b – Reference to a variable that receives the blue component of the color.

Notes: The three numbers out_r, out_g, and out_b are in the range 0.0 to 1.0, representing the components in the DeviceRGB color space of the color to be used for the Bookmark’s text.

Example:

double red, green, blue;
bookmark.GetColor(red, green, blue);

GetFirstChild()[source]

Returns the Bookmark’s first child.

Return type:: Bookmark
Returns:: The Bookmark’s first child.

GetFlags()[source]

Returns the Bookmark’s flags.

Return type:: int
Returns:: The flags of the Bookmark object. Bit 1 (least-significant bit) indicates italic font whereas bit 2 indicates bold font. Therefore, 0 indicates normal, 1 is italic, 2 is bold, and 3 is bold-italic.

GetHandleInternal()[source]

GetIndent()[source]

Returns the indentation level of the Bookmark in its containing tree.

Return type:: int
Returns:: The indentation level of the Bookmark in its containing tree.

Notes: The root level has an indentation level of zero.

GetLastChild()[source]

Returns the Bookmark’s last child.

Return type:: Bookmark
Returns:: The Bookmark’s last child.

GetNext()[source]

Returns the Bookmark’s next (right) sibling.

Return type:: Bookmark
Returns:: the Bookmark’s next (right) sibling.

GetOpenCount()[source]

Returns the number of opened bookmarks in this subtree.

Return type:: int
Returns:: The number of opened bookmarks in this subtree (not including this Bookmark). If the item is closed, a negative integer whose absolute value specifies how many descendants would appear if the item were reopened.

GetParent()[source]

Returns the Bookmark’s parent Bookmark.

Return type:: Bookmark
Returns:: The Bookmark’s parent Bookmark.

GetPrev()[source]

Returns the Bookmark’s previous (left) sibling.

Return type:: Bookmark
Returns:: The Bookmark’s previous (left) sibling.

GetSDFObj()[source]

Returns the underlying SDF/Cos object.

Return type:: Obj
Returns:: The underlying SDF/Cos object.

Notes: A null (non-valid) bookmark returns a null object.

GetTitle()[source]

Returns the Bookmark’s title string.

Return type:: string
Returns:: The Bookmark’s title string).

GetTitleObj()[source]

Returns the Bookmark’s title string object.

Return type:: Obj
Returns:: The Bookmark’s title string object.

HasChildren()[source]

Indicates whether the Bookmark has children.

Return type:: boolean
Returns:: True if the Bookmark has children; otherwise false.

IsOpen()[source]

Indicates whether the Bookmark is open.

Return type:: boolean
Returns:: True if this Bookmark is open; otherwise false.

Notes: An open Bookmark shows all its children.

IsValid()[source]

Indicates whether the Bookmark is valid (non-null).

Return type:: boolean
Returns:: True if this is a valid (non-null) Bookmark; otherwise false.

Notes: If this method returns false the underlying SDF/Cos object is null and the Bookmark object should be treated as null as well.

RemoveAction()[source]: Removes the Bookmark’s action.

SetAction(in_action)[source]

Sets the Bookmark’s action.

Parameters:: in_action (Action) – The new Action for the Bookmark.

SetColor(in_r=0.0, in_g=0.0, in_b=0.0)[source]

Sets the Bookmark’s color value.

Parameters:

in_r (double, optional) – The red component of the color.
in_g (double, optional) – The green component of the color.
in_b (double, optional) – The blue component of the color.

Notes: The three numbers in_r, in_g, and in_b are in the range 0.0 to 1.0, representing the components in the DeviceRGB color space of the color to be used for the Bookmark’s text. Default color value is black, [0.0 0.0 0.0].

SetFlags(in_flags)[source]

Sets the Bookmark’s flags.

Parameters:: in_flags (int) – The new bookmark flags. Bit 1 (the least-significant bit) indicates italic font whereas bit 2 indicates bold font. Therefore, 0 indicates normal, 1 is italic, 2 is bold, and 3 is bold-italic.

SetOpen(in_open)[source]

Opens or closes the Bookmark.

Parameters:: in_open (boolean) – Boolean value that contains the status. If true, the Bookmark is opened. Otherwise the Bookmark is closed.

Notes: An opened Bookmark shows its children, while a closed Bookmark does not.

SetTitle(title)[source]

Sets the Bookmark’s title string.

Parameters:: title (string) – The new title string for the bookmark.

Unlink()[source]

Unlinks this Bookmark from the bookmark tree that contains it, and adjusts the tree appropriately.

Notes: After the bookmark is unlinked is can be moved to another place in the bookmark tree located in the same document.

property mp_obj

property thisown: The membership flag

class apryse_sdk.BorderStyle(args)[source]

Bases: object

BorderStyle structure specifies the characteristics of the annotation’s border. The border is specified as a rounded rectangle.

Destroy()[source]: Frees the native memory of the object.

GetDash()[source]

Return type:: std::vector< double,std::allocator< double > >
Returns:: the border dash pattern.

See also: ConversionOptions

Overload 2:

Create a TemplateDocument object from an office file suitable for generating any number of PDFs from supplied template data.

Template filling will be performed entirely within PDFNet, and handles incoming files in .docx, .xlsx, .pptx, .doc, .ppt, and .xls format

This method does not perform any template filling and can be expected to return quickly. To do the actual work, use the returned TemplateDocument object

See also: ConversionOptions

static CreateReflow(in_page, json_zones)[source]

static FromCAD(in_pdfdoc, in_filename, opts=None)[source]

Convert the specified CAD file to PDF and append converted pages to the specified PDF document. This conversion requires that the optional PDFTron CAD add-on module is available. See the CADConvertOptions class for the available options. See also: the ‘CADModule’ class

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the CAD document to convert
opts (CADConvertOptions, optional) – The options to use when converting.

static FromDICOM(in_pdfdoc, in_filename, opts=None)[source]

Convert the specified AdvancedImaging file to PDF and append converted pages to the specified PDF document. This conversion requires that the optional PDFTron AdvancedImaging add-on module is available. See also: the ‘AdvancedImagingModule’ class

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the CAD document to convert
opts (AdvancedImagingConvertOptions, optional) – The options to use when converting.

static FromEmf(in_pdfdoc, in_filename)[source]

Convert the specified EMF to PDF and append converted pages to to the specified PDF document. EMF will be fitted to the page.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the EMF document to convert

Notes: This method is available only on Windows platforms.

static FromSVG(in_pdfdoc, in_filename, opts=None)[source]

Convert the specified SVG file to PDF and append converted pages to the specified PDF document.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the CAD document to convert
opts (SVGConvertOptions, optional) – The options to use when converting.

static FromText(args)[source]

Convert the specified plain text file to PDF and append converted pages to the specified PDF document.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the plain text document to convert
in_options (Obj, optional) – the conversion options. The available options are:

Option Name | Type | Note |

|-------------------------|———|---------------------------------------------------------| | BytesPerBite | Integer | In bytes. Use for streaming conversion only. | | FontFace | String | Set the font face used for the conversion. | | FontSize | Integer | Set the font size used for the conversion. | | LineHeightMultiplier | Double | Set the line height multiplier used for the conversion. | | MarginBottom | Double | In inches. Set the bottom margin of the page. | | MarginLeft | Double | In inches. Set the left margin of the page. | | MarginRight | Double | In inches. Set the right margin of the page. | | MarginTop | Double | In inches. Set the top margin of the page. | | PageHeight | Double | In inches. Set the page height. | | PageWidth | Double | In inches. Set the page width. | | UseSourceCodeFormatting | Boolean | Set whether to use mono font for the conversion. |

static FromTiff(in_pdfdoc, in_data)[source]

Convert the specified TIFF filter to PDF and append converted pages to the specified PDF document.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_data (Filter) – the source TIFF data.

static FromXps(args)[source]

Overload 1:

Convert the specified XPS document to PDF and append converted pages to the specified PDF document.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
in_filename (string) – the path to the XPS document to convert

Overload 2:

Convert the specified XPS document contained in memory to PDF and append converted pages to the specified PDF document.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append to
buf (string) – the buffer containing the xps document
buf_sz (int) – the size of the buffer

static OfficeToPDF(args)[source]

Overload 1:

Convert the an office document (in .docx, .xlsx, pptx, or .doc format) to pdf and append to the specified PDF document. This conversion is performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_filename (string) – the path to the source document.
options (ConversionOptions) – the conversion options

Raises:

PDFNetException

See also: ConversionOptions

See also: StreamingPDFConversion() if you would like more control over the conversion process

Overload 2:

Convert the an office document (in .docx, .xlsx, pptx, or .doc format) to pdf and append to the specified PDF document. This conversion is performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_stream (Filter) – the source document data.
options (ConversionOptions) – the conversion options

Raises:

PDFNetException

See also: ConversionOptions

See also: StreamingPDFConversion() if you would like more control over the conversion process

static PageToHtml(page)[source]

Convert a page to HTML and return a string of the html

Parameters:: page (Page) – the page to convert to HTML
Return type:: string
Returns:: a string containing the page’s html

static PageToHtmlZoned(page, json_zones)[source]

static RequiresPrinter(in_filename)[source]

Utility function to determine if ToPdf or ToXps will require the PDFNet printer to convert a specific external file to PDF.

Parameters:: in_filename (string) – the path to the document to be checked
Return type:: boolean
Returns:: true if ToPdf requires the printer to convert the file, false otherwise.

Notes: Current implementation looks only at the file extension not file contents. If the file extension is missing, false will be returned

static StreamingPDFConversion(args)[source]

Overload 1:

Create a DocumentConversion object suitable for converting a file to pdf and appending to the specified PDF document. Handles incoming files in .docx, .xlsx, .pptx, .doc, .ppt, .xls, .png, .jpg, .bmp, .gif, .jp2, .tif, .txt, .xml and .md format This conversion will be performed entirely within PDFNet, and does not rely on any external functionality.

This method allows for more control over the conversion process than the single call ToPDF() interface. It does not perform any conversion logic immediately, and can be expected to return quickly. To perform the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_filename (string) – the path to the source document.
options (ConversionOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: ConversionOptions

Overload 2:

Create a DocumentConversion object suitable for converting a file to pdf. Handles incoming files in .docx, .xlsx, .pptx, .doc, .ppt, .xls, .png, .jpg, .bmp, .gif, .jp2, .tif, .txt, .xml and .md format This conversion will be performed entirely within PDFNet, and does not rely on any external functionality.

This method allows for more control over the conversion process than the single call ToPDF() interface. It does not perform any conversion logic immediately, and can be expected to return quickly. To perform the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_filename (string) – the path to the source document.
options (ConversionOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: ConversionOptions

Overload 3:

Create a DocumentConversion object suitable for converting an office document (in .docx, .xlsx, pptx, or .doc format) to pdf and appending to the specified PDF document. This conversion will be performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

This method does not perform any conversion logic and can be expected to return quickly. To do the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_stream (Filter) – the source document data.
options (ConversionOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: ConversionOptions

Overload 4:

Create a DocumentConversion object suitable for converting an office document (in .docx, .xlsx, pptx, or .doc format) to pdf and appending to the specified PDF document. This conversion will be performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

This method does not perform any conversion logic and can be expected to return quickly. To do the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_stream (Filter) – the source document data.
options (ConversionOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: ConversionOptions

static ToEmf(args)[source]

Overload 1:

Convert the PDFDoc to EMF and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to EMF
in_filename (string) – the path to the EMF files to create, one file per page

Notes: This method is available only on Windows platforms.

Overload 2:

Convert the Page to EMF and save to the specified path

Parameters:

in_page (Page) – the Page to convert to EMF
in_filename (string) – the path to the EMF file to create

Notes: This method is available only on Windows platforms.

static ToEpub(args)[source]

Overload 1:

Convert a file to EPUB format and save to the specified path

Parameters:

in_filename (string) – the file to convert to EPUB
out_path (string) – the path to the EPUB file to create
html_options (HTMLOutputOptions) – the conversion options
epub_options (EPUBOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 2:

Convert a file to EPUB format and save to the specified path

Parameters:

in_filename (string) – the file to convert to EPUB
out_path (string) – the path to the EPUB file to create
html_options (HTMLOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 3:

Convert a file to EPUB format and save to the specified path

Parameters:

in_filename (string) – the file to convert to EPUB
out_path (string) – the path to the EPUB file to create

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 4:

Convert the PDFDoc to EPUB format and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to EPUB
out_path (string) – the path to the EPUB file to create
html_options (HTMLOutputOptions) – the conversion options
epub_options (EPUBOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

Overload 5:

Convert the PDFDoc to EPUB format and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to EPUB
out_path (string) – the path to the EPUB file to create
html_options (HTMLOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

Overload 6:

Convert the PDFDoc to EPUB format and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to EPUB
out_path (string) – the path to the EPUB file to create

See also: ToPdf()

static ToExcel(args)[source]

Overload 1:

Convert a PDF file to Excel and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to Excel
out_path (string) – the path to where generated content will be stored
options (ExcelOutputOptions) – the conversion options

See also: StructuredOutputModule

Overload 2:

Convert a PDF file to Excel and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to Excel
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

Overload 3:

Convert PDF to Excel and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to Excel
out_path (string) – the path to where generated content will be stored
options (ExcelOutputOptions) – the conversion options

See also: StructuredOutputModule

See also: ToPdf()

Overload 4:

Convert PDF to Excel and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to Excel
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

See also: ToPdf()

static ToHtml(args)[source]

Overload 1:

Convert a file to HTML and save to the specified path. In e_reflow_paragraphs mode, this conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to HTML
out_path (string) – the path to where generated content will be stored
options (HTMLOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

See also: StructuredOutputModule

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 2:

Convert a file to HTML and save to the specified path

Parameters:

in_filename (string) – the file to convert to HTML
out_path (string) – the path to where generated content will be stored

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 3:

Convert the PDF to HTML and save to the specified path. In e_reflow_paragraphs mode, this conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to HTML
out_path (string) – the path to where generated content will be stored
options (HTMLOutputOptions) – the conversion options

See also: HTMLOutputOptions

See also: ToPdf()

See also: StructuredOutputModule

Overload 4:

Convert the PDF to HTML and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to HTML
out_path (string) – the path to where generated content will be stored

See also: ToPdf()

static ToPdf(in_pdfdoc, in_filename)[source]

Convert the file or document to PDF and append to the specified PDF document

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to append the converted document to. The PDFDoc can then be converted to XPS, EMF or SVG using the other functions in this class.
in_filename (string) – the path to the document to be converted to pdf

Notes: Internally formats include BMP, EMF, JPEG, PNG, TIF, XPS.

Formats that require external applications for conversion use the Convert::Printer class and the PDFNet printer to be installed. This is only supported on Windows platforms. Document formats in this category include RTF(MS Word or Wordpad), TXT (Notepad or Wordpad), DOC and DOCX (MS Word), PPT and PPTX (MS PowerPoint), XLS and XLSX (MS Excel), OpenOffice documents, HTML and MHT (Internet Explorer), PUB (MS Publisher), MSG (MS Outlook).

static ToPowerPoint(args)[source]

Overload 1:

Convert a PDF file to PowerPoint and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to PowerPoint
out_path (string) – the path to where generated content will be stored
options (PowerPointOutputOptions) – the conversion options

See also: StructuredOutputModule

Overload 2:

Convert a PDF file to PowerPoint and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to PowerPoint
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

Overload 3:

Convert PDF to PowerPoint and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to PowerPoint
out_path (string) – the path to where generated content will be stored
options (PowerPointOutputOptions) – the conversion options

See also: StructuredOutputModule

See also: ToPdf()

Overload 4:

Convert PDF to PowerPoint and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to PowerPoint
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

See also: ToPdf()

static ToSvg(args)[source]

Overload 1:

Convert the PDFDoc to SVG and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to SVG
in_filename (string) – the path to the SVG files to create, one file per page
in_options (SVGOutputOptions) – the conversion options

Overload 2:

Convert the PDFDoc to SVG and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to SVG
in_filename (string) – the path to the SVG files to create, one file per page

Overload 3:

Convert the Page to SVG and save to the specified path

Parameters:

in_page (Page) – the Page to convert to SVG
in_filename (string) – the path to the SVG file to create
in_options (SVGOutputOptions) – the conversion options

Overload 4:

Convert the Page to SVG and save to the specified path

Parameters:

in_page (Page) – the Page to convert to SVG
in_filename (string) – the path to the SVG file to create

static ToTiff(args)[source]

Overload 1:

Convert a file to multipage TIFF and save to the specified path

Parameters:

in_filename (string) – the file to convert to multipage TIFF
out_path (string) – the path to the TIFF file to create
options (TiffOutputOptions) – the conversion options

See also: TiffOutputOptions

Overload 2:

Convert a file to multipage TIFF and save to the specified path

Parameters:

in_filename (string) – the file to convert to multipage TIFF
out_path (string) – the path to the TIFF file to create

Overload 3:

Convert the PDF to multipage TIFF and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to multipage TIFF
out_path (string) – the path to the TIFF file to create
options (TiffOutputOptions) – the conversion options

See also: TiffOutputOptions

Overload 4:

Convert the PDF to multipage TIFF and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to multipage TIFF
out_path (string) – the path to the TIFF file to create

Overload 5:

Convert a file to multipage TIFF and write to the provided filter

Parameters:

in_filename (string) – the file to convert to multipage TIFF
out_filter (Filter) – the output filter where the TIFF data will be written
options (TiffOutputOptions) – the conversion options

See also: TiffOutputOptions

Overload 6:

Convert a file to multipage TIFF and write to the provided filter

Parameters:

in_filename (string) – the file to convert to multipage TIFF
out_filter (Filter) – the output filter where the TIFF data will be written

Overload 7:

Convert the PDF to multipage TIFF and write to the provided filter

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to multipage TIFF
out_filter (Filter) – the output filter where the TIFF data will be written
options (TiffOutputOptions) – the conversion options

See also: TiffOutputOptions

Overload 8:

Convert the PDF to multipage TIFF and write to the provided filter

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to multipage TIFF
out_filter (Filter) – the output filter where the TIFF data will be written

static ToWord(args)[source]

Overload 1:

Convert a PDF file to Word and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to Word
out_path (string) – the path to where generated content will be stored
options (WordOutputOptions) – the conversion options

See also: StructuredOutputModule

Overload 2:

Convert a PDF file to Word and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_filename (string) – the file to convert to Word
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

Overload 3:

Convert PDF to Word and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to Word
out_path (string) – the path to where generated content will be stored
options (WordOutputOptions) – the conversion options

See also: StructuredOutputModule

See also: ToPdf()

Overload 4:

Convert PDF to Word and save the output to the specified path. This conversion requires that the optional PDFTron StructuredOutput add-on module is available.

Parameters:

in_pdfdoc (PDFDoc) – the PDF doc to convert to Word
out_path (string) – the path to where generated content will be stored

See also: StructuredOutputModule

See also: ToPdf()

static ToXod(args)[source]

Overload 1:

Convert the input file to XOD format and save to the specified path

Parameters:

in_filename (string) – the file to convert to XOD
out_filename (string) – the path to the XOD file to create
options (XODOutputOptions) – the conversion options

See also: XODOutputOptions

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 2:

Convert the input file to XOD format and save to the specified path

Parameters:

in_filename (string) – the file to convert to XOD
out_filename (string) – the path to the XOD file to create

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 3:

Convert the input file to XOD format and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to XOD
out_filename (string) – the path to the XOD file to create
options (XODOutputOptions) – the conversion options

See also: XODOutputOptions

See also: ToPdf()

Overload 4:

Convert the input file to XOD format and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to XOD
out_filename (string) – the path to the XOD file to create

See also: ToPdf()

Overload 5:

Generate a stream that incrementally converts the input file to XOD format.

Parameters:

in_filename (string) – the file to convert to XOD
options (XODOutputOptions) – the conversion options

Return type:

Returns:

A filter from which the file can be read incrementally.

See also: XODOutputOptions

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 6:

Generate a stream that incrementally converts the input file to XOD format.

Parameters:: in_filename (string) – the file to convert to XOD
Return type:: Filter
Returns:: A filter from which the file can be read incrementally.

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 7:

Generate a stream that incrementally converts the input file to XOD format.

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to XOD
options (XODOutputOptions) – the conversion options

Return type:

Returns:

A filter from which the file can be read incrementally.

See also: XODOutputOptions

See also: ToPdf()

Overload 8:

Generate a stream that incrementally converts the input file to XOD format.

Parameters:: in_pdfdoc (PDFDoc) – the PDFDoc to convert to XOD
Return type:: Filter
Returns:: A filter from which the file can be read incrementally.

See also: ToPdf()

static ToXodWithMonitor(args)[source]

static ToXps(args)[source]

Overload 1:

Convert the PDFDoc to XPS and save to the specified path

Parameters:

in_pdfdoc (PDFDoc) – the PDFDoc to convert to XPS
in_filename (string) – the path to the document to create
options (XPSOutputOptions) – the conversion options

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

Overload 4:

Convert the input file to XPS format and save to the specified path

Parameters:

in_inputFilename (string) – the file to convert to XPS
in_outputFilename (string) – the path to the XPS file to create

See also: ToPdf()

Notes: Requires the Convert::Printer class for all file formats that ToPdf also requires.

static WordToPDF(args)[source]

Overload 1:

Convert the a Word document (in .docx format) to pdf and append to the specified PDF document. This conversion is performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_filename (string) – the path to the source document. The source must be in .docx format.
options (WordToPDFOptions) – the conversion options

Raises:

PDFNetException

See also: WordToPDFOptions

See also: WordToPdfConversion() if you would like more control over the conversion process

Overload 2:

Convert the a Word document (in .docx format) to pdf and append to the specified PDF document. This conversion is performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_stream (Filter) – the source document data. The source must be in .docx format.
options (WordToPDFOptions) – the conversion options

Raises:

PDFNetException

See also: WordToPDFOptions

See also: WordToPdfConversion() if you would like more control over the conversion process

static WordToPDFConversion(args)[source]

Overload 1:

Create a DocumentConversion object suitable for converting a Word document (in .docx format) to pdf and appending to the specified PDF document. This conversion will be performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

This method allows for more control over the conversion process than the single call WordToPDF() interface. This method does not perform any conversion logic and can be expected to return quickly. To do the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_filename (string) – the path to the source document. The source must be in .docx format.
options (WordToPDFOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: WordToPDFOptions

Overload 2:

Create a DocumentConversion object suitable for converting a Word document (in .docx format) to pdf and appending to the specified PDF document. This conversion will be performed entirely within PDFNet, and does not rely on Word interop or any other external functionality.

This method allows for more control over the conversion process than the single call WordToPDF() interface. This method does not perform any conversion logic and can be expected to return quickly. To do the actual conversion, use the returned DocumentConversion object.

See also: DocumentConversion

Notes: Font requirements: on some systems you may need to specify extra font resources to aid in conversion. Please see http://www.pdftron.com/kb_fonts_and_builtin_office_conversion

Parameters:

in_pdfdoc (PDFDoc) – the conversion result will be appended to this pdf.
in_stream (Filter) – the source document data. The source must be in .docx format.
options (WordToPDFOptions) – the conversion options

Return type:

Returns:

A DocumentConversion object which encapsulates this particular conversion.

See also: WordToPDFOptions

e_default = 2: Render text that are somewhat clipped or occluded.

e_fast = 2: Feature reduce PDF while trying to preserve some complex PDF features (such as vector figures, transparency, shadings, blend modes, Type3 fonts etc.) for pages that are already fast to render. This option can also result in smaller faster files compared to e_simple, but the pages may have more complex structure.

e_high_quality = 3: Preserve vector content where possible. In particular only feature reduce PDF files containing overprint or very complex vector content. Currently this option can only be used with XODOutputOptions.

e_keep_all = 4: Only render text that are completely occluded, or used as a clipping path.

e_keep_most = 3: Only render text that are seriously clipped or occluded.

e_off = 0: Disable flattening and convert all content as is.

e_simple = 1: Feature reduce PDF to a simple two layer representation consisting of a single background RGB image and a simple top text layer.

e_strict = 1: Render text that are marginally clipped or occluded.

e_very_strict = 0: Render (flatten) any text that is clipped or occluded.

property thisown: The membership flag

class apryse_sdk.DataExtractionModule[source]

Bases: object

The class DataExtractionModule. static interface to Apryse SDKs data extraction functionality

static DetectAndAddFormFieldsToPDF(doc, options=None)[source]

Perform automatic form field detection, then insert the fields into the PDF. Note: The FormKeyValue engine is experimental and subject to change.

Parameters:

doc (PDFDoc) – – The PDF document where fields are detected from and inserted into.
options (DataExtractionOptions, optional) – – Data extraction options (optional).

static ExtractData(args)[source]

Overload 1:

Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.

Parameters:

input_pdf_file (string) – – The source document filename.
engine (int) – – The extraction engine.
options (DataExtractionOptions, optional) – – Data extraction options (optional).

Return type:

string

Returns:

JSON string representing the extracted results.

Overload 2:

Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.

Parameters:

input_pdf_file (string) – – The source document filename.
output_json_file (string) – – The resulting JSON filename.
engine (int) – – The extraction engine.
options (DataExtractionOptions, optional) – – Data extraction options (optional).

Overload 3:

Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.

Parameters:

input_pdf_file (string) – – The source document filename.
output_json_file (string) – – The resulting JSON filename.
engine (int) – – The extraction engine.
options – – Data extraction options (optional).

static ExtractToXLSX(args)[source]

Overload 1:

Perform data extraction on a PDF in XLSX output format.

Parameters:

input_pdf_file (string) – – The source document filename.
output_xlsx_file (string) – – The resulting XLSX filename.
options (DataExtractionOptions, optional) – – Data extraction options (optional).

Overload 2:

Perform data extraction on a PDF in XLSX output format.

Parameters:

input_pdf_file (string) – – The source document filename.
output_xlsx_stream (Filter) – – The resulting XLSX filter.
options (DataExtractionOptions, optional) – – Data extraction options (optional).

Overload 3:

Perform data extraction on a PDF in XLSX output format.

Parameters:

input_pdf_file (string) – – The source document filename.
output_xlsx_stream (Filter) – – The resulting XLSX filter.
options – – Data extraction options (optional).

static IsModuleAvailable(engine)[source]

Find out whether the specified data extraction engine is available (and licensed).

Parameters:: engine (int) – The extraction engine.
Return type:: boolean
Returns:: Returns true if data extraction operations can be performed.

e_DocStructure = 2: Document structure engine. This engine discovers the full logical structure, including headers, footers, paragraphs, list items, table columns, cells, borders, images and graphics.

e_Form = 1: Form field extraction engine. This engine uses artificial intelligence and computer vision to detect form fields in documents that do not have any interactive field annotations embedded.

e_FormKeyValue = 3: Form field with key value extraction engine. This engine uses artificial intelligence and computer vision to detect form fields, including field name and values, in documents that do not have any interactive field annotations embedded.

e_GenericKeyValue = 4

This engine is experimental and subject to change.

Type:: Generic key value extraction engine. This engine uses artificial intelligence to detect arbitrary pairs of key and value in documents. Note

e_Tabular = 0: Tabular Data engine. This engine identifies column and row structure and analyzes numeric columns. It is especially suited to documents that are table-based such as spreadsheets.

property thisown: The membership flag

class apryse_sdk.DataExtractionOptions[source]

Bases: object

AddExclusionZonesForPage(value, page_num)[source]

Adds the value to the ExclusionZonesForPage array. Optional list of page areas to be excluded from analysis. Zones should be provided as a collection of Rects paired with a page number. The Rects are then applied to the corresponding page. Rects are specified in User Space coordinates. If this is set, the specified areas will not be analyzed. If neither this nor InclusionZonesForPage is set, the entire page will be analyzed. This option only affects the GenericKeyValue, FormKeyValue, and FormField engines.

Parameters:

value (RectCollection) – List of page areas to be excluded from analysis.
page_num (int) – The page number (1-indexed) to which the regions are applied.

Return type:

DataExtractionOptions

Returns:

This object, for call chaining.

AddInclusionZonesForPage(value, page_num)[source]

Adds the value to the InclusionZonesForPage array. Optional list of page areas to be included in analysis (to the exclusion of all other areas). Zones should be provided as a collection of Rects paired with a page number. The Rects are then applied to the corresponding page. Rects are specified in User Space coordinates. If this is set, only the areas specified will be analyzed. If neither this nor ExclusionZonesForPage is set, the entire page will be analyzed. This option only affects the GenericKeyValue, FormKeyValue, and FormField engines.

Parameters:

value (RectCollection) – List of page areas to be included in analysis.
page_num (int) – The page number (1-indexed) to which the regions are applied.

Return type:

DataExtractionOptions

Returns:

This object, for call chaining.

GetDeepLearningAssist()[source]

Gets the value DeepLearningAssist from the options object. Specifies if Deep Learning is used with table recognition in the DocStructure engine. The default is false. When true, table recognition accuracy improves at the cost of increased processing time. This only affects the DocStructure engine.

Return type:: boolean
Returns:: The current value for DeepLearningAssist.

GetFormExtractionEngine()[source]

Gets the value FormExtractionEngine from the options object. Specifies the form extraction engine used in DetectAndAddFormFieldsToPDF, either ‘Form’ or ‘FormKeyValue’. The default is ‘Form’.

Return type:: string
Returns:: The current value for FormExtractionEngine.

GetLanguage()[source]

Gets the value Language from the options object. Specifies the OCR language(s). Use 3-letter ISO 639-2 language codes, separated by spaces. Example: “eng deu spa fra”. The default is English.

Return type:: string
Returns:: The current value for Language.

GetOverlappingFormFieldBehavior()[source]

Gets the value OverlappingFormFieldBehavior from the options object. When a detected form field overlaps with an existing one, keep either the old field (value ‘KeepOld’), or the new one (value ‘KeepNew’, default).

Return type:: string
Returns:: The current value for OverlappingFormFieldBehavior.

GetPDFPassword()[source]

Gets the value PDFPassword from the options object. Specifies the password if the PDF requires one. The default is no password.

Return type:: string
Returns:: The current value for PDFPassword.

GetPages()[source]

Gets the value Pages from the options object. Specifies a range of pages to be converted, such as “1-5”. By default all pages are converted. The first page has the page number of 1.

Return type:: string
Returns:: The current value for Pages.

SetDeepLearningAssist(value)[source]

Sets the value for DeepLearningAssist in the options object. Specifies if Deep Learning is used with table recognition in the DocStructure engine. The default is false. When true, table recognition accuracy improves at the cost of increased processing time. This only affects the DocStructure engine.

Parameters:: value (boolean) – The new value for DeepLearningAssist.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

SetFormExtractionEngine(value)[source]

Sets the value for FormExtractionEngine in the options object. Specifies the form extraction engine used in DetectAndAddFormFieldsToPDF, either ‘Form’ or ‘FormKeyValue’. The default is ‘Form’.

Parameters:: value (string) – The new value for FormExtractionEngine.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

SetLanguage(value)[source]

Sets the value for Language in the options object. Specifies the OCR language(s). Use 3-letter ISO 639-2 language codes, separated by spaces. Example: “eng deu spa fra”. The default is English.

Parameters:: value (string) – The new value for Language.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

SetOverlappingFormFieldBehavior(value)[source]

Sets the value for OverlappingFormFieldBehavior in the options object. When a detected form field overlaps with an existing one, keep either the old field (value ‘KeepOld’), or the new one (value ‘KeepNew’, default).

Parameters:: value (string) – The new value for OverlappingFormFieldBehavior.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

SetPDFPassword(value)[source]

Sets the value for PDFPassword in the options object. Specifies the password if the PDF requires one. The default is no password.

Parameters:: value (string) – The new value for PDFPassword.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

SetPages(value)[source]

Sets the value for Pages in the options object. Specifies a range of pages to be converted, such as “1-5”. By default all pages are converted. The first page has the page number of 1.

Parameters:: value (string) – The new value for Pages.
Return type:: DataExtractionOptions
Returns:: This object, for call chaining.

property thisown: The membership flag

class apryse_sdk.Date(args)[source]

Bases: TRN_date

The Date class is a utility class used to simplify work with PDF date objects.

PDF defines a standard date format, which closely follows international standard ASN.1 (Abstract Syntax Notation One), A date is a string of the form (D:YYYYMMDDHHmmSSOHH’mm’); See PDF Reference Manual for details.

Date can be associated with a SDF/Cos date string using Date(Obj)ructor or later using Date::Attach(Obj) or Date::Update(Obj) methods.

Date keeps a local date/time cache so it is necessary to call Date::Update() method if the changes to the Date should be saved in the attached Cos/SDF string.

Attach(d)[source]

Attach the Cos/SDF object to the Date.

Parameters:

d (Obj) –

underlying Cos/SDF object. Must be an SDF::Str containing

a PDF date object.

GetDay()[source]

GetHour()[source]

GetMinute()[source]

GetMonth()[source]

GetSecond()[source]

GetUT()[source]

GetUTHour()[source]

GetUTMin()[source]

GetYear()[source]

Return type:: int
Returns:: The year.

IsValid()[source]

Indicates whether the Date is valid (non-null).

Return type:: boolean
Returns:: True if this is a valid (non-null) Date; otherwise false.

Notes: If this method returns false the underlying SDF/Cos object is null and the Date object should be treated as null as well.

SetCurrentTime()[source]: Sets the date object to the current date and time. The method also updates associated SDF object.

SetUT(ut)[source]

Set the relationship of local time to Universal Time(UT), denoted by one of the characters +, -, or Z

Parameters:: ut (char) – the relationship of local time to Universal Time(UT),

SetUTHour(ut_hour)[source]

Set the absolute value of the offset from UT in hours(00-23)

Parameters:: ut_hour (Int8) – the absolute value of the offset from UT in hours(00-23)

SetUTMinutes(ut_minutes)[source]

Set the absolute value of the offset from UT in minutes(00-59)

Parameters:: ut_minutes (Int8) – the absolute value of the offset from UT in minutes(00-59)

Update(d=0)[source]

Saves changes made to the Date object in the attached (or specified) SDF/Cos string.

Parameters:

d (Obj, optional) –

an optional parameter indicating a SDF string that should be

updated and attached to this Date. If parameter d is NULL or is omitted, update is performed on previously attached Cos/SDF date.

Return type:

boolean

Returns:

true if the attached Cos/SDF string was successfully updated, false otherwise.

property thisown: The membership flag

class apryse_sdk.Destination(args)[source]

Bases: object

A destination defines a particular view of a document, consisting of the following:

The page of the document to be displayed
The location of the document window on that page
The magnification (zoom) factor to use when displaying the page

Destinations may be associated with Bookmarks, Annotations, and Remote Go-To Actions.

Destination is a utility class used to simplify work with PDF Destinations; Please refer to section 8.2.1 ‘Destinations’ in PDF Reference Manual for details.

static CreateFit(page)[source]

Create a new ‘Fit’ Destination.

The new Destination displays the page designated by ‘page’, with its contents magnified just enough to fit the entire page within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, use the smaller of the two, centering the page within the window in the other dimension.

Parameters:: page (Page) – Page object to display

static CreateFitB(page)[source]

Create a new ‘FitB’ Destination.

The new Destination displays the page designated by ‘page’, with its contents magnified just enough to fit its bounding box entirely within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, use the smaller of the two, centering the bounding box within the window in the other dimension.

Parameters:: page (Page) – Page object to display

static CreateFitBH(page, top)[source]

Create a new ‘FitBH’ Destination.

The new Destination displays the page designated by ‘page’, with the vertical coordinate ‘top’ positioned at the top edge of the window and the contents of the page magnified just enough to fit the entire width of its bounding box within the window.

Parameters:

page (Page) – Page object to display
top (double) – vertical coordinate of the top edge of the window

static CreateFitBV(page, left)[source]

Create a new ‘FitBV’ Destination.

The new Destination displays Display the page designated by ‘page’, with the horizontal coordinate ‘left’ positioned at the left edge of the window and the contents of the page magnified just enough to fit the entire height of its bounding box within the window.

Parameters:

page (Page) – Page object to display
left (double) – horizontal coordinate of the left edge of the window

static CreateFitH(page, top)[source]

Create a new ‘FitH’ Destination.

The new Destination displays the page designated by ‘page’, with the vertical coordinate ‘top’ positioned at the top edge of the window and the contents of the page magnified just enough to fit the entire width of the page within the window.

Parameters:

page (Page) – Page object to display
top (double) – vertical coordinate of the top edge of the window

static CreateFitR(page, left, bottom, right, top)[source]

Create a new ‘FitR’ Destination.

The new Destination displays the page designated by ‘page’, with its contents magnified just enough to fit the rectangle specified by the coordinates ‘left’, ‘bottom’, ‘right’, and ‘top’ entirely within the window both horizontally and vertically. If the required horizontal and vertical magnification factors are different, use the smaller of the two, centering the rectangle within the window in the other dimension.

Parameters:

page (Page) – Page object to display
left (double) – horizontal coordinate of the left edge of the window
bottom (double) – vertical coordinate of the bottom edge of the window
right (double) – horizontal coordinate of the right edge of the window
top (double) – vertical coordinate of the top edge of the window

static CreateFitV(page, left)[source]

Create a new ‘FitV’ Destination.

The new Destination displays the page designated by ‘page’, with the horizontal coordinate ‘left’ positioned at the left edge of the window and the contents of the page magnified just enough to fit the entire height of the page within the window.

Parameters:

page (Page) – Page object to display
left (double) – horizontal coordinate of the left edge of the window

static CreateXYZ(page, left, top, zoom)[source]

Create a new ‘XYZ’ Destination.

The new Destination displays the page designated by ‘page’, with the coordinates (‘left’, ‘top’) positioned at the top-left corner of the window and the contents of the page magnified by the factor ‘zoom’. A null value for any of the parameters ‘left’, ‘top’, or ‘zoom’ specifies that the current value of that parameter is to be retained unchanged. A ‘zoom’ value of 0 has the same meaning as a null value. the page within the window in the other dimension.

Parameters:

page (Page) – Page object to display
left (double) – horizontal coordinate of the left edge of the window
top (double) – vertical coordinate of the top edge of the window
zoom (double) – amount to zoom the page by

GetExplicitDestObj()[source]

Return type:: Obj
Returns:: the explicit destination SDF/Cos object. This is always an Array as shown in Table 8.2 in PDF Reference Manual.
Raises:: An Exception is thrown if this is not a valid Destination.

GetFitType()[source]

Return type:: int
Returns:: destination’s FitType.
Raises:: An Exception is thrown if this is not a valid Destination.

GetPage()[source]

Return type:: Page
Returns:: the Page that this destination refers to.
Raises:: An Exception is thrown if this is not a valid Destination.

GetSDFObj()[source]

Return type:: Obj
Returns:: the object to the underlying SDF/Cos object. The returned SDF/Cos object is an explicit destination (i.e. the Obj is either an array defining the destination, using the syntax shown in Table 8.2 in PDF Reference Manual), or a dictionary with a ‘D’ entry whose value is such an array. The latter form allows additional attributes to be associated with the destination

IsValid()[source]

Return type:: boolean
Returns:: True if this is a valid Destination and can be resolved, false otherwise.

Notes: If this method returns false the underlying SDF/Cos object is null and the Action object should be treated as null as well.

SetPage(page)[source]

Modify the destination so that it refers to the new ‘page’ as the destination page.

Parameters:: page (Page) – The new page associated with this Destination.
Raises:: An Exception is thrown if this is not a valid Destination.

e_Fit = 1

e_FitB = 5

e_FitBH = 6

e_FitBV = 7

e_FitH = 2

e_FitR = 4

e_FitV = 3

e_XYZ = 0

property mp_dest

property thisown: The membership flag

class apryse_sdk.DictIterator(args)[source]

Bases: object

DictIterator is used to traverse key/value pairs in a dictionary. For example a DictIterator can be used to print out all the entries in a given Obj dictionary as follows:

DictIterator itr = dict.GetDictIterator();
while (itr.HasCurrent()) {
    Obj key = itr.Key();
           cout << key.GetName() << endl;
    Obj value = itr.Value();
    // ...
    itr.Next()
 }

Destroy()[source]: Frees the native memory of the object.

HasCurrent()[source]

Return type:: boolean
Returns:: true if the current iterator is not the end of the collection

HasNext()[source]

Return type:: boolean
Returns:: true if the current iterator is not the end of the collection Deprecated prefer HasCurrent()

Key()[source]

Return type:: Obj
Returns:: the key of the current dictionary entry.

Next()[source]: Advances the iterator to the next element of the collection.

Value()[source]

Return type:: Obj
Returns:: the value of the current dictionary entry.

property mp_impl

property thisown: The membership flag

class apryse_sdk.DiffOptions[source]

Bases: object

GetAddGroupAnnots()[source]

Gets the value AddGroupAnnots from the options object Whether we should add an annot layer indicating the difference regions

Return type:: boolean
Returns:: a bool, the current value for AddGroupAnnots.

GetBlendMode()[source]

Gets the value BlendMode from the options object How the two colors should be blended.

Return type:: GState::BlendMode
Returns:: a GState::BlendMode, the current value for BlendMode.

GetColorA()[source]

Gets the value ColorA from the options object The difference color for the first page.

Return type:: ColorPt
Returns:: a ColorPt, the current value for ColorA.

GetColorB()[source]

Gets the value ColorB from the options object The difference color for the second page

Return type:: ColorPt
Returns:: a ColorPt, the current value for ColorB.

GetInternalObj()[source]

SetAddGroupAnnots(value)[source]

Sets the value for AddGroupAnnots in the options object Whether we should add an annot layer indicating the difference regions

Parameters:: value: – the new value for AddGroupAnnots
Return type:: DiffOptions
Returns:: this object, for call chaining

SetBlendMode(value)[source]

Sets the value for BlendMode in the options object How the two colors should be blended.

Parameters:: value: – the new value for BlendMode
Return type:: DiffOptions
Returns:: this object, for call chaining

SetColorA(value)[source]

Sets the value for ColorA in the options object The difference color for the first page.

Parameters:: value: – the new value for ColorA
Return type:: DiffOptions
Returns:: this object, for call chaining

SetColorB(value)[source]

Sets the value for ColorB in the options object The difference color for the second page

Parameters:: value: – the new value for ColorB
Return type:: DiffOptions
Returns:: this object, for call chaining

property thisown: The membership flag

class apryse_sdk.DigestAlgorithm[source]

Bases: object

static CalculateDigest(in_digest_algorithm_type, in_message_buf)[source]

Calculates a digest of arbitrary data. Useful during CMS generation custom signing workflows for digesting signedAttributes before sending off for CMS signatureValue generation (e.g. by HSM device or cloud signing platform).

Parameters:

in_digest_algorithm_type (int) – – the digest algorithm to use
in_message_buf (std::vector< UChar,std::allocator< UChar > >) – – the message to digest

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

a container of bytes corresponding to the digest of the message

static SignDigest(args)[source]

Overload 1:

Sign the digest of arbitrary data with private key in the provided PKCS #12 key file (.pfx). This function is part of the low-level custom signing API, and works with GenerateESSSigningCertPAdESAttribute, GenerateCMSSignedAttributes, and GenerateCMSSignature.

Parameters:

digest_buf (std::vector< UChar,std::allocator< UChar > >) – – The digest to sign.
digest_algorithm_type (int) – – The digest algorithm used to generate the digest.
pkcs12_keyfile_path (string) – – The path to the PKCS #12 keyfile (usually has a .pfx extension) to use for signing.
pkcs12_password (string) – – The password to use to parse the PKCS #12 keyfile.

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the DER-serialized bytes of the signature.

Overload 2:

Sign the digest of arbitrary data with private key in the provided PKCS #12 key file (.pfx). This function is part of the low-level custom signing API, and works with GenerateESSSigningCertPAdESAttribute, GenerateCMSSignedAttributes, and GenerateCMSSignature.

Parameters:

digest_buf (std::vector< UChar,std::allocator< UChar > >) – – The digest to sign.
digest_algorithm_type (int) – – The digest algorithm used to generate the digest.
pkcs12_buf (std::vector< UChar,std::allocator< UChar > >) – – The buffer containing the PKCS #12 key (as usually stored in .pfx files) to use for signing.
pkcs12_password (string) – – The password to use to parse the PKCS 12 key.

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the DER-serialized bytes of the signature.

e_RIPEMD160 = 4

e_SHA1 = 0

e_SHA256 = 1

e_SHA384 = 2

e_SHA512 = 3

e_unknown_digest_algorithm = 5

property thisown: The membership flag

class apryse_sdk.DigitalSignatureField(args)[source]

Bases: object

A class representing a digital signature form field.

CalculateDigest(args)[source]

Calculates the digest of the relevant bytes of the document for this signature field, in order to allow

the caller to perform custom signing/processing. Signature field must first be prepared using one of the non-sign overloads (CreateSigDictForCustomSigning/Certification), and then the document must be saved; after that, this function can be called. The ByteRanges that the most recent save has entered into the signature dictionary within this signature field will be used to calculate the digest.

Parameters:: in_digest_algorithm_type (int, optional) – – the enumerated type of digest algorithm to use for the calculation. The default is SHA-256.
Return type:: std::vector< UChar,std::allocator< UChar > >
Returns:: an array of bytes containing the digest value

CertifyOnNextSave(args)[source]

Overload 1:

Must be called to prepare a signature for certification, which is done afterwards by calling Save. Throws if document already certified. Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:

in_pkcs12_keyfile_path (string) – – The path to the PKCS #12 private keyfile to use to certify this digital signature.
in_password (string) – – The password to use to parse the PKCS #12 keyfile.

Overload 2:

Must be called to prepare a signature for certification, which is done afterwards by calling Save. Throws if document already certified. Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:

in_pkcs12_buffer (UChar) – – A buffer of bytes containing the PKCS #12 private key certificate store to use to certify this digital signature.
in_buf_size (int) – – buffer size.
in_password (string) – – The password to use to parse the PKCS #12 buffer.

CertifyOnNextSaveWithCustomHandler(in_signature_handler_id)[source]

Must be called to prepare a signature for certification, which is done afterwards by calling Save. Throws if document already certified. Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:: in_signature_handler_id (int) – – The unique id of the signature handler to use to certify this digital signature.

ClearSignature()[source]: Clears cryptographic signature, if present. Otherwise, does nothing. Do not need to call HasCryptographicSignature before calling this. After clearing, other signatures should still pass validation if saving after clearing was done incrementally. Clears the appearance as well.

CreateSigDictForCustomCertification(in_filter_name, in_subfilter_type, in_contents_size_to_reserve)[source]

Prepares the field for certification without actually performing certification.

Useful for custom signing workflows. It is not necessary to call HasCryptographicSignature before calling this function.

Parameters:

in_filter_name (string) – the Filter name to use, representing the name of the signature handler that will be used to sign and verify the signature (e.g. Adobe.PPKLite)
in_subfilter_type (int) – the SubFilter name to use, representing an interoperable signature type identifier for third-party verification (e.g. adbe.pkcs7.detached, ETSI.CAdES.detached, etc.)
in_contents_size_to_reserve (int) – The size of the empty Contents entry to create. For security reasons, set the contents size to a value greater than but as close as possible to the size you expect your final signature to be.

CreateSigDictForCustomSigning(in_filter_name, in_subfilter_type, in_contents_size_to_reserve)[source]

Prepares the field for approval signing without actually performing signing.

Useful for custom signing workflows. It is not necessary to call HasCryptographicSignature before calling this function.

Parameters:

in_filter_name (string) – the Filter name to use, representing the name of the signature handler that will be used to sign and verify the signature (e.g. Adobe.PPKLite)
in_subfilter_type (int) – the SubFilter name to use, representing an interoperable signature type identifier for third-party verification (e.g. adbe.pkcs7.detached, ETSI.CAdES.detached, etc.)
in_contents_size_to_reserve (int) – The size of the empty Contents entry to create. For security reasons, set the contents size to a value greater than but as close as possible to the size you expect your final signature to be.

EnableLTVOfflineVerification(in_verification_result)[source]

Given a successful verification result that required online information to verify trust (trust verification must have been enabled and successful during the verification), embeds data into the PDF document that allows the signature to be verified offline. (This is accomplished using DSS and VRI dictionaries.) When this operation is successfully completed, one of the two components of secure long term validation (LTV) will be in place. The other necessary component of secure long term validation is to make sure to timestamp the document appropriately while the signature is still verifiable to maintain a chain of unexpired secure timestamps attesting to the integrity of the document. The verifiability of the signature should thereafter be maintainable in such a fashion despite any possible certificate expiry, algorithm compromise, or key compromise that would have otherwise rendered it invalid if it were to be verified using a time in the future rather than a securely-signed timestamp-derived time nearer the time of signing (at which which the signature was verifiable without extra data). This function, if given a good verification result, is also capable of making timestamp (DocTimeStamp ETSI.RFC3161) signatures LTV-enabled, which is necessary to do first when you intend to add another timestamp around an already-timestamped document to extend or enhance its verifiability (as described above), as per the PDF 2.0 and ETSI TS 102 778-4 (PAdES Level 4) specifications.

Parameters:: in_verification_result (VerificationResult) – – a successful verification result containing a successful TrustVerificationResult
Return type:: boolean
Returns:: a boolean status that reflects whether offline verification information was added successfully

Notes: It is necessary to save the document incrementally after this function completes successfully in order to actually write the LTV data into the document.

static GenerateCMSSignature(args)[source]

Overload 1:

Low-level function belonging to custom-signing APIs. Using low-level inputs that permit incorporation of remote key usage (cloud keystore, Hardware Security Module (HSM) device, etc.), generates bytes representing a Cryptographic Message Syntax (CMS)-format signature encoded in DER. The resulting data can be passed to SaveCustomSignature.

Parameters:

in_signer_cert (X509Certificate) – – the X509 public-key certificate of the signature’s signer (mathematically associated with private key used)
in_chain_certs_list (std::vector< Crypto::X509Certificate,std::allocator< Crypto::X509Certificate > >) – – the intermediate and root certificates to include in the CMS to allow verifiers to establish the chain/path of trust
in_digest_algorithm_oid (ObjectIdentifier) – – the OID of the digest algorithm used, for embedding in the CMS
in_signature_algorithm_oid (ObjectIdentifier) – – the OID of the signature algorithm used, for embedding in the CMS
in_signature_value_buf (std::vector< UChar,std::allocator< UChar > >) – – a buffer containing the signature value to embed in the CMS
in_signedattributes_buf (std::vector< UChar,std::allocator< UChar > >) – – a buffer containing signedAttributes for embedding into the CMS (must exactly match those used when creating signature value)

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

finished CMS data for embedding into the document using SaveCustomSignature

Overload 2:

Low-level function belonging to custom-signing APIs. Using low-level inputs that permit incorporation of remote key usage (cloud keystore, Hardware Security Module (HSM) device, etc.), generates bytes representing a Cryptographic Message Syntax (CMS)-format signature encoded in DER. The resulting data can be passed to SaveCustomSignature.

Parameters:

signer_cert (X509Certificate) – The X509 public-key certificate of the signature’s signer (mathematically associated with private key used).
chain_certs_list (std::vector< Crypto::X509Certificate,std::allocator< Crypto::X509Certificate > >) – The intermediate and root certificates to include in the CMS to allow verifiers to establish the chain/path of trust.
digest_algorithm_id (AlgorithmIdentifier) – The digest algorithm used, for embedding in the CMS.
signature_algorithm_id (AlgorithmIdentifier) – The signature algorithm used, for embedding in the CMS.
signature_value_buf (std::vector< UChar,std::allocator< UChar > >) – A buffer containing the signature value to embed in the CMS.
signedattributes_buf (std::vector< UChar,std::allocator< UChar > >) – A buffer containing signedAttributes for embedding into the CMS (must exactly match those used when creating signature value).
cms_options (CMSSignatureOptions, optional) – Optional extra data to store in the CMS.

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

The finished CMS data for embedding into the document using SaveCustomSignature.

Overload 3:

Low-level function belonging to custom-signing APIs. Using low-level inputs that permit incorporation of remote key usage (cloud keystore, Hardware Security Module (HSM) device, etc.), generates bytes representing a Cryptographic Message Syntax (CMS)-format signature encoded in DER. The resulting data can be passed to SaveCustomSignature.

Parameters:

signer_cert (X509Certificate) – The X509 public-key certificate of the signature’s signer (mathematically associated with private key used).
chain_certs_list (std::vector< Crypto::X509Certificate,std::allocator< Crypto::X509Certificate > >) – The intermediate and root certificates to include in the CMS to allow verifiers to establish the chain/path of trust.
digest_algorithm_id (AlgorithmIdentifier) – The digest algorithm used, for embedding in the CMS.
signature_algorithm_id (AlgorithmIdentifier) – The signature algorithm used, for embedding in the CMS.
signature_value_buf (std::vector< UChar,std::allocator< UChar > >) – A buffer containing the signature value to embed in the CMS.
signedattributes_buf (std::vector< UChar,std::allocator< UChar > >) – A buffer containing signedAttributes for embedding into the CMS (must exactly match those used when creating signature value).
cms_options – Optional extra data to store in the CMS.

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

The finished CMS data for embedding into the document using SaveCustomSignature.

static GenerateCMSSignedAttributes(args)[source]

Low-level function belonging to custom-signing APIs. Creates the signedAttributes component of Cryptographic Message Syntax (CMS). The result of this function can then be encrypted by a remote private key (cloud service, Hardware Security Module (HSM) device, etc.), using some external API that returns the bytes of a not-already-CMS-embedded signature value (e.g. RSA PKCS #1 v1.5 format). Following that, CMS generation can be performed using GenerateCMSSignature, after which the resulting signature can be inserted into a resulting signed version of the PDF document using the PDFDoc function SaveCustomSignature.

Parameters:

in_digest_buf (std::vector< UChar,std::allocator< UChar > >) – – a buffer containing the digest of the document within ByteRanges of this DigitalSignatureField (see CalculateDigest)
in_custom_signedattributes_buf (std::vector< UChar,std::allocator< UChar > >, optional) – – a buffer containing any optional custom BER-encoded signedAttributes to add, including potentially the PAdES one (see GenerateESSSigningCertPAdESAttribute). (Do not place an ASN.1ructed type around all of the attributes.) Do not pass any of the normal attributes (content type or message digest) as custom attributes because otherwise they will be duplicated.

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the BER-encoded bytes of the future signedAttrs component of a CMS signature, with no surroundingructed type

GenerateContentsWithEmbeddedTimestamp(in_timestamping_config, in_timestamp_response_verification_options)[source]

Contacts a remote timestamp authority over network, sends CMS digest, receives and verifies timestamp token, combines the timestamp token and the data of an existing CMS-type (adbe.pkcs7.detached or ETSI.CAdES.detached subfilter) main document signature, and then returns that data to the user. At least one signing time, whether “M” (see SetSigDictTimeOfSigning) or a secure embedded timestamp, is required to be added in order to create a PAdES signature.

Notes: This function does not insert the final CMS-type document signature into the document. You must retrieve it from the result using GetData and then pass that to PDFDoc SaveCustomSignature.

Parameters:

in_timestamping_config (TimestampingConfiguration) – – Configuration options to store for timestamping. These will include various items related to contacting a timestamping authority. Incorrect configuration will result in an exception being thrown. The usability of a combination of a TimestampingConfiguration and VerificationOptions can be checked ahead of time to prevent exceptions by calling TestConfiguration on TimestampingConfiguration and passing VerificationOptions.
in_timestamp_response_verification_options (VerificationOptions) – – Options for the timestamp response verification step (which is required by RFC 3161 to be done as part of timestamping). These response verification options should include the root certificate of the timestamp authority, so that the trust status of the timestamp signature can be verified. The options that should be passed are the same ones that one expects the timestamp to be verifiable with in the future (once it is embedded in the document), except the response verification requires online revocation information whereas the later verification may not (depending on whether LTV offline verification information for the embedded timestamp gets embedded into the document by that time). The timestamp response verification step makes sure that (a) the timestamp response has a success status, which is the only time that this is verified in the entire workflow, which prevents embedding an unsuccessful response; (b) that it digests the main signature digest correctly and is otherwise generally verifiable; and (c) that the nonce is correct (which is the only time that this is verifiable in the entire workflow) to prevent replay attacks (if it was not requested in the TimestampingConfiguration that the nonce mechanism should be disabled).

Return type:

TimestampingResult

Returns:

The result of the timestamp request, including the final document signature as DER-encoded CMS with a timestamp embedded

static GenerateESSSigningCertPAdESAttribute(in_signer_cert, in_digest_algorithm_type)[source]

Low-level optional function belonging to custom-signing APIs allowing creation of PAdES signatures with key elsewhere, allowing CMS to be generated automatically later. Represents one the components of the functionality of SignDigest which are not key-related. Creates the necessary attribute for a PAdES signature (ETSI.CAdES.detached subfilter type). The result of this function can be passed as a contiguous part of the custom attributes buffer parameter of GenerateCMSSignedAttributes. At least one signing time, whether “M” (see SetSigDictTimeOfSigning) or a secure embedded timestamp (see GenerateContentsWithEmbeddedTimestamp), is also required to be added in order to create a PAdES signature.

The result will be either the BER-serialized bytes of an ESS_signing_cert or ESS_signing_cert_V2 CMS Attribute (an ASN.1 SEQUENCE containing the correct OID and ESSCertID or ESSCertIDv2), as is appropriate, depending on what digest algorithm type is provided (see RFC 5035).

Parameters:

in_signer_cert (X509Certificate) – – the X509 public-key certificate of the signature’s signer (mathematically associated with private key to be used)
in_digest_algorithm_type (int) – – the digest algorithm to be used

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the BER-serialized bytes of an ESS_signing_cert or ESS_signing_cert_V2 CMS attribute

GetByteRanges()[source]

Retrieves the ranges of byte indices within the document over which this signature is intended to apply/be verifiable.

Return type:: std::vector< Common::ByteRange,std::allocator< Common::ByteRange > >
Returns:: a container of byte range objects

Notes: This function does not verify that the signature is valid over its byte ranges. It merely returns them. This can be useful when a document consists of multiple incremental revisions, the latter of which may or may not have been signed, for telling which revisions were actually signed by which signature. The outputs of this function can also be used to truncate the document at the end of a signed byte range, in order that the signed document revision may be retrieved from a document with later incremental revisions. Of course, to be certain that the signature is valid, it must also then be verified using the verification API. Also, the caller is responsible for making sure that the byte ranges returned from this function actually make sense (i.e. fit inside the document).

GetCert(in_index)[source]

Gets a certificate in the certificate chain (Cert entry) of the digital signature dictionary by index. Throws if Cert is not Array or String, throws if index is out of range and Cert is Array, throws if index is > 1 and Cert is string, otherwise retrieves the certificate. Only to be used for old-style adbe.x509.rsa_sha1 signatures; for other signatures, use CMS getter functions instead.

Parameters:: in_index (int) – – An integral index which must be greater than 0 and less than the cert count as retrieved using GetCertCount.
Return type:: std::vector< UChar,std::allocator< UChar > >
Returns:: A vector of bytes containing the certificate at the index. Returns empty vector if Cert is missing.

GetCertCount()[source]

Gets number of certificates in certificate chain (Cert entry of digital signature dictionary). Must call HasCryptographicSignature first and use it to check whether the signature is signed. Only to be used for old-style adbe.x509.rsa_sha1 signatures; for other signatures, use CMS getter functions instead.

Return type:: int
Returns:: An integer value - the number of certificates in the Cert entry of the digital signature dictionary.

GetCertPathsFromCMS()[source]

Retrieves allructible certificate paths from an adbe.pkcs7.detached or ETSI.CAdES.detached digital signature.

The signer will always be returned if the signature is CMS-based and not corrupt. Must only be called on signed adbe.pkcs7.detached signatures. The order of the certificates in each of the paths returned is as follows: the signer will be first, and issuers come after it in order of the issuer of the previous certificate. The default behaviour is to return a sub-path for each marginal issuer in a max-length path.

Return type:: std::vector< std::vector< Crypto::X509Certificate,std::allocator< Crypto::X509Certificate > >,std::allocator< std::vector< Crypto::X509Certificate,std::allocator< Crypto::X509Certificate > > > >
Returns:: a container of X509Certificate objects

Notes: This function does not verify the paths. It merely extracts certificates andructs paths. This function only works when the build has support for verification-related APIs.

GetContactInfo()[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Returns the contact information of the signer from the digital signature dictionary. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: string
Returns:: A unicode string containing the contact information of the signer from within the digital signature dictionary. Empty if ContactInfo entry not present.

GetDocumentPermissions()[source]

If HasCryptographicSignature, returns most restrictive permissions found in any reference entries in this digital signature. Returns Lock-resident (i.e. tentative) permissions otherwise. Throws if invalid permission value is found.

Return type:: int
Returns:: An enumeration value representing the level of restrictions (potentially) placed on the document by this signature.

GetLocation()[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Returns the Location of the signature from the digital signature dictionary. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: string
Returns:: A unicode string containing the signing location from within the digital signature dictionary. Empty if Location entry not present.

GetLockedFields()[source]

Returns the fully-qualified names of all fields locked by this signature using the field permissions feature. Retrieves from the digital signature dictionary if the form field HasCryptographicSignature. Otherwise, retrieves from the Lock entry of the digital signature form field. Result is invalidated by any field additions or removals. Does not take document permissions restrictions into account.

Return type:: std::vector< std::string,std::allocator< std::string > >
Returns:: A vector of UStrings representing the fully-qualified names of all fields locked by this signature.

GetReason()[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Returns the Reason for the signature from the digital signature dictionary. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: string
Returns:: A unicode string containing the reason for the signature from within the digital signature dictionary. Empty if Reason entry not present.

GetSDFObj()[source]

Retrieves the SDF Obj of the digital signature field.

Return type:: Obj
Returns:: the underlying SDF/Cos object.

GetSignatureName()[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Returns the name of the signer of the signature from the digital signature dictionary. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: string
Returns:: A unicode string containing the name of the signer from within the digital signature dictionary. Empty if Name entry not present.

GetSignerCertFromCMS()[source]

Returns the signing certificate. Must only be called on signed adbe.pkcs7.detached or ETSI.CAdES.detached signatures.

Return type:: X509Certificate
Returns:: An X509Certificate object.

Notes: This function does not verify the signature. It merely extracts the claimed signing certificate. This function only works when the build has support for verification-related APIs.

GetSigningTime()[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Returns the “M” entry from the digital signature dictionary, which represents the signing date/time. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: Date
Returns:: A PDF::Date object holding the signing date/time from within the digital signature dictionary. Returns a default-constructed PDF::Date if no date is present.

GetSubFilter()[source]

Returns the SubFilter type of the digital signature. Specification says that one must check the SubFilter before using various getters. Must call HasCryptographicSignature first and use it to check whether the signature is signed.

Return type:: int
Returns:: An enumeration describing what the SubFilter of the digital signature is from within the digital signature dictionary.

HasCryptographicSignature()[source]

Returns whether the digital signature field has been cryptographically signed. Checks whether there is a digital signature dictionary in the field and whether it has a Contents entry. Must be called before using various digital signature dictionary-related functions. Does not check validity - will return true even if a valid hash has not yet been generated (which will be the case after [Certify/Sign]OnNextSave[WithCustomHandler] has been called on the signature but even before Save is called on the document).

Return type:: boolean
Returns:: A boolean value representing whether the digital signature field has a digital signature dictionary with a Contents entry.

HasVisibleAppearance()[source]

Returns whether the field has a visible appearance. Can be called without checking HasCryptographicSignature first, since it operates on the surrounding Field dictionary, not the “V” entry (i.e. digital signature dictionary). Performs the zero-width+height check, the Hidden bit check, and the NoView bit check as described by the PDF 2.0 specification, section 12.7.5.5 “Signature fields”.

Return type:: boolean
Returns:: A boolean representing whether or not the signature field has a visible signature.

IsCertification()[source]

Returns whether or not this signature is a certification.

Return type:: boolean
Returns:: a boolean value representing whether or not this signature is a certification.

IsLockedByDigitalSignature()[source]

Returns whether this digital signature field is locked against modifications by any digital signatures. Can be called when this field is unsigned.

Return type:: boolean
Returns:: A boolean representing whether this digital signature field is locked against modifications by any digital signatures in the document.

SetContactInfo(in_contact_info)[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Sets the ContactInfo entry in the digital signature dictionary. Must create a digital signature dictionary first using [Certify/Sign]OnNextSave[WithCustomHandler]. If this function is called on a digital signature field that has already been cryptographically signed with a valid hash, the hash will no longer be valid, so do not call Save (to sign/create the hash) until after you call this function, if you need to call this function in the first place. Essentially, call this function after [Certify/Sign]OnNextSave[WithCustomHandler] and before Save.

Parameters:: in_contact_info (string) – – A string containing the ContactInfo to be set.

static SetDigSigLogFilename(filename)[source]

Sets the digital signature logging filename, and enables the logging. This function is expected to be called only once. Subsequent calls to the function will have no effect.

Parameters:: filename (string) – The name (and path) of the log file.
Return type:: boolean
Returns:: True if this operation was successful and false if it failed because the logging process has already started.

SetDocumentPermissions(in_perms)[source]

Sets the document locking permission level for this digital signature field. Call only on unsigned signatures, otherwise a valid hash will be invalidated.

Parameters:: in_perms (int) – – An enumerated value representing the document locking permission level to set.

SetFieldPermissions(args)[source]

Tentatively sets which fields are to be locked by this digital signature upon signing. It is not necessary to call HasCryptographicSignature before using this function. Throws if non-empty array of field names is passed along with FieldPermissions Action == e_lock_all.

Parameters:

in_action (int) – – An enumerated value representing which sort of field locking should be done. Options are All (lock all fields), Include (lock listed fields), and Exclude (lock all fields except listed fields).
in_field_names (std::vector< std::string,std::allocator< std::string > >, optional) – – A list of field names; can be empty (and must be empty, if Action is set to All). Empty by default.

SetLocation(in_location)[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Sets the Location entry in the digital signature dictionary. Must create a digital signature dictionary first using [Certify/Sign]OnNextSave[WithCustomHandler]. If this function is called on a digital signature field that has already been cryptographically signed with a valid hash, the hash will no longer be valid, so do not call Save (to sign/create the hash) until after you call this function, if you need to call this function in the first place. Essentially, call this function after [Certify/Sign]OnNextSave[WithCustomHandler] and before Save.

Parameters:: in_location (string) – – A string containing the Location to be set.

SetPreferredDigestAlgorithm(in_digest_algorithm_type, in_make_mandatory=True)[source]

Sets the preferred digest algorithm to use when signing this field. This is done by setting DigestMethod

in the Seed Value dictionary. This function can be called before a signature field is even prepared for signing.

Parameters:

in_digest_algorithm_type (int) – – the digest algorithm to use
in_make_mandatory (boolean, optional) – – whether to tell signing software to give up if the preferred algorithm is unsupported. Default value for this parameter is true.

SetReason(in_reason)[source]

Should not be called when SubFilter is ETSI.RFC3161 (i.e. on a DocTimeStamp). Sets the Reason entry in the digital signature dictionary. Must create a digital signature dictionary first using [Certify/Sign]OnNextSave[WithCustomHandler]. If this function is called on a digital signature field that has already been cryptographically signed with a valid hash, the hash will no longer be valid, so do not call Save (to sign/create the hash) until after you call this function, if you need to call this function in the first place. Essentially, call this function after [Certify/Sign]OnNextSave[WithCustomHandler] and before Save.

Parameters:: in_reason (string) – – A string containing the Reason to be set.

SetSigDictTimeOfSigning(in_date)[source]

Adds the “M” key and value, representing the PDF-time-of-signing (not to be confused with

embedded timestamps, DocTimeStamps, or CMS signing time), to the digital signature dictionary. The digital signature field must have been prepared for signing first. This function should only be used if no secure embedded timestamping support is available from your signing provider. Useful for custom signing workflows, where signing time is not set automatically by the Apryse SDK, unlike in the usual standard handler signing workflow. A secure embedded timestamp can also be added later and should override this “M” date entry when the signature is read by signature-verifying PDF processor applications. At least one signing time, whether “M” or a secure embedded timestamp (see GenerateContentsWithEmbeddedTimestamp), is required to be added in order to create a PAdES signature.

Parameters:: in_date (Date) – the PDF Date datetime value to set

static SignDigest(args)[source]

Overload 1:

Returns a CMS detached signature incorporating a digest that is provided using the provided PKCS #12 key file (.pfx). This function is part of the custom signing API, but cannot be used for workflows where the key is not in PFX format or when the signature comes from a source that cannot generate CMS signatures (e.g. Hardware Security Modules (HSM) devices, cloud signing services). In such cases, the low-level parts of the custom signing API should be used instead of this function (e.g. GenerateESSSigningCertPAdESAttribute, GenerateCMSSignedAttributes, GenerateCMSSignature). This function is a shortcut for situations in which use of more low-level custom signing functions is unnecessary. Therefore, this function will generate necessary CMS components, such as signedAttrs, internally. Notes: This function does not change the DigitalSignatureField. Call SaveCustomSignature to write a signature to its PDFDoc.

Parameters:

in_digest (std::vector< UChar,std::allocator< UChar > >) – – the document digest value to use
in_pkcs12_keyfile_path (string) – – the path to the PKCS #12 key file (usually has a .pfx extension) to use for signing
in_keyfile_password (string) – – the password to use to decrypt the PKCS #12 key file
in_pades_mode (boolean) – – whether to create a PAdES-type signature (PDF Advanced Electronic Signatures standards)
in_digest_algorithm_type (int) – – the identifier to use to write the digest algorithm

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the DER-serialized bytes of a CMS detached signature (CMS ContentInfo)

Overload 2:

Returns a CMS detached signature incorporating a digest that is provided using the provided PKCS #12 key buffer (.pfx). This function is part of the custom signing API, but cannot be used for workflows where the key is not in PFX format or when the signature comes from a source that cannot generate CMS signatures (e.g. Hardware Security Modules (HSM) devices, cloud signing services). In such cases, the low-level parts of the custom signing API should be used instead of this function (e.g. GenerateESSSigningCertPAdESAttribute, GenerateCMSSignedAttributes, GenerateCMSSignature). This function is a shortcut for situations in which use of more low-level custom signing functions is unnecessary. Therefore, this function will generate necessary CMS components, such as signedAttrs, internally. Notes: This function does not change the DigitalSignatureField. Call SaveCustomSignature to write a signature to its PDFDoc.

Parameters:

in_digest (std::vector< UChar,std::allocator< UChar > >) – – the document digest value to use
in_pkcs12_buffer (std::vector< UChar,std::allocator< UChar > >) – – a buffer containing the PKCS #12 key (as usually stored in .pfx files) to use for signing
in_keyfile_password (string) – – the password to use to decrypt the PKCS #12 key file data in the buffer
in_pades_mode (boolean) – – whether to create a PAdES-type signature (PDF Advanced Electronic Signatures standards)
in_digest_algorithm_type (int) – – the identifier to use to write the digest algorithm

Return type:

std::vector< UChar,std::allocator< UChar > >

Returns:

the DER-serialized bytes of a CMS detached signature (CMS ContentInfo)

SignOnNextSave(args)[source]

Overload 1:

Must be called to prepare a signature for signing, which is done afterwards by calling Save. Cannot sign two signatures during one save (throws). Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:

in_pkcs12_keyfile_path (string) – – The path to the PKCS #12 private keyfile to use to sign this digital signature.
in_password (string) – – The password to use to parse the PKCS #12 keyfile.

Overload 2:

Must be called to prepare a signature for signing, which is done afterwards by calling Save. Cannot sign two signatures during one save (throws). Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:

in_pkcs12_buffer (UChar) – – A buffer of bytes containing the PKCS #12 private key certificate store to use to sign this digital signature.
in_buf_size (int) – – buffer size.
in_password (string) – – The password to use to parse the PKCS #12 buffer.

SignOnNextSaveWithCustomHandler(in_signature_handler_id)[source]

Must be called to prepare a signature for signing, which is done afterwards by calling Save. Cannot sign two signatures during one save (throws). Default document permission level is e_annotating_formfilling_signing_allowed. Throws if signature field already has a digital signature dictionary.

Parameters:: in_signature_handler_id (int) – – The unique id of the signature handler to use to sign this digital signature.

TimestampOnNextSave(in_timestamping_config, in_timestamp_response_verification_options)[source]

Must be called to prepare a secure PDF-embedded timestamp signature (RFC 3161

DocTimeStamp) for signing, which is done afterwards by calling Save on the document with an e_incremental flag. Throws if document is locked by other signatures, if signature is already signed, or if another signature has already been prepared for signing on the next save (because only one signing operation can be done per incremental save). Default document permission level is e_annotating_formfilling_signing_allowed.

Parameters:

in_timestamping_config (TimestampingConfiguration) – – Configuration options to store for timestamping. These will include various items related to contacting a timestamping authority. Incorrect configuration will result in document Save throwing an exception. The usability of a combination of a TimestampingConfiguration and VerificationOptions can be checked ahead of time to prevent exceptions by calling TestConfiguration on TimestampingConfiguration and passing VerificationOptions.
in_timestamp_response_verification_options (VerificationOptions) – – Options for the timestamp response verification step (which is required by RFC 3161 to be done as part of timestamping). These response verification options should include the root certificate of the timestamp authority, so that the trust status of the timestamp signature can be verified. The options that should be passed are the same ones that one expects the timestamp to be verifiable with in the future (once it is embedded in the document), except the response verification requires online revocation information whereas the later verification may not (depending on whether LTV offline verification information for the timestamp signature gets embedded into the document by that time). The timestamp response verification step makes sure that (a) the timestamp response has a success status, which is the only time that this is verified in the entire workflow, which prevents embedding an unsuccessful response; (b) that it digests the document correctly and is otherwise generally verifiable; and (c) that the nonce is correct (which is the only time that this is verifiable in the entire workflow) to prevent replay attacks (if it was not requested in the TimestampingConfiguration that the nonce mechanism should be disabled).

Notes: A failure in timestamp response verification will result in document Save throwing an exception. It is recommended to use TimestampingConfiguration.TestConfiguration with the VerificationOptions ahead of time to avoid this.

UseSubFilter(in_subfilter_type, in_make_mandatory=True)[source]

Sets the requested SubFilter value (which identifies a signature type) as the only one to use during future signing, overwriting all such previous settings. It is not necessary to call HasCryptographicSignature before calling this function. For example, this function can be used to switch to PAdES signing mode.

Parameters:

in_subfilter_type (int) – – The SubFilter type to set.
in_make_mandatory (boolean, optional) – – Whether to make usage of this SubFilter mandatory for future signing applications. Default value for this parameter is true.

Verify(in_opts)[source]

Verifies this cryptographic digital signature in the manner specified by the VerificationOptions.

Parameters:: in_opts (VerificationOptions) – – The options specifying how to do the verification.
Return type:: VerificationResult
Returns:: A VerificationResult object containing various information about the verifiability of the cryptographic digital signature.

e_ETSI_CAdES_detached = 3

e_ETSI_RFC3161 = 4

e_absent = 6

e_adbe_pkcs7_detached = 1

e_adbe_pkcs7_sha1 = 2

e_adbe_x509_rsa_sha1 = 0

e_annotating_formfilling_signing_allowed = 3

e_exclude = 2

e_formfilling_signing_allowed = 2

e_include = 1

e_lock_all = 0

e_no_changes_allowed = 1

e_unknown = 5

e_unrestricted = 4

property m_impl

property thisown: The membership flag

class apryse_sdk.DigitalSignatureFieldIterator(args)[source]

Bases: object

Supports a simple iteration over a generic collection.

Current()[source]

Note: HasNext() must be true before calling Current()

Return type:: DigitalSignatureField
Returns:: the current element in the collection

Destroy()[source]: Frees the native memory of the object.

HasNext()[source]

Return type:: boolean
Returns:: true if the iterator can be successfully advanced to the next element; false if the iterator is no longer valid.

Next()[source]: Note: HasNext() must be true before calling Next() Advances the iterator to the next element of the collection.

property mp_impl

property thisown: The membership flag

class apryse_sdk.DisallowedChange(args)[source]

Bases: object

The class DisallowedChange. Data pertaining to a change detected in a document during a digital signature modification permissions verification step, the change bein g both made after the signature was signed, and disallowed by t he signature’s permissions settings.

Destroy()[source]

GetObjNum()[source]

Returns the SDF object number of the indirect object associated with this DisallowedChange.

Return type:: int
Returns:: An unsigned 32-bit integer value.

GetType()[source]

Returns an enumeration value representing the semantic type of this disallowed change.

Return type:: int
Returns:: An enumeration value of type: Type of DisallowedChange.

GetTypeAsString()[source]

Returns a string value representing the semantic type of this disallowed change.

Return type:: string
Returns:: A string.

e_annotation_created_or_updated_or_deleted = 3

e_digital_signature_signed = 1

e_form_filled = 0

e_other = 4

e_page_template_instantiated = 2

e_unknown = 5

property m_impl

property thisown: The membership flag

class apryse_sdk.DocSnapshot(args)[source]

Bases: object

The class DocSnapshot. Represents a state of the document.

Destroy()[source]

Equals(snapshot)[source]

Returns whether this snapshot’s document state is equivalent to another.

Parameters:: snapshot (DocSnapshot) – – the other snapshot with which to compare.
Return type:: boolean
Returns:: Whether this snapshot’s document state is equivalent to another.

GetHash()[source]

Returns a hash that is unique to particular document states.

Return type:: int
Returns:: A hash that is unique to particular document states.

IsValid()[source]

Returns whether this snapshot is valid.

Return type:: boolean
Returns:: Whether this snapshot is valid.

property m_impl

property thisown: The membership flag

class apryse_sdk.DocumentConversion(args)[source]

Bases: object

The class DocumentConversion. Encapsulates the conversion of a single document from one format to another.

DocumentConversion instances are created through methods belonging to the Convert class. See Convert.WordToPDFConversion for an example.

CancelConversion()[source]: Cancel the current conversion, forcing TryConvert or Convert to return.

Convert()[source]: Perform the conversion. Will throw an exception on failure.

ConvertNextPage()[source]: Perform the conversion. Will throw an exception on failure. Does nothing if the conversion is already complete. Use GetConversionStatus() to check if there is remaining content to be converted.

static CreateInternal(impl)[source]

Destroy()[source]

GetConversionStatus()[source]

Get the state of the conversion process. Pair this with ConvertNextPage().

Return type:

int

Returns:

.

GetCurrentExcelSheetName()[source]

Retrieve the name of the Excel sheet placed on the last converted page if any.

Return type:: string
Returns:: The name of the Excel sheet.

GetDoc()[source]

Gets the PDFDoc from the conversion. Can be accessed at any time during or after conversion.

Return type:: PDFDoc
Returns:: The conversion’s PDFDoc.

GetErrorString()[source]

If the conversion finsihed with some kind of error, this returns the value of the error description; otherwise returns an empty string.

Return type:: string
Returns:: The error description. Will be blank unless GetConversionStatus returns Failure.

GetHandleInternal()[source]

GetNextExcelSheetCellCount()[source]

Retrieve the number of cells in the Excel sheet that will be converted next.

Return type:: int
Returns:: The number of cells.

GetNumConvertedPages()[source]

Returns the number of pages which have been added to the destination document. Will never decrease, and will not change after the conversion status becomes “complete”.

Return type:: int
Returns:: The number of pages that have been converted.

GetNumWarnings()[source]

Return the number of warning strings generated during the conversion process. Warning: experimental interface; this method may be renamed or replaced with equivalent functionality in the future.

Return type:: int
Returns:: The number of stored warning strings.

GetProgress()[source]

Returns a number from 0.0 to 1.0, representing the best estimate of conversion progress. This number is only an indicator, and should not be used to dictate program logic (in particular, it is possible for this method to return 1.0 while there is still work to be done. Use GetConversionStatus() to find out when the conversion is fully complete).

Return type:: double
Returns:: The conversion progress. Will never return a smaller number than a previous call.

GetProgressLabel()[source]

Returns the label for the current conversion stage. May return a blank string. Warning: experimental interface; this method may be renamed or replaced with equivalent functionality in the future.

Return type:: string
Returns:: The stage label.

GetWarningString(index)[source]

Retrieve warning strings that have been collected during the conversion process. Warning: experimental interface; this method may be renamed or replaced with equivalent functionality in the future.

Parameters:: index (int) – – the index of the string to be retrieved. Must be less than GetNumWarnings().
Return type:: string
Returns:: The value of the particular warning string.

HasProgressTracking()[source]

Determine whether this DocumentConversion has progress reporting capability.

Return type:: boolean
Returns:: True if GetProgress is expected to return usable values.

IsCancelled()[source]

Has the conversion been cancelled?.

Return type:: boolean
Returns:: Returns true if CancelConversion has been called previously.

SkipNextExcelSheet()[source]: Skip the next Excel sheet. The sheet will not be converted.

TryConvert()[source]

Perform the conversion. If the result of the conversion is failure, then GetErrorString will contain further information about the failure.

Return type:: int
Returns:: Indicates that the conversion succeeded, failed, or was cancelled.

eFailure = 2

eIncomplete = 1

eSuccess = 0

property m_impl

property thisown: The membership flag

class apryse_sdk.EPUBOutputOptions[source]

Bases: object

A class containing options common to ToEpub functions

SetExpanded(expanded)[source]

Create the EPUB in expanded format. Default is false.

Parameters:: expanded (boolean) – if false a single EPUB file will be generated, otherwise, the generated EPUB will be in unzipped (expanded) format

SetReuseCover(reuse)[source]

Set whether the first content page in the EPUB uses the cover image or not. If this is set to true, then the first content page will simply wrap the cover image in HTML. Otherwise, the page will be converted the same as all other pages in the EPUB. Default is false.

Parameters:: reuse (boolean) – if true the first page will simply be EPUB cover image, otherwise, the first page will be converted the same as the other pages

property thisown: The membership flag

class apryse_sdk.Element(args)[source]

Bases: object

Element is the abstract interface used to access graphical elements used to build the display list.

Just like many other classes in PDFNet (e.g. ColorSpace, Font, Annot, etc), Element class follows the composite design pattern. This means that all Elements are accessed through the same interface, but depending on the Element type (that can be obtained using GetType()), only methods related to that type can be called. For example, if GetType() returns e_image, it is illegal to call a method specific to another Element type (i.e. a call to a text specific GetTextData() will throw an Exception).

GetBBox()[source]

Obtains the bounding box for a graphical element.

Calculates the bounding box for a graphical element (i.e. an Element that belongs to one of following types: e_path, e_text, e_image, e_inline_image, e_shading e_form). The returned bounding box is guaranteed to encompass the Element, but is not guaranteed to be the smallest box that could contain the element. For example, for Bezier curves the bounding box will enclose all control points, not just the curve itself.

Return type:: Rect
Returns:: true if this is a graphical element and the bounding box can be calculated; false for non-graphical elements which don’t have bounding box.
Parameters:: out_bbox – (Filled by the method) A reference to a rectangle specifying the bounding box of Element (a rectangle that surrounds the entire element). The coordinates are represented in the default PDF page coordinate system and are using units called points ( 1 point = 1/72 inch = 2.54 /72 centimeter). The bounding box already accounts for the effects of current transformation matrix (CTM), text matrix, font size, and other properties in the graphics state. If this is a non-graphical element (i.e. the method returns false) the bounding box is undefined.

GetBitsPerComponent()[source]

Return type:: int
Returns:: the number of bits used to represent each color component. Only a single value may be specified; the number of bits is the same for all color components. Valid values are 1, 2, 4, and 8.

GetCTM()[source]

Return type:: Matrix2D
Returns:: Current Transformation Matrix (CTM) that maps coordinates to the initial user space.

GetCharIterator()[source]

Return type:: CharIterator
Returns:: a CharIterator addressing the first CharData element in the text run.

CharIterator points to CharData. CharData is a data structure that contains the char_code number (used to retrieve glyph outlines, to map to Unicode, etc.), character positioning information (x, y), and the number of bytes taken by the character within the text buffer.

Notes: CharIterator follows the standard STL forward-iterator interface.

An example of how to use CharIterator.

for (CharIterator itr = element.GetCharIterator(); itr.HasNext(); itr.Next()) {
                unsigned int char_code = itr.Current().char_code;
                double char_pos_x = itr.Current().x;
                double char_pos_y = itr.Current().y;
}

Character positioning information (x, y) is represented in text space. In order to get the positioning in the user space, the returned value should be scaled using the text matrix (GetTextMatrix()) and the current transformation matrix (GetCTM()). See section 4.2 ‘Other Coordinate Spaces’ in PDF Reference Manual for details and PDFNet FAQ - “How do I get absolute/relative text and character positioning?”.

within a text run a character may occupy more than a single byte (e.g. in case of composite/Type0 fonts). The role of CharIterator/CharData is to provide a uniform and easy to use interface to access character information.

GetComponentNum()[source]

Return type:: int
Returns:: the number of color components per sample.

GetDecodeArray()[source]

Return type:: Obj
Returns:: Decode array or NULL if the parameter is not specified. A decode object is an array of numbers describing how to map image samples into the range of values appropriate for the color space of the image. If ImageMask is true, the array must be either [0 1] or [1 0]; otherwise, its length must be twice the number of color components required by ColorSpace. Default value depends on the color space, See Table 4.36 in PDF Ref. Manual.

GetGState()[source]

Return type:: GState
Returns:: GState of this Element

GetImageColorSpace()[source]

Convert PDF image to GDI+ Bitmap.

Return type:: ColorSpace
Returns:: GDI+ bitmap from this image. PDFNet creates a GDI+ bitmap that closely matches the original image in terms of the image depth and the number of color channels. PDF color spaces that do not have a counterpart in GDI+ are converted to RGB.

Notes: This method is available only on Windows platforms.

Return type:: ColorSpace
Returns:: The SDF object representing the color space in which image are specified or NULL if the image is an image mask

The returned color space may be any type of color space except Pattern.

GetImageData()[source]

Return type:: Filter
Returns:: A stream (filter) containing decoded image data

GetImageDataSize()[source]

Return type:: int
Returns:: the size of image data in bytes

GetImageHeight()[source]

Return type:: int
Returns:: the height of the image, in samples.

GetImageRenderingIntent()[source]

Return type:: int
Returns:: The color rendering intent to be used in rendering the image.

GetImageWidth()[source]

Return type:: int
Returns:: the width of the image, in samples.

GetMCPropertyDict()[source]

Return type:: Obj
Returns:: a dictionary containing the property list or NULL if property dictionary is not present.

Notes: the function automatically looks under Properties sub-dictionary of the current resource dictionary if the dictionary is not in-line. Therefore you can assume that returned Obj is dictionary if it is not NULL.

GetMCTag()[source]

Return type:: Obj
Returns:: a tag is a name object indicating the role or significance of the marked content point/sequence.

GetMask()[source]

Return type:: Obj
Returns:: an image XObject defining an image mask to be applied to this image (See ‘Explicit Masking’, 4.8.5), or an array specifying a range of colors to be applied to it as a color key mask (See ‘Color Key Masking’).

If IsImageMask() return true, this method will return NULL.

GetNewTextLineOffset()[source]

Returns the offset (out_x, out_y) to the start of the current line relative to the beginning of the previous line.

out_x and out_y are numbers expressed in unscaled text space units. The returned numbers correspond to the arguments of ‘Td’ operator.

GetParentStructElement()[source]

Return type:: SElement
Returns:: Parent logical structure element (such as ‘span’ or ‘paragraph’). If the Element is not associated with any structure element, the returned SElement will not be valid (i.e. selem.IsValid() -> false).

GetPathData()[source]

Returns the PathData stored by the path element.

Return type:: PathData
Returns:: The PathData which contains the operators and corresponding point data.

GetPosAdjustment()[source]

Return type:: double
Returns:: The number used to adjust text matrix in horizontal direction when drawing text. The number is expressed in thousandths of a unit of text space. The returned number corresponds to a number value within TJ array. For ‘Tj’ text strings the returned value is always 0.

Notes: because CharIterator positioning information already accounts for TJ adjustments this method is rarely used.

GetShading()[source]

Return type:: Shading
Returns:: the SDF object of the Shading object.

GetStructMCID()[source]

Return type:: int
Returns:: Marked Content Identifier (MCID) for this Element or a negative number if the element is not assigned an identifier/MCID.

Marked content identifier can be used to associate an Element with logical structure element that refers to the Element.

GetTextData()[source]

Return type:: std::vector< unsigned char,std::allocator< unsigned char > >
Returns:: a pointer to the internal text buffer for this text element.

Notes: GetTextData() returns the raw text data and not a Unicode string. In PDF text can be encoded using various encoding schemes so it is necessary to consider Font encoding while processing the content of this buffer.

Most of the time GetTextString() is what you are looking for instead. GetTextString() maps the raw text directly into Unicode (as specified by Adobe Glyph List (AGL) ). Even if you would prefer to decode text yourself it is more convenient to use CharIterators returned by CharBegin()/CharEnd() and PDF::Font code mapping methods.

the buffer owner is the current element (i.e. ElementReader or ElementBuilder).

GetTextDataSize()[source]

Return type:: int
Returns:: the size of the internal text buffer returned in GetTextData().

GetTextLength()[source]

Return type:: double
Returns:: The text advance distance in text space.

The total sum of all of the advance values from rendering all of the characters within this element, including the advance value on the glyphs, the effect of properties such as ‘char-spacing’, ‘word-spacing’ and positioning adjustments on ‘TJ’ elements.

Notes: Computed text length is represented in text space. In order to get the length of the text run in the user space, the returned value should be scaled using the text matrix (GetTextMatrix()) and the current transformation matrix (GetCTM()). See section 4.2 ‘Other Coordinate Spaces’ in PDF Reference Manual for details.

GetTextMatrix()[source]

Return type:: Matrix2D
Returns:: a reference to the current text matrix (Tm).

GetTextString()[source]

Return type:: string
Returns:: a pointer to Unicode string for this text Element. The function maps character codes to Unicode array defined by Adobe Glyph List (http://partners.adobe.com/asn/developer/type/glyphlist.txt).

Notes: In PDF text can be encoded using various encoding schemes and in some cases it is not possible to extract Unicode encoding. If it is not possible to map charcode to Unicode the function will map a character to undefined code, 0xFFFD. This code is defined in private Unicode range.

If you would like to map raw text to Unicode (or some other encoding) yourself use CharIterators returned by CharBegin()/CharEnd() and PDF::Font code mapping methods.

The string owner is the current element (i.e. ElementReader or ElementBuilder).

GetType()[source]

Return type:: int
Returns:: the current element type.

GetXObject()[source]

Return type:: Obj
Returns:: the SDF object of the Image/Form object.

HasTextMatrix()[source]

Return type:: boolean
Returns:: true if this element is directly associated with a text matrix (that is Tm operator is just before this text element) or false if the text matrix is default or is inherited from previous text elements.

IsClipWindingFill()[source]

Return type:: boolean
Returns:: true if the current clip path is using non-zero winding rule, or false for even-odd rule.

IsClippingPath()[source]

Return type:: boolean
Returns:: true if the current path element is a clipping path and should be added to clipping path stack.

IsFilled()[source]

Return type:: boolean
Returns:: true if the current path element should be filled

IsImageInterpolate()[source]

Return type:: boolean
Returns:: a boolean indicating whether image interpolation is to be performed.

IsImageMask()[source]

Return type:: boolean
Returns:: a boolean indicating whether the inline image is to be treated as an image mask.

IsOCVisible()[source]

Return type:: boolean
Returns:: true if this element is visible in the optional-content context (OCG::Context). The method considers the context’s current OCMD stack, the group ON-OFF states, the non-OC drawing status, the drawing and enumeration mode, and the intent.

When enumerating page content, OCG::Context can be passed as a parameter in ElementReader.Begin() method. When using PDFDraw, PDFRasterizer, or PDFView class to render PDF pages use PDFDraw::SetOCGContext() method to select an OC context.

IsStroked()[source]

Return type:: boolean
Returns:: true if the current path element should be stroked

IsWindingFill()[source]

Return type:: boolean
Returns:: true if the current path should be filled using non-zero winding rule, or false if the path should be filled using even-odd rule.

According non-zero winding rule, you can determine whether a test point is inside or outside a closed curve as follows: Draw a line from a test point to a point that is distant from the curve. Count the number of times the curve crosses the test line from left to right, and count the number of times the curve crosses the test line from right to left. If those two numbers are the same, the test point is outside the curve; otherwise, the test point is inside the curve.

According to even-odd rule, you can determine whether a test point is inside or outside a closed curve as follows: Draw a line from the test point to a point that is distant from the curve. If that line crosses the curve an odd number of times, the test point is inside the curve; otherwise, the test point is outside the curve.

SetClipWindingFill(winding_rule)[source]

Sets clipping path’s fill rule.

Parameters:: winding_rule (boolean) – if winding_rule is true clipping should use non-zero winding rule, or false for even-odd rule.

SetNewTextLineOffset(dx, dy)[source]

Sets the offset (dx, dy) to the start of the current line relative to the beginning of the previous line.

Parameters:

dx (double) – horizontal offset to the start of the curret line
dy (double) – vertical offset to the start of the current line

SetPathClip(clip)[source]

Indicate whether the path is a clipping path or non-clipping path

Parameters:: clip (boolean) – true to set path to clipping path. False for non-clipping path.

SetPathData(data)[source]: Set the PathData of this element. The PathData contains the array of points stored by the element and the array of path segment types.

SetPathFill(fill)[source]

Indicate whether the path should be filled

Parameters:: fill (boolean) – true to set path to be filled. False for no fill path.

SetPathStroke(stroke)[source]

Indicate whether the path should be stroked

Parameters:: stroke (boolean) – true to set path to be stroked. False for no stroke path.

SetPosAdjustment(adjust)[source]

Parameters:: adjust (double) – number to set the horizontal adjustment to

Notes: Positive values move the current text element backwards (along text direction).: Negative values move the current text element forward (along text direction).

SetTextData(buf_text_data, text_data_size)[source]

Set the text data for the current e_text Element.

Parameters:

buf_text_data (UChar) – a pointer to a buffer containing text.
text_data_size (int) – the size of the internal text buffer

SetTextMatrix(args)[source]

Overload 1:

Sets the text matrix for a text element.

Parameters:: mtx (Matrix2D) – The new text matrix for this text element

Overload 2:

Sets the text matrix for a text element. This method accepts text transformation matrix components directly.

A transformation matrix in PDF is specified by six numbers, usually in the form of an array containing six elements. In its most general form, this array is denoted [a b c d h v]; it can represent any linear transformation from one coordinate system to another. For more information about PDF matrices please refer to section 4.2.2 ‘Common Transformations’ in PDF Reference Manual, and to documentation for Matrix2D class.

Parameters:

a (double) –
- horizontal ‘scaling’ component of the new text matrix.
b (double) –
- ‘rotation’ component of the new text matrix.
c (double) –
- ‘rotation’ component of the new text matrix.
d (double) –
- vertical ‘scaling’ component of the new text matrix.
h (double) –
- horizontal translation component of the new text matrix.
v (double) –
- vertical translation component of the new text matrix.

SetWindingFill(winding_rule)[source]

Sets path’s fill rule.

Parameters:: winding_rule (boolean) – if winding_rule is true path will be filled using non-zero winding fill rule, otherwise even-odd fill will be used.

UpdateTextMetrics()[source]

Recompute the character positioning information (i.e. CharIterator-s) and text length.

Element objects caches text length and character positioning information. If the user modifies the text data or graphics state the cached information is not correct. UpdateTextMetrics() can be used to recalculate the correct positioning and length information.

e_form = 9

e_group_begin = 10

e_group_end = 11

e_image = 6

e_inline_image = 7

e_marked_content_begin = 12

e_marked_content_end = 13

e_marked_content_point = 14

e_null = 0

e_path = 1

e_shading = 8

e_text = 3

e_text_begin = 2

e_text_end = 5

e_text_new_line = 4

property mp_elem

property thisown: The membership flag

class apryse_sdk.ElementBuilder[source]

Bases: object

ElementBuilder is used to build new PDF::Elements (e.g. image, text, path, etc) from scratch. In conjunction with ElementWriter, ElementBuilder can be used to create new page content.

Notes: Analogous to ElementReader, every call to ElementBuilder.Create? method destroys the Element currently associated with the builder and all previous Element pointers are invalidated.

For C++ developers. Analogous to ElementReader, ElementBuilder is the owner of all Element objects it creates.

ArcTo(args)[source]

Overload 1:

Draw an arc with the specified parameters (lower left corner, width, height and angles).

Parameters:

x (double) – The horizontal x coordinate of the lower left corner of the ellipse encompassing rectangle
y (double) – The horizontal y coordinate of the lower left corner of the ellipse encompassing rectangle
width (double) – overall width of the full ellipse (not considering the angular extents).
height (double) – overall height of the full ellipse (not considering the angular extents).
start (double) – starting angle of the arc in degrees
extent (double) – angular extent of the arc in degrees

Overload 2:

Draw an arc from the current point to the end point.

Parameters:

xr (double) – x radius for the arc
yr (double) – y radius for the arc
rx (double) – x-axis rotation in radians
isLargeArc (boolean) – indicates if smaller or larger arc is chosen 1 - one of the two larger arc sweeps is chosen 0 - one of the two smaller arc sweeps is chosen
sweep (boolean) – direction in which arc is drawn (1 - clockwise, 0 - counterclockwise)
endX (double) – x coordinate of end point
endY (double) – y coordinate of end point

Notes: The Arc is defined the same way as it is specified by SVG or XPS standards. For: further questions please refer to the XPS or SVG standards.

ClosePath()[source]: Closes the current subpath.

CreateEllipse(x, y, width, height)[source]

Create an ellipse (or circle, if width == height) path Element.

Parameters:

x (double) – The horizontal x coordinate of the ellipse center.
y (double) – The vertical y coordinate of the ellipse center.
width (double) – The width of the ellipse rectangle.
height (double) – The height of the ellipse rectangle.

Return type:

Returns:

the path Element

CreateForm(args)[source]

Overload 1:

Create a Form XObject Element.

Parameters:: form (Obj) – a Form XObject content stream

Overload 2:

Create a Form XObject Element using the content of the existing page. This method assumes that the XObject will be used in the same document as the given page. If you need to create the Form XObject in a different document use CreateForm(Page, Doc) method.

Parameters:: page (Page) – A page used to create the Form XObject.

Overload 3:

Create a Form XObject Element using the content of the existing page. Unlike CreateForm(Page) method, you can use this method to create form in another document.

Parameters:

page (Page) – A page used to create the Form XObject.
doc (PDFDoc) – Destination document for the Form XObject.

CreateGroupBegin()[source]: Create e_group_begin Element (i.e. ‘q’ operator in PDF content stream). The function saves the current graphics state.

CreateGroupEnd()[source]: Create e_group_end Element (i.e. ‘Q’ operator in PDF content stream). The function restores the previous graphics state.

CreateImage(args)[source]

Overload 1:

Create a content image Element out of a given document Image.

Parameters:: img (Image) – the given image.

Overload 2:

Create a content image Element out of a given document Image.

Parameters:

img (Image) – the given image.
mtx (Matrix2D) – the image transformation matrix.

Overload 3:

Create a content image Element out of a given document Image with the lower left corner at (x, y), and scale factors (hscale, vscale).

Parameters:

img (Image) – the given image.
x (double) – The horizontal x position to place the lower left corner of the image
y (double) – The vertical x position to place the lower left corner of the image
hscale (double) – The horizontal scale of the image
vscale (double) – The vertical scale of the image

CreateMarkedContentBegin(tag, property_dict)[source]

Create e_marked_content_begin element with an associated property dictionary (i.e. BMC or BDC operator in PDF content stream).

Parameters:

tag (string) – the tag entry for this element.
property_dict (Obj) – the property dictionary.

Return type:

Returns:

the marked content begin element.

CreateMarkedContentBeginInlineProperties(tag)[source]

Create e_marked_content_begin element with an inline property dictionary (i.e. BDC operator in PDF content stream).

Parameters:: tag (string) – the tag entry for this element.
Return type:: Element
Returns:: the marked content begin element.

Notes: The inline property dictionary can be accessed and edited using element.GetMCPropertyDict()

CreateMarkedContentEnd()[source]

Create e_marked_content_end element (i.e. EMC operator in PDF content stream).

Return type:: Element
Returns:: the marked content end element.

CreateMarkedContentPoint(tag, property_dict)[source]

Create e_marked_content_point element with an associated property dictionary (i.e. MP or DP operator in PDF content stream).

Parameters:

tag (string) – the tag entry for this element.
property_dict (Obj) – the property dictionary.

Return type:

Returns:

the marked content point element.

CreateMarkedContentPointInlineProperties(tag)[source]

Create e_marked_content_point element with an inline property dictionary (i.e. DP operator in PDF content stream).

Parameters:: tag (string) – the tag entry for this element.
Return type:: Element
Returns:: the marked content point element.

Notes: The inline property dictionary can be accessed and edited using element.GetMCPropertyDict()

CreatePath(points, seg_types)[source]

Create a path Element using given path segment data

Return type:: Element
Returns:: the path Element

CreateRect(x, y, width, height)[source]

Create a rectangle path Element.

Parameters:

x (double) – The horizontal coordinate of the lower left corner of the rectangle.
y (double) – The vertical coordinate of the lower left corner of the rectangle.
width (double) – The width of the rectangle.
height (double) – The height of the rectangle.

Return type:

Returns:

the path Element

CreateShading(sh)[source]

Parameters:: sh (Shading) – A Shading object. Shading objects represent a flat interface around all PDF shading types (e_function_shading, e_axial_shading, etc.) Create a shading Element.

CreateShapedTextRun(text_data)[source]

Create a new text run from shaped text. Shaped Text can be created with an approriate Font, using the Font::GetShapedText() method.

Parameters:: text_data (ShapedText) – the shaped text data

Notes: you must set the current Font and font size before calling this function and the font must be created using Font::CreateCIDTrueTypeFont() method, and should be the same font used to generate the shaped text content.

For best results, the font should be encoded using the e_Indices encoding scheme.

a text run can be created only within a text block

CreateTextBegin(args)[source]

Overload 1:

Start a text block (‘BT’ operator in PDF content stream). The function installs the given font in the current graphics state.

Parameters:

font (Font) – font to set the text in the text block to
font_sz (double) – size to set the text in the text block to

Overload 2:

Start a text block (‘BT’ operator in PDF content stream).

CreateTextEnd()[source]: Ends a text block.

CreateTextNewLine(args)[source]

Overload 1:

Create e_text_new_line Element (i.e. a Td operator in PDF content stream). Move to the start of the next line, offset from the start of the current line by (dx , dy). dx and dy are numbers expressed in unscaled text space units.

Parameters:

dx (double) – The horizontal x offset from the start of the current line
dy (double) – The vertical y offset from the start of the current line

Return type:

Returns:

the path Element

Overload 2:

Create e_text_new_line Element (i.e. a T operator in PDF content stream).

Return type:: Element
Returns:: the path Element

CreateTextRun(args)[source]

Overload 1:

Create a text run using the given font. Notes: a text run can be created only within a text block

Overload 2:

Create a new text run. Notes: a text run can be created only within a text block you must set the current Font and font size before calling this function.

CreateUnicodeTextRun(text_data, text_data_sz)[source]

Create a new Unicode text run.

Parameters:

text_data (int) – pointer to Unicode text
text_data_sz (int) – number of characters (not bytes) in text_data

Notes: you must set the current Font and font size before calling this function and the font must be created using Font::CreateCIDTrueTypeFont() method.

a text run can be created only within a text block

CurveTo(cx1, cy1, cx2, cy2, x2, y2)[source]

Draw a Bezier curve from the current point to the given point (x2, y2) using (cx1, cy1) and (cx2, cy2) as control points.

Parameters:

cx1 (double) – The x component of the first control point
cy1 (double) – The y component of the first control point
cx2 (double) – The x component of the second control point
cy2 (double) – The y component of the second control point
x2 (double) – The horizontal x component of the goal point
y2 (double) – The vertical y component of the goal point

Destroy()[source]: Frees the native memory of the object.

Ellipse(x, y, width, height)[source]

Add an ellipse (or circle, if rx == ry) to the current path as a complete subpath. Setting the current point is not required before using this function.

Parameters:

x (double) – The x coordinate of the ellipse center.
y (double) – The y coordinate of the ellipse center.
width (double) – The x radii of the ellipse.
height (double) – The y radii of the ellipse.

LineTo(x, y)[source]

Draw a line from the current point to the given point.

Parameters:

x (double) – The horizontal x component of the goal point
y (double) – The vertical y component of the goal point

MoveTo(x, y)[source]

Set the current point.

Parameters:

x (double) – The horizontal x component of the point
y (double) – The vertical y component of the point

PathBegin()[source]: Starts building a new path Element that can contain an arbitrary sequence of lines, curves, and rectangles.

PathEnd()[source]

Finishes building of the path Element.

Return type:: Element
Returns:: the path Element

Rect(x, y, width, height)[source]

Add a rectangle to the current path as a complete subpath. Setting the current point is not required before using this function.

Parameters:

x (double) – The x coordinate of the lower left corner of the rectangle.
y (double) – The y coordinate of the lower left corner of the rectangle.
width (double) – The width of the rectangle.
height (double) – The height of the rectangle.

Reset(args)[source]

The function sets the graphics state of this Element to the given value. If ‘gs’ parameter is not specified or is NULL the function resets the graphics state of this Element to the default graphics state (i.e. the graphics state at the beginning of the display list).

The function can be used in situations where the same ElementBuilder is used to create content on several pages, XObjects, etc. If the graphics state is not Reset() when moving to a new display list, the new Element will have the same graphics state as the last Element in the previous display list (and this may or may not be your intent).

Another use of Reset(gs) is to make sure that two Elements have the graphics state.

Parameters:: gs (GState, optional) – GState (graphics state) object. If NULL or unspecified, resets graphics state to default.

property mp_builder

property thisown: The membership flag

class apryse_sdk.ElementReader(args)[source]

Bases: object

ElementReader can be used to parse and process content streams. ElementReader provides a convenient interface used to traverse the Element display list of a page. The display list representing graphical elements (such as text-runs, paths, images, shadings, forms, etc) is accessed using the intrinsic iterator. ElementReader automatically concatenates page contents spanning multiple streams and provides a mechanism to parse contents of sub-display lists (e.g. forms XObjects and Type3 fonts).

A sample use case for ElementReader is given below:

   ...
   ElementReader reader;
   reader.Begin(page);
   for (Element element=reader.Next(); element; element = reader.Next()) // Read page contents
   {
     switch (element.GetType())    {
       case Element::e_path: { // Process path data...
          double data = element.GetPathPoints();
           int sz = element.GetPointCount();
       }
       break;
       case Element::e_text:
           // ...
       break;
     }
   }
   reader.End();

For a full sample, please refer to ElementReader and ElementReaderAdvTest sample projects.

AppendResource(res)[source]

Parameters:: res (Obj) – resource dictionary for finding images, fonts, etc.

Begin(args)[source]

Overload 1:

Begin processing a page.

Parameters:

page (Page) – A page to start processing.
ocg_context (Context, optional) – An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return ‘true’ or ‘false’ depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.

Notes: When page processing is completed, make sure to call ElementReader.End().

Overload 2:

Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.

Parameters:

content_stream (Obj) –
- A stream object representing the content stream (usually
a Form XObject).
resource_dict (Obj, optional) –
- An optional ‘/Resource’ dictionary parameter.
If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary.
ocg_context (Context, optional) – An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return ‘true’ or ‘false’ depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.

Notes: When page processing is completed, make sure to call ElementReader.End().

Overload 3:

Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.

Parameters:

content_stream (Obj) –
- A stream object representing the content stream (usually
a Form XObject).
resource_dict (Obj, optional) –
- An optional ‘/Resource’ dictionary parameter.
If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary.
ocg_context – An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return ‘true’ or ‘false’ depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.

Notes: When page processing is completed, make sure to call ElementReader.End().

Overload 4:

Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.

Parameters:

content_stream (Obj) –
- A stream object representing the content stream (usually
a Form XObject).
resource_dict –
- An optional ‘/Resource’ dictionary parameter.
If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary.
ocg_context – An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return ‘true’ or ‘false’ depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.

Notes: When page processing is completed, make sure to call ElementReader.End().

ClearChangeList()[source]: Clear the list containing identifiers of modified graphics state attributes. The list of modified attributes is then accumulated during a subsequent call(s) to ElementReader.Next().

Current()[source]

Return type:: Element
Returns:: the current Element or a ‘NULL’ Element. The current element is the one returned in the last call to Next().

Notes: Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.

Destroy()[source]: Frees the native memory of the object.

End()[source]

Close the current display list.

If the current display list is a sub-list created using FormBegin(), PatternBegin(), or Type3FontBegin() methods, the function will end the sub-list and will return processing to the parent display list at the point where it left off before entering the sub-list.

Return type:: boolean
Returns:: true if the closed display list is a sub-list or false if it is a root display list.

FormBegin()[source]

When the current element is a form XObject you have the option to skip form processing (by not calling FormBegin()) or to open the form stream and continue Element traversal into the form.

To open a form XObject display list use FormBegin() method. The Next() returned Element will be the first Element in the form XObject display list. Subsequent calls to Next() will traverse form’s display list until NULL is returned. At any point you can close the form sub-list using ElementReader::End() method. After the form display list is closed (using End()) the processing will return to the parent display list at the point where it left off before entering the form XObject.

GetChangesIterator()[source]

Return type:: GSChangesIterator
Returns:: an iterator to the beginning of the list containing identifiers of modified graphics state attributes since the last call to ClearChangeList(). The list can be consulted to determine which graphics states were modified between two Elements. Attributes are ordered in the same way as they are set in the content stream. Duplicate attributes are eliminated.

GetColorSpace(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get

Notes: see ElementReader::GetFont

GetExtGState(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get

Notes: see ElementReader::GetFont

GetFont(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get
Return type:: Obj
Returns:: SDF/Cos object matching the specified name in the current resource dictionary. For ‘Page’ the name is looked up in the page’s /Resources/<Class> dictionary. For Form XObjects, Patterns, and Type3 fonts that have a content stream within page content stream the specified resource is first looked-up in the resource dictionary of the inner stream. If the resource is not found, the name is looked up in the outer content stream’s resource dictionary. The function returns NULL if the resource was not found.

GetPattern(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get

Notes: see ElementReader::GetFont

GetShading(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get

Notes: see ElementReader::GetFont

GetXObject(name)[source]

Parameters:: name (string) – string of the name of the SDF/Cos object to get

Notes: see ElementReader::GetFont

IsChanged(attrib)[source]

Return type:: boolean
Returns:: true if given GState attribute was changed since the last call to
Parameters:: attrib (int) – the GState attribute to test if it has been changed ClearChangeList().

Next()[source]

Return type:: Element
Returns:: a page Element or a ‘NULL’ element if the end of current-display list was reached. You may use GetType() to determine the type of the returned Element.

Notes: Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.

PatternBegin(fill_pattern, reset_ctm_tfm=False)[source]

A method used to spawn the sub-display list representing the tiling pattern of the current element in the ElementReader. You can call this method at any point as long as the current element is valid.

Parameters:

fill_pattern (boolean) – If true, the filling pattern of the current element will be spawned; otherwise, the stroking pattern of the current element will be spawned. Note that the graphics state will be inherited from the parent content stream (the content stream in which the pattern is defined as a resource) automatically.
reset_ctm_tfm (boolean, optional) – An optional parameter used to indicate whether the pattern’s display list should set its initial CTM and transformation matrices to identity matrix. In general, we should leave it to be false.

To open a tiling pattern sub-display list use PatternBegin(pattern) method. The Next() returned Element will be the first Element in the pattern display list. Subsequent calls to Next() will traverse pattern’s display list until NULL is encountered. At any point you can close the pattern sub-list using ElementReader::End() method. After the pattern display list is closed, the processing will return to the parent display list at the point where pattern display list was spawned.

Type3FontBegin(char_data, resource_dict=0)[source]

A method used to spawn a sub-display list representing a Type3 Font glyph. You can call this method at any point as long as the current element in the ElementReader is a text element whose font type is type 3.

Parameters:

char_data (CharData) – The information about the glyph to process. You can get this information by dereferencing a CharIterator.
resource_dict (Obj, optional) –
- An optional ‘/Resource’ dictionary parameter.
If any glyph descriptions refer to named resources but Font Resource dictionary is absent, the names are looked up in the supplied resource dictionary.

To open a Type3 font sub-display list use Type3FontBegin() method. The Next() returned Element will be the first Element in the glyph’s display list. Subsequent calls to Next() will traverse glyph’s display list until NULL is returned. At any point you can close the glyph sub-list using ElementReader::End() method. After the glyph display list is closed, the processing will return to the parent display list at the point where glyph display list was spawned.

property mp_reader

property thisown: The membership flag

class apryse_sdk.ElementWriter(args)[source]

Bases: object

ElementWriter can be used to assemble and write new content to a page, Form XObject, Type3 Glyph stream, pattern stream, or any other content stream.

Begin(args)[source]

Overload 1:

Begin writing to the given page.

By default, new content will be appended to the page, as foreground graphics. It is possible to add new page content as background graphics by setting the second parameter in begin method to ‘true’ (e.g. writer.Begin(page, true)).

Parameters:

page (Page) – The page to write content.
placement (int, optional) – An optional flag indicating whether the new content should be added as a foreground or background layer to the existing page. By default, the new content will appear on top of the existing graphics.
page_coord_sys (boolean, optional) – An optional flag used to select the target coordinate system if true (default), the coordinates are relative to the lower-left corner of the page, otherwise the coordinates are defined in PDF user coordinate system (which may, or may not coincide with the page coordinates).
compress (boolean, optional) – An optional flag indicating whether the page content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.
resources (Obj, optional) – the resource dictionary in which to store resources for the final page. By default, a new resource dictionary will be created.

Overload 2:

Begin writing an Element sequence to a new stream. Use this function to write Elements to a content stream other than the page. For example, you can create Form XObjects (See Section ‘4.9 Form XObjects’ in PDF Reference for more details) pattern streams, Type3 font glyph streams, etc.

Parameters:

doc (SDFDoc) –
- A low-level SDF/Cos document that will contain the new stream. You can
access low-level document using PDFDoc::GetSDFDoc() or Obj::GetDoc() methods.
compress (boolean, optional) – An optional flag indicating whether the page content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.

Notes: the newly created content stream object is returned when writing operations are completed (i.e. after the call to ElementWriter::End()).

Overload 3:

Begin writing an Element sequence to a new stream. Use this function to write Elements to a content stream other than the page. For example, you can create Form XObjects (See Section ‘4.9 Form XObjects’ in PDF Reference for more details) pattern streams, Type3 font glyph streams, etc.

Parameters:

doc (SDFDoc) –
- A low-level SDF/Cos document that will contain the new stream. You can
access low-level document using PDFDoc::GetSDFDoc() or Obj::GetDoc() methods.
compress – An optional flag indicating whether the page content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.

Notes: the newly created content stream object is returned when writing operations are completed (i.e. after the call to ElementWriter::End()).

Overload 4:

Begin writing an Element sequence to a stream. Use this function to write Elements to a content stream which will replace an existing content stream in an object passed as a parameter.

Parameters:

stream_obj_to_update (Obj) –
- A low-level SDF stream object that will contain the new stream.
Old stream inside that object will be discarded.
compress (boolean, optional) – An optional flag indicating whether the content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.
resources (Obj, optional) – the resource dictionary in which to store resources for the final page. By default, a new resource dictionary will be created.

Notes: The content stream object is returned when writing operations are completed (i.e. after the call to ElementWriter::End()).

Overload 5:

Begin writing an Element sequence to a stream. Use this function to write Elements to a content stream which will replace an existing content stream in an object passed as a parameter.

Parameters:

stream_obj_to_update (Obj) –
- A low-level SDF stream object that will contain the new stream.
Old stream inside that object will be discarded.
compress (boolean, optional) – An optional flag indicating whether the content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.
resources – the resource dictionary in which to store resources for the final page. By default, a new resource dictionary will be created.

Notes: The content stream object is returned when writing operations are completed (i.e. after the call to ElementWriter::End()).

Overload 6:

Begin writing an Element sequence to a stream. Use this function to write Elements to a content stream which will replace an existing content stream in an object passed as a parameter.

Parameters:

stream_obj_to_update (Obj) –
- A low-level SDF stream object that will contain the new stream.
Old stream inside that object will be discarded.
compress – An optional flag indicating whether the content stream should be compressed. This may be useful for debugging content streams. Also some applications need to do a clear text search on strings in the PDF files. By default, all content streams are compressed.
resources – the resource dictionary in which to store resources for the final page. By default, a new resource dictionary will be created.

Notes: The content stream object is returned when writing operations are completed (i.e. after the call to ElementWriter::End()).

Destroy()[source]: Frees the native memory of the object.

End()[source]

Finish writing to a page

Return type:: Obj
Returns:: A low-level stream object that was used to store Elements.

Flush()[source]: The Flush method flushes all pending Element writing operations. This method is typically only required to be called when intermixing direct content writing (i.e. WriteBuffer/WriteString) with Element writing.

SetDefaultGState(reader)[source]

This method is used to initialize ElementWriter state with the state of a given ElementReader. This can be used to avoid incorrectly writing inherited GState attributes.

Parameters:: reader (ElementReader) – ElementReader.

WriteBuffer(data)[source]: Writes an arbitrary buffer to the content stream. This function can be used to insert comments, inline-image data, and chunks of arbitrary content to the output stream.

WriteElement(element)[source]

Writes the Element to the content stream.

Parameters:: element (Element) – The element to write to the content stream.

WriteGStateChanges(element)[source]

Write only the graphics state changes applied to this element and skip writing the element itself. This is especially useful when rewriting page content, but with the intention to skip certain elements.

Parameters:: element (Element) – The element for which to write graphics state changes.

WritePlacedElement(element)[source]

A utility function that surrounds the given Element with a graphics state Save/Restore Element (i.e. in PDF content stream represented as ‘q element Q’).

The function is equivalent to calling WriteElement three times:: WriteElement(eSave); WriteElement(element); WriteElement(eRestore);

where eSave is ‘e_group_begin’ and eRestore is ‘e_group_end’ Element

The function is useful when XObjects such as Images and Forms are drawn on the page.

Parameters:: element (Element) – Element object to enact function on.

WriteString(str)[source]

Writes an arbitrary string to the content stream. Serves the same purpose as WriteBuffer().

Parameters:: str (string) – String to write to the content stream.

e_overlay = 1: element appears on top of the existing graphics

e_replacement = 2: element will replace current page contents

e_underlay = 0: element is put in the background layer of the page

property mp_writer

property thisown: The membership flag

class apryse_sdk.EmbeddedTimestampVerificationResult(args)[source]

Bases: object

This class represents the result of verifying a secure embedded timestamp digital signature.

Destroy()[source]

GetCMSDigestStatus()[source]

Retrieves the result condition associated with the CMS signed digest verification step.

Return type:: int
Returns:: A DigestStatus-type enumeration value.

GetCMSDigestStatusAsString()[source]

Retrieves the result condition associated with the CMS signed digest verification step, as a descriptive string.

Return type:: string
Returns:: a string

Notes: Output may change in future versions.

GetCMSSignatureDigestAlgorithm()[source]

Retrieves an enumeration value representing the digest algorithm used to sign the timestamp token.

Return type:: int
Returns:: A DigestAlgorithm enumeration value.

GetMessageImprintDigestAlgorithm()[source]

Retrieves an enumeration value representing the digest algorithm used inside the message imprint field of the timestamp to digest the main signature value.

Return type:: int
Returns:: A DigestAlgorithm enumeration value.

GetMessageImprintDigestStatus()[source]

Retrieves the result condition associated with the message imprint digest verification step.

Return type:: int
Returns:: A DigestStatus-type enumeration value.

GetMessageImprintDigestStatusAsString()[source]

Retrieves the result condition associated with the message imprint digest verification step, as a descriptive string.

Return type:: string
Returns:: a string

Notes: Output may change in future versions.

GetTrustStatus()[source]

Retrieves the result condition associated with the trust verification step.

Return type:: int
Returns:: A TrustStatus-type enumeration value.

GetTrustStatusAsString()[source]

Retrieves the result condition associated with the trust verification step, as a descriptive string.

Return type:: string
Returns:: a string

Notes: Output may change in future versions.

GetTrustVerificationResult()[source]

Retrieves the detailed result associated with the trust step of the verification operation that returned this EmbeddedTimestampVerificationResult, if such a detailed trust result is available. Must call HasTrustVerificationResult first and check for a true result. Notes: This function will throw if there is no trust result available.

Return type:: TrustVerificationResult
Returns:: A TrustVerificationResult object.

GetUnsupportedFeatures()[source]

Retrieves reports about unsupported features encountered during verification of the timestamp. Current possible values:

“GeneralizedTime format with length <number greater than 15>”, “unsupported digest algorithm”

Return type:: std::vector< std::string,std::allocator< std::string > >
Returns:: a container of strings representing unsupported features encountered during verification of the timestamp

Notes: Output may change in future versions.

GetVerificationStatus()[source]

Retrieves the main verification status. The main status is determined based on the other statuses.

Return type:: boolean
Returns:: A boolean representing whether or not the verification operation was completely successful.

HasTrustVerificationResult()[source]

Returns whether there is a detailed TrustVerificationResult in this EmbeddedTimestampVerificationResult.

Return type:: boolean
Returns:: A boolean

property m_impl

property thisown: The membership flag

class apryse_sdk.ExcelOutputOptions[source]

Bases: object

A class containing options common to ToExcel functions

GetFootnotesSetting()[source]

Get the setting for footnotes from this options object.

Return type:: int
Returns:: The current footnote setting.

GetHeadersAndFootersSetting()[source]

Get the setting for headers and footers from this options object.

Return type:: int
Returns:: The current header and footer setting.

SetCustomOCRLanguage(ocrlang)[source]

Specifies the custom OCR languages to use. Notes: Use 3-letter ISO 639-2 language codes, separated by spaces. Example: “eng deu spa fra”. The default is English.

Parameters:: ocrlang (string) – the OCR language(s).

SetFootnotesSetting(option)[source]

Specifies how footnotes should be converted. Default is e_Recover, which will include them as footnotes.

Parameters:: option (int) – The footnote setting.

SetHeadersAndFootersSetting(option)[source]

Specifies how header and footers should be converted. Default is e_Recover, which will include them as headers and footers.

Parameters:: option (int) – The header and footer setting.

SetLanguage(language)[source]

Specifies the OCR language. Default is automatic language detection.

Parameters:: language (int) – the OCR language.

SetNonTableContent(non_tables)[source]

Specifies whether to convert non-tabular content. Default is false.

Parameters:: non_tables (boolean) – If false, only tabular content is converted to Excel. If true, all textual content is converted to Excel.

SetPDFPassword(password)[source]

Specifies the password if the PDF requires one.

Parameters:: password (string) – the PDF password, if required; an empty string otherwise.

SetPageSingleSheet(page_single)[source]

Specifies whether to combine all tables on a page into a single sheet. Default is false.

Parameters:: page_single (boolean) – If false, each logical table on a page goes to a separate Excel sheet. If true, all logical tables for a page are combined into a single Excel sheet.

SetPages(page_from, page_to)[source]

Specifies a range of pages to be converted. By default all pages are converted. The first page has the page number of 1.

Parameters:

page_from (int) – the first page to be converted.
page_to (int) – the last page to be converted (inclusive). Use a negative value to specify the last page in the PDF.

SetPreferredOCREngine(engine)[source]

Specifies preferred OCR engine.

Parameters:: engine (int) – The PreferredOCREngine to OCR.

SetSearchableImageSetting(setting)[source]

Specifies how scanned image pages should be converted. Default is e_ocr_text.

Parameters:: setting (int) – the searchable image setting.

Remarks: Pre-existing OCRed content is ignored and a new OCR is performed from scratch. See also: SearchableImageSetting

SetSingleSheet(single_sheet)[source]

Specifies whether to combine all tables into a single sheet. Default is false.

Parameters:: single_sheet (boolean) – If false, each logical table goes to a separate Excel sheet. If true, all logical tables are combined into a single Excel sheet.

e_ocr_always = 4

e_ocr_off = 3

e_ocr_text = 2

property thisown: The membership flag

class apryse_sdk.FDFDoc(args)[source]

Bases: object

FDFDoc is a class representing Forms Data Format (FDF) documents. FDF is typically used when submitting form data to a server, receiving the response, and incorporating it into the interactive form. It can also be used to export form data to stand-alone files that can be stored, transmitted electronically, and imported back into the corresponding PDF interactive form. In addition, beginning in PDF 1.3, FDF can be used to define a container for annotations that are separate from the PDF document to which they apply.

Note: While theructor does not, a few methods in FDFDoc will cause it to count as a document for the consumption-based licensing if was not created through PDFDoc::FDFExtract(). Please consult individual API documentation for exact details.

Close()[source]: Close FDFDoc

static CreateFromXFDF(file_name)[source]

Create a new FDFDoc from XFDF input. Input can be either a XFDF file path, or the XFDF data itself.

Parameters:

file_name (string) –

string containing either the file path to a XFDF file, or the XML buffer containing the XFDF.

Return type:

Returns:

A new FDFDoc.

FieldCreate(args)[source]

GetFDF()[source]

Return type:: Obj
Returns:: the FDF dictionary located in “/Root” or NULL if dictionary is not present.

GetField(field_name)[source]

Parameters:

field_name (string) –

a string representing the fully qualified name of

the field (e.g. “employee.name.first”).

Return type:

FDFField

Returns:

a FDFField associated with the given field_name or invalid field (null) if the field is not found.

GetFieldIterator(args)[source]

Overload 1:

An interactive form (sometimes referred to as an AcroForm) is a collection of fields for gathering information interactively from the user. A FDF document may contain any number of fields appearing on any combination of pages, all of which make up a single, global interactive form spanning the entire document.

The following methods are used to access and manipulate Interactive form fields (sometimes referred to as AcroForms).

Return type:: FDFFieldIterator
Returns:: an iterator to the first FDFField in the document.

Notes: if the document has no AcroForms, HasNext() will return false.

Overload 2:

An interactive form (sometimes referred to as an AcroForm) is a collection of fields for gathering information interactively from the user. A FDF document may contain any number of fields appearing on any combination of pages, all of which make up a single, global interactive form spanning the entire document.

The following methods are used to access and manipulate Interactive form fields (sometimes referred to as AcroForms).

Parameters:: field_name (string) – String representing the name of the FDFField to get.
Return type:: FDFFieldIterator
Returns:: an iterator to the FDFField in the document.

Notes: if the document has no AcroForms, HasNext() will return false.

GetID()[source]

Get the ID entry from “/Root/FDF” dictionary.

Return type:

Returns:

An object representing the ID entry in “/Root/FDF” dictionary.

GetPDFFileName()[source]

Get the PDF document file that this FDF file was exported from or is intended to be imported into.

Return type:: string
Returns:: a String with the PDF document file name.

GetRoot()[source]

Return type:

Returns:

A dictionary representing the Cos root of the document (/Root entry

within the trailer dictionary)

Notes: This method will count as a document usage for consumption-based licensing if the current document has not yet been counted.

GetSDFDoc()[source]

Return type:: SDFDoc
Returns:: document’s SDF/Cos document

Notes: This method will count as a document usage for consumption-based licensing if the current document has not yet been counted.

GetTrailer()[source]

Return type:

Returns:

A dictionary representing the Cos root of the document (document’s trailer)

Notes: This method will count as a document usage for consumption-based licensing if the current document has not yet been counted.

IsModified()[source]

Return type:

boolean

Returns:

true if document was modified, false otherwise

MergeAnnots(args)[source]

Merge the annotations from XFDF file into FDF file

Parameters:

command_file (string) –
- string containing the xml command file path or xml string of the command
permitted_user (string, optional) –
- optional user name of the permitted user

Save(args)[source]

Overload 1:

Saves the document to a file.

If a full save is requested to the original path, the file is saved to a file system-determined temporary file, the old file is deleted, and the temporary file is renamed to path.

A full save with remove unused or linearization option may re-arrange object in the cross reference table. Therefore all pointers and references to document objects and resources should be re acquired in order to continue document editing.

In order to use incremental save the specified path must match original path and e_incremental flag bit should be set.

Parameters:

path (string) –

The full path name to which the file is saved.

Raises:

if the file can’t be opened for saving or if there is a problem during Save

an Exception object will be thrown.

Notes: This method will count as a document usage for consumption-based licensing if the current document has not yet been counted.

Overload 2:

Saves the document to a memory buffer.

Raises:

if there is a problem during Save an Exception object will be thrown.

Notes: This method will count as a document usage for consumption-based licensing if the current document has not yet been counted.

SaveAsXFDF(args)[source]

Overload 1:

Export FDF file as an XFDF file

Parameters:

filepath (string) –

the filepath of the exported XFDF file

Overload 2:

Export FDF file as an XFDF file

Parameters:

filepath (string) –
- the filepath of the exported XFDF file
opts (FDF::XFDFExportOptions) – Options controlling finer parameters of xfdf export

Overload 3:

Export FDF file as a XFDF string :param opts: Options controlling finer parameters of xfdf export

Overload 4:

Export FDF file as a XFDF string

Parameters:: opts (FDF::XFDFExportOptions) – Options controlling finer parameters of xfdf export
Return type:: string
Returns:: A UString containing the XFDF representation of the FDF file

SetID(id)[source]

Set the ID entry in “/Root/FDF” dictionary.

Parameters:

id (Obj) –

ID array object.

SetPDFFileName(filepath)[source]

Set the PDF document file that this FDF file was exported from or is intended to be imported into.

Parameters:

filepath (string) –

pathname to the file.

property mp_doc

property thisown: The membership flag

class apryse_sdk.FDFField(args)[source]

Bases: object

FindAttribute(attrib)[source]

The function returns the specified attribute.

Parameters:

attrib (string) –

name of the attribute to find

Return type:

Returns:

return the attribute value if the given attribute name was found or a NULL object if the given attribute name was not found.

GetName()[source]

Return type:: string
Returns:: a string representing the fully qualified name of the field (e.g. “employee.name.first”).

GetPartialName()[source]

Return type:: string
Returns:: a string representing the partial name of the field (e.g. “first” when “employee.name.first” is fully qualified name).

GetSDFObj()[source]

Return type:: Obj
Returns:: the object to the underlying SDF/Cos object.

GetValue()[source]

Return type:: Obj
Returns:: the value of the Field (the value of its /V key) or NULL if the value is not specified. The format of field’s value varies depending on the field type.

SetValue(value)[source]

Sets the value of the FDFField (the value of the field’s /V key).

Parameters:

value (Obj) –

the value to set the FDFField to

Notes: in order to remove/erase the existing value use SetValue(SDF::Null)

property thisown: The membership flag

class apryse_sdk.FDFFieldIterator(args)[source]

Bases: object

Supports a simple iteration over a generic collection.

Current()[source]

Note: HasNext() must be true before calling Current()

Return type:: FDFField
Returns:: the current element in the collection

Destroy()[source]: Frees the native memory of the object.

HasNext()[source]

Return type:: boolean
Returns:: true if the iterator can be successfully advanced to the next element; false if the iterator is no longer valid.

Next()[source]: Note: HasNext() must be true before calling Next() Advances the iterator to the next element of the collection.

property mp_impl

property thisown: The membership flag

class apryse_sdk.Field(args)[source]

Bases: object

An interactive form (sometimes referred to as an AcroForm) is a collection of fields for gathering information interactively from the user. A PDF document may contain any number of Fields appearing on any combination of pages, all of which make up a single, global interactive form spanning the entire document.

PDFNet fully supports reading, writing, and editing PDF forms and provides many utility methods so that work with forms is simple and efficient. Using PDFNet forms API arbitrary subsets of form fields can be imported or exported from the document, new forms can be created from scratch, and the appearance of existing forms can be modified.

In PDFNet Fields are accessed through FieldIterator-s. The list of all Fields present in the document can be traversed as follows:

FieldIterator itr = pdfdoc.GetFieldIterator();
for(; itr.HasNext(); itr.Next()) {
  Field field = itr.Current();
  Console.WriteLine("Field name: {0}", field.GetName());
 }

For a full sample, please refer to ‘InteractiveForms’ sample project.

To search field by name use FieldFind method. For example:

FieldIterator itr = pdfdoc.FieldFind("name");
if (itr.HasNext()) {
  Console.WriteLine("Field name: {0}", itr.Current().GetName());
}
else { ...field was not found... }

If a given field name was not found or if the end of the field list was reached the iterator HasNext() will return false.

If you have a valid iterator you can access the Field using Current() method. For example: Field field = itr.Current();

Using Flatten(…) method it is possible to merge field appearances with the page content. Form ‘flattening’ refers to the operation that changes active form fields into a static area that is part of the PDF document, just like the other text and images in the document. A completely flattened PDF form does not have any widget annotations or interactive fields.

Destroy()[source]: Frees the native memory of the object.

EraseAppearance()[source]: Removes any appearances associated with the field.

FindInheritedAttribute(attrib)[source]

Some of the Field attributes are designated as inheritable. If such an attribute is omitted from a Field object, its value is inherited from an ancestor node in the Field tree. If the attribute is a required one, a value must be supplied in an ancestor node; if it is optional and no inherited value is specified, the default value should be used.

The function walks up the Field inheritance tree in search for specified attribute.

Return type:

Returns:

The attribute value if the given attribute name was found: or a NULL object if the given attribute name was not found.

Resources dictionary (Required; inheritable) MediaBox rectangle (Required; inheritable) CropBox rectangle (Optional; inheritable) Rotate integer (Optional; inheritable)

Flatten(page)[source]

Flatten/Merge existing form field appearances with the page content and remove widget annotation.

Form ‘flattening’ refers to the operation that changes active form fields into a static area that is part of the PDF document, just like the other text and images in the document. A completely flattened PDF form does not have any widget annotations or interactive fields.

Parameters:: page (Page) – page object to flatten

Notes: an alternative approach to set the field as read only is using Field.SetFlag(Field::e_read_only, true) method. Unlike Field.SetFlag(…), the result of Flatten() operation can not be programatically reversed.

GetDefaultAppearance()[source]

Return type:: GState
Returns:: The default graphics state that should be used in formatting the text. The state corresponds to /DA entry in the field dictionary.

GetDefaultValue()[source]

Return type:: Obj
Returns:: The default value to which the field reverts when a reset-form action is executed or NULL if the default value is not specified.

The format of field’s value varies depending on the field type.

GetDefaultValueAsString()[source]

GetFlag(flag)[source]

Return type:: boolean
Returns:: the value of given field flag

GetJustification()[source]

Return type:: int
Returns:: the form of quadding (justification) to be used in displaying the text fields.

GetMaxLen()[source]

Return type:: int
Returns:: The maximum length of the field’s text, in characters, or a negative number if the length is not limited.

Notes: This method is specific to a text field.

GetName()[source]

Return type:: string
Returns:: a string representing the fully qualified name of the field (e.g. “employee.name.first”).

GetOpt(index)[source]

Parameters:: index (int) – index position of the option to retrieve.
Return type:: string
Returns:: The string of the option at the givent index.

Notes: The index must be less than the value returned by GetOptCount().

GetOptCount()[source]: Returns the total number of options in a list or combo box.

GetPartialName()[source]

Return type:: string
Returns:: a string representing the partial name of the field (e.g. “first” when “employee.name.first” is fully qualified name).

GetSDFObj()[source]

Return type:: Obj
Returns:: the underlying SDF/Cos object.

GetTriggerAction(trigger)[source]

Get the Action associated with the selected Field Trigger event.

Parameters:: trigger (int) – the type of trigger event to get
Return type:: Obj
Returns:: the Action Obj if present, otherwise NULL

GetType()[source]

Return type:: int
Returns:: The field’s value, whose type/format varies depending on the field type. See the descriptions of individual field types for further information.

GetUpdateRect()[source]

Return type:: Rect
Returns:: The rectangle that should be refreshed after changing a field.

GetValue()[source]

Return type:: Obj
Returns:: the value of the Field (the value of its /V key) or NULL if the value is not specified.

The format of field’s value varies depending on the field type.

GetValueAsBool()[source]

Return type:: boolean
Returns:: Field value as a boolean.

Notes: This method is usually for check-box and radio button fields.

GetValueAsString()[source]

IsAnnot()[source]

Return type:: boolean
Returns:: true if this Field is a Widget Annotation

Determines whether or not this Field is an Annotation.

IsLockedByDigitalSignature()[source]

Returns whether modifying this field would invalidate a digital signature in the document.

Return type:: boolean
Returns:: whether modifying this field would invalidate a digital signature in the document

IsValid()[source]

Return type:: boolean
Returns:: whether this is a valid (non-null) Field. If the function returns false the underlying SDF/Cos object is null and the Field object should be treated as null as well.

RefreshAppearance()[source]

Regenerates the appearance stream for the Widget Annotation containing variable text. Call this method if you modified field’s value and would like to update field’s appearance.

Notes: If this field contains text, and has been added to a rotated page, the text in the field may be rotated. If RefreshAppearance is called after the field is added to a rotated page, then any text will be rotated in the opposite direction of the page rotation. If this method is called before the field is added to any rotated page, then no counter rotation will be applied. If you wish to call RefreshAppearance on a field already added to a rotated page, but you don’t want the text to be rotated, you can do one of the following; temporarily un-rotate the page, or, temporarily remove the “P” object from the field.

Rename(field_name)[source]

Modifies the field name.

Parameters:: field_name (string) – a string representing the fully qualified name of the field (e.g. “employee.name.first”).

SetFlag(flag, value)[source]

Set the value of given FieldFlag.

Notes: You can use this method to set the field as read-only. An alternative approach to set the field as read only is using Page.Flatten(…) method. Unlike Flatten(…), the result of SetFlag(…) can be programatically reversed.

SetJustification(j)[source]

Sets the justification to be used in displaying the text field.

Parameters:: j (int) – enum representing justification to set the text field to, options are e_left_justified, e_centered and e_right_justified

Notes: This method is specific to a text field.

SetMaxLen(max_len)[source]

Sets the maximum length of the field’s text, in characters.

Parameters:: max_len (int) – maximum length of a field’s text.

Notes: This method is specific to a text field.

SetValue(args)[source]

Overload 1:

Sets the value of the field (i.e. the value of the field’s /V key). The format of field’s value varies depending on the field type.

Parameters:: value (string) – the new field value.

Notes: in order to remove/erase the existing value use pass a SDF::Null object to SetValue().

In PDF, Field’s value is separate from its annotation (i.e. how the field appears on the page). After you modify Field’s value you need to refresh Field’s appearance using RefreshAppearance() method.

Alternatively, you can delete “AP” entry from the Widget annotation and set “NeedAppearances” flag in AcroForm dictionary (i.e. doc.GetAcroForm().Put(“NeedAppearances”, Obj.CreateBool(true)); ) This will force viewer application to auto-generate new field appearances every time the document is opened.

Yet another option is to generate a custom annotation appearance using ElementBuilder and ElementWriter and then set the “AP” entry in the widget dictionary to the new appearance stream. This functionality is useful in applications that need advanced control over how the form fields are rendered.

Overload 2:

Sets the value of a check-box or radio-button field.

Parameters:: value (boolean) – If true, the filed will be set to ‘True’, if false the field will be set to ‘False’.

Notes: This method is usually for check-box and radio button fields.

UseSignatureHandler(signature_handler_id)[source]

Sets the signature handler to use for adding a signature to this field. If the signature handler is not found in PDFDoc’s signature handlers list, this field will not be signed. To add signature handlers, use PDFDoc.AddSignatureHandler method.

If a signature handler is already assigned to this field and this method is called once again, the associate signature handler for this field will be updated with the new handler.

Parameters:: signature_handler_id (int) – The unique id of the SignatureHandler to use for adding signature in this field.
Return type:: Obj
Returns:: The signature dictionary created using the SignatureHandler, or NULL pointer if the signature handler is not found.

e_action_trigger_calculate = 16

e_action_trigger_format = 14

e_action_trigger_keystroke = 13

e_action_trigger_validate = 15

e_button = 0

e_centered = 1

e_check = 1

e_choice = 4

e_comb = 12

e_combo = 14

e_commit_on_sel_change = 18

e_edit = 15

e_file_select = 9

e_left_justified = 0

e_multiline = 7

e_multiselect = 17

e_no_export = 2

e_no_scroll = 11

e_no_spellcheck = 10

e_null = 6

e_password = 8

e_pushbutton_flag = 3

e_radio = 2

e_radio_flag = 4

e_radios_in_unison = 6

e_read_only = 0

e_required = 1

e_rich_text = 13

e_right_justified = 2

e_signature = 5

e_sort = 16

e_text = 3

e_toggle_to_off = 5

property mp_field

property thisown: The membership flag

class apryse_sdk.FieldIterator(args)[source]

Bases: object

Supports a simple iteration over a generic collection.

Current()[source]

Note: HasNext() must be true before calling Current()

Return type:: Field
Returns:: the current element in the collection

Destroy()[source]: Frees the native memory of the object.

HasNext()[source]

Return type:: boolean
Returns:: true if the iterator can be successfully advanced to the next element; false if the iterator is no longer valid.

Next()[source]: Note: HasNext() must be true before calling Next() Advances the iterator to the next element of the collection.

property mp_impl

property thisown: The membership flag

class apryse_sdk.FileAttachment(args)[source]

Bases: Markup

A file attachment annotation contains a reference to a file, which may be embedded in the PDF document.

static Create(args)[source]

Overload 1:

Creates a file attachment annotation.

A file attachment annotation contains a reference to a file, which typically is embedded in the PDF file.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds, in user space coordinates. Note that FileAttachment icons can differ in their appearance dimensions, so you may want to match these Rectangle dimensions or the aspect ratio to avoid a squished or stretched appearance: e_Graph: 40 x 40 e_PushPin: 28 x 40 e_Paperclip: 14 x 34 e_Tag: 40 x 32
fs (FileSpec) – a file specification object used to initialize the file attachment annotation.
icon_name (int, optional) – The name of an icon to be used in displaying the annotation, default is PushPin.

Notes: PDF Viewer applications should provide predefined icon appearances for at least the following standard names: Graph, PushPin, Paperclip, Tag. Additional names may be supported as well. Default value: PushPin.

Return type:: FileAttachment
Returns:: A new file attachment annotation.

Overload 2:

Creates a file attachment annotation.

A file attachment annotation contains a reference to a file, which typically is embedded in the PDF file.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds, in user space coordinates. Note that FileAttachment icons can differ in their appearance dimensions, so you may want to match these Rectangle dimensions or the aspect ratio to avoid a squished or stretched appearance: e_Graph: 40 x 40 e_PushPin: 28 x 40 e_Paperclip: 14 x 34 e_Tag: 40 x 32
path (string) – The path to the file which should be attached
icon_name (int, optional) – An icon to be used in displaying the annotation, default is PushPin.

Notes: PDF Viewer applications should provide predefined icon appearances for at least the following standard names: Graph PushPin Paperclip Tag. Additional names may be supported as well. Default value: PushPin.

Return type:: FileAttachment
Returns:: A new file attachment annotation.

Overload 3:

Creates a file attachment annotation.

A file attachment annotation contains a reference to a file, which typically is embedded in the PDF file.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds, in user space coordinates. Note that FileAttachment icons can differ in their appearance dimensions, so you may want to match these Rectangle dimensions or the aspect ratio to avoid a squished or stretched appearance: e_Graph: 40 x 40 e_PushPin: 28 x 40 e_Paperclip: 14 x 34 e_Tag: 40 x 32
path (string) – The path to the file which should be attached
icon_name – An icon to be used in displaying the annotation, default is PushPin.

Notes: PDF Viewer applications should provide predefined icon appearances for at least the following standard names: Graph PushPin Paperclip Tag. Additional names may be supported as well. Default value: PushPin.

Return type:: FileAttachment
Returns:: A new file attachment annotation.

Overload 4:

Creates a file attachment annotation. This method should be used when a nonstandard icon type is desired in the annotation.

A file attachment annotation contains a reference to a file, which typically is embedded in the PDF file.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds, in user space coordinates.
path (string) – The path to the file which should be attached
icon_name (string) – The name of an icon to be used in displaying the annotation.

Notes: PDF Viewer applications should provide predefined icon appearances for at least the following standard names: Graph PushPin Paperclip Tag. Additional names may be supported as well. Default value: PushPin.

Return type:: FileAttachment
Returns:: A new file attachment annotation.

static CreateAnnot(args)[source]

Export(args)[source]

The function saves the data referenced by this File Attachment to an external file.

If the file is embedded, the function saves the embedded file. If the file is not embedded, the function will copy the external file. If the file is not embedded and the external file can’t be found, the function returns false.

Parameters:: save_as (string, optional) – An optional parameter indicating the filepath and filename where the data should be saved. If this parameter is not specified the function will attempt to save the file using FileSpec.GetFilePath().
Return type:: boolean
Returns:: true is the file was saved successfully, false otherwise.

GetFileSpec()[source]

Return type:: FileSpec
Returns:: the file specification that contains a file reference or the embedded file data stream.

GetIcon()[source]

Return type:: int
Returns:: the type the associated icon style. e_PushPin

Notes: The annotation dictionary’s appearance stream, if present, will take precedence over this entry when displaying the annotation in the viewer.

GetIconName()[source]

Returns the name of the icon associated with the FileAttachment annotation.

Return type:: string
Returns:: A string denoting the name of the icon.

See also: GetIcon() GetIconName() returns the icon name as it appears in the annotation dictionary, while GetIcon() returns the same icon name converted to enumeration value. Notes: The annotation dictionary’s appearance stream, if present, will take precedence over this entry when displaying the annotation in the viewer.

SetFileSpec(file)[source]

Sets the file specification.

Parameters:: file (FileSpec) – The file specification to associate with this annotation.. The file specification contains a file reference or the embedded file data stream.

SetIcon(args)[source]

Sets the icon style associated with FileAttachment annotation. (Optional)

Parameters:: type (int, optional) – icon style. e_PushPin

Notes: The annotation dictionary’s appearance stream, if present, will take precedence over this entry when displaying the annotation in the viewer.

SetIconName(iname)[source]

Sets the name of the icon associated with the FileAttachment annotation. (Optional)

Parameters:: iname (string) – A string.denoting the name of the icon.

Notes: this method should be used to assign non-standard icon type to the annotation. See also: SetIcon() The annotation dictionary’s appearance stream, if present, will take precedence over this entry when displaying the annotation in the viewer.

e_Graph = 0: The icon has graph appearance

e_Paperclip = 2: The icon has a paper clip appearance

e_PushPin = 1: The icon has a push pin appearance.

e_Tag = 3: The icon has tag appearance

e_Unknown = 4: The icon has unrecognized appearance type

property thisown: The membership flag

class apryse_sdk.FileSpec(args)[source]

Bases: object

FileSpec corresponds to the PDF file specification object.

A PDF file can refer to the contents of another file by using a file specification, which can take either of the following forms:

A simple file specification gives just the name of the target file in a standard format, independent of the naming conventions of any particular file system.
A full file specification includes information related to one or more specific file systems.
A URL reference.

Although the file designated by a file specification is normally external to the PDF file referring to it, it is also possible to embed the file allowing its contents to be stored or transmitted along with the PDF file. However, embedding a file does not change the presumption that it is external to (or separate from) the PDF file.

For more details on file specifications, please refer to Section 3.10, ‘File Specifications’ in the PDF Reference Manual.

static Create(doc, path, embed=True)[source]

Creates a file specification for the given file. By default, the specified file is embedded in PDF.

Parameters:

doc (SDFDoc) –
- A document to which the FileSpec should be added. To obtain
SDFDoc from PDFDoc use PDFDoc::GetSDFDoc() or Obj::GetDoc().
path (string) –
- The path to convert into a file specification.
embed (boolean, optional) –
- A flag indicating whether to embed specified in the PDF.
By default, all files are embedded.

Return type:

FileSpec

Returns:

newly created FileSpec object.

static CreateURL(doc, url)[source]

Creates a URL file specification.

Parameters:

doc (SDFDoc) –
- A document to which the FileSpec should be added. To obtain
SDFDoc from PDFDoc use PDFDoc::GetSDFDoc() or Obj::GetDoc().
url (string) –
- A uniform resource locator (URL) of the form defined in
Internet RFC 1738, Uniform Resource Locators Specification.

Return type:

FileSpec

Returns:

newly created FileSpec object.

Export(args)[source]

The function saves the data referenced by this FileSpec to an external file.

Parameters:: save_as (string, optional) – An optional parameter indicating the filepath and filename where the data should be saved. If this parameter is not specified, the function will attempt to save the file using FileSpec.GetFilePath().

If the file is embedded, the function saves the embedded file. If the file is not embedded, the function will copy the external file. If the file is not embedded and the external file can’t be found, the function returns false.

Return type:: boolean
Returns:: true is the file was saved successfully, false otherwise.

GetFileData()[source]

The function returns data referenced by this FileSpec.

Return type:: Filter
Returns:: A stream (filter) containing file data. If the file is embedded, the function returns a stream to the embedded file. If the file is not embedded, the function will return a stream to the external file. If the file is not embedded and the external file can’t be found, the function returns NULL.

GetFilePath()[source]

Return type:: string
Returns:: The file path for this file specification.

If the FileSpec is a dictionary, a corresponding platform specific path is returned (DOS, Mac, or Unix). Otherwise the function returns the path represented in the form described in Section 3.10.1, ‘File Specification Strings,’ or , if the file system is URL, as a uniform resource locator (URL). If the FileSpec is not valid, an empty string is returned.

GetSDFObj()[source]

Return type:: Obj
Returns:: The underlying SDF/Cos object.

IsValid()[source]

Return type:: boolean
Returns:: whether this is a valid (non-null) FileSpec. If the function returns false the underlying SDF/Cos object is null or is not valid and the FileSpec object should be treated as null as well.

SetDesc(desc)[source]: The functions sets the descriptive text associated with the file specification. This test is typically used in the EmbeddedFiles name tree.

property mp_impl

property thisown: The membership flag

class apryse_sdk.Filter(args)[source]

Bases: object

Provides a generic view of a sequence of bytes.

A Filter is the abstract base class of all filters. A filter is an abstraction of a sequence of bytes, such as a file, an input/output device, an inter-process communication pipe, or a TCP/IP socket. The Filter class and its derived classes provide a generic view of these different types of input and output, isolating the programmer from the specific details of the operating system and the underlying devices.

Besides providing access to input/output sources Filters can be also to transform the data (e.g. to compress the data stream, to normalize the image data, to encrypt data, etc). Filters can also be attached to each other to form pipelines. For example, a filter used to open an image data file can be attached to a filter that decompresses the data, which is attached to another filter that will normalize the image data.

Depending on the underlying data source or repository, filters might support only some of these capabilities. An application can query a stream for its capabilities by using the IsInputFilter() and CanSeek() properties.

Notes: To read or write data to a filter, a user will typically use FilterReader/FilterWriter class. instead of using Filter methods

For example:

StdFile file("my_stream.txt", StdFile::e_read_mode);
FilterReader reader(file);
while (reader.Read(..)) ...

AttachFilter(attach_filter)[source]

Attaches a filter to the this filter. If this filter owns another filter it will be deleted. This filter then becomes the owner of the attached filter.

Parameters:: attach_filter (Filter) – filter object to attach

CanSeek()[source]

Return type:

boolean

Returns:

true if the stream supports seeking; otherwise, false.

default is to return false.

Consume(num_bytes)[source]

Moves the Begin() pointer num_bytes forward.

Parameters:

num_bytes (int) –

number of bytes to consume. num_bytes must be less than or

equal to Size().

Count()[source]

Return type:

int

Returns:

the number of bytes consumed since opening the filter or

the last Seek operation

CreateInputIterator()[source]

Create Filter iterator. Filter iterator similar to a regular filter. However, there can be only one owner of the attached filter.

Notes: - Derived classes should make sure that there is only one owner of the: attached stream. Otherwise the attached stream may be deleted several times.

Raises:

throws an exception if the method is not implemented in the derived class

Destroy()[source]: Frees the native memory of the object.

Flush()[source]: Forces any data remaining in the buffer to be written to input or output filter.

FlushAll()[source]: Forces any data remaining in the filter chain to the source or destination.

GetAttachedFilter()[source]

Return type:

Returns:

returns attached Filter or a NULL filter if no filter is attached.

GetDecodeName()[source]

Return type:

string

Returns:

string representing the name of corresponding decode filter as

it should appear in document (e.g. both ASCIIHexDecode and ASCIIHexEncode should return ASCIIHexDecode).

GetFilePath()[source]

Return type:: string
Returns:: the file path to the underlying file stream. Default implementation returns empty string.

GetName()[source]

Return type:

string

Returns:

descriptive name of the filter.

GetSourceFilter()[source]

Return type:

Returns:

returns the first filter in the chain (usually a file filter)

IsInputFilter()[source]

Return type:

boolean

Returns:

boolean indicating whether this is an input filter.

ReleaseAttachedFilter()[source]

Release the ownership of the attached filter. After the attached filter is released this filter points to NULL filter.

Return type:

Returns:

Previously attached filter.

Seek(offset, origin)[source]

When overridden in a derived class, sets the position within the current stream.

Parameters:

offset (ptrdiff_t) –
- A byte offset relative to origin. If offset is negative,
the new position will precede the position specified by origin by the number of bytes specified by offset. If offset is zero, the new position will be the position specified by origin. If offset is positive, the new position will follow the position specified by origin by the number of bytes specified by offset.
origin (int) –
- A value of type ReferencePos indicating the reference point used
to obtain the new position

Notes: - After each Seek() operation the number of consumed bytes (i.e. Count()) is set to 0.

Raises:

throws FilterExc if the method is not implemented in derived class

SetCount(new_count)[source]

Sets a new counting point for the current filter. All subsequent Consume() operations will increment this counter.

Make sure that the output filter is flushed before using SetCount().

Parameters:

new_count (int) – number to set the counting point of the filter to.

Return type:

int

Returns:

the value of previous counter

SetStreamLength(bytes)[source]

The functions specifies the length of the data stream. The default implementation doesn’t do anything. For some derived filters such as file segment filter it may be useful to override this function in order to limit the stream length.

Parameters:: bytes (int) – the length of stream in bytes

Size()[source]

Return type:

int

Returns:

the size of buffer returned by Begin(). If the Size() returns 0

end of data has been reached.

Tell()[source]

Reports the current read position in the stream relative to the stream origin.

Return type:

ptrdiff_t

Returns:

The current position in the stream

Raises:

throws FilterExc if the method is not implemented in derived class

Truncate(new_size)[source]

Truncates the underlying data.

This method is for a writeable, seekable filter only and will throw otherwise.

Notes: For a filter representing a file, truncation would mean resizing the file.

Parameters:

new_size (int) –

the number of bytes to resize the filter to

Return type:

int

Returns:

The new size of the filter

Raises:

throws FilterExc if the method is not implemented in derived class

WriteToFile(path, append)[source]

Writes the entire filter, starting at current position, to specified filepath. Should only be called on an input filter.

type path:

string

Parameters:

path –
the output filepath.

type append:

boolean
append – ‘true’ to append to existing file contents, ‘false’ to overwrite.

e_begin = 0

e_cur = 1

e_end = 2

property m_impl

property m_owner

property thisown: The membership flag

class apryse_sdk.FilterReader(args)[source]

Bases: object

FilterReader is a utility class providing a convenient way to read data from an input filter (using Filter directly is not very intuitive).

For example:

StdFile file("my_stream.txt", StdFile::e_read_mode);
FilterReader reader(file);
while (reader.Read(...)) ...

AttachFilter(filter)[source]

Attaches a filter to the this FilterReader.

Parameters:: filter (Filter) – filter object to attach

Count()[source]

Return type:

int

Returns:

the number of bytes consumed since opening the filter or

since the last Seek operation.

Flush()[source]: Forces any data remaining in the buffer to be written to input or output filter.

FlushAll()[source]: Forces any data remaining in the filter chain to the source or destination.

Get()[source]

Return type:

int

Returns:

the next character from the stream or EOF (-1) if the end of file is reached.

GetAttachedFilter()[source]

Return type:

Returns:

The attached Filter or a NULL filter if no filter is attached.

Peek()[source]

Return type:

int

Returns:

the next character without extracting it from the stream or

or EOF (-1) if the end of file is reached.

Read(buf_size)[source]

Return type:

std::vector< unsigned char,std::allocator< unsigned char > >

Returns:

returns the number of bytes actually read and stored in buffer (buf),

which may be less than buf_size if the end of the stream is encountered before reaching count.

Seek(offset, origin)[source]

Sets the position within the current stream.

Parameters:

offset (ptrdiff_t) –
- A byte offset relative to origin. If offset is negative,
the new position will precede the position specified by origin by the number of bytes specified by offset. If offset is zero, the new position will be the position specified by origin. If offset is positive, the new position will follow the position specified by origin by the number of bytes specified by offset.
origin (int) –
- A value of type ReferencePos indicating the reference point used
to obtain the new position

Notes: - After each Seek() operation the number of consumed bytes (i.e. Count()) is set to 0. :raises: - throws an exception if the method is not implemented in the associated filter.

Tell()[source]

Reports the current read position in the stream relative to the stream origin.

Return type:

ptrdiff_t

Returns:

The current position in the stream

Raises:

throws an exception if the method is not implemented in the associated filter.

property m_impl

property thisown: The membership flag

class apryse_sdk.FilterWriter(args)[source]

Bases: object

FilterWriter is a utility class providing a convenient way to write data to an output filter (using Filter directly is not very intuitive).

For example:

StdFile outfile("file.dat", StdFile::e_write_mode);
FilterWriter fwriter(outfile);
fwriter.WriteBuffer(buf, buf_sz);
fwriter.Flush();

AttachFilter(filter)[source]

Attaches a filter to the this FilterWriter.

Parameters:: filter (Filter) – filter object to attach

Count()[source]

Return type:

int

Returns:

the number of bytes consumed since opening the filter or

since the last Seek operation.

Flush()[source]: Forces any data remaining in the buffer to be written to input or output filter.

FlushAll()[source]: Forces any data remaining in the filter chain to the source or destination.

GetAttachedFilter()[source]

Return type:

http://partners.adobe.com/public/developer/en/pdf/HighlightFileFormat.pdf

Returns:

The attached Filter or a NULL filter if no filter is attached.

Seek(offset, origin)[source]

Sets the position within the current stream.

Parameters:

offset (ptrdiff_t) –
- A byte offset relative to origin. If offset is negative,
the new position will precede the position specified by origin by the number of bytes specified by offset. If offset is zero, the new position will be the position specified by origin. If offset is positive, the new position will follow the position specified by origin by the number of bytes specified by offset.
origin (int) –
- A value of type ReferencePos indicating the reference point used
to obtain the new position

Notes: - After each Seek() operation the number of consumed bytes (i.e. Count()) is set to 0. :raises: - throws an exception if the method is not implemented in the associated filter.

Tell()[source]

Reports the current read position in the stream relative to the stream origin.

Return type:

ptrdiff_t

Returns:

The current position in the stream

Raises:

throws an exception if the method is not implemented in the associated filter.

WriteBuffer(buf)[source]

Parameters:

buf (std::vector< unsigned char,std::allocator< unsigned char > >) – buffer object to write out.

Return type:

int

Returns:

returns the number of bytes actually written to a stream. This number may

less than buf_size if the stream is corrupted.

WriteFilter(reader)[source]

Write the entire input stream to the output stream (i.e. to this FilterWriter).

Parameters:: reader (FilterReader) – A FilterReader attached to an input stream.

WriteInt16(num)[source]

Write an integer to the output stream.

Parameters:: num (int) – An integer to write to the output stream.

WriteInt32(num)[source]

WriteInt64(num)[source]

WriteLine(line, eol=13)[source]

Write out a null terminated ‘line’ followed by a end of line character default end of line character is carriage return.

Parameters:

line (string) – string to write out.
eol (char, optional) – end of line character. Defaults to carriage return (0x0D).

WriteString(args)[source]

Overload 1:

Write a string to the output stream.

Parameters:: str (string) – A string to write to the output stream.

Overload 2:

Write a null terminated string

Parameters:: str (string) – A terminated string string to write to the output stream.

WriteUChar(ch)[source]

Write a single character to the output stream.

Parameters:: ch (UChar) – An unsigned character to write to the output stream.

WriteUInt16(num)[source]

WriteUInt32(num)[source]

WriteUInt64(num)[source]

property m_impl

property thisown: The membership flag

class apryse_sdk.FlateEncode(input_filter, compression_level=-1, buf_sz=256)[source]

Bases: Filter

FlateEncode filter can be used to compress any data stream using Flate (i.e. ZIP) compression method.

property thisown: The membership flag

class apryse_sdk.Flattener[source]

Bases: object

Flattener is a optional PDFNet add-on that can be used to simplify and optimize existing PDF’s to render faster on devices with lower memory and speeds.

PDF documents can frequently contain very complex page description (e.g. thousands of paths, different shadings, color spaces, blend modes, large images etc.) that may not be suitable for interactive viewing on mobile devices. Flattener can be used to speed-up PDF rendering on mobile devices and on the Web by simplifying page content (e.g. flattening complex graphics into images) while maintaining vector text whenever possible.

By using the FlattenMode::e_simple option each page in the PDF will be reduced to a single background image, with the remaining text over top in vector format. Some text may still get flattened, in particular any text that is clipped, or underneath, other content that will be flattened.

On the other hand the FlattenMode::e_fast will not flatten simple content, such as simple straight lines, nor will it flatten Type3 fonts.

Notes: ‘Flattener’ is available as a separately licensable add-on to PDFNet core license.

See ‘pdftron.PDF.Optimizer’ for alternate approach to optimize PDFs with focus on file size reduction.

Destroy()[source]: Frees the native memory of the object.

Process(args)[source]

Overload 1:

Process each page in the PDF, flattening content that matches the mode criteria.

Parameters:

doc (PDFDoc) – the document to flatten.
mode (int) – indicates the criteria for which elements are flattened.

Overload 2:

Process the given page, flattening content that matches the mode criteria.

Parameters:

page (Page) – the page to flatten.
mode (int) – indicates the criteria for which elements are flattened.

SetDPI(dpi)[source]

The output resolution, from 1 to 1000, in Dots Per Inch (DPI) at which to render elements which cannot be directly converted. the default value is 150 Dots Per Inch

Parameters:: dpi (int) – the resolution in Dots Per Inch

SetJPGQuality(quality)[source]

Specifies the compression quality to use when generating JPEG images.

Parameters:: quality (int) – the JPEG compression quality, from 0(highest compression) to 100(best quality).

SetMaximumImagePixels(max_pixels)[source]

Specifies the maximum image size in pixels.

Parameters:: max_pixels (int) – the maximum number of pixels an image can have.

SetPathHinting(enable_hinting)[source]

Enable or disable path hinting.

Parameters:: enable_hinting (boolean) – if true path hinting is enabled. Path hinting is used to slightly adjust paths in order to avoid or alleviate artifacts of hair line cracks between certain graphical elements. This option is turned on by default.

SetPreferJPG(jpg)[source]

Specifies whether to leave images in existing compression, or as JPEG.

Parameters:: jpg (boolean) – if true PDF will contain all JPEG images.

SetThreshold(threshold)[source]

Used to control how precise or relaxed text flattening is. When some text is preserved (not flattened to image) the visual appearance of the document may be altered.

Parameters:: threshold (int) – the threshold setting to use.

e_fast = 1: Feature reduce PDF while trying to preserve some complex PDF features (such as vector figures, transparency, shadings, blend modes, Type3 fonts etc.) for pages that are already fast to render. This option can also result in smaller faster files compared to e_simple, but the pages may have more complex structure.

e_simple = 0: Feature reduce PDF to a simple two layer representation consisting of a single background RGB image and a simple top text layer.

e_threshold_default = 2: Render text that are somewhat clipped or occluded.

e_threshold_keep_all = 4: Only render text that are completely occluded, or used as a clipping path.

e_threshold_keep_most = 3: Only render text that are seriously clipped or occluded.

e_threshold_strict = 1: Render text that are marginally clipped or occluded.

e_threshold_very_strict = 0: Render (flatten) any text that is clipped or occluded.

property mp_impl

property thisown: The membership flag

class apryse_sdk.FlowDocument[source]

Bases: object

The class FlowDocument. Encapsulates document creation API.

AddList()[source]

Adds a list to the document.

Return type:: List
Returns:: The list object

AddParagraph(args)[source]

Overload 1:

Adds a paragraph to the document.

Return type:: Paragraph
Returns:: The paragraph object

Overload 2:

Adds a paragraph to the document and sets the text.

Return type:: Paragraph
Returns:: The paragraph object

AddTable()[source]

Adds a table to the document.

Return type:: Table
Returns:: The table object

GetBody()[source]

Gets the body of the document.

The body is the root of the content tree. It can be used to traverse the content tree via the ContentNodeIterator object.

Return type:: ContentNode
Returns:: The body of the document

PaginateToPDF()[source]

Paginates the content tree into a PDFDoc object.

Return type:: PDFDoc
Returns:: The PDFDoc object

SetDefaultMargins(left, top, right, bottom)[source]

Set the default margins for the document.

Parameters:

left (double) – The left margin in points
top (double) – The top margin in points
right (double) – The right margin in points
bottom (double) – The bottom margin in points

SetDefaultPageSize(width, height)[source]

Set the default page size for the document.

Parameters:

width (double) – The width in points
height (double) – The height in points

property m_impl

property thisown: The membership flag

class apryse_sdk.Font(args)[source]

Bases: object

A font that is used to draw text on a page. It corresponds to a Font Resource in a PDF file. More than one page may reference the same Font object. A Font has a number of attributes, including an array of widths, the character encoding, and the font’s resource name.

PDF document can contain several different types of fonts and Font class represents a single, flat interface around all PDF font types.

There are two main classes of fonts in PDF: simple and composite fonts.

Simple fonts are Type1, TrueType, and Type3 fonts. All simple fonts have the following properties:

Glyphs in the font are selected by single-byte character codes obtained from a string that is shown by the text-showing operators. Logically, these codes index into a table of 256 glyphs; the mapping from codes to glyphs is called the font’s encoding. Each font program has a built-in encoding. Under some circumstances, the encoding can be altered by means described in Section 5.5.5 “Character Encoding” in PDF Reference Manual.

Each glyph has a single set of metrics. Therefore simple fonts support only horizontal writing mode.

A composite font is one whose glyphs are obtained from a font like object called a CIDFont (e.g. CIDType0Font and CIDType0Font). A composite font is represented by a font dictionary whose Subtype value is Type0. The Type 0 font is known as the root font, while its associated CIDFont is called its descendant. CID-keyed fonts provide a convenient and efficient method for defining multiple-byte character encodings and fonts with a large number of glyphs. These capabilities provide great flexibility for representing text in writing systems for languages with large character sets, such as Chinese, Japanese, and Korean (CJK).

static Create(args)[source]

Overload 1:

Create a PDF::Font object for the given standard (also known as base 14 font)

Overload 2:

Create a CID TrueType PDF font with the characteristics specified in the LOGFONTA structure.

Parameters:

doc (SDFDoc) –
- document in which the external font should be embedded.
logfonta – A pointer to a Windows LOGFONTA structure that defines the characteristics of the logical font.
embed –
- a boolean indicating whether the font should be embedded or
not. For accurate font reproduction set the embed flag to ‘true’.
subset –
- a boolean indicating whether the embedded font should
be subsetted
encoding –
- the encoding type either e_IdentityH (default)
or e_Indices (to write glyph indices rather than unicode)

Notes: This method is available only on Windows platforms.

Create a CID TrueType PDF font with the characteristics specified in the LOGFONTA structure.

Parameters:

doc (SDFDoc) –
- document in which the external font should be embedded.
logfontw – A pointer to a Windows LOGFONTW structure that defines the characteristics of the logical font.
embed –
- a boolean indicating whether the font should be embedded or
not. For accurate font reproduction set the embed flag to ‘true’.
subset –
- a boolean indicating whether the embedded font should
be subsetted
encoding –
- the encoding type either e_IdentityH (default)
or e_Indices (to write glyph indices rather than unicode)

This method is available only on Windows platforms.

Create a new Unicode font based on the description of an existing PDF font.

Parameters:

doc (SDFDoc) – document in which the external font should be embedded.
from (Font) – A Font object that provides the name for choosing a font. If the font with that name can be located and it covers a sufficient character set characters from that font will be used. Otherwise the font object created will be from a another font that covers the character set.
char_set (string) – An initial character set. This provides an approach to specify any characters that are required to be included in the final font as part of a string. Note that additional characters will be added to the character set as needed, so it is not required to specify them here. (empty string is a perfectly valid and common value for this argument)

Overload 3:

Create a new Unicode font based on the description of an existing PDF font.

Parameters:

doc (SDFDoc) – document in which the external font should be embedded.
name (string) – A font name that provides a hint when choosing a font. If the font with that name can be located and it covers a sufficient character set characters from that font will be used. Otherwise the font object created will be from a another font that covers the character set.
char_set (string) – An initial character set. This provides an approach to specify any characters that are required to be included in the final font as part of a string. Note that additional characters will be added to the character set as needed, so it is not required to specify them here. (empty string is a perfectly valid and common value for this argument)

static CreateCIDTrueTypeFont(args)[source]

Embed an external TrueType font in the document as a CID font. By default the function selects “Identity-H” encoding that maps 2-byte character codes ranging from 0 to 65,535 to the same Unicode value. Other predefined encodings are listed in Table 5.15 ‘Predefined CMap names’ in PDF Reference Manual.

Parameters:

doc (SDFDoc) –
- document in which the external font should be embedded.
font_path (string) –
- path to the external font file.
embed (boolean, optional) –
- a boolean indicating whether the font should be embedded or
not. For accurate font reproduction set the embed flag to ‘true’.
subset (boolean, optional) –
- a boolean indicating whether the embedded font should
be subsetted
encoding (int, optional) –
- the encoding type either e_IdentityH (default)
or e_Indices (to write glyph indices rather than unicode)
ttc_font_index (int, optional) –
- if a TrueTypeCollection (TTC) font is loaded this
parameter controls which font is actually picked

static CreateTrueTypeFont(doc, font_path, embed=True, subset=True)[source]

Embed an external TrueType font in the document as a Simple font.

Notes: glyphs in the Simple font are selected by single-byte character codes. If you want to work with multi-byte character codes (e.g. UTF16) you need to create a CID font.

Parameters:

doc (SDFDoc) – Document in which the external font should be embedded.
font_path (string) – Path to the external font file.
embed (boolean, optional) – A boolean indicating whether the font should be embedded or not. For accurate font reproduction set the embed flag to ‘true’.
subset (boolean, optional) – A boolean indicating whether the embedded font should be subsetted.

static CreateType1Font(doc, font_path, embed=True)[source]

Embed an external Type1 font in the document.

Parameters:

doc (SDFDoc) –
- document in which the external font should be embedded.
font_path (string) –
- path to the external font file.
embed (boolean, optional) –
- a boolean indicating whether the font should be embedded or
not. For accurate font reproduction set the embed flag to ‘true’.

Destroy()[source]: Frees the native memory of the object.

GetAscent()[source]

The face’s ascender is the vertical distance from the baseline to the topmost point of any glyph in the face. This field’s value is a positive number, expressed in the glyph coordinate system. For all font types except Type 3, the units of glyph space are one-thousandth of a unit of text space. Some font designs use a value different from ‘bbox.yMax’.

Notes: Only relevant for scalable formats.

GetBBox()[source]

Return type:: Rect
Returns:: A rectangle expressed in the glyph coordinate system, specifying the font bounding box. This is the smallest rectangle enclosing the shape that would result if all of the glyphs of the font were placed with their origins coincident and then filled.

GetCharCodeIterator()[source]: GetCharCodeIterator represents an iterator interface used to traverse a list of char codes for which there is a glyph outline in the embedded font.

GetDescendant()[source]

Return type:: Font
Returns:: descendant CIDFont.

Notes: Relevant only for a Type0 font.

GetDescent()[source]

The face’s descender is the vertical distance from the baseline to the bottommost point of any glyph in the face. This field’s value is a negative number expressed in the glyph coordinate system. For all font types except Type 3, the units of glyph space are one-thousandth of a unit of text space. Some font designs use a value different from ‘bbox.yMin’.

Notes: Only relevant for scalable formats.

GetDescriptor()[source]

Return type:: Obj
Returns:: a SDF/Cos object representing FontDescriptor or NULL is FontDescriptor is not present.

GetEmbeddedFont()[source]

Return type:: Obj
Returns:: the stream object of the embedded font or NULL if there if the font is not embedded.

Notes: This function is not applicable to Type3 font and will throw exception.

GetEmbeddedFontBufSize()[source]

Return type:: int
Returns:: the size of decoded buffer containing embedded font data or 0 if this information is not known in advance.

Notes: The size of decoded buffer may not be known in advance for all fonts and may not be correct. This function is not applicable to Type3 font and will throw exception.

GetEmbeddedFontName()[source]

Return type:: string
Returns:: the PostScript font name for the embedded font. If the embedded font name is not available the function returns the empty string .

GetFamilyName()[source]

Return type:: string
Returns:: the face’s family name. This is an ASCII string, usually in English, which describes the typeface’s family (like ‘Times New Roman’, ‘Bodoni’, ‘Garamond’, etc). This is a least common denominator used to list fonts.

GetGlyphPath(char_code, conics2cubics, transform=None)[source]

The function retrieves the glyph outline for a given character code.

Parameters:

char_code (int) – character to query
conics2cubics (boolean) – if set to true converts all quadratic Bezier curves to cubic Beziers, otherwise no conversion is performed.
transform (Matrix2D, optional) – An optional matrix used to transform glyph data coordinates. If null/unspecified, glyph data points will not be transformed.

Return type:

PathData

Returns:

A PathData object containing the path information.

Notes: the function can return only the following operators (Element::e_moveto, Element::e_lineto, Element::e_cubicto and optionally Element::e_conicto if conics2cubics parameter is set to true.

This function is not applicable to Type3 font and will throw an exception. Use GetType3GlyphStream instead.

Check PathData::IsDefined to see if the char_code was mapped to ‘undefined character code’.

GetMaxWidth()[source]

Return type:: double
Returns:: the maximal advance width, in font units, for all glyphs in this face.

GetMissingWidth()[source]

Return type:: double
Returns:: the default width to use for character codes whose widths are not specified in a font dictionary’s Widths array.

GetName()[source]

Return type:: string
Returns:: the name of a font. The behavior depends on the font type; for a Type 3 font it gets the value of the Name key in a PDF Font resource. For other types it gets the value of the BaseFont key in a PDF font resource.

GetSDFObj()[source]

Return type:: Obj
Returns:: a SDF/Cos object of this Font.

GetShapedText(text_to_shape)[source]

Creates a set of positioned glyphs corresponding to the visual representation of the provided text string.

The shaped text will take into account any advanced positioning and substitution features provided by an underylying embedded font file. For example, these features could include kerning, ligatures, and diacritic positioning. Typically the resulting shaped text would be fed into ElementBuilder.CreateShapedTextRun()

Parameters:: text_to_shape (string) – the string to be shaped.
Return type:: ShapedText
Returns:: A ShapedText object representing the result of the shaping operation.

Notes: Shaping requires a Type0 font with an embedded font file which covers all the unicode codepoints in the source text. For best results, this font should use the e_Indices encoding scheme, as shaping features that combine multiple codepoints into one glyph (ligatures, for example) will not work well in non-index encoded fonts.

GetStandardType1FontType()[source]

Return type:: int
Returns:: Font::e_null if the font is not a standard Type1 font or some other StandardType1Font value for a standard Type1 font.

GetType()[source]

Return type:: int
Returns:: Font Type

GetType3FontMatrix()[source]

Return type:: Matrix2D
Returns:: Type3 font matrix, mapping glyph space to text space A common practice is to define glyphs in terms of a 1000-unit glyph coordinate system, in which case the font matrix is [0.001 0 0 0.001 0 0].

Notes: Relevant only for a Type3 font.

GetType3GlyphStream(char_code)[source]

Return type:: Obj
Returns:: a SDF/Cos glyph stream for the given char_code. If specified char_code is not found in the CharProcs dictionary the function returns NULL.

Notes: Relevant only for a Type3 font.

GetUnitsPerEm()[source]

Return type:: int
Returns:: the number of font units per EM square for this face. This is typically 2048 for TrueType fonts, 1000 for Type1 fonts

Notes: Only relevant for scalable formats (such as TrueType and Type1).

This function is not applicable to Type3 font and will throw an exception. Use GetType3FontMatrix instead.

GetVerticalAdvance(char_code)[source]

Return type:

std::vector< double,std::allocator< double > >

Returns:

vertical advance. vertical advance is a displacement vector for vertical writing mode (i.e. writing mode 1); its horizontal component is always 0.

Parameters:

char_code (int) – character to query for vertical advance
out_pos_vect_x –
- initialized by the method. horizontal component of the
position vector defining the position of the vertical writing mode origin relative to horizontal writing mode origin.
out_pos_vect_y –
- initialized by the method. vertical component of the
position vector defining the position of the vertical writing mode origin relative to horizontal writing mode origin.

Notes: Use this method only for composite fonts with vertical writing mode (i.e. if Font.IsHorizontalMode() returns false). The method will return 0 as vertical advance for simple fonts or for composite fonts with only horizontal writing mode. Relevant only for a Type0 font.

GetWidth(char_code)[source]

Return type:: double
Returns:: advance width, measured in glyph space units for the glyph matching given character code.

Notes: 1000 glyph units = 1 text space unit The width returned has NOT been scaled by the font size, text matrix,

nor the CTM.

The function gets the advance width of the font glyph. The advance width is the amount by which the current point advances when the glyph is drawn. The advance width may not correspond to the visible width of the glyph and for this reason, the advance width cannot be used to determine the glyphs’ bounding boxes.

IsAllCap()[source]

Return type:: boolean
Returns:: true if font contains no lowercase letters

IsCFF()[source]

Return type:: boolean
Returns:: true if the embedded font is represented as CFF (Compact Font Format).

Notes: Only Type1 and Type1C fonts can be represented in CFF format

IsEmbedded()[source]

Tests whether or not the specified font is stored as a font file in a stream embedded in the PDF file.

Return type:: boolean
Returns:: true if the font is embedded in the file, false otherwise.

IsFixedWidth()[source]

Return type:: boolean
Returns:: true if all glyphs have the same width

IsForceBold()[source]

Return type:: boolean
Returns:: true if bold glyphs should be painted with extra pixels at very small text sizes.

IsHorizontalMode()[source]

Return type:: boolean
Returns:: true if the font uses horizontal writing mode, false for vertical writing mode.

IsItalic()[source]

Return type:: boolean
Returns:: true if glyphs have dominant vertical strokes that are slanted.

IsSerif()[source]

Return type:: boolean
Returns:: true if glyphs have serifs

IsSimple()[source]

Return type:: boolean
Returns:: true for non-CID based fonts such as Type1, TrueType, and Type3

All simple fonts have the following properties:

Glyphs in the font are selected by single-byte character codes obtained from a string that is shown by the text-showing operators. Logically, these codes index into a table of 256 glyphs; the mapping from codes to glyphs is called the font’s encoding. Each font program has a built-in encoding. Under some circumstances, the encoding can be altered by means described in Section 5.5.5 “Character Encoding” in PDF Reference Manual.
Each glyph has a single set of metrics. Therefore simple fonts support only horizontal writing mode.

IsSymbolic()[source]

Return type:: boolean
Returns:: true if font contains characters outside the Adobe standard Latin character set.

MapToCID(char_code)[source]

Return type:: int
Returns:: a CID matching specified charcode.

Notes: Relevant only for a Type0 font.

MapToUnicode(char_code)[source]

Maps the encoding specific ‘charcode’ to Unicode. Conversion of ‘charcode’ to Unicode can result in up to four Unicode characters.

Parameters:

char_code (int) – encoding specific ‘charcode’ that needs to be converted to Unicode.
out_uni_arr – A pointer to an array of Unicode characters where the conversion result will be stored.
in_uni_sz – The number of characters that can be written to out_uni_arr. You can assume that the function will never map to more than 10 characters.
out_char_num – The function will modify this value to return the number of Unicode characters written in ‘out_uni_arr’ array.

Return type:

string

Returns:

true if char_code was mapped to Unicode public area or false if the char_code was mapped to Unicode private area.

A char_code is mapped to Unicode private area when the information required for proper mapping is missing in PDF document (e.g. a predefined encoding or ToUnicode CMap).

Notes: This function is not applicable to CIDFonts (e_CIDType0 and e_CIDType2) and will throw an exception if called.

e_CIDType0 = 5

e_CIDType2 = 6

e_IdentityH = 0

e_Indices = 1

e_MMType1 = 2

e_TrueType = 1

e_Type0 = 4

e_Type1 = 0

e_Type3 = 3

e_courier = 8

e_courier_bold = 9

e_courier_bold_oblique = 11

e_courier_oblique = 10

e_helvetica = 4

e_helvetica_bold = 5

e_helvetica_bold_oblique = 7

e_helvetica_oblique = 6

e_null = 14

e_symbol = 12

e_times_bold = 1

e_times_bold_italic = 3

e_times_italic = 2

e_times_roman = 0

e_zapf_dingbats = 13

property mp_font

property thisown: The membership flag

class apryse_sdk.FreeText(args)[source]

Bases: Markup

A FreeText annotation (PDF 1.3) displays text directly on the page. Unlike an ordinary Text annotation, a FreeText annotation has no open or closed state; The content of the FreeText annotation is always visible instead of being displayed in a popup window.

static Create(doc, pos)[source]

Creates a new FreeText annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the FreeText annotation is added.
pos (Rect) – A rectangle specifying the FreeText annotation’s bounds in default user space units.

Return type:

FreeText

Returns:

A newly created blank FreeText annotation.

static CreateAnnot(doc, pos)[source]

GetCalloutLinePoint1()[source]

Returns the callout line points of the FreeText annotation. (PDF 1.6)

Parameters:

p1 – The target point. (where the ending style is used)
p2 – The ending point.
p3 – The knee point.

Return type:

Point

Returns:

Three point objects if the line is bent or two point objects if the line is straight.

Notes: If the line is straight, i.e. only has two points, two points will be returned in p1 and p2, and p3 will be the same as p2. The coordinates are given in default user space.

GetCalloutLinePoint2()[source]

GetCalloutLinePoint3()[source]

GetDefaultAppearance()[source]

Returns the default appearance of the FreeText annotation.

Return type:: string
Returns:: A string representing the default appearance of the annotation.

Notes: The default appearance string is used to format the text. The annotation dictionary’s Appearance entry, if present, will take precedence over this entry. this method corresponds to the ‘DA’ entry in the annotation dictionary.

GetEndingStyle()[source]

Returns the ending style of the callout line of the FreeText Annotation.

Return type:: int
Returns:: The ending style represented as one of the entries of the enum “EndingStyle”

Notes: The ending style specifies the line ending style that shall be used in drawing the callout line specified in CallOut Line Points (CL). The enum entry shall specify the line ending style for the endpoint defined by the target point(p1) of the CallOut Line Points. Default value: e_None.

GetFontSize()[source]

Get the default appearance font size. To get the actual font size used, call RefreshAppearance and then use ElementReader on the content stream of this annotation.

Return type:: double
Returns:: the default font size, where a value of zero indicates auto sizing.

GetIntentName()[source]

Returns Intent name of the FreeText annotation. (PDF 1.4)

Return type:: int
Returns:: The intent name of the annotation as an entry from the enum “IntentName”.

GetLineColor()[source]

Returns the line and border color of the FreeText Annotation.

Parameters:

color – reference to ColorPt object, where results will be saved.
col_comp – reference to an integer, where number of colorant components will be written.

GetLineColorCompNum()[source]

GetQuaddingFormat()[source]

Returns the quading format of the FreeText annotation. (PDF 1.4)

Return type:: int
Returns:: A int (code) indicating the quading format of the FreeText annotation.

Notes: The following are the quading formats corresponding to each int code. 0 Left-justified 1 Centered 2 Right-justified

GetTextColor()[source]

Returns the text color of the FreeText Annotation.

Notes: Note: In rich text annotations, some or all of the text may have a different color than the default text color.

GetTextColorCompNum()[source]

SetCalloutLinePoints(args)[source]

Overload 1:

Sets the callout line points of the FreeText annotation. (Optional; meaningful only if IT is FreeTextCallout; PDF 1.6)

Parameters:

p1 (Point) – The target point. (where the ending style is used)
p2 (Point) – The knee point.
p3 (Point) – The ending point.

Notes: The coordinates are defined in default user space.

Overload 2:

Sets the callout line points of the FreeText annotation. (Optional; meaningful only if IT is FreeTextCallout; PDF 1.6)

Parameters:

p1 (Point) – The target point. (where the ending style is used)
p2 (Point) – The ending point.

Notes: The coordinates are defined in default user space.

SetDefaultAppearance(app_str)[source]

Sets the default appearance of the FreeText annotation.

Parameters:: app_str (string) – A string representing the default appearance of the annotation.

Notes: The default appearance string is used to format the text. The annotation dictionary’s Appearance entry, if present, will take precedence over this entry. this method corresponds to the ‘DA’ entry in the annotation dictionary.

SetEndingStyle(args)[source]

Overload 1:

Sets the ending style of the callout line of the FreeText Annotation. (Optional; meaningful only if CL is present; PDF 1.6)

Parameters:: style (int) – The ending style represented using one of the entries of the enum “EndingStyle”

Notes: The ending style specifies the line ending style that shall be used in drawing the callout line specified in CallOut Line Points (CL). The enum entry shall specify the line ending style for the endpoint defined by the target point(p1) of the CallOut Line Points. Default value: e_None.

Overload 2:

Sets the ending style of the callout line of the FreeText Annotation. (Optional; meaningful only if CL is present; PDF 1.6)

Parameters:: est (string) – The ending style represented using a string.

Notes: The ending style specifies the line ending style that shall be used in drawing the callout line specified in CallOut Line Points (CL). The enum entry shall specify the line ending style for the endpoint defined by the target point(p1) of the CallOut Line Points. Default value: “None”.

SetFontName(fontName)[source]

Sets the default appearance font name.

Parameters:: fontName (string) – Set the default name name.

SetFontSize(font_size)[source]

Sets the default appearance font size. A value of zero specifies that the font size should should adjust so that the text uses as much of the FreeText bounding box as possible.

Parameters:: font_size (double) – Set the default font size. A value of zero means auto resize font.

SetIntentName(args)[source]

Sets the Intent name of the FreeText annotation. (Optional; PDF 1.4)

Parameters:: mode (int, optional) – The intent name of the annotation as an entry from the enum “IntentName”.

SetLineColor(color, col_comp)[source]

Sets the line and border color of the FreeText Annotation.

Parameters:

color (ColorPt) – ColorPt object representing the color.
col_comp (int) – number of colorant components in ColorPt object.

Notes: Current implementation of this method creates a non-standard entry in the annotation dictionary and uses it to generate the appearance stream. Make sure you call RefreshAppearance() after changing text or line color, and remember that editing the annotation in other PDF applications will produce different appearance.

SetQuaddingFormat(format)[source]

Sets the quading format of the FreeText annotation. (Optional; PDF 1.4)

Parameters:: format (int) – A int code indicating the quading format of the FreeText annotation. Default value: 0 (left-justified).

Notes: The int code specifies the form of quadding (justification) that shall be used in displaying the annotation’s text: 0 Left-justified 1 Centered 2 Right-justified

SetTextColor(color, col_comp)[source]

Sets the text color of the FreeText Annotation.

Parameters:

color (ColorPt) – ColorPt object representing the color.
col_comp (int) – number of colorant components in ColorPt object.

Notes: Current implementation of this method creates a non-standard entry in the annotation dictionary and uses it to generate the appearance stream. Make sure you call RefreshAppearance() after changing text or line color, and remember that editing the annotation in other PDF applications will produce different appearance.

e_FreeText = 0: The annotation intended to function as a plain FreeText annotation.

e_FreeTextCallout = 1: The annotation is intended to function as a callout.

e_FreeTextTypeWriter = 2: The annotation is intended to function as a click-to-type or typewriter object and no callout line is drawn.

e_Unknown = 3: User defined or Invalid.

property thisown: The membership flag

class apryse_sdk.Function(args)[source]

Bases: object

Although PDF is not a programming language it provides several types of function object that represent parameterized classes of functions, including mathematical formulas and sampled representations with arbitrary resolution. Functions are used in various ways in PDF, including device-dependent rasterization information for high-quality printing (halftone spot functions and transfer functions), color transform functions for certain color spaces, and specification of colors as a function of position for smooth shadings. Functions in PDF represent static, self-contained numerical transformations.

PDF::Function represents a single, flat interface around all PDF function types.

Destroy()[source]: Frees the native memory of the object.

Eval(in_arr)[source]

Evaluate the function at a given point.

Notes: that size of ‘in’ array must be greater than or equal to function input cardinality. and the size of ‘out’ array must be greater than or equal to function output cardinality.

GetInputCardinality()[source]

Return type:: int
Returns:: the number of input components required by the function

GetOutputCardinality()[source]

Return type:: int
Returns:: the number of output components returned by the function

GetSDFObj()[source]

Return type:: Obj
Returns:: the underlying SDF/Cos object

GetType()[source]

Return type:: int
Returns:: The function type

e_exponential = 2

e_postscript = 4

e_sampled = 0

e_stitching = 3

property mp_func

property thisown: The membership flag

class apryse_sdk.GSChangesIterator(args)[source]

Bases: object

The Iterator specialization for integer type.

Current()[source]

Destroy()[source]: Frees the native memory of the object.

HasNext()[source]

Next()[source]

property mp_impl

property thisown: The membership flag

class apryse_sdk.GState(args)[source]

Bases: object

GState is a class that keeps track of a number of style attributes used to visually define graphical Elements. Each PDF::Element has an associated GState that can be used to query or set various graphics properties.

Notes: current clipping path is not tracked in the graphics state for efficiency reasons. In most cases tracking of the current clipping path is best left to the client.

Concat(args)[source]

Overload 1:

Concatenate the given matrix to the transformation matrix of this element.

Parameters:: mtx (Matrix2D) – Matrix2D object to concatenate the current matrix with.

Overload 2:

Concatenate the given matrix expressed in its values to the transformation matrix of this element.

Parameters:

a (double) –
- horizontal ‘scaling’ component of the new text matrix.
b (double) –
- ‘rotation’ component of the new text matrix.
c (double) –
- ‘rotation’ component of the new text matrix.
d (double) –
- vertical ‘scaling’ component of the new text matrix.
h (double) –
- horizontal translation component of the new text matrix.
v (double) –
- vertical translation component of the new text matrix.

GetAISFlag()[source]

Return type:: boolean
Returns:: the alpha source flag (‘alpha is shape’), specifying whether the current soft mask and alphaant are to be interpreted as shape values (true) or opacity values (false).

GetAutoStrokeAdjust()[source]

Return type:: boolean
Returns:: a flag specifying whether stroke adjustment is enabled in the graphics state. Corresponds to the /SA key within the ExtGState’s dictionary.

GetBlackGenFunct()[source]

Return type:: Obj
Returns:: currently selected black-generation function (NULL by default) used during conversion between DeviceRGB and DeviceCMYK. Corresponds to the /BG key within the ExtGState’s dictionary.

GetBlendMode()[source]

Return type:: int
Returns:: the current blend mode to be used in the transparent imaging model. Corresponds to the /BM key within the ExtGState’s dictionary.

GetCharSpacing()[source]

Return type:: double
Returns:: currently selected character spacing.

The character spacing parameter is a number specified in unscaled text space units. When the glyph for each character in the string is rendered, the character spacing is added to the horizontal or vertical component of the glyph’s displacement, depending on the writing mode. See Section 5.2.1 in PDF Reference Manual for details.

GetDashes()[source]

Return type:: std::vector< double,std::allocator< double > >
Returns:: The method fills the vector with an array of numbers representing the dash pattern

The line dash pattern controls the pattern of dashes and gaps used to stroke paths. It is specified by a dash array and a dash phase. The dash array’s elements are numbers that specify the lengths of alternating dashes and gaps; the dash phase specifies the distance into the dash pattern at which to start the dash. The elements of both the dash array and the dash phase are expressed in user space units.

GetFillColor()[source]

Return type:: ColorPt
Returns:: a color value/point represented in the current fill color space

GetFillColorSpace()[source]

Return type:: ColorSpace
Returns:: color space used for filling

GetFillOpacity()[source]

Return type:: double
Returns:: the opacity value for painting operations other than stroking. Returns the value of the /ca key in the ExtGState dictionary. If the value is not found, the default value of 1 is returned.

GetFillOverprint()[source]

Return type:: boolean
Returns:: whether overprint is enabled for fill painting operations. Corresponds to the /op key within the ExtGState’s dictionary.

GetFillPattern()[source]

Return type:: PatternColor
Returns:: the pattern color of currently selected pattern color space used for filling.

GetFlatness()[source]

Return type:: double
Returns:: current value of flatness tolerance

Flatness is a number in the range 0 to 100; a value of 0 specifies the output device’s default flatness tolerance.

The flatness tolerance controls the maximum permitted distance in device pixels between the mathematically correct path and an approximationructed from straight line segments.

GetFont()[source]

Return type:: Font
Returns:: currently selected font

GetFontSize()[source]

Return type:: double
Returns:: the font size

GetHalftone()[source]

Return type:: Obj
Returns:: currently selected halftone dictionary or stream (NULL by default). Corresponds to the /HT key within the ExtGState’s dictionary. Halftoning is a process by which continuous-tone colors are approximated on an output device that can achieve only a limited number of discrete colors.

GetHorizontalScale()[source]

Return type:: double
Returns:: currently selected horizontal scale

The horizontal scaling parameter adjusts the width of glyphs by stretching or compressing them in the horizontal direction. Its value is specified as a percentage of the normal width of the glyphs, with 100 being the normal width. The scaling always applies to the horizontal coordinate in text space, independently of the writing mode. See Section 5.2.3 in PDF Reference Manual for details.

GetLeading()[source]

Return type:: double
Returns:: currently selected leading parameter

The leading parameter is measured in unscaled text space units. It specifies the vertical distance between the baselines of adjacent lines of text. See Section 5.2.4 in PDF Reference Manual for details.

GetLineCap()[source]

Return type:: int
Returns:: currently selected LineCap style

The line cap style specifies the shape to be used at the ends of open sub-paths (and dashes, if any) when they are stroked.

GetLineJoin()[source]

Return type:: int
Returns:: currently selected LineJoin style

The line join style specifies the shape to be used at the corners of paths that are stroked.

GetLineWidth()[source]

Return type:: double
Returns:: the thickness of the line used to stroke a path.

Notes: A line width of 0 denotes the thinnest line that can be rendered at device resolution: 1 device pixel wide.

GetMiterLimit()[source]

Return type:: double
Returns:: current value of miter limit.

The miter limit imposes a maximum on the ratio of the miter length to the line width. When the limit is exceeded, the join is converted from a miter to a bevel.

GetOverprintMode()[source]

Return type:: int
Returns:: the overprint mode used by this graphics state. Corresponds to the /OPM key within the ExtGState’s dictionary.

GetPhase()[source]

Return type:: double
Returns:: the phase of the currently selected dash pattern. dash phase is expressed in user space units.

GetRenderingIntent()[source]

Return type:: int
Returns:: The color intent to be used for rendering the Element

static GetRenderingIntentType(name)[source]

A utility function that maps a string representing a rendering intent to RenderingIntent type.

Parameters:: name (string) – string that represents the rendering intent to get.
Return type:: int
Returns:: The color rendering intent type matching the specified string

GetSmoothnessTolerance()[source]

Return type:: double
Returns:: the smoothness tolerance used to control the quality of smooth shading. Corresponds to the /SM key within the ExtGState’s dictionary. The allowable error (or tolerance) is expressed as a fraction of the range of the color component, from 0.0 to 1.0.

GetSoftMask()[source]

Return type:: Obj
Returns:: Associated soft mask. NULL if the soft mask is not selected or SDF dictionary representing the soft mask otherwise.

GetSoftMaskTransform()[source]

Return type:: Matrix2D
Returns:: The soft mask transform. This is the transformation matrix at the moment the soft mask is established in the graphics state with the gs operator. This information is only relevant when applying the soft mask that may be specified in the graphics state to the current element.

GetStrokeColor()[source]

Return type:: ColorPt
Returns:: a color value/point represented in the current stroke color space

GetStrokeColorSpace()[source]

Return type:: ColorSpace
Returns:: color space used for stroking

GetStrokeOpacity()[source]

Return type:: double
Returns:: opacity value for stroke painting operations for paths and glyph outlines. Returns the value of the /CA key in the ExtGState dictionary. If the value is not found, the default value of 1 is returned.

GetStrokeOverprint()[source]

Return type:: boolean
Returns:: whether overprint is enabled for stroke painting operations. Corresponds to the /OP key within the ExtGState’s dictionary.

GetStrokePattern()[source]

Return type:: PatternColor
Returns:: the SDF pattern object of currently selected PatternColorSpace used for stroking.

GetTextRenderMode()[source]

Return type:: int
Returns:: current text rendering mode.

The text rendering mode determines whether showing text causes glyph outlines to be stroked, filled, used as a clipping boundary, or some combination of the three. See Section 5.2.5 in PDF Reference Manual for details..

GetTextRise()[source]

Return type:: double
Returns:: current value of text rise

Text rise specifies the distance, in unscaled text space units, to move the baseline up or down from its default location. Positive values of text rise move the baseline up

GetTransferFunct()[source]

Return type:: Obj
Returns:: currently selected transfer function (NULL by default) used during color conversion process. A transfer function adjusts the values of color components to compensate for nonlinear response in an output device and in the human eye. Corresponds to the /TR key within the ExtGState’s dictionary.

GetTransform()[source]

Return type:: Matrix2D
Returns:: the transformation matrix for this element.

Notes: If you are looking for a matrix that maps coordinates to the initial user space see Element::GetCTM().

GetUCRFunct()[source]

Return type:: Obj
Returns:: currently selected undercolor-removal function (NULL by default) used during conversion between DeviceRGB and DeviceCMYK. Corresponds to the /UCR key within the ExtGState’s dictionary.

GetWordSpacing()[source]

Return type:: double
Returns:: currently selected word spacing

Word spacing works the same way as character spacing, but applies only to the space character (char code 32). See Section 5.2.2 in PDF Reference Manual for details.

IsTextKnockout()[source]

Return type:: boolean
Returns:: a boolean flag that determines the text element is considered elementary objects for purposes of color compositing in the transparent imaging model.

SetAISFlag(AIS)[source]

Specifies if the alpha is to be interpreted as a shape or opacity mask. The alpha source flag (‘alpha is shape’), specifies whether the current soft mask and alphaant are to be interpreted as shape values (true) or opacity values (false).

Parameters:: AIS (boolean) – true for interpretation as shape values or false for opacity values

SetAutoStrokeAdjust(SA)[source]

Specify whether to apply automatic stroke adjustment. Corresponds to the /SA key within the ExtGState’s dictionary.

Parameters:: SA (boolean) – if true automatic stroke adjustment will be applied.

SetBlackGenFunct(BG)[source]

Sets black-generation function used during conversion between DeviceRGB and DeviceCMYK. Corresponds to the /BG key within the ExtGState’s dictionary.

Parameters:

BG (Obj) –

SDF/Cos black-generation function or name

SetBlendMode(BM)[source]

Sets the current blend mode to be used in the transparent imaging model. Corresponds to the /BM key within the ExtGState’s dictionary.

Parameters:

BM (int) –

New blending mode type.

// C#
gs.SetBlendMode(GState.BlendMode.e_lighten);

// C++
gs->SetBlendMode(GState::e_lighten);

SetCharSpacing(char_spacing)[source]

Sets character spacing.

Parameters:: char_spacing (double) – a number specified in unscaled text space units. When the glyph for each character in the string is rendered, the character spacing is added to the horizontal or vertical component of the glyph’s displacement, depending on the writing mode. See Section 5.2.1 in PDF Reference Manual for details.

SetDashPattern(dash_array, phase)[source]

Sets the dash pattern used to stroke paths. The line dash pattern controls the pattern of dashes and gaps used to stroke paths. It is specified by a dash array and a dash phase. The elements of both the dash array and the dash phase are expressed in user space units.

Parameters:

dash_array (std::vector< double,std::allocator< double > >) – the numbers that specify the lengths of alternating dashes and gaps.
phase (double) – specifies the distance into the dash pattern at which to start the dash.

SetFillColor(args)[source]

Overload 1:

Sets the color value/point used for filling operations.

Parameters:: c (ColorPt) – the color used for filling operations The color value must be represented in the currently selected color space used for filling.

Overload 2:

Set the fill color to the given tiling pattern.

Parameters:: pattern (PatternColor) – New pattern color.

Notes: The currently selected fill color space must be Pattern color space.

Overload 3:

Set the fill color to the given uncolored tiling pattern.

Parameters:

pattern (PatternColor) – PatternColor (PatternType = 1 and PaintType = 2) object.
c (ColorPt) – is a color in the pattern’s underlying color space.

Notes: The currently selected fill color space must be Pattern color space.

SetFillColorSpace(cs)[source]

Sets the color space used for filling operations

Parameters:: cs (ColorSpace) – ColorSpace object to use for filling operations

SetFillOpacity(ca)[source]

Sets the opacity value for painting operations other than stroking. Corresponds to the value of the /ca key in the ExtGState dictionary.

Parameters:: ca (double) – value to set fill opacity to

SetFillOverprint(op)[source]

Specifies if overprint is enabled for fill operations. Corresponds to the /op key within the ExtGState’s dictionary.

Parameters:: op (boolean) – true to enable overprint for fill, false to disable.

SetFlatness(flatness)[source]

Sets the value of flatness tolerance.

Parameters:: flatness (double) – is a number in the range 0 to 100; a value of 0 specifies the output device’s default flatness tolerance.

The flatness tolerance controls the maximum permitted distance in device pixels between the mathematically correct path and an approximationructed from straight line segments.

SetFont(font, font_sz)[source]

Sets the font and font size used to draw text.

Parameters:

font (Font) – Font to draw the text with
font_sz (double) – size of the font to draw the text with

SetHalftone(HT)[source]

Sets currently selected halftone dictionary or stream (NULL by default). Corresponds to the /HT key within the ExtGState’s dictionary. Halftoning is a process by which continuous-tone colors are approximated on an output device that can achieve only a limited number of discrete colors.

Parameters:

HT (Obj) –

SDF/Cos halftone dictionary, stream, or name

SetHorizontalScale(hscale)[source]

Sets horizontal scale. The horizontal scaling parameter adjusts the width of glyphs by stretching or compressing them in the horizontal direction. Its value is specified as a percentage of the normal width of the glyphs, with 100 being the normal width. The scaling always applies to the horizontal coordinate in text space, independently of the writing mode. See Section 5.2.3 in PDF Reference Manual for details.

Parameters:: hscale (double) – value to set horizontal scale to.

SetLeading(leading)[source]

Sets the leading parameter.

The leading parameter is measured in unscaled text space units. It specifies the vertical distance between the baselines of adjacent lines of text. See Section 5.2.4 in PDF Reference Manual for details.

Parameters:: leading (double) – number representing vertical distance between lines of text

SetLineCap(cap)[source]

Sets LineCap style property.

The line cap style specifies the shape to be used at the ends of open subpaths (and dashes, if any) when they are stroked.

SetLineJoin(join)[source]

Sets LineJoin style property.

The line join style specifies the shape to be used at the corners of paths that are stroked.

SetLineWidth(width)[source]

Sets the thickness of the line used to stroke a path.

Parameters:: width (double) – a non-negative number expressed in user space units. A line width of 0 denotes the thinnest line that can be rendered at device resolution: 1 device pixel wide.

SetMiterLimit(miter_limit)[source]

Sets miter limit.

Parameters:: miter_limit (double) – A number that imposes a maximum on the ratio of the miter length to the line width. When the limit is exceeded, the join is converted from a miter to a bevel.

SetOverprintMode(OPM)[source]

Sets the overprint mode. Corresponds to the /OPM key within the ExtGState’s dictionary.

Parameters:: OPM (int) – overprint mode.

SetRenderingIntent(intent)[source]: Sets the color intent to be used for rendering the Element.

SetSmoothnessTolerance(SM)[source]: Sets the smoothness tolerance used to control the quality of smooth shading. Corresponds to the /SM key within the ExtGState’s dictionary.

SetSoftMask(SM)[source]

Sets the soft mask of the extended graphics state. Corresponds to the /SMask key within the ExtGState’s dictionary.

Parameters:

SM (Obj) –

SDF/Cos black-generation function or name

SetStrokeColor(args)[source]

Overload 1:

Sets the color value/point used for stroking operations.

Parameters:: c (ColorPt) – is the color used for stroking operations

Notes: The color value must be represented in the currently selected color space used for stroking.

Overload 2:

Set the stroke color to the given tiling pattern.

Parameters:: pattern (PatternColor) – SDF pattern object.

Notes: The currently selected stroke color space must be Pattern color space.

Overload 3:

Set the stroke color to the given uncolored tiling pattern.

Parameters:

pattern (PatternColor) – pattern (PatternType = 1 and PaintType = 2) object.
c (ColorPt) – is a color in the pattern’s underlying color space.

Notes: The currently selected stroke color space must be Pattern color space.

SetStrokeColorSpace(cs)[source]

Sets the color space used for stroking operations

Parameters:: cs (ColorSpace) – ColorSpace object to use for stroking operations

SetStrokeOpacity(ca)[source]

Sets opacity value for stroke painting operations for paths and glyph outlines. Corresponds to the value of the /CA key in the ExtGState dictionary.

Parameters:: ca (double) – value to set stroke opacity to

SetStrokeOverprint(OP)[source]

Specifies if overprint is enabled for stroke operations. Corresponds to the /OP key within the ExtGState’s dictionary.

Parameters:: OP (boolean) – true to enable overprint for stroke, false to disable.

SetTextKnockout(knockout)[source]

Mark the object as elementary for purposes of color compositing in the transparent imaging model.

Parameters:: knockout (boolean) – Whether an object is elementary or not.

SetTextRenderMode(rmode)[source]: Sets text rendering mode. The text rendering mode determines whether showing text causes glyph outlines to be stroked, filled, used as a clipping boundary, or some combination of the three. See Section 5.2.5 in PDF Reference Manual for details..

SetTextRise(rise)[source]

Sets text rise. Text rise specifies the distance, in unscaled text space units, to move the baseline up or down from its default location. Positive values of text rise move the baseline up

Parameters:: rise (double) – distance to move baseline up. Negative values move baseline down.

SetTransferFunct(TR)[source]

Sets transfer function used during color conversion process. A transfer function adjusts the values of color components to compensate for nonlinear response in an output device and in the human eye. Corresponds to the /TR key within the ExtGState’s dictionary.

Parameters:

TR (Obj) –

SDF/Cos transfer function, array, or name

SetTransform(args)[source]

Overload 1:

Set the transformation matrix associated with this element.

Parameters:: mtx (Matrix2D) – The new transformation for this text element.

Notes: in PDF associating a transformation matrix with an element (‘cm’ operator) will also affect all subsequent elements.

Overload 2:

Set the transformation matrix associated with this element.

A transformation matrix in PDF is specified by six numbers, usually in the form of an array containing six elements. In its most general form, this array is denoted [a b c d h v]; it can represent any linear transformation from one coordinate system to another. For more information about PDF matrices please refer to section 4.2.2 ‘Common Transformations’ in PDF Reference Manual, and to documentation for Common::Matrix2D class.

Parameters:

a (double) –
- horizontal ‘scaling’ component of the new text matrix.
b (double) –
- ‘rotation’ component of the new text matrix.
c (double) –
- ‘rotation’ component of the new text matrix.
d (double) –
- vertical ‘scaling’ component of the new text matrix.
h (double) –
- horizontal translation component of the new text matrix.
v (double) –
- vertical translation component of the new text matrix.

SetUCRFunct(UCR)[source]

Sets undercolor-removal function used during conversion between DeviceRGB and DeviceCMYK. Corresponds to the /UCR key within the ExtGState’s dictionary.

Parameters:

UCR (Obj) –

SDF/Cos undercolor-removal function or name

SetWordSpacing(word_spacing)[source]

Sets word spacing.

Parameters:

word_spacing (double) –

a number specified in unscaled text space units.

Word spacing works the same way as character spacing, but applies only to the space character (char code 32). See Section 5.2.2 in PDF Reference Manual for details.

e_BG_funct = 33

e_UCR_funct = 34

e_absolute_colorimetric = 0

e_alpha_is_shape = 25

e_auto_stoke_adjust = 28

e_bevel_join = 2

e_bl_color = 16

e_bl_color_burn = 8

e_bl_color_dodge = 7

e_bl_compatible = 0

e_bl_darken = 5

e_bl_difference = 4

e_bl_exclusion = 9

e_bl_hard_light = 10

e_bl_hue = 14

e_bl_lighten = 6

e_bl_luminosity = 13

e_bl_multiply = 2

e_bl_normal = 1

e_bl_overlay = 11

e_bl_saturation = 15

e_bl_screen = 3

e_bl_soft_light = 12

e_blend_mode = 22

e_butt_cap = 0

e_char_spacing = 12

e_clip_text = 7

e_dash_pattern = 11

e_fill_clip_text = 4

e_fill_color = 5

e_fill_cs = 4

e_fill_overprint = 30

e_fill_stroke_clip_text = 6

e_fill_stroke_text = 2

e_fill_text = 0

e_flatness = 9

e_font = 16

e_font_size = 17

e_halftone = 35

e_horizontal_scale = 14

e_invisible_text = 3

e_leading = 15

e_line_cap = 7

e_line_join = 8

e_line_width = 6

e_miter_join = 0

e_miter_limit = 10

e_null = 36

e_opacity_fill = 23

e_opacity_stroke = 24

e_overprint_mode = 31

e_perceptual = 3

e_relative_colorimetric = 1

e_rendering_intent = 1

e_round_cap = 1

e_round_join = 1

e_saturation = 2

e_smoothnes = 27

e_soft_mask = 26

e_square_cap = 2

e_stroke_clip_text = 5

e_stroke_color = 3

e_stroke_cs = 2

e_stroke_overprint = 29

e_stroke_text = 1

e_text_knockout = 20

e_text_pos_offset = 21

e_text_render_mode = 18

e_text_rise = 19

e_transfer_funct = 32

e_transform = 0

e_word_spacing = 13

property mp_state

property thisown: The membership flag

class apryse_sdk.Group(args)[source]

Bases: object

The OCG::Group object represents an optional-content group. This corresponds to a PDF OCG dictionary representing a collection of graphic objects that can be made visible or invisible (Section 4.10.1 ‘Optional Content Groups’ in PDF Reference). Any graphic content of the PDF can be made optional, including page contents, XObjects, and annotations. The specific content objects in the group have an OC entry in the PDF as part of the surrounding marked content or in the XObject dictionary. The group itself is a named object that can be typically manipulated through a Layers panel in a PDF viewer.

In the simplest case, the group’s ON-OFF state makes the associated content visible or hidden. The ON-OFF state of a group can be toggled for a particular context (OCG::Context), and a set of states is kept in a configuration (OCG::Config). The visibility can depend on more than one group in an optional-content membership dictionary (OCG::OCMD), and can also be affected by the context’s draw mode (OCGContext::OCDrawMode).

A group has an Intent entry, broadly describing the intended use. A group’s content is considered to be optional (that is, the group’s state is considered in its visibility) if any intent in its list matches an intent of the context. The intent list of the context is usually set from the intent list of the document configuration.

A Usage dictionary entry provides more specific intended usage information than an intent entry. Possible key values are: CreatorInfo, Language, Export, Zoom, Print, View, User, PageElement. The usage value can act as a kind of metadata, describing the sort of things that belong to the group, such as text in French, fine detail on a map, or a watermark. The usage values can also be used by the AutoState mechanism to make decisions about what groups should be on and what groups should be off. The AutoState mechanism considers the usage information in the OCGs, the AS array of the configuration, and external factors; for example, the language the application is running in, the current zoom level on the page, or whether the page is being printed.

static Create(doc, name)[source]

Creates a new optional-content group (OCG) object in the document.

Parameters:

doc (PDFDoc) – The document in which the new OCG will be created.
name (string) – The name of the optional-content group.

Return type:

Group

Returns:

The newly created OCG::Group object.

GetCurrentState(context)[source]

Return type:: boolean
Returns:: true if this OCG object is in the ON state in a given context, false otherwise.
Parameters:: context (Context) – The context for which to get the group’s state.

GetInitialState(config)[source]

Return type:: boolean
Returns:: The initial state (ON or OFF) of the optional-content group (OCG) object in a given configuration.
Parameters:: config (Config) – The configuration for which to get the group’s initial state.

Notes: If the configuration has a BaseState of Unchanged, and the OCG is not listed explicitly in its ON list or OFF list, then the initial state is taken from the OCG’s current state in the document’s default context.

GetIntent()[source]

Return type:: Obj
Returns:: OCG intent. An intent is a name object (or an array of name objects) broadly describing the intended use, which can be either “View” or “Design”. A group’s content is considered to be optional (that is, the group’s state is considered in its visibility) if any intent in its list matches an intent of the context. The intent list of the context is usually set from the intent list of the document configuration.

GetName()[source]

Return type:: string
Returns:: the name of this optional-content group (OCG).

GetSDFObj()[source]

Return type:: Obj
Returns:: Pointer to the underlying SDF/Cos object.

GetUsage(key)[source]

Return type:: Obj
Returns:: The usage information associated with the given key in the Usage dictionary for the group, or a NULL if the entry is not present. A Usage dictionary entry provides more specific intended usage information than an intent entry.
Parameters:: key (string) – The usage key in the usage dictionary entry. The possible key values are: CreatorInfo, Language, Export, Zoom, Print, View, User, PageElement.

HasUsage()[source]

Return type:: boolean
Returns:: true if this group is associated with a Usage dictionary, false otherwise.

IsLocked(config)[source]

Return type:: boolean
Returns:: true if this OCG is locked in a given configuration, false otherwise. The on/off state of a locked OCG cannot be toggled by the user through the user interface.
Parameters:: config (Config) – The OC configuration.

IsValid()[source]

Return type:: boolean
Returns:: True if this is a valid (non-null) OCG, false otherwise.

SetCurrentState(context, state)[source]

Sets the current ON-OFF state of the optional-content group (OCG) object in a given context.

Parameters:

context (Context) – The context for which to set the group’s state.
state (boolean) – The new state.

SetInitialState(config, state)[source]

Sets the initial state (ON or OFF) of the optional-content group (OCG) object in a given configuration.

Parameters:

config (Config) – The configuration for which to set the group’s initial state.
state (boolean) – The new initial state, true if the state is ON, false if it is OFF.

SetIntent(intent)[source]

Sets the Intent entry in an optional-content group’s SDF/Cos dictionary. For more information, see GetIntent().

Parameters:: intent (Obj) – The new Intent entry value (a name object or an array of name objects).

SetLocked(config, locked)[source]

Sets the locked state of an OCG in a given configuration. The on/off state of a locked OCG cannot be toggled by the user through the user interface.

Parameters:

config (Config) – IN/OUT The optional-content configuration.
locked (boolean) – true if the OCG should be locked, false otherwise.

SetName(name)[source]

Sets the name of this optional-content group (OCG).

Parameters:: name (string) – The new name.

property mp_obj

property thisown: The membership flag

class apryse_sdk.HTML2PDF[source]

Bases: object

‘pdftron.PDF.HTML2PDF’ is an optional PDFNet Add-On utility class that can be used to convert HTML web pages into PDF documents by using an external module (html2pdf).

The html2pdf modules can be downloaded from http: www.pdftron.com/pdfnet/downloads.html.

Users can convert HTML pages to PDF using the following operations: - Simple one line static method to convert a single web page to PDF. - Convert HTML pages from URL or string, plus optional table of contents, in user defined order. - Optionally configure settings for proxy, images, java script, and more for each HTML page. - Optionally configure the PDF output, including page size, margins, orientation, and more. - Optionally add table of contents, including setting the depth and appearance.

The following code converts a single webpage to pdf

using namespace pdftron;
using namespace PDF;

PDFDoc pdfdoc;
if ( HTML2PDF::Convert(pdfdoc, "http://www.gutenberg.org/wiki/Main_Page") )
                pdfdoc.Save(outputFile, SDF::SDFDoc::e_remove_unused, NULL);

The following code demonstrates how to convert multiple web pages into one pdf, excluding the background, and with lowered image quality to save space.

using namespace pdftron;
using namespace PDF;

HTML2PDF converter;
converter.SetImageQuality(25);

HTML2PDF::WebPageSettings settings;
settings.SetPrintBackground(false);

converter.InsertFromURL("http://www.gutenberg.org/wiki/Main_Page", settings);

PDFDoc pdfdoc;
if ( converter.Convert(pdfdoc) )
                pdfdoc.Save(outputFile, SDF::SDFDoc::e_remove_unused, NULL);

AddCookie(name, value)[source]

Add cookie to the HTTP Headers when converting via a URL.

Parameters:

name (string) –
- the name of the cookie.
value (string) –
- the value of the cookie.

Convert(doc)[source]

Convert HTML documents and append the results to doc.

Return type:

boolean

Returns:

true if successful, otherwise false. Use ‘GetHttpErrorCode’ for possible HTTP errors.

Parameters:

doc (PDFDoc) –

Target PDF to which converted HTML pages will

be appended to.

Notes: Use ‘InsertFromURL’ and ‘InsertFromHtmlString’ to add HTML documents to be converted.

Destroy()[source]: Frees the native memory of the object.

DumpOutline(xml_file)[source]

Save outline to a xml file.

Parameters:

xml_file (string) –

Path of where xml data representing outline

of produced PDF should be saved to.

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

GetHTTPErrorCode()[source]

Return the largest HTTP error code encountered during conversion

Return type:: int
Returns:: the largest HTTP code greater then or equal to 300 encountered during loading of any of the supplied objects, if no such error code is found 0 is returned.

Notes: This function will only return a useful result after ‘Convert’ has been called.

GetLog()[source]

Get results of conversion, including errors and warnings, in human readable form.

Return type:: string
Returns:: String containing results of conversion.

InsertFromHtmlString(args)[source]

Overload 1:

Convert HTML encoded in string.

Parameters:

html (string) –

String containing HTML code.

Overload 2:

Convert HTML encoded in string.

Parameters:

html (string) –
- String containing HTML code.
settings (WebPageSettings) –
- How the HTML content described in html is loaded.

InsertFromURL(args)[source]

Overload 1:

Add a web page to be converted. A single URL typically results in many PDF pages.

Parameters:

url (string) –

HTML page, or relative path to local HTML page

Overload 2:

Add a web page to be converted. A single URL typically results in many PDF pages.

Parameters:

url (string) –
- HTML page, or relative path to local HTML page
settings (WebPageSettings) –
- How the web page should be loaded and converted

InsertTOC(args)[source]

Overload 1:

Add a table of contents to the produced PDF. Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

Overload 2:

Add a table of contents to the produced PDF.

Parameters:

settings (TOCSettings) –

Settings for the table of contents.

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

static IsModuleAvailable()[source]

Find out whether the HTLML2PDF module is available (and licensed).

Return type:: boolean
Returns:: returns true if HTLML2PDF operations can be performed.

SetCompatibilityMode(compatibility)[source]

Provides the ability to run HTML to PDF conversion to run in compatibility mode, which runs with altered graphics options and does not create a dedicated render process. This option may be somewhat slower than the default mode. However, it may be required on environments with limited platform dependencies, such as AWS Lambda.

Parameters:

compatibility (boolean) –

If true, compatibility mode is enabled.

SetCookieJar(path)[source]

Path of file used for loading and storing cookies.

Parameters:

path (string) –

Path to file used for loading and storing cookies.

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

SetCustomHeader(name, value)[source]

Add a custom HTTP header specified by name and value.

Parameters:

name (string) –
- the name of the custom header.
value (string) –
- the value of the custom header.

SetDPI(dpi)[source]

Change the DPI explicitly for the output PDF.

Parameters:

dpi (int) –

Dots per inch, e.g. 80.

This has no effect on X11 based systems. Notes: Results also depend on ‘SetSmartShrinking’.

SetFooter(footer)[source]

Set footer of generated PDF.

Parameters:: footer (string) – HTML string to be used as the footer

SetHeader(header)[source]

Set header of generated PDF.

Parameters:: header (string) – HTML string to be used as the header

SetImageDPI(dpi)[source]

Maximum DPI to use for images in the generated PDF.

Parameters:

dpi (int) –

Maximum dpi of images in produced PDF, e.g. 80.

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

SetImageQuality(quality)[source]

JPEG compression factor to use when generating PDF.

Parameters:

quality (int) –

Compression factor, e.g. 92.

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

SetLandscape(enable)[source]

Set page orientation for output PDF.

Parameters:

enable (boolean) –

If true generated PDF pages will be orientated to

landscape, otherwise orientation will be portrait.

SetLogFilePath(path)[source]

Sets the location of the log file to be used during conversion.

Parameters:: path (string) – Full path and filename of file to log to.

SetMargins(top, bottom, left, right)[source]

Set margins of generated PDF.

Parameters:

top (string) –
- Size of the top margin, e.g. “2cm”.
bottom (string) –
- Size of the bottom margin, e.g. “2cm”.
left (string) –
- Size of the left margin, e.g. “2cm”.
right (string) –
- Size of the right margin, e.g. “2cm”.

Notes: Supported units are mm, cm, m, in, pc(pica), px(pixel) and pt(point).

static SetModulePath(path)[source]

Set the only location that PDFNet will look for the html2pdf module.

Parameters:

path (string) –

A folder or file path. If non-empty, PDFNet will only

look in path for the html2pdf module, otherwise it will search in the default locations for the module.

SetOutline(enable, depth=4)[source]

Add bookmarks to the PDF.

Parameters:

enable (boolean) –
- If true bookmarks will be generated for the
produced PDF.
depth (int, optional) –
- Maximum depth of the outline (e.g. 4).

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

SetPDFCompression(enable)[source]

Use loss less compression to create PDF.

Parameters:

enable (boolean) –

If true loss less compression will be used to

create PDF

Notes: This option is deprecated in the latest HTML2PDF module and may have no effect.

SetPaperSize(args)[source]

Overload 1:

Set paper size of output PDF

Parameters:

size (int) –

Paper size to use for produced PDF.

Overload 2:

Manually set the paper dimensions of the produced PDF.

Parameters:

width (string) –
- Width of the page, e.g. “4cm”.
height (string) –
- Height of the page, eg. “12in”.

Notes: Supported units are mm, cm, m, in, pc(pica), px(pixel) and pt(point).

SetQuiet(quiet)[source]

Display HTML to PDF conversion progress, warnings, and errors, to stdout.

Parameters:

quiet (boolean) –

If false, progress information is sent to stdout during conversion.

Notes: You can get the final results using GetLog.

SetSandbox(sandbox)[source]

Provides the ability to run HTML to PDF conversion with sandbox disabled. Default is true. Note: On Linux this option has no effect, because sandbox is always disabled there.

Parameters:

sandbox (boolean) –

If false, ‘–no_sandbox’ will be applied and the sandbox will not be used.

property mp_html2pdf

property thisown: The membership flag

class apryse_sdk.HTMLOutputOptions[source]

Bases: object

A class containing options common to ToHtml and ToEpub functions

GetFootnotesSetting()[source]

Get the setting for footnotes from this options object. Notes: This option is only available for e_reflow_full mode.

Return type:: int
Returns:: The current footnote setting.

GetHeadersAndFootersSetting()[source]

Get the setting for headers and footers from this options object. Notes: This option is only available for e_reflow_full mode.

Return type:: int
Returns:: The current header and footer setting.

SetConnectHyphens(connect)[source]

Specifies whether hyphens in the PDF should be connected. Default is false. Notes: This option is only available for e_reflow_paragraphs and e_reflow_full modes.

Parameters:: connect (boolean) – if true, hyphens in the PDF will be connected.

SetContentReflowSetting(reflow)[source]

Switch between fixed (pre-paginated) and reflowable HTML generation. Default is e_fixed_position. In e_reflow_paragraphs mode (now deprecated), conversions require that the optional PDFTron HTML reflow paragraphs add-on module is available. In e_reflow_full mode, conversions require that the optional PDFTron StructuredOutput add-on module is available.

Parameters:: reflow (int) – the generated HTML will be either fixed or reflowable.

See also: ContentReflowSetting See also: StructuredOutputModule See also: PDF2HtmlReflowParagraphsModule

SetDPI(dpi)[source]

The output resolution, from 1 to 1000, in Dots Per Inch (DPI) at which to render elements which cannot be directly converted. Default is 140. Notes: This option is only available for e_fixed_position mode.

Parameters:: dpi (int) – the resolution in Dots Per Inch

SetDisableVerticalSplit(disable)[source]

Specifies whether to disable the detection of section columns. Default is false. Enable this if your tables are coming out as section columns. Notes: This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, columns are detected automatically.

Parameters:: disable (boolean) – if true, the detection of section columns are disabled.

SetEmbedImages(embed)[source]

Specifies whether images are embedded in the HTML without having to link to external files. Default is true. Notes: This option is only available for e_reflow_paragraphs and e_reflow_full modes.

Parameters:: embed (boolean) – if true, images are embedd in the HTML, otherwise, images are saved as external files.

SetExternalLinks(enable)[source]

Enable the conversion of external URL navigation. Default is false.

Parameters:: enable (boolean) – if true, links that specify external URL’s are converted into HTML.

Notes: This option is only available for e_fixed_position mode.

SetFileConversionTimeoutSeconds(seconds)[source]

Specifies the amount of time in seconds after which the conversion fails. Default is 300. Very long files need more time to convert. Notes: This option is only available for e_reflow_paragraphs mode. The timeout feature is not necessary in other modes.

Parameters:: seconds (int) – the timeout in seconds.

SetFootnotesSetting(option)[source]

Specifies how footnotes should be converted. Default is e_Recover, which will include them as footnotes. Notes: This option is only available for e_reflow_full mode.

Parameters:: option (int) – The footnote setting.

SetHeadersAndFootersSetting(option)[source]

Specifies how header and footers should be converted. Default is e_Recover, which will include them as headers and footers. Notes: This option is only available for e_reflow_full mode.

Parameters:: option (int) – The header and footer setting.

SetImageDPI(dpi)[source]

Specifies the output image resolution, from 8 to 600, in Pixels Per Inch (PPI). The higher the PPI, the larger the image. Default is 192. Notes: This option is only available for e_reflow_paragraphs mode. In other modes, image resolution is determined automatically for an optimal result.

Parameters:: dpi (int) – the resolution in Pixels Per Inch.

SetInternalLinks(enable)[source]

Enable the conversion of internal document navigation. Default is false.

Parameters:: enable (boolean) – if true, links that specify page jumps are converted into HTML.

Notes: This option is only available for e_fixed_position mode.

SetJPGQuality(quality)[source]

Specifies the compression quality to use when generating JPEG images. Notes: This option is only available for e_fixed_position and e_reflow_paragraphs modes. In e_reflow_full mode, the optimal JPEG quality is chosen automatically for best balance between size and quality.

Parameters:: quality (int) – the JPEG compression quality, from 0 (highest compression) to 100 (best quality).

SetLanguage(language)[source]

Specifies the OCR language. Default is automatic language detection. Notes: This option is only available for e_reflow_full mode.

Parameters:: language (int) – the OCR language.

SetMaximumImagePixels(max_pixels)[source]

Specifies the maximum image slice size in pixels. Default is 2000000. Notes: This setting now will no longer reduce the total number of image pixels. Instead a lower value will just produce more slices and vice versa. Since image compression works better with more pixels a larger max pixels should generally create smaller files. This option is only available for e_fixed_position mode.

Parameters:: max_pixels (int) – the maximum number of pixels an image can have

SetNoPageWidth(enable)[source]

Determines whether to flow contents across the entire browser window. Default is false. Notes: This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, content always flows across the entire browser window.

Parameters:: enable (boolean) – if true, content will flow across entire page.

SetPDFPassword(password)[source]

Specifies the password if the PDF requires one. Notes: This option is only available for e_reflow_paragraphs and e_reflow_full modes.

Parameters:: password (string) – the PDF password, if required; an empty string otherwise.

SetPages(page_from, page_to)[source]

Specifies a range of pages to be converted. By default all pages are converted. The first page has the page number of 1. Notes: This option is only available for e_reflow_paragraphs and e_reflow_full modes.

Parameters:

page_from (int) – the first page to be converted.
page_to (int) – the last page to be converted (inclusive). Use a negative value to specify the last page in the PDF.

SetPreferJPG(prefer_jpg)[source]

Use JPG files rather than PNG. This will apply to all generated images. Default is true. Notes: This option is only available for e_fixed_position and e_reflow_paragraphs modes.

Parameters:: prefer_jpg (boolean) – if true JPG images will be used whenever possible.

SetPreferredOCREngine(engine)[source]

Specifies preferred OCR engine. Notes: This option is only available for e_reflow_full mode.

Parameters:: engine (int) – The PreferredOCREngine to OCR.

SetReportFile(path)[source]

Generate a XML file that contains additional information about the conversion process. By default no report is generated.

Parameters:: path (string) – the file path to which the XML report is written to.

Notes: This option is only available for e_fixed_position mode.

SetScale(scale)[source]

Set an overall scaling of the generated HTML pages. Default is 1.0.

Parameters:: scale (double) – A number greater than 0 which is used as a scale factor. For example, calling SetScale(0.5) will reduce the HTML body of the page to half its original size, whereas SetScale(2) will double the HTML body dimensions of the page and will rescale all page content appropriately.

Notes: This option is only available for e_fixed_position mode.

SetSearchableImageSetting(setting)[source]

Specifies how scanned image pages should be converted. Default is e_ocr_image_text. Notes: This option is only available for e_reflow_paragraphs and e_reflow_full modes.

Parameters:: setting (int) – the searchable image setting.

Remarks: In e_reflow_paragraphs mode, this feature does not perform OCR, but instead it relies on pre-existing text from previous OCR. Both images and pre-existing hidden text are kept by default. In e_reflow_full mode, pre-existing OCRed content is ignored and a new OCR is performed from scratch by default. e_ocr_off can be used to disable OCR. See also: SearchableImageSetting

SetSimpleLists(enable)[source]

Determines whether to use tags for list items. Default is false. Notes: This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, list items always use tags.

Parameters:: enable (boolean) – if true, tags are used for list items.

SetSimplifyText(enable)[source]

Controls whether converter optimizes DOM or preserves text placement accuracy. Default is false.

Parameters:: enable (boolean) – if true, converter will try to reduce DOM complexity at the expense of text placement accuracy.

Notes: This option is only available for e_fixed_position mode.

SetTitle(title)[source]

Specifies the title for the output HTML. Notes: This option is only available for e_reflow_paragraphs mode. HTML titles are not supported in other modes at the moment.

Parameters:: title (string) – the title of the output HTML.

e_fixed_position = 0

e_ocr_always = 4

e_ocr_image = 1

e_ocr_image_text = 0

e_ocr_off = 3

e_ocr_text = 2

e_reflow_full = 2

e_reflow_paragraphs = 1

property thisown: The membership flag

class apryse_sdk.Highlight(args)[source]

Bases: object

property length

property page_num

property position

property thisown: The membership flag

class apryse_sdk.HighlightAnnot(args)[source]

Bases: TextMarkup

A Highlight annotation covers a word or a group of contiguous words with partially transparent color.

static Create(doc, pos)[source]

Creates a new Highlight annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the Highlight annotation is added.
pos (Rect) – A rectangle specifying the Highlight annotation’s bounds in default user space units.

Return type:

HighlightAnnot

Returns:

A newly created blank Highlight annotation.

static CreateAnnot(doc, pos)[source]

property thisown: The membership flag

class apryse_sdk.Highlights(args)[source]

Bases: object

Highlights is used to store the necessary information and perform certain tasks in accordance with Adobe’s Highlight standard, whose details can be found at:

In a nutshell, the Highlights class maintains a set of highlights. Each highlight contains three pieces of information:

page: the number of the page this Highlight is on; position: the start position (text offset) of this Highlight; length: the length of this Highlight.

Possible use case scenarios for Highlights include:

Load a Highlight file (in XML format) and highlight the corresponding texts in the viewer (e.g., if the viewer is implemented using PDFViewCtrl, it can be achieved simply by calling PDFViewCtrl::SelectByHighlights() method);
Save the Highlight information (e.g.,ructed by the TextSearch class) to an XML file for external uses.

Note

The Highlights class does not maintain the corresponding PDF document for its highlights. It is the user’s responsibility to match them up.
The Highlights class ensures that each highlight it maintains is unique (no two highlights have the same page, position and length values).
The current implementation of Highlights only supports the ‘characters’ encoding for ‘units’ as described in the format; the ‘words’ encoding is not supported at this point.

For a sample code, please take a look at the TextSearchTest sample project.

Add(hlts)[source]

Add highlights.

Parameters:: hlts (Highlights) – the Highlights instance containing the highlights to be added.

Begin(doc)[source]

Rewind the internal pointer to the first highlight.

Parameters:: doc (PDFDoc) – the PDF document to which the highlights correspond.

Notes: the PDF document can be a dummy document unless GetCurrentQuads() is to be called.

Clear()[source]: Clear the current Highlight information in the class.

static CreateInternal(impl)[source]

Destroy()[source]: Frees the native memory of the object.

GetCurrentPageNumber()[source]: Get the page number of the current highlight.

GetCurrentQuads()[source]

Get the corresponding quadrangles of the current highlight.

Parameters:: quads – the output pointer to the resulting quadrangles.
Return type:: std::vector< PDF::QuadPoint,std::allocator< PDF::QuadPoint > >
Returns:: the number of the resulting quadrangles. Each quadrangle has eight doubles (x1, y1), (x2, y2), (x3, y3), (x4, y4) denoting the four vertices in counter-clockwise order.

Notes: the ‘quads’ array is owned by the current Highlights and does not need to be explicitly released. Since a highlight may correspond to multiple quadrangles, e.g., when it crosses a line, the number of resulting quadrangles may be larger than 1.

GetCurrentTextRange()[source]: Get a TextRange object that represents the current highlight.

GetHandleInternal()[source]

HasNext()[source]: Query if there is any subsequent highlight after the current highlight.

Load(file_name)[source]

Load the Highlight information from a file. Note that the pre-existing Highlight information is discarded.

Parameters:: file_name (string) – the name of the file to load from.

Next()[source]: Move the current highlight to the next highlight.

Save(file_name)[source]

Save the current Highlight information in the class to a file.

Parameters:: file_name (string) – the name of the file to save to.

SaveToString()[source]

Save the current Highlight information in the class to an XML string.

Return type:: string
Returns:: the highlight XML file contents as a string

property mp_highlights

property thisown: The membership flag

class apryse_sdk.Image(args)[source]

Bases: object

Image class provides common methods for working with PDF images.

Notes: PDF::Element contains a similar interface used to access image data. To create the Image object from image PDF::Element, pass the Element’s SDF/Cos dictionary to Imageructor (i.e. Image image(element->GetXObject()) )

static Create(args)[source]

Overload 1:

Create and embed an Image from an external file taking into account specified compression hints.

By default the function will either pass-through data preserving the original compression or will compress data using Flate compression. It is possible to fine tune compression or to select a different compression algorithm using ‘encoder_hints’ object.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. To obtain
SDF::Doc from PDFDoc use PDFDoc::GetSDFDoc() or Obj::GetDoc().
filename (string) –
- The name of the image file. Currently supported formats are
JPEG, PNG, GIF, TIFF, BMP, EMF, and WMF. Other raster formats can be embedded by decompressing image data and using other versions of Image::Create(…) method.
encoder_hints (Obj, optional) –
- An optional SDF::Obj containing a hint (or an SDF::Array of
  hints) that could be used to select a specific compression method and compression parameters. For a concrete example of how to create encoder hints, please take a look at JBIG2Test and AddImage sample projects. The image encoder accepts the following hints:
- /JBIG2; SDF::Name(“JBIG2”), An SDF::Name Object with value equal to “JBIG2”. If the
  image is monochrome (i.e. bpc == 1), the encoder will compress the image using JBIG2Decode filter. Note that JBIG2 compression is not recommended for use on scanned text/financial documents or equivalent
since its lossless nature can lead to similar looking numbers or characters being replaced.
- [/JBIG2 /Threshold 0.6 /SharePages 50] - Compress a monochrome image using lossy JBIG2Decode
compression with the given image threshold and by sharing segments from a specified number of pages. The threshold is a doubleing point number in the range from 0.4 to 0.9. Increasing the threshold value will increase image quality, but may increase the file size. The default value for threshold is 0.85. “SharePages” parameter can be used to specify the maximum number of pages sharing a common ‘JBIG2Globals’ segment stream. Increasing the value of this parameter improves compression ratio at the expense of memory usage.
- [/CCITT] - Compress a monochrome (i.e. bpc == 1) image using CCITT Group 4 compression. This algorithm typically produces
  larger output than JBIG2, but is lossless. This makes it much more suitable for scanned text documents. CCITT is the best option for more general monochrome compression use cases, since JBIG2 has potential to change image content.
- [/JPEG] - Use JPEG compression with default compression.
- [/JPEG /Quality 60] - Use JPEG compression with given quality setting. The “Quality”
  value is expressed on the 0..100 scale.
- [/JPEG2000] - Use JPEG2000 compression to compress a RGB or a grayscale image.<para>
- [/JP2] - Use JPEG2000 compression with JP2 encoding. JP2 does not support CMYK images.<para>
- [/Flate] - Use Flate compression with maximum compression at the expense of
  speed.
- [/Flate /Level 9] - Use Flate compression using specified compression level.
  Compression “Level” must be a number between 0 and 9: 1 gives best speed, 9 gives best compression, 0 gives no compression at all (the input data is simply copied a block at a time).
- /RAW or [/RAW] - The encoder will not use any compression method and the image
  will be stored in the raw format.

Return type:

Returns:

PDF::Image object representing the embedded image.

Notes: For C++ developers: Current document does not take the ownership of the encoder_hints object. Therefore it is a good programming practice to create encoder_hints object on the stack.

Overload 2:

Create and embed an Image. Embed the raw image data taking into account specified compression hints.

By default the function will compress all images using Flate compression. It is possible to fine tune compression or to select a different compression algorithm using ‘encoder_hints’ object.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
buf (unsigned char) –
- The stream or buffer containing image data. The image data must
not be compressed and must follow PDF format for sample representation (please refer to section 4.8.2 ‘Sample Representation’ in PDF Reference Manual for details).
width (int) –
- The width of the image, in samples.
height (int) –
- The height of the image, in samples.
bpc (int) –
- The number of bits used to represent each color component.
color_space (ColorSpace) –
- The color space in which image samples are represented.
encoder_hints (Obj, optional) –
- An optional parameter that can be used to fine tune
compression or to select a different compression algorithm. See Image::Create() for details.

Return type:

Returns:

PDF::Image object representing the embedded image.

Overload 3:

Create and embed an Image. Embed the raw image data taking into account specified compression hints.

By default the function will compress all images using Flate compression. It is possible to fine tune compression or to select a different compression algorithm using ‘encoder_hints’ object.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
buf (unsigned char) –
- The stream or buffer containing image data. The image data must
not be compressed and must follow PDF format for sample representation (please refer to section 4.8.2 ‘Sample Representation’ in PDF Reference Manual for details).
width (int) –
- The width of the image, in samples.
height (int) –
- The height of the image, in samples.
bpc (int) –
- The number of bits used to represent each color component.
color_space (ColorSpace) –
- The color space in which image samples are represented.
encoder_hints –
- An optional parameter that can be used to fine tune
compression or to select a different compression algorithm. See Image::Create() for details.

Return type:

Returns:

PDF::Image object representing the embedded image.

Overload 4:

Create and embed an Image. Embed the raw image data taking into account specified compression hints. Notes: see Image::Create for details.

Overload 5:

Create and embed an Image. Embed the raw image data taking into account specified compression hints. Notes: see Image::Create for details.

Overload 6:

Create and embed an Image. Embed the raw image data taking into account specified compression hints. Notes: see Image::Create for details.

Overload 7:

Create and embed an Image. Embed the raw image data taking into account specified compression hints. Notes: see Image::Create for details.

Overload 8:

Create and embed an Image. Embed the raw image data taking into account specified compression hints.

Notes: see Image::Create for details. PDFNet takes ownership of the filter

Overload 9:

Create and embed an Image. Embed the raw image data taking into account specified compression hints.

Notes: see Image::Create for details. PDFNet takes ownership of the filter

Overload 10:

Directly embed the image that is already compressed using the Image::InputFilter format. The function can be used to pass-through pre-compressed image data.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
buf (string) –
- The stream or buffer containing compressed image data.
The compression format must match the input_format parameter.
width (int) –
- The width of the image, in samples.
height (int) –
- The height of the image, in samples.
bpc (int) –
- The number of bits used to represent each color component.
color_space (ColorSpace) –
- The color space in which image samples are specified.
input_format (int) –
- Image::InputFilter describing the format of pre-compressed
image data.

Return type:

Returns:

PDF::Image object representing the embedded image.

Overload 11:

Embed the raw image data taking into account specified compression hints. Notes: see the above method for details.

static CreateImageMask(args)[source]

Overload 1:

Create and embed an Image from any GDI+ Bitmap taking into account specified compression hints.

Notes: see Image::Create for details. This method is available only on Windows platforms.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
bmp –
- GDI+ bitmap.

Return type:

Returns:

PDF::Image object representing the embedded image.

Create and embed an ImageMask. Embed the raw image data taking into account specified compression hints. The ImageMask can be used as a stencil mask for painting in the current color or as an explicit mask specifying which areas of the image to paint and which to mask out. One of the most important uses of stencil masking is for painting character glyphs represented as bitmaps.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
buf (string) –
- The stream or buffer containing image data stored in 1 bit per
sample format. The image data must not be compressed and must follow PDF format for sample representation (please refer to section 4.8.2 ‘Sample Representation’ in PDF Reference Manual for details).
width (int) –
- The width of the image, in samples.
height (int) –
- The height of the image, in samples.
encoder_hints (Obj, optional) –
- An optional parameter that can be used to fine tune
compression or to select a different compression algorithm. See Image::Create() for details.

Return type:

Returns:

PDF::Image object representing the embedded ImageMask.

Overload 2:

Create and embed an ImageMask. Notes: see Image::CreateImageMask for details.

Overload 3:

Create and embed an ImageMask. Notes: see Image::CreateImageMask for details.

static CreateSoftMask(args)[source]

Overload 1:

Create and embed a Soft Mask. Embed the raw image data taking into account specified compression hints. A soft-mask image (see “Soft-Mask Images” in PDF Reference Manual) is used as a source of mask shape or mask opacity values in the transparent imaging model.

Parameters:

doc (SDFDoc) –
- A document to which the image should be added. The ‘Doc’ object
can be obtained using Obj::GetDoc() or PDFDoc::GetSDFDoc().
buf (string) –
- The stream or buffer containing image data represented in
DeviceGray color space (i.e. one component per sample). The image data must not be compressed and must follow PDF format for sample representation (please refer to section 4.8.2 ‘Sample Representation’ in PDF Reference Manual for details).
width (int) –
- The width of the image, in samples.
height (int) –
- The height of the image, in samples.
bpc (int) –
- The number of bits used to represent each color component.
encoder_hints (Obj, optional) –
- An optional parameter that can be used to fine tune
compression or to select a different compression algorithm. See Image::Create() for details.

Notes: this feature is available only in PDF 1.4 and higher.

Overload 2:

Create and embed a Soft Mask. Embed the raw image data taking into account specified compression hints. Notes: see Image::CreateSoftMask for details.

Overload 3:

Create and embed a Soft Mask. Embed the raw image data taking into account specified compression hints. Notes: see Image::CreateSoftMask for details.

Export(args)[source]

Overload 1:

Saves this image to a file.

The output image format (TIFF, JPEG, or PNG) will be automatically selected based on the properties of the embedded image. For example, if the embedded image is using CCITT Fax compression, the output format will be TIFF. Similarly, if the embedded image is using JPEG compression the output format will be JPEG. If your application needs to explicitly control output image format you may want to use ExportAsTiff() or ExportAsPng().

Parameters:: filename (string) – string that specifies the path name for the saved image. The filename should not include the extension which will be appended to the filename string based on the output format.
Return type:: int
Returns:: the number indicating the selected image format: (0 - PNG, 1 - TIF, 2 - JPEG).

Overload 2:

Saves this image to the output stream. (0 - PNG, 1 - TIF, 2 - JPEG).

Parameters:: writer (FilterWriter) – A pointer to FilterWriter used to write to the output stream. If the parameter is null, nothing will be written to the output stream, but the function returns the format identifier.
Return type:: int
Returns:: the number indicating the selected image format:

Notes: see the overloaded Image::Export method for more information.

ExportAsPng(args)[source]

Overload 1:

Saves this image to a PNG file.

Parameters:: filename (string) – string that specifies the path name for the saved image. The filename should include the file extension

Overload 2:

Saves this image to a PNG output stream.

Parameters:: writer (FilterWriter) – FilterWriter used to write to the output stream.

ExportAsTiff(args)[source]

Overload 1:

Saves this image to a TIFF file.

Parameters:: filename (string) – string that specifies the path name for the saved image. The filename should include the file extension

Overload 2:

Saves this image to a TIFF output stream.

Parameters:: writer (FilterWriter) – FilterWriter used to write to the output stream.

GetBitsPerComponent()[source]

Return type:: int
Returns:: the number of bits used to represent each color component. Only a single value may be specified; the number of bits is the same for all color components. Valid values are 1, 2, 4, 8, and 16.

GetComponentNum()[source]

Return type:: int
Returns:: the number of color components per sample.

GetDecodeArray()[source]

Return type:: Obj
Returns:: Decode array or NULL if the parameter is not specified. A decode object is an array of numbers describing how to map image samples into the range of values appropriate for the images color space . If ImageMask is true, the array must be either [0 1] or [1 0]; otherwise, its length must be twice the number of color components required by ColorSpace. Default value depends on the color space, See Table 4.36 in PDF Ref. Manual.

GetImageColorSpace()[source]

Convert PDF image to GDI+ Bitmap.

Return type:: ColorSpace
Returns:: GDI+ bitmap from this image. PDFNet creates a GDI+ bitmap that closely matches the original image in terms of the image depth and the number of color channels. PDF color spaces that do not have a counterpart in GDI+ are converted to RGB.

Notes: This method is available only on Windows platforms.

Return type:

ColorSpace

Returns:

The SDF object representing the color space in which image samples are specified or NULL if:

the image is an image mask

or is compressed using JPXDecode with missing ColorSpace entry in image dictionary.

The returned color space may be any type of color space except Pattern.

GetImageData()[source]

Return type:: Filter
Returns:: A stream (filter) containing decoded image data

GetImageDataSize()[source]

Return type:: int
Returns:: the size of image data in bytes

GetImageHeight()[source]

Return type:: int
Returns:: the height of the image, in samples.

GetImageRenderingIntent()[source]

Return type:: int
Returns:: The color rendering intent to be used in rendering the image.

GetImageWidth()[source]

Return type:: int
Returns:: the width of the image, in samples.

GetMask()[source]

Return type:: Obj
Returns:: an image XObject defining an image mask to be applied to this image (See ‘Explicit Masking’, 4.8.5), or an array specifying a range of colors to be applied to it as a color key mask (See ‘Color Key Masking’).

If IsImageMask() return true, this method will return NULL.

GetSDFObj()[source]

Return type:: Obj
Returns:: the underlying SDF/Cos object

GetSoftMask()[source]

Return type:: Obj
Returns:: an image XObject defining a Soft Mask to be applied to this image (See section 7.5.4 ‘Soft-Mask Images’ in PDF Reference Manual), or NULL if the image does not have the soft mask.

IsImageInterpolate()[source]

Return type:: boolean
Returns:: a boolean indicating whether image interpolation is to be performed.

IsImageMask()[source]

Return type:: boolean
Returns:: a boolean indicating whether the inline image is to be treated as an image mask.

IsValid()[source]

Return type:: boolean
Returns:: whether this is a valid raster image. If the function returns false the underlying SDF/Cos object is not a valid raster image and this Image object should be treated as null.

SetMask(args)[source]

Overload 1:

Set an Explicit Image Mask.

Parameters:: image_mask (Image) – An Image object which serves as an explicit mask for the base (this) image. The base image and the image mask need not have the same resolution (Width and Height values), but since all images are defined on the unit square in user space, their boundaries on the page will coincide; that is, they will overlay each other. The image mask indicates which places on the page are to be painted and which are to be masked out (left unchanged). Unmasked areas are painted with the corresponding portions of the base image; masked areas are not.

Notes: image_mask must be a valid image mask (i.e. image_mask.IsImageMask() must return true.

Overload 2:

Set a Color Key Mask.

Parameters:: mask (Obj) – is an Cos/SDF array specifying a range of colors to be masked out. Samples in the image that fall within this range are not painted, allowing the existing background to show through. The effect is similar to that of the video technique known as chroma-key. For details of the array format please refer to section 4.8.5 ‘Color Key Masking’ in PDF Reference Manual.

Notes: the current document takes the ownership of the given SDF object.

SetSoftMask(soft_mask)[source]

Set a Soft Mask.

Parameters:: soft_mask (Image) – is a subsidiary Image object defining a soft-mask image (See section 7.5.4 ‘Soft-Mask Images’ in PDF Reference Manual) to be used as a source of mask shape or mask opacity values in the transparent imaging model. The alpha source parameter in the graphics state determines whether the mask values are interpreted as shape or opacity.

e_ascii_hex = 6

e_flate = 3

e_g3 = 4

e_g4 = 5

e_jp2 = 2

e_jpeg = 1

e_none = 0

property mp_image

property thisown: The membership flag

class apryse_sdk.Image2RGB(args)[source]

Bases: Filter

Image2RGB is a filter that can decompress and normalize any PDF image stream (e.g. monochrome, CMYK, etc) into a raw RGB pixel stream.

property thisown: The membership flag

class apryse_sdk.Image2RGBA(args)[source]

Bases: Filter

Image2RGBA is a filter that can decompress and normalize any PDF image stream (e.g. monochrome, CMYK, etc) into a raw RGBA pixel stream.

property thisown: The membership flag

class apryse_sdk.ImageSettings[source]

Bases: object

A class that stores downsampling/recompression settings for color and grayscale images.

ForceChanges(force)[source]

Sets whether image changes that grow the PDF file should be kept. This is off by default.

Parameters:: force (boolean) – if true all image changes will be kept.

ForceRecompression(force)[source]

Sets whether recompression to the specified compression method should be forced when the image is not downsampled. By default the compression method for these images will not be changed.

Parameters:: force (boolean) – if true the compression method for all images will be changed to the specified compression mode

SetCompressionMode(mode)[source]

Sets the output compression mode for this type of image The default value is e_retain

Parameters:: mode (int) – the compression mode to set

SetDownsampleMode(mode)[source]

Sets the downsample mode for this type of image The default value is e_default which will allow downsampling of images

Parameters:: mode (int) – the downsample mode to set

SetImageDPI(maximum, resampling)[source]

Sets the maximum and resampling dpi for images. By default these are set to 144 and 96 respectively.

Parameters:

maximum (double) – the highest dpi of an image before it will be resampled
resampling (double) – the image dpi to resample to if an image is encountered over the maximum dpi

SetQuality(quality)[source]: Sets the quality for lossy compression modes from 1 to 10 where 10 is lossless (if possible) the default value is 5

e_default = 1

e_flate = 1

e_jpeg = 2

e_jpeg2000 = 3

e_none = 4

e_off = 0

e_retain = 0

property thisown: The membership flag

class apryse_sdk.Ink(args)[source]

Bases: Markup

An ink annotation (PDF 1.3) represents a freehand “scribble” composed of one or more disjoint paths. When opened, it shall display a pop-up window containing the text of the associated note.

static Create(doc, pos)[source]

Creates a new Ink annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the Ink annotation is added.
pos (Rect) – A rectangle specifying the Ink annotation’s bounds in default user space units.

Return type:

Ink

Returns:

A newly created blank Ink annotation.

static CreateAnnot(doc, pos)[source]

Erase(pt1, pt2, width)[source]

Erase a rectangle area formed by pt1pt2 with width

Parameters:

pt1 (Point) – A point object that is one end of the eraser segment
pt2 (Point) – A point object that is the other end of the eraser segment
width (double) – The half width of the eraser

Return type:

boolean

Returns:

Whether an ink stroke was erased

GetHighlightIntent()[source]

Retrieves whether the Ink will draw like a highlighter.

Return type:: boolean
Returns:: true if the Ink will draw like a highlighter. (multiply blend mode) If false it will draw in normal mode. (normal blend mode)

GetPathCount()[source]

Returns number of paths in the annotation.

Return type:: int
Returns:: An integer representing the number of paths in the ‘InkList’ entry of the annotation dictionary.

Notes: Each path is an array of Point objects specifying points along the path. When drawn, the points shall be connected by straight lines or curves in an implementation-dependent way.

GetPoint(pathindex, pointindex)[source]

Returns the specific point in a given path.

Parameters:

pathindex (int) – path index for each the point is returned. Index starts at 0.
pointindex (int) – index of point in the path. Index starts at 0.

Return type:

Point

Returns:

A Point object for specified path and point index.

Notes: Each path is an array of Point objects specifying points along the path. When drawn, the points shall be connected by straight lines or curves in an implementation-dependent way.

GetPointCount(pathindex)[source]

Returns number of points in a certain stroked path in the InkList.

Parameters:: pathindex (int) – path index for each the point count is returned. Index starts at 0.
Return type:: int
Returns:: An integer representing the number of points in the stroked path of the Ink list.

|Language |Code|Language |Code |

Here are the languages supported out of the box per engine:

|Executable Name|Supported Codes |

Note: OCRModuleIRIS allows mix of a single Asian language and just English.

type lang:: string
param lang:: The new language to be added to Langs.
rtype:: OCROptions
return:: This object, for call chaining.

AddTextZonesForPage(regions, page_num)[source]

Adds the zones to the TextZones array.

It is as an optional list of known text zones that will be used to improve OCR quality.

|Value|Origin |+Y axis dir.|Unit |Size |Expected input|

Note that, OCRModule backend has no notion about the SDK’s real input. So, if not explicitelly instructed via this call and value, it will work by default and will report the coordinates as for false. This is important to know for cases when the call for OCR service comes from GetOCRJsonFromPDF() or GetOCRXmlFromPDF() where the results should be correct in JSON or XML format - and removes the need for user to do additional adjustments.

See also: AddIgnoreZonesForPage() AddTextZonesForPage() AddDPI() GetOCRJsonFromImage() GetOCRXmlFromImage().

rtype:: boolean
return:: The current value for UsePDFPageCoords.

SetAutoRotate(value)[source]

Sets the value for AutoRotate in the options object.

Default value is false. Setting to true will deskew the image before conducting OCR.

Note: This function doesn’t apply to IRIS OCR module.

Parameters:: value (boolean) – The new value for AutoRotate.
Return type:: OCROptions
Returns:: This object, for call chaining.

SetIgnoreExistingText(value)[source]

Sets the value for IgnoreExistingText in the options object.

Default value is false, so that areas with existing text will be automatically skipped during OCR. Setting to true will cause a pre-existing text to be duplicated with the OCR-ed ones in the PDF document or in GetOCRJsonFromPDF() and GetOCRXmlFromPDF() results.

Parameters:: value (boolean) – The new value for IgnoreExistingText.
Return type:: OCROptions
Returns:: This object, for call chaining.

SetOCREngine(value)[source]

Sets the value for OCREngine in the options object.

Options include ‘default’ or ‘iris’. Chosen module must be present and correctly licensed.

Parameters:: value (string) – The new value for OCREngine.
Return type:: OCROptions
Returns:: This object, for call chaining.

SetUsePDFPageCoords(value)[source]

Sets the value for UsePDFPageCoords in the options object.

Defines the coordinate system, scaling and units. SDK and OCRModule will refer to this setting while dealing with potential input zone rectangle(s) (ignorable or text) and with the output result as well. The default value is false which corresponds to raster image input.

Here are the meanings for value:

|Value|Origin |+Y axis dir.|Unit |Size |Expected input|

Note that, OCRModule backend has no notion about the SDK’s real input. So, if not explicitelly instructed via this call and value, it will work by default and will report the coordinates as for false. This is important to know for cases when the call for OCR service comes from GetOCRJsonFromPDF() or GetOCRXmlFromPDF() where the results should be correct in JSON or XML format - and removes the need for user to do additional adjustments.

See also: AddIgnoreZonesForPage() AddTextZonesForPage() AddDPI() GetOCRJsonFromImage() GetOCRXmlFromImage().

type value:: boolean
param value:: The new value for UsePDFPageCoords.
rtype:: OCROptions
return:: This object, for call chaining.

property thisown: The membership flag

class apryse_sdk.Obj(args)[source]

Bases: object

Obj is a concrete class for all SDF/Cos objects. Obj hierarchy implements the composite design pattern. As a result, you can invoke a member function of any ‘derived’ object through Obj interface. If the member function is not supported (e.g. if you invoke Obj::GetNumber() on a boolean object) an Exception will be thrown.

You can use GetType() or obl.Is???() member functions to find out type-information at run time, however most of the time the type can be inferred from the PDF specification. Therefore when you call Doc::GetTrailer() you can assume that returned object is a dictionary. If there is any ambiguity use Is???() methods.

Objects can’t be shared across documents, however you can use Doc::ImportObj() to copy objects from one document to another.

Objects can be shared within a document provided that they are created as indirect. Indirect objects are the ones that are referenced in cross-reference table. To create an object as indirect use doc.CreateIndirect???() (where ? is the Object type).

static CreateInternal(impl)[source]

Erase(args)[source]

Overload 1:

Removes an element in the dictionary that matches the given key.

Parameters:: key (string) – A string representing the key value of the element to remove.
Raises:: An Exception is thrown if this is not a dictionary or a stream.

Overload 2:

Removes an element in the dictionary from specified position.

Parameters:: pos (DictIterator) – A dictionary iterator indicating the position of the element to remove.
Raises:: An Exception is thrown if this is not a dictionary or a stream.

EraseAt(pos)[source]

Checks whether the position is within the array bounds and then removes it from the array and moves each subsequent element to the slot with the next smaller index and decrements the arrays length by 1.

Parameters:: pos (int) – The index for the array member to remove. Array indexes start at 0.
Raises:: An Exception is thrown if this is not an Obj::Type::e_array

Find(key)[source]

Search the dictionary for a given key.

Parameters:

key (string) –

a key to search for in the dictionary

Return type:

DictIterator

Returns:

The iterator to the matching key/value pair or invalid iterator (i.e. itr.HasCurrent()==fase) if the if the dictionary does not contain the given key.

Notes: A dictionary entry whose value is Obj::Null is equivalent to an absent entry. :raises: Exception is thrown if this is not a dictionary or a stream

Sample code used to search a dictionary for a given key:

 DictIterator itr = info_dict.Find("Info");
 if (itr.HasCurrent()) {
   Obj info = itr.Value();
   if (info.IsDict())
     info.PutString("Producer", "PDFTron PDFNet SDK");
}

FindObj(key)[source]

Search the dictionary for a given key.

Parameters:

key (string) –

a key to search for in the dictionary

Return type:

Returns:

NULL if the dictionary does not contain the specified key. Otherwise return the corresponding value.

Notes: A dictionary entry whose value is Obj::Null is equivalent to an absent entry. :raises: Exception is thrown if this is not a dictionary or a stream

Get(key)[source]

Search the dictionary for a given key and throw an exception if the key is not found.

Parameters:

key (string) –

a key to search for in the dictionary

Return type:

DictIterator

Returns:

Obj::Null object if the value matching specified key is a Obj::Null object. otherwise return the iterator to the matching key/value pair.

Raises:

An Exception is thrown if the dictionary does not contain the specified key.

Raises:

An Exception is thrown if this is not a Obj::Type::e_dict or a stream.

GetAsPDFText()[source]

Convert the SDF/Cos String object to ‘PDF Text String’ (a Unicode string).

PDF Text Strings are not used to represent page content, however they are used in text annotations, bookmark names, article names, document information etc. These strings are encoded in either PDFDocEncoding or Unicode character encoding. For more information on PDF Text Strings, please refer to section 3.8.1 ‘Text Strings’ in PDF Reference.

Notes: Not all SDF/Cos String objects are used to represent ‘PDF Text’. PDF Reference indicates (on a case by case basis ) where an SDF/Cos String object can be used as ‘PDF Text’.

Raises:: An Exception is thrown if this is not a Obj::Type::e_string.

GetAt(index)[source]

Parameters:

index (int) –

The array element to obtain. The first element in an array has an index of zero.

Raises:

throws an Exception if index is outside the array bounds.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array.

GetBool()[source]

Return type:: boolean
Returns:: bool value if this is Bool.
Raises:: Exception is thrown if the object is not Obj::Type::e_bool

GetBuffer()[source]

Return type:: std::vector< unsigned char,std::allocator< unsigned char > >
Returns:: a pointer to the string buffer. Please note that the string may not be NULL terminated and that it may not be represented in ASCII or Unicode encoding. For more information on SDF/Cos String objects, please refer to section 3.2.3 ‘String Objects’ in PDF Reference Manual.

Notes: if SDF/Cos String object is represented as ‘PDF Text’ (Section 3.8.1 ‘Text Strings’ in PDF Reference) you can use GetAsPDFText method to obtain Unicode representation of the string.

use Size() member function in order to obtain the number of bytes in string buffer.

Raises:: Exception is thrown if this is not a Obj::Type::e_string.

GetDecodedStream()[source]

Return type:

Returns:

A filter to the decoded stream

Raises:

An Exception is thrown if this is not a Obj::Type::e_stream

GetDictIterator()[source]

Return type:: DictIterator
Returns:: an iterator that addresses the first element in the dictionary.
Raises:: An Exception is thrown if this is not a dictionary object (Dict).

Sample code used to traverse all entries in the dictionary:

DictIterator itr = dict.GetDictIterator();
while (itr.HasCurrent()) {
    Obj key = itr.Key();
    Obj value = itr.Value();
    // ...
    itr.Next()
 }

GetDoc()[source]

Return type:: SDFDoc
Returns:: the document to which this object belongs.

Notes: this method can be invoked on any Obj.

GetGenNum()[source]

Return type:: int
Returns:: generation number. If this is not an Indirect object, generation number of a containing indirect object is returned.

Notes: this method can be invoked on any Obj.

GetHandleInternal()[source]

GetName()[source]

Return type:: string
Returns:: string representing the Name object.
Raises:: An Exception is thrown if this is not a Obj::Type::e_name

GetNumber()[source]

Return type:: double
Returns:: value, if this is Number.
Raises:: An Exception is thrown if the object is not a Obj::Type::e_number

GetObjNum()[source]

Return type:: int
Returns:: object number. If this is not an Indirect object, object number of a containing indirect object is returned.

Notes: this method can be invoked on any Obj.

GetOffset()[source]

Return type:: int
Returns:: object offset from the beginning of the file. If this is not an Indirect object, offset of a containing indirect object is returned.

Notes: this method can be invoked on any Obj.

GetRawBuffer()[source]

Return type:: std::vector< unsigned char,std::allocator< unsigned char > >
Returns:: a vector containing the encrypted string buffer.

Notes: Similar in behaviour to GetBuffer except that no decryption is done. If the file is not encrypted the result should be the same as GetBuffer

Raises:: Exception is thrown if this is not a Obj::Type::e_string.

GetRawStream(decrypt)[source]

Parameters:

decrypt (boolean) –

If true decrypt the stream if the stream is encrypted.

Return type:

Returns:

A filter to the encoded stream

Raises:

An Exception is thrown if this is not a Obj::Type::e_stream

GetRawStreamLength()[source]

Return type:: int
Returns:: the length of the raw/encoded stream equal to the Length parameter
Raises:: An Exception is thrown if this is not a Obj::Type::e_stream

GetType()[source]

Return type:: int
Returns:: the object type.

Notes: this method can be invoked on any Obj.

Insert(pos, obj)[source]

Inserts an existing Obj in this array.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
obj (Obj) – The value to be inserted into the dictionary. If ‘obj’ is indirect (i.e. is a shared) object it will be inserted by reference, otherwise the object will be cloned and then inserted.

Return type:

Returns:

A newly inserted object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertArray(pos)[source]

Inserts an Obj::Type::e_array object in the array.

Return type:

Returns:

A newly created array object.

Parameters:

pos (int) –

The location in the array to insert the object . The object is inserted

before the specified location. The first element in an array has a pos of: zero. If pos >= Array->Length(), appends obj to array.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertBool(pos, value)[source]

Inserts an Obj::Type::e_bool object in the array.

Return type:

Returns:

A newly created boolean object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (boolean) – The value of the Obj::Type::e_bool object to be inserted.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertDict(pos)[source]

Inserts an Obj::Type::e_dict object in the array.

Return type:

Returns:

A newly created dictionary object.

Parameters:

pos (int) –

The location in the array to insert the object . The object is inserted

before the specified location. The first element in an array has a pos of: zero. If pos >= Array->Length(), appends obj to array.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertMatrix(pos, value)[source]

Inserts an array of 6 numbers in this array.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (Matrix2D) –
- A matrix used to set the values in an array of six numbers.
The resulting array will be then inserted in this array.

Return type:

Returns:

A newly created array object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertName(pos, name)[source]

Inserts an Obj::Type::e_name object in the array.

Return type:

Returns:

A newly created name object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
name (string) – The value of the Obj::Type::e_name object to be inserted.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertNull(pos)[source]

Inserts an Obj::Type::e_null object in the array.

Return type:

Returns:

A newly created null object.

Parameters:

pos (int) –

The location in the array to insert the object . The object is inserted

before the specified location. The first element in an array has a pos of: zero. If pos >= Array->Length(), appends obj to array.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertNumber(pos, value)[source]

Inserts an Obj::Type::e_number object in the array.

Return type:

Returns:

A newly created number object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (double) – The value of the Obj::Type::e_number object to be inserted.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertRect(pos, x1, y1, x2, y2)[source]

Inserts an array of 4 numbers in this array.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
x1 (double) – The bottom left x value of the rect to be inserted
y1 (double) – The bottom left y value of the rect to be inserted
x2 (double) – The top right x value of the rect to be inserted
y2 (double) – The top right y value of the rect to be inserted

Return type:

Returns:

A newly created array object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertString(args)[source]

Overload 1:

Inserts an Obj::Type::e_string object in the array.

Return type:

Returns:

A newly created string object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (string) – The value of the Obj::Type::e_string object to be inserted.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

Overload 2:

Inserts an Obj::Type::e_string object in the array.

Return type:

Returns:

A newly created string object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (string) – The buffer used to set the value of the Obj::Type::e_string object to be inserted.
size (int) – The number of bytes to copy from the ‘value’ buffer parameter.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

InsertText(pos, value)[source]

Inserts an Obj::Type::e_string object in the array.

Return type:

Returns:

A newly created string object.

Parameters:

pos (int) –
- The location in the array to insert the object . The object is inserted
before the specified location. The first element in an array has a pos of
zero. If pos >= Array->Length(), appends obj to array.
value (string) – The value of the Obj::Type::e_string object to be inserted.

Notes: InsertText will create the string object as a ‘PDF Text’ object. :raises: An Exception is thrown if this is not an Obj::Type::e_array

IsArray()[source]

Return type:: boolean
Returns:: true if this is an Array, false otherwise.

Notes: this method can be invoked on any Obj.

IsBool()[source]

Return type:: boolean
Returns:: true if this is a Bool object, false otherwise.

Notes: this method can be invoked on any Obj.

IsContainer()[source]

Return type:: boolean
Returns:: true if this is a Container (a dictionary, array, or a stream), false otherwise.

Notes: this method can be invoked on any Obj.

IsDict()[source]

Return type:: boolean
Returns:: true if this is a dictionary (i.e. Dict), false otherwise.

Notes: this method can be invoked on any Obj.

IsEqual(to)[source]

Return type:

boolean

Returns:

true if two Obj’s point to the same object. This method does not compare object content. For this operation use IsEqualValue() instead.

Parameters:

to (Obj) –

Obj to compare to

IsFree()[source]

Return type:: boolean
Returns:: true if the object is in use or is marked as free.

Notes: this method can be invoked on any Obj.

IsIndirect()[source]

Return type:: boolean
Returns:: true if this is Indirect object (i.e. object referenced in the cross-reference table), false otherwise.

Notes: this method can be invoked on any Obj.

IsLoaded()[source]

Return type:: boolean
Returns:: true if the object is loaded in memory.

Notes: this method can be invoked on any Obj.

IsMarked()[source]

Return type:: boolean
Returns:: true if the object is marked.

Notes: this method can be invoked on any Obj.

IsName()[source]

Return type:: boolean
Returns:: true if this is Name, false otherwise.

Notes: this method can be invoked on any Obj.

IsNull()[source]

Return type:: boolean
Returns:: true if this is a Null object, false otherwise.

Notes: this method can be invoked on any Obj.

IsNumber()[source]

Return type:: boolean
Returns:: true if this is a Number object, false otherwise.

Notes: this method can be invoked on any Obj.

IsStream()[source]

Return type:: boolean
Returns:: true if this is a Stream, false otherwise.

Notes: this method can be invoked on any Obj.

IsString()[source]

Return type:: boolean
Returns:: true if this is a Str (String) object, false otherwise.

Notes: this method can be invoked on any Obj.

IsValid()[source]

Return type:: boolean
Returns:: true if this is a valid object, false otherwise. If the function returns false then the underlying SDF/Cos object is null or is not valid.

PushBack(obj)[source]

Appends an existing Obj at the end of the array.

Parameters:: obj (Obj) – The value to be inserted into the dictionary. If ‘obj’ is indirect (i.e. is a shared) object it will be inserted by reference, otherwise the object will be cloned and then appended.
Return type:: Obj
Returns:: A newly appended object.
Raises:: An Exception is thrown if this is not an Obj::Type::e_array

PushBackArray()[source]

Appends a new Obj::Type::e_array object at the end of the array.

Return type:: Obj
Returns:: The new array object.
Raises:: An Exception is thrown if this is not an Obj::Type::e_array

PushBackBool(value)[source]

Appends a new Obj::Type::e_bool object at the end of the array.

Return type:

Returns:

The new boolean object.

Parameters:

value (boolean) –

The value of the Obj::Type::e_bool object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackDict()[source]

Appends a new Obj::Type::e_dict object at the end of the array.

Return type:: Obj
Returns:: The new dictionary object.
Raises:: An Exception is thrown if this is not an Obj::Type::e_array

PushBackMatrix(value)[source]

Appends an array of 6 numbers at the end of the array.

Parameters:

value (Matrix2D) –

A matrix used to set the values in an array of six numbers.

The resulting array will be then inserted in this array.

Return type:

Returns:

A newly appended array object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackName(name)[source]

Appends a new Obj::Type::e_name object at the end of the array.

Return type:

Returns:

The new array object.

Parameters:

name (string) –

The value of the Obj::Type::e_name object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackNull()[source]

Appends a new Obj::Type::e_null object at the end of the array.

Return type:: Obj
Returns:: The new null object.
Raises:: An Exception is thrown if this is not an Obj::Type::e_array

PushBackNumber(value)[source]

Appends a new Obj::Type::e_number object at the end of the array.

Return type:

Returns:

The new number object.

Parameters:

value (double) –

The value of the Obj::Type::e_number object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackRect(x1, y1, x2, y2)[source]

Appends an array of 4 numbers at the end of the array.

Parameters:

x1 (double) – The bottom left x value of the rect to be inserted
y1 (double) – The bottom left y value of the rect to be inserted
x2 (double) – The top right x value of the rect to be inserted
y2 (double) – The top right y value of the rect to be inserted

Return type:

Returns:

A newly appended array object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackString(args)[source]

Overload 1:

Appends a new Obj::Type::e_string object at the end of the array.

Return type:

Returns:

The new string object.

Parameters:

value (string) –

The value of the Obj::Type::e_string object.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

Overload 2:

Appends a new Obj::Type::e_string object at the end of the array.

Return type:

Returns:

The new string object.

Parameters:

value (string) – The buffer used to set the value of the Obj::Type::e_string object to be inserted.
size (int) – The number of bytes to copy from the ‘value’ buffer parameter.

Raises:

An Exception is thrown if this is not an Obj::Type::e_array

PushBackText(value)[source]

Appends a new Obj::Type::e_string object at the end of the array.

Return type:: Obj
Returns:: The new string object.
Parameters:: value (string) – The value of the Obj::Type::e_string object to be inserted.

Notes: InsertText will create the string object as a ‘PDF Text’ object. :raises: An Exception is thrown if this is not an Obj::Type::e_array

Put(key, obj)[source]

Inserts a <key, Obj> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
obj (Obj) – The value to be inserted into the dictionary. If ‘obj’ is indirect (i.e. is a shared) object it will be inserted by reference, otherwise the object will be cloned and then inserted into the dictionary.

Return type:

Returns:

A newly inserted object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

PutArray(key)[source]

Inserts a <key, Obj::Type::e_array> pair in the dictionary.

Parameters:: key (string) – The key of the value to set.
Return type:: Obj
Returns:: A newly created array object.
Raises:: An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutBool(key, value)[source]

Inserts a <key, Obj::Type::e_bool> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (boolean) – The value of the Obj::Type::e_bool object to be inserted into the dictionary.

Return type:

Returns:

A newly created boolean object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutDict(key)[source]

Inserts a <key, Obj::Type::e_dict> pair in the dictionary.

Parameters:: key (string) – The key of the value to set.
Return type:: Obj
Returns:: A newly created dictionary.
Raises:: An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutMatrix(key, value)[source]

Inserts a <key, [a,b,c,d,h,v]> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (Matrix2D) –
- A matrix used to set the values in an array of six numbers.
The resulting array will be inserted into the dictionary.

Return type:

Returns:

A newly created array object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutName(key, name)[source]

Inserts a <key, Obj::Type::e_name> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
name (string) – The value of the Obj::Type::e_name object to be inserted into the dictionary.

Return type:

Returns:

A newly created name object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutNull(key)[source]

Inserts a <key, Obj::Type::e_null> pair in the dictionary.

Parameters:: key (string) – The key of the value to set.
Raises:: An Exception is thrown if this is not a dictionary or a stream object.

Notes: The effect of calling this method is essentially the same as dict.Erase(key) .

PutNumber(key, value)[source]

Inserts a <key, Obj::Type::e_number> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (double) – The value of the Obj::Type::e_number object to be inserted into the dictionary.

Return type:

Returns:

A newly created number object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutRect(key, x1, y1, x2, y2)[source]

Inserts a <key, [x1,y1,x2,y2]> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
x1 (double) – The bottom left x value of the rect to be inserted
y1 (double) – The bottom left y value of the rect to be inserted
x2 (double) – The top right x value of the rect to be inserted
y2 (double) – The top right y value of the rect to be inserted

Return type:

Returns:

A newly created array object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutString(args)[source]

Overload 1:

Inserts a <key, Obj::Type::e_string> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (string) – The value of the Obj::Type::e_string object to be inserted into the dictionary.

Return type:

Returns:

A newly created string object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

Overload 2:

Inserts a <key, Obj::Type::e_string> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (string) – The buffer used to set the value of the Obj::Type::e_string object to be inserted into the dictionary.
size (int) – The number of bytes to copy from the ‘value’ buffer parameter.

Return type:

Returns:

A newly created string object.

Raises:

An Exception is thrown if this is not a dictionary or a stream object.

Notes: If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

PutText(key, value)[source]

Inserts a <key, Obj::Type::e_string> pair in the dictionary.

Parameters:

key (string) – The key of the value to set.
value (string) – The value of the Obj::Type::e_string object to be inserted into the dictionary.

Notes: PutText will create the string object as a ‘PDF Text’ object.

Return type:: Obj
Returns:: A newly created string object.
Raises:: An Exception is thrown if this is not a dictionary or a stream object.

If a dictionary already contains an entry with the same key, the old entry will be deleted and all DictIterators to this entry will be invalidated.

Rename(old_key, new_key)[source]

Change the key value of a dictionary entry. The key can’t be renamed if another key with the same name already exists in the dictionary. In this case Rename returns false.

Parameters:

old_key (string) – A string representing the key value to be changed.
new_key (string) – A string representing the key value that the old key is changed into.

Raises:

An Exception is thrown if this is not a dictionary or a stream.

SetBool(b)[source]

Parameters:

b (boolean) –

bool value used to set Bool object.

Raises:

An Exception is thrown if this is not a Obj::Type::e_bool

SetMark(mark)[source]

Set the object mark. Mark is a boolean value that can be associated with every indirect object. This is especially useful when an object graph should be traversed and an operation should be performed on each node only once.

Parameters:: mark (boolean) – boolean value that the object’s mark should be set to.

Notes: this method can be invoked on any Obj.

SetName(name)[source]

Parameters:

name (string) –

value used to set Name object.

Raises:

An Exception is thrown if this is not a Obj::Type::e_name

SetNumber(n)[source]

Parameters:

n (double) –

value used to set Number object.

Raises:

An Exception is thrown if this is not a Obj::Type::e_number

SetStreamData(args)[source]: allows to replace the content stream with a new one without creating a new object

SetString(args)[source]

Overload 1:

Sets the string object value.

Parameters:

value (UChar) –
- character buffer.
size (int) –
- the size of character buffer.

Raises:

An Exception is thrown if this is not a Obj::Type::e_string

Overload 2:

Sets the string object value.

Parameters:

str (string) –

A Unicode string value.

Raises:

An Exception is thrown if this is not a Obj::Type::e_string

Size()[source]

Return type:

int

Returns:

the ‘size’ of the object. The definition of ‘size’ depends on the object type. In particular:

For a dictionary or a stream object, the method will return the number of key/value pairs in the dictionary.

For an array object the method will return the number of Obj entries in the array.

For a string object the method will return the number of bytes in the string buffer.

For any other object the method will always return 1.

Notes: this method can be invoked on any Obj.

Write(stream)[source]

The function writes the Obj to the output stream

Parameters:

stream (FilterWriter) –

the input stream where the Obj will be written

Notes: this method can be invoked on any Obj.

e_array = 6

e_bool = 1

e_dict = 5

e_name = 3

e_null = 0

e_number = 2

e_stream = 7

e_string = 4

property mp_obj

property thisown: The membership flag

class apryse_sdk.ObjSet(args)[source]

Bases: object

ObjSet is a lightweight container that can hold a collection of SDF objects.

CreateArray()[source]: Create a new array object in this object set.

CreateBool(value)[source]

Create a new boolean object in this object set.

Parameters:: value (boolean) – The boolean value of the object to create

CreateDict()[source]: Create a new dictionary object in this object set.

CreateFromJson(value)[source]

parses a json string to create either a Dictionary or an Array in ObjSet.

return the created object as Obj

CreateName(name)[source]

Create a new name object in this object set.

Parameters:: name (string) – The name of the object to create

CreateNull()[source]: Create a new null object in this object set.

CreateNumber(value)[source]

Create a new number object in this object set.

Parameters:: value (double) – The numeric value of the number object to create.

CreateString(value)[source]: Create a new string object in this object set. The unsigned string value of the string object to create.

Destroy()[source]: Frees the native memory of the object.

property thisown: The membership flag

class apryse_sdk.ObjectIdentifier(args)[source]

Bases: object

This class represents an object identifier (OID), as defined by ITU and used in X.509.

static CreateFromDigestAlgorithm(in_digest_algorithm)[source]

Destroy()[source]

GetRawValue()[source]

Retrieves the value of the object identifier.

Return type:: std::vector< int,std::allocator< int > >
Returns:: the value of the object identifier, as a container of integer components.

e_MGF1 = 15

e_RIPEMD160 = 12

e_RSASSA_PSS = 14

e_RSA_encryption_PKCS1 = 13

e_SHA1 = 8

e_SHA256 = 9

e_SHA384 = 10

e_SHA512 = 11

e_commonName = 0

e_countryName = 2

e_localityName = 3

e_organizationName = 6

e_organizationalUnitName = 7

e_stateOrProvinceName = 4

e_streetAddress = 5

e_surname = 1

property m_impl

property thisown: The membership flag

class apryse_sdk.OfficeToPDFOptions[source]

Bases: ConversionOptions

GetApplyPageBreaksToSheet()[source]

Gets the value ApplyPageBreaksToSheet from the options object. Whether we should split Excel worksheets into pages so that the output resembles print output. If set to false (the default), Excel sheets will be placed one per page, except in the case where the sheets are very large

Return type:: boolean
Returns:: The current value for ApplyPageBreaksToSheet.

GetDisplayChangeTracking()[source]

Gets the value DisplayChangeTracking from the options object. If this option is true, will display office change tracking markup present in the document (i.e, red strikethrough of deleted content and underlining of new content). Otherwise displays the resolved document content, with no markup. Defaults to true.

Return type:: boolean
Returns:: The current value for DisplayChangeTracking.

GetDisplayComments()[source]

Gets the value DisplayComments from the options object. Specifies the display of comments that are present in the document. By default, comments will not be displayed.

Return type:: int
Returns:: The current value for DisplayComments.

GetDisplayHiddenText()[source]

Gets the value DisplayHiddenText from the options object. Display any hidden text that is present in the document (i.e., text that has been marked as ‘Hidden’ in the font style). By default, hidden text will not be displayed.

Return type:: boolean
Returns:: The current value for DisplayHiddenText.

GetExcelDefaultCellBorderWidth()[source]

Gets the value ExcelDefaultCellBorderWidth from the options object. Cell border width for table cells that would normally be drawn with no border. In units of points. Can be used to achieve a similar effect to the “show gridlines” display option within Microsoft Excel.

Return type:: double
Returns:: The current value for ExcelDefaultCellBorderWidth.

GetExcelMaxAllowedCellCount()[source]

Gets the value ExcelMaxAllowedCellCount from the options object. Conversion will throw an exception if the number of cells in a Microsoft Excel document is above the set MaxAllowedCellCount. Used for early termination of resource intensive conversions. Setting this value to 250000 will allow the vast majority of Excel documents to convert without issue, while keeping RAM usage to a reasonable level. By default there is no limit to the number of allowed cells.

Return type:: int
Returns:: The current value for ExcelMaxAllowedCellCount.

GetHideTotalNumberOfPages()[source]

Gets the value HideTotalNumberOfPages from the options object. If the document has an element that displays the total number of pages and the total number of pages is unknown beforehand, remove those elements from the document.

Return type:: boolean
Returns:: The current value for HideTotalNumberOfPages.

GetIncludeBookmarks()[source]

Gets the value IncludeBookmarks from the options object. When this option is set to false, Word document bookmarks will not be converted into PDF bookmarks. However, Word headings will still be automatically converted into PDF bookmarks. By default, both Word bookmarks and headings are converted into PDF bookmarks, providing a comprehensive navigation structure within the converted PDF.

Return type:: boolean
Returns:: The current value for IncludeBookmarks.

GetIncrementalSave()[source]

Gets the value IncrementalSave from the options object. If this option is true, the document will be saved incrementally during the conversion, thus reducing the peak memory usage. Save an empty PDFDoc to the target location before the conversion so the incremental saving is done directly to the target location. Otherwise, a temporary file will be used. PDFDoc.Save still has to be called after the conversion is done to finalize the file. Doing PDFDoc.Save with e_incremental flag will reduce the saving time but increase the PDF file size.

Return type:: boolean
Returns:: The current value for IncrementalSave.

GetLayoutResourcesPluginPath()[source]

Gets the value LayoutResourcesPluginPath from the options object The path at which the pdftron-provided font resource plugin resides

Return type:: string
Returns:: a string, the current value for LayoutResourcesPluginPath.

GetLocale()[source]

Gets the value Locale from the options object ISO 639-1 code of the locale to be applied during conversion. For example: ‘en-US’, ‘ar-SA’, ‘de-DE’, etc. Currently only applied during xls/xlsx conversions.

Return type:: string
Returns:: The current value for Locale.

GetPassword()[source]

Gets the value Password from the options object. Password used to decrypt password-protected office documents.

Return type:: string
Returns:: The current value for Password.

GetResourceDocPath()[source]

Gets the value ResourceDocPath from the options object The path at which a .docx resource document resides

Return type:: string
Returns:: a string, the current value for ResourceDocPath.

GetSmartSubstitutionPluginPath()[source]

Gets the value SmartSubstitutionPluginPath from the options object The path at which the pdftron-provided font resource plugin resides

Return type:: string
Returns:: a string, the current value for SmartSubstitutionPluginPath.

GetStructureTagLevel()[source]

Gets the value StructureTagLevel from the options object Specifies the level of document structure tags to include in the PDF for accessibility purposes.

Return type:: int
Returns:: The current value for StructureTagLevel.

GetTemplateLeftDelimiter()[source]

Gets the value TemplateLeftDelimiter from the options object. Left delimiter for template tags. Defaults to ‘{{‘.

Return type:: string
Returns:: The current value for TemplateLeftDelimiter.

GetTemplateParamsJson()[source]

Gets the value TemplateParamsJson from the options object. JSON string representing the data to be merged into a PDFTron office template. For a more featureful template API, see CreateOfficeTemplate.

Return type:: string
Returns:: The current value for TemplateParamsJson.

GetTemplateRightDelimiter()[source]

Gets the value TemplateRightDelimiter from the options object. Right delimiter for template tags. Defaults to ‘}}’.

Return type:: string
Returns:: The current value for TemplateRightDelimiter.

GetTemplateStrictMode()[source]

Gets the value TemplateStrictMode from the options object. If “Strict Mode” is enabled, when a template key is missing from the json data an exception will be thrown. If “Strict Mode” is disabled (default), the tag will be replaced with no content.

Return type:: boolean
Returns:: The current value for TemplateStrictMode.

GetUpdateTableOfContents()[source]

Gets the value UpdateTableOfContents from the options object. Updates the table of contents in the document so it matches the actual locations of headings/bookmarks. By default, the table of contents is not updated. Enabling this option may negatively affect conversion speed.

Return type:: boolean
Returns:: The current value for UpdateTableOfContents.

SetApplyPageBreaksToSheet(value)[source]

Sets the value for ApplyPageBreaksToSheet in the options object. Whether we should split Excel worksheets into pages so that the output resembles print output. If set to false (the default), Excel sheets will be placed one per page, except in the case where the sheets are very large

Parameters:: value (boolean) – The new value for ApplyPageBreaksToSheet.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetDisplayChangeTracking(value)[source]

Sets the value for DisplayChangeTracking in the options object. If this option is true, will display office change tracking markup present in the document (i.e, red strikethrough of deleted content and underlining of new content). Otherwise displays the resolved document content, with no markup. Defaults to true.

Parameters:: value (boolean) – The new value for DisplayChangeTracking.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetDisplayComments(value)[source]

Sets the value for DisplayComments in the options object. Specifies the display of comments that are present in the document. By default, comments will not be displayed.

Parameters:: value (int) – The new value for DisplayComments.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetDisplayHiddenText(value)[source]

Sets the value for DisplayHiddenText in the options object. Display any hidden text that is present in the document (i.e., text that has been marked as ‘Hidden’ in the font style). By default, hidden text will not be displayed.

Parameters:: value (boolean) – The new value for DisplayHiddenText.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetExcelDefaultCellBorderWidth(value)[source]

Sets the value for ExcelDefaultCellBorderWidth in the options object. Cell border width for table cells that would normally be drawn with no border. In units of points. Can be used to achieve a similar effect to the “show gridlines” display option within Microsoft Excel.

Parameters:: value (double) – The new value for ExcelDefaultCellBorderWidth.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetExcelMaxAllowedCellCount(value)[source]

Sets the value for ExcelMaxAllowedCellCount in the options object. Conversion will throw an exception if the number of cells in a Microsoft Excel document is above the set MaxAllowedCellCount. Used for early termination of resource intensive conversions. Setting this value to 250000 will allow the vast majority of Excel documents to convert without issue, while keeping RAM usage to a reasonable level. By default there is no limit to the number of allowed cells.

Parameters:: value (int) – The new value for ExcelMaxAllowedCellCount.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetHideTotalNumberOfPages(value)[source]

Sets the value for HideTotalNumberOfPages in the options object. If the document has an element that displays the total number of pages and the total number of pages is unknown beforehand, remove those elements from the document.

Parameters:: value (boolean) – The new value for HideTotalNumberOfPages.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetIncludeBookmarks(value)[source]

Sets the value for IncludeBookmarks in the options object. When this option is set to false, Word document bookmarks will not be converted into PDF bookmarks. However, Word headings will still be automatically converted into PDF bookmarks. By default, both Word bookmarks and headings are converted into PDF bookmarks, providing a comprehensive navigation structure within the converted PDF.

Parameters:: value (boolean) – The new value for IncludeBookmarks.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetIncrementalSave(value)[source]

Sets the value for IncrementalSave in the options object. If this option is true, the document will be saved incrementally during the conversion, thus reducing the peak memory usage. Save an empty PDFDoc to the target location before the conversion so the incremental saving is done directly to the target location. Otherwise, a temporary file will be used. PDFDoc.Save still has to be called after the conversion is done to finalize the file. Doing PDFDoc.Save with e_incremental flag will reduce the saving time but increase the PDF file size.

Parameters:: value (boolean) – The new value for IncrementalSave.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetLayoutResourcesPluginPath(value)[source]

Sets the value for LayoutResourcesPluginPath in the options object The path at which the pdftron-provided font resource plugin resides

Parameters:: value: – the new value for LayoutResourcesPluginPath
Return type:: OfficeToPDFOptions
Returns:: this object, for call chaining

SetLocale(value)[source]

Sets the value for Locale in the options object. ISO 639-1 code of the locale to be applied during conversion. For example: ‘en-US’, ‘ar-SA’, ‘de-DE’, etc. Currently only applied during xls/xlsx conversions.

Parameters:: value (string) – The new value for Locale.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetPassword(value)[source]

Sets the value for Password in the options object. Password used to decrypt password-protected office documents.

Parameters:: value (string) – The new value for Password.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetResourceDocPath(value)[source]

Sets the value for ResourceDocPath in the options object The path at which a .docx resource document resides

Parameters:: value: – the new value for ResourceDocPath
Return type:: OfficeToPDFOptions
Returns:: this object, for call chaining

SetSmartSubstitutionPluginPath(value)[source]

Sets the value for SmartSubstitutionPluginPath in the options object The path at which the pdftron-provided font resource plugin resides

Parameters:: value: – the new value for SmartSubstitutionPluginPath
Return type:: OfficeToPDFOptions
Returns:: this object, for call chaining

SetStructureTagLevel(value)[source]

Sets the value for StructureTagLevel in the options object. Specifies the level of document structure tags to include in the PDF for accessibility purposes.

Parameters:: value (int) – The new value for StructureTagLevel.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetTemplateLeftDelimiter(value)[source]

Sets the value for TemplateLeftDelimiter in the options object. Left delimiter for template tags. Defaults to ‘{{‘.

Parameters:: value (string) – The new value for TemplateLeftDelimiter.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetTemplateParamsJson(value)[source]

Sets the value for TemplateParamsJson in the options object. JSON string representing the data to be merged into a PDFTron office template. For a more featureful template API, see CreateOfficeTemplate.

Parameters:: value (string) – The new value for TemplateParamsJson.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetTemplateRightDelimiter(value)[source]

Sets the value for TemplateRightDelimiter in the options object. Right delimiter for template tags. Defaults to ‘}}’.

Parameters:: value (string) – The new value for TemplateRightDelimiter.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetTemplateStrictMode(value)[source]

Sets the value for TemplateStrictMode in the options object. If “Strict Mode” is enabled, when a template key is missing from the json data an exception will be thrown. If “Strict Mode” is disabled (default), the tag will be replaced with no content.

Parameters:: value (boolean) – The new value for TemplateStrictMode.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

SetUpdateTableOfContents(value)[source]

Sets the value for UpdateTableOfContents in the options object. Updates the table of contents in the document so it matches the actual locations of headings/bookmarks. By default, the table of contents is not updated. Enabling this option may negatively affect conversion speed.

Parameters:: value (boolean) – The new value for UpdateTableOfContents.
Return type:: OfficeToPDFOptions
Returns:: This object, for call chaining.

e_annotations = 1: Display comments as annotations

e_default = 0: Default level of structure tags, good for accessibility.

e_none = 1: No structure tags. This can be used to get smaller file sizes.

e_off = 0: Default value. Display no comments.

property thisown: The membership flag

class apryse_sdk.Optimizer(args, kwargs)[source]

Bases: object

The Optimizer class provides functionality for optimizing/shrinking output PDF files.

‘pdftron.PDF.Optimizer’ is an optional PDFNet Add-On utility class that can be used to optimize PDF documents by reducing the file size, removing redundant information, and compressing data streams using the latest in image compression technology. PDF Optimizer can compress and shrink PDF file size with the following operations: - Remove duplicated fonts, images, ICC profiles, and any other data stream. - Optionally convert high-quality or print-ready PDF files to small, efficient and web-ready PDF. - Optionally down-sample large images to a given resolution. - Optionally compress or recompress PDF images using JBIG2 and JPEG2000 compression formats. - Compress uncompressed streams and remove unused PDF objects.

Notes: ‘Optimizer’ is available as a separately licensable add-on to PDFNet core license.

See ‘pdftron.PDF.Flattener’ for alternate approach to optimize PDFs for fast viewing on mobile devices and the Web.

static Optimize(args)[source]

property thisown: The membership flag

class apryse_sdk.OptimizerSettings[source]

Bases: object

A class that stores settings for the optimizer

RemoveCustomEntries(should_remove)[source]

Enable or disable removal of custom entries in the PDF. By default custom entries are removed.<summary>

Parameters:: should_remove (boolean) – if true custom entries will be removed.

SetColorImageSettings(settings)[source]: updates the settings for color image processing

SetGrayscaleImageSettings(settings)[source]: updates the settings for grayscale image processing

SetMonoImageSettings(settings)[source]: updates the settings for monochrome image processing

SetTextSettings(settings)[source]: updates the settings for text processing

property m_color_image_settings

property m_grayscale_image_settings

property m_mono_image_settings

property m_remove_custom

property m_text_settings

property thisown: The membership flag

class apryse_sdk.OutputOptionsOCR[source]

Bases: object

A class containing OCR options common to the ToHtml, ToWord, ToExcel, ToPowerPoint functions

static LanguageChoiceToString(language)[source]

static PreferredOCRChoiceToString(engine)[source]

e_engine_default = 0

e_engine_tesseract = 1

e_lang_auto = 0

e_lang_catalan = 1

e_lang_danish = 2

e_lang_dutch = 9

e_lang_english = 4

e_lang_finnish = 6

e_lang_french = 7

e_lang_german = 3

e_lang_italian = 8

e_lang_norwegian = 10

e_lang_polish = 12

e_lang_portuguese = 11

e_lang_romanian = 13

e_lang_russian = 14

e_lang_slovenian = 15

e_lang_spanish = 5

e_lang_swedish = 16

e_lang_turkish = 17

property thisown: The membership flag

class apryse_sdk.PDF2HtmlReflowParagraphsModule[source]

Bases: object

The class PDF2HtmlReflowParagraphsModule. static interface to PDFTron SDKs PDF to HTML functionality

static IsModuleAvailable()[source]

Find out whether the pdf2html module is available (and licensed).

Return type:: boolean
Returns:: returns true if pdf2html operations can be performed.

property thisown: The membership flag

class apryse_sdk.PDF2WordModule[source]

Bases: object

The class PDF2WordModule. static interface to PDFTron SDKs PDF to Word functionality

static IsModuleAvailable()[source]

Find out whether the pdf2word module is available (and licensed).

Return type:: boolean
Returns:: returns true if pdf2word operations can be performed.

property thisown: The membership flag

class apryse_sdk.PDFACompliance(args)[source]

Bases: object

PDFACompliance class is used to validate PDF documents for PDF/A (ISO 19005:1/2/3) compliance or to convert existing PDF files to PDF/A compliant documents.

The conversion option analyzes the content of existing PDF files and performs a sequence of modifications in order to produce a PDF/A compliant document. Features that are not suitable for long-term archiving (such as encryption, obsolete compression schemes, missing fonts, or device-dependent color) are replaced with their PDF/A compliant equivalents. Because the conversion process applies only necessary changes to the source file, the information loss is minimal. Also, because the converter provides a detailed report for each change, it is simple to inspect changes and to determine whether the conversion loss is acceptable.

The validation option in PDF/A Manager can be used to quickly determine whether a PDF file fully complies with the PDF/A specification according to the international standard ISO 19005:1/2/3. For files that are not compliant, the validation option can be used to produce a detailed report of compliance violations as well as a list of relevant error objects.

Key Functions: - Checks if a PDF file is compliant with PDF/A (ISO 19005:1/2/3) specification. - Converts any PDF to a PDF/A compliant document. - Supports PDF/A-1a, PDF/A-1b, PDF/A-2b - Produces a detailed report of compliance violations and associated PDF objects. - Keeps the required changes a minimum, preserving the consistency of the original. - Tracks all changes to allow for automatic assessment of data loss. - Allows user to customize compliance checks or omit specific changes. - Preserves tags, logical structure, and color information in existing PDF documents. - Offers automatic font substitution, embedding, and subsetting options. - Supports automation and batch operation. PDF/A Converter is designed to be used

in unattended mode in high throughput server or batch environments

Destroy()[source]: Frees the native memory of the object.

static GetDeclaredConformance(in_doc)[source]

Retrieves whether document’s XMP metadata claims PDF/A conformance and to what part and level.

Parameters:: in_doc (PDFDoc) – the document
Return type:: int
Returns:: Presumptive PDFA part number and conformance level, as an enumerated value.

GetError(idx)[source]

Return type:: int
Returns:: The error identifier.
Parameters:: idx (int) – The index in the array of error code identifiers. The array is indexed starting from zero.
Raises:: throws an Exception if the index is outside the array bounds.

GetErrorCount()[source]

Return type:: int
Returns:: The number of compliance violations.

static GetPDFAErrorMessage(id)[source]

Parameters:: id (int) – error code identifier (obtained using GetError() method).
Return type:: string
Returns:: A descriptive error message for the given error identifier.

GetRefObj(id, err_idx)[source]

Return type:

int

Returns:

A specific object reference associated with a given error type. The return value is a PDF object identifier (i.e. object number for ‘pdftron.SDF.Obj)) for the that is associated with the error.

Parameters:

id (int) – error code identifier (obtained using GetError() method).
err_idx (int) – The index in the array of object references. The array is indexed starting from zero.

Raises:

throws an Exception if the index is outside the array bounds.

GetRefObjCount(id)[source]

Return type:: int
Returns:: The number of object references associated with a given error.
Parameters:: id (int) – error code identifier (obtained using GetError() method).

SaveAs(args)[source]

Overload 1:

Serializes the converted PDF/A document to a file on disk. Notes: This method assumes that the first parameter passed in PDFACompliance

ructor (i.e. the convert parameter) is set to ‘true’.

type file_path:

string

param file_path:

the output file name.

type linearized:

boolean, optional

param linearized:

An optional flag used to specify whether the resulting

PDF/A document should be web-optimized (linearized).

Overload 2:

Serializes the converted PDF/A document to a memory buffer. Notes: This method assumes that the first parameter passed in PDFACompliance

ructor (i.e. the convert parameter) is set to ‘true’.

type linearized:

boolean, optional

param linearized:

An optional flag used to specify whether the resulting

PDF/A document should be web-optimized (linearized).

rtype:

std::vector< unsigned char,std::allocator< unsigned char > >

return:

The converted document saved as a memory buffer.

Overload 3:

Serializes the converted PDF/A document to a memory buffer. Notes: This method assumes that the first parameter passed in PDFACompliance

ructor (i.e. the convert parameter) is set to ‘true’.

param linearized:

An optional flag used to specify whether the resulting

PDF/A document should be web-optimized (linearized).

rtype:

std::vector< unsigned char,std::allocator< unsigned char > >

return:

The converted document saved as a memory buffer.

e_Level1A = 1

e_Level1B = 2

e_Level2A = 3

e_Level2B = 4

e_Level2U = 5

e_Level3A = 6

e_Level3B = 7

e_Level3U = 8

e_Level4 = 9

e_Level4E = 10

e_Level4F = 11

e_NoConformance = 0

e_PDFA0_1_0 = 10: Invalid PDF structure.

e_PDFA0_1_1 = 11: Corrupt document.

e_PDFA0_1_2 = 12: Corrupt content stream.

e_PDFA0_1_3 = 13: Using JPEG2000 compression (PDF 1.4 compatibility).

e_PDFA0_1_4 = 14: Contains compressed object streams (PDF 1.4 compatibility).

e_PDFA0_1_5 = 15: Contains cross-reference streams (PDF 1.4 compatibility).

e_PDFA11_0_0 = 11000: Catalog contains Requirements key.

e_PDFA1_10_1 = 1101: Using LZW compression.

e_PDFA1_10_2 = 1102: Invalid use of Crypt filter.

e_PDFA1_10_3 = 1103: Bad stream Filter.

e_PDFA1_11_1 = 1111: A file specification dictionary contains a non-compliant embedded file (EF key).

e_PDFA1_11_2 = 1112: Contains the EmbeddedFiles key

e_PDFA1_12_1 = 1121: Array contains more than 8191 elements

e_PDFA1_12_10 = 11210: Bad Permission Dictionary

e_PDFA1_12_2 = 1122: Dictionary contains more than 4095 elements

e_PDFA1_12_3 = 1123: Name with more than 127 bytes

e_PDFA1_12_4 = 1124: Contains an integer value outside of the allowed range [-2^31, 2^31-1],

e_PDFA1_12_5 = 1125: Exceeds the maximum number (8,388,607) of indirect objects in a PDF file.

e_PDFA1_12_6 = 1126: The number of nested q/Q operators is greater than 28.

e_PDFA1_13_1 = 1131: Optional content (layers) not allowed.

e_PDFA1_13_5 = 1135: Page dimensions are outside of the allowed range (3-14400).

e_PDFA1_2_1 = 121: Document does not start with % character.

e_PDFA1_2_2 = 122: File header line not followed by % and 4 characters > 127.

e_PDFA1_2_3 = 123: Bad file header.

e_PDFA1_3_1 = 131: The trailer dictionary does not contain ID.

e_PDFA1_3_2 = 132: Trailer dictionary contains Encrypt.

e_PDFA1_3_3 = 133: Data after last EOF marker.

e_PDFA1_3_4 = 134

ID in 1st page and last trailer are different.

Type:: Linearized file

e_PDFA1_4_1 = 141

starting object number and range not separated by a single space.

Type:: Subsection header

e_PDFA1_4_2 = 142: ‘xref’ and cross reference subsection header not separated by a single EOL marker.

e_PDFA1_6_1 = 161: Invalid hexadecimal strings used.

e_PDFA1_7_1 = 171: The ‘stream’ token is not followed by CR and LF or a single LF.

e_PDFA1_7_2 = 172: The ‘endstream’ token is not preceded by EOL.

e_PDFA1_7_3 = 173: The value of Length does not match the number of bytes.

e_PDFA1_7_4 = 174: A stream object dictionary contains the F, FFilter, or FDecodeParms keys.

e_PDFA1_8_1 = 181: Object number and generation number are not separated by a single white-space.

e_PDFA1_8_2 = 182: Generation number and ‘obj’ are not separated by a single white-space.

e_PDFA1_8_3 = 183: Object number not preceded by EOL marker

e_PDFA1_8_4 = 184: ‘endobj’ not preceded by EOL marker

e_PDFA1_8_5 = 185: ‘obj’ not followed by EOL marker

e_PDFA1_8_6 = 186: ‘endobj’ not followed by EOL marker

e_PDFA1_8_7 = 187: Invalid UTF8 string

e_PDFA2_10_1 = 2101: Illegal operator.

e_PDFA2_10_20 = 21020: Page Group entry is missing in a document without OutputIntent.

e_PDFA2_10_21 = 21021: Invalid blend mode.

e_PDFA2_2_1 = 221: DestOutputProfile-s in OutputIntents array do not match.

e_PDFA2_3_10 = 2310: Contains DestOutputProfileRef

e_PDFA2_3_2 = 232: Not a valid ICC color profile.

e_PDFA2_3_3 = 233: The N entry does not match the number of color components in the embedded ICC profile.

e_PDFA2_3_3_1 = 2331: Device-specific color space used, but no GTS_PDFA1 OutputIntent.

e_PDFA2_3_3_2 = 2332: Device-specific color space, does not match OutputIntent.

e_PDFA2_3_4_1 = 2341: Device-specific color space used in an alternate color space.

e_PDFA2_4_1 = 241: Image with Alternates key.

e_PDFA2_4_2 = 242: Image with OPI key.

e_PDFA2_4_2_10 = 24220: OPM is 1

e_PDFA2_4_2_11 = 24221: Incorrect colorant specification in DeviceN

e_PDFA2_4_2_12 = 24222: tintTransform is different in Separations with the same colorant name.

e_PDFA2_4_2_13 = 24223: alternateSpace is different in Separations with the same colorant name.

e_PDFA2_4_3 = 243: Image with invalid rendering intent.

e_PDFA2_4_4 = 244: Image with Interpolate key set to true.

e_PDFA2_5_1 = 251: XObject with OPI key.

e_PDFA2_5_10 = 2510: HTP entry in ExtGState.

e_PDFA2_5_11 = 2511: Unsupported HalftoneType.

e_PDFA2_5_12 = 2512: Uses HalftoneName key.

e_PDFA2_5_2 = 252: PostScript XObject.

e_PDFA2_6_1 = 261: Contains a reference XObject.

e_PDFA2_7_1 = 271: Contains an XObject that is not supported (e.g. PostScript XObject).

e_PDFA2_8_1 = 281: Contains an invalid Transfer Curve in the extended graphics state.

e_PDFA2_8_3_1 = 2831

Only the JPX baseline is supported.

Type:: JPEG2000

e_PDFA2_8_3_2 = 2832

Invalid number of colour channels.

Type:: JPEG2000

e_PDFA2_8_3_3 = 2833

Invalid color space.

Type:: JPEG2000

e_PDFA2_8_3_4 = 2834

The bit-depth JPEG2000 data must be in range 1-38.

Type:: JPEG2000

e_PDFA2_8_3_5 = 2835

All colour channels in the JPEG2000 data must have the same bit-depth.

Type:: JPEG2000

e_PDFA2_9_1 = 291: Use of an invalid rendering intent.

e_PDFA3_2_1 = 321: Embedded font is damaged.

e_PDFA3_3_1 = 331: Incompatible CIDSystemInfo entries

e_PDFA3_3_2 = 332: Type 2 CIDFont without CIDToGIDMap

e_PDFA3_3_3_1 = 3331: CMap not embedded

e_PDFA3_3_3_2 = 3332: Inconsistent WMode in embedded CMap dictionary and stream.

e_PDFA3_4_1 = 341: The font is not embedded.

e_PDFA3_5_1 = 351: Embedded composite (Type0) font program does not define all font glyphs.

e_PDFA3_5_2 = 352: Embedded Type1 font program does not define all font glyphs.

e_PDFA3_5_3 = 353: Embedded TrueType font program does not define all font glyphs.

e_PDFA3_5_4 = 354: The font descriptor dictionary does not include a CIDSet stream for CIDFont subset.

e_PDFA3_5_5 = 355: The font descriptor dictionary does not include a CharSet string for Type1 font subset.

e_PDFA3_5_6 = 356: CIDSet in subset font is incomplete.

e_PDFA3_6_1 = 361: Widths in embedded font are inconsistent with /Widths entry in the font dictionary.

e_PDFA3_7_1 = 371: A non-symbolic TrueType font must use WinAnsiEncoding or MacRomanEncoding.

e_PDFA3_7_2 = 372: A symbolic TrueType font must not specify encoding.

e_PDFA3_7_3 = 373: A symbolic TrueType font does not have exactly one entry in cmap table.

e_PDFA3_8_1 = 381: The font dictionary is missing ‘ToUnicode’ entry.

e_PDFA4_1 = 41: Transparency used (ExtGState with soft mask).

e_PDFA4_2 = 42: Transparency used (XObject with soft mask).

e_PDFA4_3 = 43: Transparency used (Page or Form XObject with transparency group).

e_PDFA4_4 = 44: Transparency used (Blend mode is not ‘Normal’).

e_PDFA4_5 = 45: Transparency used (‘CA’ value is not 1.0).

e_PDFA4_6 = 46: Transparency used (‘ca’ value is not 1.0).

e_PDFA5_2_1 = 521: Unknown annotation type.

e_PDFA5_2_10 = 5210: PolyLine annotation is not permitted.

e_PDFA5_2_11 = 5211: Screen annotation is not permitted.

e_PDFA5_2_2 = 522: FileAttachment annotation is not permitted.

e_PDFA5_2_3 = 523: Sound annotation is not permitted.

e_PDFA5_2_4 = 524: Movie annotation is not permitted.

e_PDFA5_2_5 = 525: Redact annotation is not permitted.

e_PDFA5_2_6 = 526: 3D annotation is not permitted.

e_PDFA5_2_7 = 527: Caret annotation is not permitted.

e_PDFA5_2_8 = 528: Watermark annotation is not permitted.

e_PDFA5_2_9 = 529: Polygon annotation is not permitted.

e_PDFA5_3_1 = 531: An annotation dictionary contains the CA key with a value other than 1.0.

e_PDFA5_3_2_1 = 5321: An annotation dictionary is missing F key.

e_PDFA5_3_2_2 = 5322: An annotation’s ‘Print’ flag is not set.

e_PDFA5_3_2_3 = 5323: An annotation’s ‘Hidden’ flag is set.

e_PDFA5_3_2_4 = 5324: An annotation’s ‘Invisible’ flag is set.

e_PDFA5_3_2_5 = 5325: An annotation’s ‘NoView’ flag is set.

e_PDFA5_3_3_1 = 5331: An annotation’s C entry present but no OutputIntent present

e_PDFA5_3_3_2 = 5332: An annotation’s C entry present but OutputIntent has non-RGB destination profile

e_PDFA5_3_3_3 = 5333: An annotation’s IC entry present but no OutputIntent present

e_PDFA5_3_3_4 = 5334: An annotation’s IC entry present and OutputIntent has non-RGB destination profile

e_PDFA5_3_4_0 = 5340: Annotation is missing AP entry.

e_PDFA5_3_4_1 = 5341: An annotation AP dictionary has entries other than the N entry.

e_PDFA5_3_4_2 = 5342: An annotation AP dictionary does not contain N entry

e_PDFA5_3_4_3 = 5343: AP has an N entry whose value is invalid.

e_PDFA6_10_0 = 6100: PresSteps is not allowed

e_PDFA6_10_1 = 6101: AlternatePresentations not allowed

e_PDFA6_1_1 = 611: Contains an action type that is not permitted.

e_PDFA6_1_2 = 612: Contains a non-predefined Named action.

e_PDFA6_2_1 = 621: The document catalog dictionary contains AA entry.

e_PDFA6_2_11_5 = 62115: Some characters map to 0 or FFFE.

e_PDFA6_2_11_6 = 62116: Some text can’t be mapped to Unicode

e_PDFA6_2_11_7 = 62117: PUA characters are missing ActualText

e_PDFA6_2_11_8 = 62118: Use of .notdef glyph

e_PDFA6_2_2 = 622: Contains the JavaScript key.

e_PDFA6_2_3 = 623: Invalid destination.

e_PDFA6_9_1 = 69001: Optional content Missing Name entry

e_PDFA6_9_3 = 69003: Optional content Contains AS entry

e_PDFA7_11_1 = 7111: Missing PDF/A identifier

e_PDFA7_11_2 = 7112: Invalid PDF/A identifier namespace

e_PDFA7_11_3 = 7113: Invalid PDF/A conformance level.

e_PDFA7_11_4 = 7114: Invalid PDF/A part number.

e_PDFA7_11_5 = 7115: Invalid PDF/A amendment identifier.

e_PDFA7_2_1 = 721: The document catalog does not contain Metadata stream.

e_PDFA7_2_2 = 722: The Metadata object stream contains Filter key.

e_PDFA7_2_3 = 723: The XMP Metadata stream is not valid.

e_PDFA7_2_4 = 724: XMP property not predefined and no extension schema present.

e_PDFA7_2_5 = 725: XMP not included in ‘xpacket’.

e_PDFA7_3_1 = 731: Document information entry ‘Title’ not synchronized with XMP.

e_PDFA7_3_2 = 732: Document information entry ‘Author’ not synchronized with XMP.

e_PDFA7_3_3 = 733: Document information entry ‘Subject’ not synchronized with XMP.

e_PDFA7_3_4 = 734: Document information entry ‘Keywords’ not synchronized with XMP.

e_PDFA7_3_5 = 735: Document information entry ‘Creator’ not synchronized with XMP.

e_PDFA7_3_6 = 736: Document information entry ‘Producer’ not synchronized with XMP.

e_PDFA7_3_7 = 737: Document information entry ‘CreationDate’ not synchronized with XMP.

e_PDFA7_3_8 = 738: Document information entry ‘ModDate’ not synchronized with XMP.

e_PDFA7_3_9 = 739: Wrong value type for predefined XMP property.

e_PDFA7_5_1 = 751: ‘bytes’ and ‘encoding’ attributes are allowed in the header of an XMP packet.

e_PDFA7_8_1 = 781: XMP Extension schema doesn’t have a description.

e_PDFA7_8_10 = 7810

valueType’ not found.

Type:: ‘pdfaProperty

e_PDFA7_8_11 = 7811: The required namespace prefix for extension schema is ‘pdfaExtension’.

e_PDFA7_8_12 = 7812: The required field namespace prefix is ‘pdfaSchema’.

e_PDFA7_8_13 = 7813: The required field namespace prefix is ‘pdfaProperty’.

e_PDFA7_8_14 = 7814: The required field namespace prefix is ‘pdfaType’.

e_PDFA7_8_15 = 7815: The required field namespace prefix is ‘pdfaField’.

e_PDFA7_8_16 = 7816

valueType’ not found.

Type:: ‘pdfaSchema

e_PDFA7_8_17 = 7817

valueType’ is using a wrong value type.

Type:: ‘pdfaSchema

e_PDFA7_8_18 = 7818: Required property ‘valueType’ missing in PDF/A Schema Value Type.

e_PDFA7_8_19 = 7819: ‘pdfaType :type’ not found.

e_PDFA7_8_2 = 782: XMP Extension schema is not valid. Required property ‘namespaceURI’ might be missing in PDF/A Schema value Type.

e_PDFA7_8_20 = 7820: ‘pdfaType :type’ is using a wrong value type.

e_PDFA7_8_21 = 7821

description’ not found.

Type:: ‘pdfaType

e_PDFA7_8_22 = 7822

namespaceURI’ not found.

Type:: ‘pdfaType

e_PDFA7_8_23 = 7823

field’ is using a wrong value type.

Type:: ‘pdfaType

e_PDFA7_8_24 = 7824

name’ not found.

Type:: ‘pdfaField

e_PDFA7_8_25 = 7825

name’ is using a wrong value type.

Type:: ‘pdfaField

e_PDFA7_8_26 = 7826

valueType’ not found.

Type:: ‘pdfaField

e_PDFA7_8_27 = 7827

valueType’ is using a wrong type.

Type:: ‘pdfaField

e_PDFA7_8_28 = 7828

description’ not found.

Type:: ‘pdfaField

e_PDFA7_8_29 = 7829

description’ is using a wrong type.

Type:: ‘pdfaField

e_PDFA7_8_3 = 783

schemas’ not found.

Type:: ‘pdfaExtension

e_PDFA7_8_30 = 7830: Required description for ‘pdfaField::valueType’ is missing.

e_PDFA7_8_31 = 7831: A property doesn’t match its custom schema type.

e_PDFA7_8_4 = 784

schemas’ is using a wrong value type.

Type:: ‘pdfaExtension

e_PDFA7_8_5 = 785

property’ not found.

Type:: ‘pdfaExtension

e_PDFA7_8_6 = 786

property’ is using a wrong value type.

Type:: ‘pdfaExtension

e_PDFA7_8_7 = 787

name’ not found.

Type:: ‘pdfaProperty

e_PDFA7_8_8 = 788

name’ is using a wrong value type.

Type:: ‘pdfaProperty

e_PDFA7_8_9 = 789

property’ sequence.

Type:: A description for a property is missing in ‘pdfaSchema

e_PDFA8_1 = 81: FileSpec is missing F or UF key

e_PDFA8_2_2 = 822: The PDF is not marked as Tagged PDF.

e_PDFA8_3_3_1 = 8331: Bad StructTreeRoot

e_PDFA8_3_3_2 = 8332: Each structure element dictionary in the structure hierarchy must have a Type entry with the name value of StructElem.

e_PDFA8_3_4_1 = 8341: A non-standard structure type does not map to a standard type.

e_PDFA9_1 = 91: An interactive form field contains an action.

e_PDFA9_2 = 92: The NeedAppearances flag in the interactive form dictionary is set to true.

e_PDFA9_3 = 93: AcroForms contains XFA.

e_PDFA9_4 = 94: Catalog contains NeedsRendering.

e_PDFA_3E1 = 1: Embedded file has no MIME type entry

e_PDFA_3E1_1 = 101: Embedded file Params has no ModDate entry

e_PDFA_3E2 = 2: Embedded file has no AFRelationship

e_PDFA_3E3 = 3: Doc catalog is missing AF entry

e_PDFA_4_6_1_12_1 = 461121: If the Version key is present in the document catalog dictionary, the first character in its value shall be a 2 (32h) and the second character of its value shall be a PERIOD (2Eh) (decimal point). The third character shall be a decimal digit. The number of characters of the value of the Version key shall be exactly 3.

e_PDFA_4_6_1_3_4 = 46134: The Info key shall not be present in the trailer dictionary unless there exists a PieceInfo entry in the document catalog dictionary.

e_PDFA_4_6_1_3_5 = 46135: If a document information dictionary is present, it shall only contain a ModDate entry.

e_PDFA_4_6_1_6_1_3 = 461613: 3D stream shall have a Subtype entry with a value which is either U3D or PRC.

e_PDFA_4_6_2_10_6_1 = 4621061: For all non-symbolic TrueType fonts used for rendering, the embedded TrueType font program shall contain at least Microsoft Unicode (3,1 - Platform ID=3, Encoding ID=1), or Macintosh Roman (1,0 - Platform ID=1, Encoding ID=0) ‘cmap’ subtable that all necessary glyph lookups are able to be carried out

e_PDFA_4_6_2_10_6_4 = 4621064: Symbolic TrueType fonts shall not contain an Encoding entry in the font dictionary, and the ‘cmap’ subtable in the embedded font program shall either contain the Microsoft Symbol (3,0 - Platform ID=3, Encoding ID=0) or the Mac Roman (1,0 - Platform ID=1, Encoding ID=0) encoding.

e_PDFA_4_6_2_2_3 = 46223: A content stream’s named resource not defined by a resource dictionary

e_PDFA_4_6_2_4_2_3 = 462423: An ICCBased CMYK color space is identical to the current PDF/A OutputIntent color profile or the current transparency blending color space.

e_PDFA_4_6_2_5_3 = 46253: HTO entry in ExtGState.

e_PDFA_4_6_6_3_1 = 46631

E, X, D, U, Fo and Bl.

Type:: A document catalog or a page dictionary contains an AA entry and its value contains key(s) not from the following list

e_PDFA_4_6_7_3_5 = 46735: Invalid PDF/A revision.

e_PDFA_4_6_9_5 = 4695: Doc catalog is missing EmbeddedFiles key

e_PDFA_LAST = 4621065

property mp_pdfac

property thisown: The membership flag

class apryse_sdk.PDFAOptions(level)[source]

Bases: object

GetConformance()[source]

Gets the value Conformance from the options object The PDF/A conformance level.

Return type:: int
Returns:: a PDFACompliance::Conformance, the current value for Conformance.

GetDPI()[source]

Gets the value DPI from the options object DPI used for flattening.

Return type:: int
Returns:: a UInt32, the current value for DPI.

GetFirstStop()[source]

Gets the value FirstStop from the options object Whether to stop processing after the first PDF/A error is detected.

Return type:: boolean
Returns:: a bool, the current value for FirstStop.

GetFlattenTransparency()[source]

Gets the value FlattenTransparency from the options object Whether to flatten transparency in PDF/A-1 mode.

Return type:: boolean
Returns:: a bool, the current value for FlattenTransparency.

GetMaxRefObjs()[source]

Gets the value MaxRefObjs from the options object The maximum number of object references per error condition.

Return type:: int
Returns:: a UInt32, the current value for MaxRefObjs.

GetPassword()[source]

Gets the value Password from the options object The password to be used for encrypted PDF documents.

Return type:: string
Returns:: a string, the current value for Password.

SetConformance(value)[source]

Sets the value for Conformance in the options object The PDF/A conformance level.

Parameters:: value: – the new value for Conformance
Return type:: PDFAOptions
Returns:: this object, for call chaining

SetDPI(value)[source]

Sets the value for DPI in the options object DPI used for flattening.

Parameters:: value: – the new value for DPI
Return type:: PDFAOptions
Returns:: this object, for call chaining

SetFirstStop(value)[source]

Sets the value for FirstStop in the options object Whether to stop processing after the first PDF/A error is detected.

Parameters:: value: – the new value for FirstStop
Return type:: PDFAOptions
Returns:: this object, for call chaining

SetFlattenTransparency(value)[source]

Sets the value for FlattenTransparency in the options object Whether to flatten transparency in PDF/A-1 mode.

Parameters:: value: – the new value for FlattenTransparency
Return type:: PDFAOptions
Returns:: this object, for call chaining

SetMaxRefObjs(value)[source]

Sets the value for MaxRefObjs in the options object The maximum number of object references per error condition.

Parameters:: value: – the new value for MaxRefObjs
Return type:: PDFAOptions
Returns:: this object, for call chaining

SetPassword(value)[source]

Sets the value for Password in the options object The password to be used for encrypted PDF documents.

Parameters:: value: – the new value for Password
Return type:: PDFAOptions
Returns:: this object, for call chaining

property thisown: The membership flag

class apryse_sdk.PDFDoc(args)[source]

Bases: object

PDFDoc is a high-level class describing a single PDF (Portable Document Format) document. Most applications using PDFNet will use this class to open existing PDF documents, or to create new PDF documents from scratch.

The class offers a number of entry points into the document. For example,

To access pages use pdfdoc.GetPageIterator() or pdfdoc.GetPage(page_num).
To access form fields use pdfdoc.GetFieldIterator(), pdfdoc.GetFieldIterator(name) or pdfdoc.GetField(name).
To access document’s meta-data use pdfdoc.GetDocInfo().
To access the outline tree use pdfdoc.GetFirstBookmark().
To access low-level Document Catalog use pdfdoc.GetRoot().

…

The class also offers utility methods to slit and merge PDF pages, to create new pages, to flatten forms, to change security settings, etc.

AddFileAttachment(file_key, embedded_file)[source]

Associates a file attachment with the document.

The file attachment will be displayed in the user interface of a viewer application (in Acrobat this is File Attachment tab). The function differs from Annot.CreateFileAttachment() because it associates the attachment with the whole document instead of an annotation on a specific page.

Parameters:

file_key (string) – A key/name under which the attachment will be stored.
embedded_file (FileSpec) – Embedded file stream

Notes: Another way to associate a file attachment with the document is using SDF::NameTree:

SDF::NameTree names = SDF::NameTree::Create(doc, "EmbeddedFiles");
names.Put(file_key, file_keysz, embedded_file.GetSDFObj());

AddHighlights(hilite)[source]

AddHighlights is used to highlight text in a document using ‘Adobe’s Highlight File Format’ (Technical Note #5172 ). The method will parse the character offset data and modify the current document by adding new highlight annotations.

Parameters:: hilite (string) – a string representing the filename for the highlight file or or a data buffer containing XML data.
Raises:: An exception will be thrown if the XML file is malformed or os out of sync with the document.

AddRootBookmark(root_bookmark)[source]

Adds/links the specified Bookmark to the root level of document’s outline tree.

Parameters:: root_bookmark (Bookmark) – Bookmark to Add/link

Notes: parameter ‘root_bookmark’ must not be linked (must not be belong) to a bookmark tree.

AddSignatureHandler(signature_handler)[source]

Adds a signature handler to the signature manager.

Parameters:: signature_handler (SignatureHandler) – The signature handler instance to add to the signature manager.
Return type:: int
Returns:: A unique ID representing the SignatureHandler within the SignatureManager.

AddStdSignatureHandler(args)[source]

Overload 1:

Adds a standard (built-in) signature handler to the signature manager. This method will use cryptographic algorithm based on Adobe.PPKLite/adbe.pkcs7.detached filter to sign a PDF.

Parameters:

pkcs12_file – The private key certificate store to use.
pkcs12_pass – The passphrase for the provided private key.

Return type:

int

Returns:

A unique ID representing the SignatureHandler within the SignatureManager.

Overload 2:

Adds a standard (built-in) signature handler to the signature manager. This method will use cryptographic algorithm based on Adobe.PPKLite/adbe.pkcs7.detached filter to sign a PDF.

Parameters:

pkcs12_keybuffer (std::vector< unsigned char,std::allocator< unsigned char > >) – The private key certificate store to use (as a data buffer in an array of bytes).
pkcs12_pass – The passphrase for the provided private key.

Return type:

int

Returns:

A unique ID representing the SignatureHandler within the SignatureManager.

AppendTextDiff(args)[source]

Overload 1:

Imports two external pages and highlights the differences between them. This function adds two new pages to the current document. The two input pages are typically coming from two different PDF files. Note: Each contiguous block of change is considered a single difference. A deletion immediately followed by an insertion is considered a single edit.

Parameters:

page1 (Page) – is the before page, the basis of the comparison (read-only)
page2 (Page) – is the after page, to which the basis is compared (read-only)

Return type:

int

Returns:

the total number of differences found

Overload 2:

Imports two external PDFs and highlights the differences between them. This function appends alternating pages from the two input documents into the current document. Note: Each contiguous block of change is considered a single difference. A deletion immediately followed by an insertion is considered a single edit.

Parameters:

doc1 (PDFDoc) – is the before document, the basis of the comparison (read-only)
doc2 (PDFDoc) – is the after document, to which the basis is compared (read-only)

Return type:

int

Returns:

the total number of differences found

Overload 3:

Imports two external PDFs and highlights the differences between them. This function appends alternating pages from the two input documents into the current document. Note: Each contiguous block of change is considered a single difference. A deletion immediately followed by an insertion is considered a single edit.

Parameters:

doc1 (PDFDoc) – is the before document, the basis of the comparison (read-only)
doc2 (PDFDoc) – is the after document, to which the basis is compared (read-only)
options (TextDiffOptions) – processing options (optional)

Return type:

int

Returns:

the total number of differences found

AppendVisualDiff(p1, p2, opts)[source]

Generates a PDF diff of the given pages by overlaying and blending them on top of each other.

Parameters:

p1 (Page) – one of the two pages for comparing.
p2 (Page) – the other page for comparing.
opts (DiffOptions) – options for comparison results.

Close()[source]: Close PDFDoc

CreateDigitalSignatureField(args)[source]

Creates an unsigned digital signature form field inside the document.

Parameters:: in_sig_field_name (string, optional) – The fully-qualified name to give the digital signature field. If one is not provided, a unique name is created automatically.
Return type:: DigitalSignatureField
Returns:: A DigitalSignatureField object representing the created digital signature field.

CreateIndirectArray()[source]

This method creates an SDF/Cos indirect array object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect array object.

CreateIndirectBool(value)[source]

This method creates an SDF/Cos indirect boolean object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect boolean object.
Parameters:: value (boolean) – the value with which to create the boolean object.

CreateIndirectDict()[source]

This method creates an SDF/Cos indirect dict object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect dict object.

CreateIndirectName(name)[source]

This method creates an SDF/Cos indirect name object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

CreateIndirectNull()[source]

This method creates an SDF/Cos indirect null object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect null object.

CreateIndirectNumber(value)[source]

This method creates an SDF/Cos indirect number object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect number object.
Parameters:: value (double) – the value with which to create the number object.

CreateIndirectStream(args)[source]

Overload 1:

This method creates an SDF/Cos indirect stream object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:

Returns:

Returns a new indirect stream object.

Parameters:

data (FilterReader) – reference to a FilterReader object with which to create the stream object.
filter_chain (Filter, optional) – filter object with which to create the stream object. Defaults to Filters::Filter(0,false)

Overload 2:

This method creates an SDF/Cos indirect stream object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:

Returns:

Returns a new indirect stream object.

Parameters:

data (string) – a buffer from which to create the stream object.
data_size (int) – size of the buffer.
filter_chain (Filter, optional) – filter object with which to create the stream object. Defaults to Filters::Filter(0,false)

Overload 3:

This method creates an SDF/Cos indirect stream object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:

Returns:

Returns a new indirect stream object.

Parameters:

data (string) – a buffer from which to create the stream object.
data_size (int) – size of the buffer.
filter_chain – filter object with which to create the stream object. Defaults to Filters::Filter(0,false)

CreateIndirectString(args)[source]

Overload 1:

This method creates an SDF/Cos indirect string object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:

Returns:

Returns a new indirect string object.

Parameters:

value (UChar) – Unsigned char pointer with which to create the string object.
size (int) – length of string.

Overload 2:

This method creates an SDF/Cos indirect string object

Unlike direct objects, indirect objects can be referenced by more than one object (i.e. indirect objects can be shared).

Return type:: Obj
Returns:: Returns a new indirect string object.
Parameters:: str (string) – reference to string with which to create the string object.

static CreateInternal(impl)[source]

FDFExtract(args)[source]

Overload 1:

Extract form data and/or annotations to FDF

Parameters:

flag (int, optional) –

specifies extract options

Return type:

Returns:

a pointer to the newly created FDF file with an interactive data.

Overload 2:

Extract form data and/or annotations to FDF

Parameters:

pages_to_extract (PageSet) – The set of pages for which to extract interactive data.
flag (int, optional) – specifies extract options

Return type:

Returns:

a pointer to the newly created FDF file with an interactive data.

Overload 3:

Extract form data and/or annotations to FDF

Parameters:

pages_to_extract (PageSet) – The set of pages for which to extract interactive data.
flag – specifies extract options

Return type:

Returns:

a pointer to the newly created FDF file with an interactive data.

Overload 4:

Extract selected annotations to FDF

Parameters:: annotations (std::vector< PDF::Annot,std::allocator< PDF::Annot > >) – the annotation(s) to extract
Return type:: FDFDoc
Returns:: a pointer to the newly created FDF file with the interactive data.

Overload 5:

Extract annotations to FDF

Parameters:

annot_added (std::vector< PDF::Annot,std::allocator< PDF::Annot > >) – specifies the array of added annotations
annot_modified (std::vector< PDF::Annot,std::allocator< PDF::Annot > >) – specifies the array of modified annotations
annot_deleted (std::vector< PDF::Annot,std::allocator< PDF::Annot > >) – specifies the array of deleted annotations

Return type:

Returns:

a pointer to the newly created FDF file with an interactive data.

FDFMerge(fdf_doc)[source]

Import form data from FDF file to PDF interactive form.

Parameters:: fdf_doc (FDFDoc) – a reference to the FDF file

FDFUpdate(fdf_doc)[source]

Replace existing form and annotation data with those imported from the FDF file. It will make annotations in the FDF match those in the PDF. Since this method avoids updating annotations unnecessarily it works well with incremental save and can sometimes preserve annotation appearances, but it requires that the annotations intended to be in the final document be in the provided FDF file. Notes: Some PDF viewers (like Chrome) cannot display annotations that don’t already have an appearance, so it is often desirable to call PDFDoc.RefreshAnnotAppearances after this method to ensure these annotations can still be displayed in those applications. This method is not suitable for realtime collaboration.

Parameters:: fdf_doc (FDFDoc) – a pointer to the FDF file

FieldCreate(args)[source]

Overload 1:

Create a new interactive form Field.

Parameters:

field_name (string) – a string representing the fully qualified name of the field (e.g. “employee.name.first”). field_name must be either a unique name or equal to an existing terminal field name.
type (int) – field type (e.g. Field::e_text, Field::e_button, etc.)
field_value (Obj, optional) –
def_field_value (Obj, optional) –

Return type:

Returns:

the new form Field.

Raises:

if ‘field_name’ is equal to an existing non-terminal field name an exception is thrown.

Overload 2:

Create a new interactive form Field.

Parameters:

field_name (string) – a string representing the fully qualified name of the field (e.g. “employee.name.first”). field_name must be either a unique name or equal to an existing terminal field name.
type (int) – field type (e.g. Field::e_text, Field::e_button, etc.)
field_value (string) –
def_field_value (string, optional) –

Return type:

Returns:

the new form Field.

Raises:

if ‘field_name’ is equal to an existing non-terminal field name an exception is thrown.

Overload 3:

Create a new interactive form Field.

Parameters:

field_name (string) – a string representing the fully qualified name of the field (e.g. “employee.name.first”). field_name must be either a unique name or equal to an existing terminal field name.
type (int) – field type (e.g. Field::e_text, Field::e_button, etc.)
field_value (string) –
def_field_value –

Return type:

Returns:

the new form Field.

Raises:

if ‘field_name’ is equal to an existing non-terminal field name an exception is thrown.

FlattenAnnotations(forms_only=False)[source]

Flatten all annotations in the document.

Parameters:: forms_only (boolean, optional) – if false flatten all annotations, otherwise flatten only form fields.

GenerateThumbnails(size)[source]

Generates thumbnail images for all the pages in this PDF document.

Parameters:: size (int) – The maximum dimension (width or height) that thumbnails will have.

GetAcroForm()[source]

Return type:: Obj
Returns:: the AcroForm dictionary located in “/Root” or NULL if dictionary is not present.

GetDigitalSignatureFieldIterator()[source]

Retrieves an iterator that iterates over digital signature fields.

Return type:: DigitalSignatureFieldIterator
Returns:: An iterator that iterates over digital signature fields.

GetDigitalSignaturePermissions()[source]

Retrieves the most restrictive document permissions locking level from all of the signed digital signatures in the document.

Return type:: int
Returns:: An enumerated value representing the most restrictive document permission level found in the document.

GetDocInfo()[source]

Return type:: PDFDocInfo
Returns:: The class representing document information metadata. (i.e. entries in the document information dictionary).

GetDownloadedByteCount()[source]

Returns the number of bytes that have been downloaded, when HasDownloader() is True.

Return type:: int
Returns:: The number bytes downloaded.
Raises:: if ‘HasDownloader()` returns False, calling this method will result in an exception.

GetField(field_name)[source]

Parameters:

field_name (string) –

a string representing the fully qualified name of

the field (e.g. “employee.name.first”).

Return type:

Returns:

a FieldIterator referring to an interactive Field or to invalid field if the field name was not found. If a given field name was not found itr.HasNext() will return false. For example:

FieldIterator itr = pdfdoc.GetFieldIterator("name");
if (itr.HasNext()) {
  Console.WriteLine("Field name: {0}", itr.Current().GetName());
}
else { ...field was not found... }

GetFieldIterator(args)[source]

Overload 1:

An interactive form (sometimes referred to as an AcroForm) is a collection of fields for gathering information interactively from the user. A PDF document may contain any number of fields appearing on any combination of pages, all of which make up a single, global interactive form spanning the entire document.

The following methods are used to access and manipulate Interactive form fields (sometimes referred to as AcroForms).

Return type:: FieldIterator
Returns:: an iterator to the first Field in the document.

The list of all Fields present in the document can be traversed as follows:

FieldIterator itr = pdfdoc.GetFieldIterator();
for(; itr.HasNext(); itr.Next()) {
  Field field = itr.Current();
  Console.WriteLine("Field name: {0}", field.GetName());
 }

For a sample, please refer to ‘InteractiveForms’ sample project.

Overload 2:

An interactive form (sometimes referred to as an AcroForm) is a collection of fields for gathering information interactively from the user. A PDF document may contain any number of fields appearing on any combination of pages, all of which make up a single, global interactive form spanning the entire document.

The following methods are used to access and manipulate Interactive form fields (sometimes referred to as AcroForms).

Parameters:: field_name (string) – String representing the name of the field to get.
Return type:: FieldIterator
Returns:: an iterator to the Field in the document.

For a sample, please refer to ‘InteractiveForms’ sample project.

GetFileName()[source]

Return type:: string
Returns:: The filename of the document if the document is loaded from disk, or empty string if the document is not yet saved or is loaded from a memory buffer.

GetFirstBookmark()[source]

Return type:

Bookmark

Returns:

the first Bookmark from the document’s outline tree. If the

Bookmark tree is empty the underlying SDF/Cos Object is null and returned Bookmark is not valid (i.e. Bookmark::IsValid() returns false).

GetHandleInternal()[source]

GetOCGConfig()[source]

Return type:: Config
Returns:: the default optional-content configuration for the document from the OCProperties D entry.

GetOCGs()[source]

Return type:: Obj
Returns:: the Obj array that contains optional-content groups (OCGs) for the document, or NULL if the document does not contain any OCGs. The order of the groups is not guaranteed to be the creation order, and is not the same as the display order.

GetOpenAction()[source]

Return type:: Action
Returns:: Action that is triggered when the document is opened. The returned action can be either a destination or some other kind of Action (see Section 8.5, ‘Actions’ in PDF Reference Manual).

Notes: if the document does not nave associated action the returned Action will be null (i.e. Action.IsValid() returns false)

GetPage(page_number)[source]

Parameters:

page_number (int) –

the page number in document’s page sequence. Page numbers

in document’s page sequence are indexed from 1.

Return type:

Page

Returns:

a Page corresponding to a given page number, or null (invalid page) if the document does not contain the given page number.

For example:

Page page = pdfdoc.GetPage(page_num);
if (page == null) return; //  Page not found

GetPageCount()[source]

Return type:: int
Returns:: the number of pages in the document.

GetPageIterator(page_number=1)[source]

Use the Next() method on the returned iterator to traverse all pages in the document. For example:

PageIterator itr = pdfdoc.GetPageIterator();
while (itr.HasNext()) { //  Read every page
   Page page = itr.Current();
   // ...
   itr.Next()
}

For full sample code, please take a look at ElementReader, PDFPageTest and PDFDraw sample projects.

Return type:: PageIterator
Returns:: an iterator to the first page in the document.
Parameters:: page_number (int, optional) – page to set the iterator on. 1 corresponds to the first page.

GetPageLabel(page_num)[source]

Return type:: PageLabel
Returns:: the PageLabel that is in effect for the given page. If there is no label object in effect, this method returns an invalid page label object.
Parameters:: page_num (int) – The page number. Because PDFNet indexes pages starting from 1, page_num must be larger than 0.

GetPages()[source]

Return type:

Returns:

A dictionary representing the root of the low level page-tree

GetRoot()[source]

Return type:

Returns:

A dictionary representing the Cos root of the document (/Root entry

within the trailer dictionary)

GetSDFDoc()[source]

Return type:: SDFDoc
Returns:: document’s SDF/Cos document

GetSecurityHandler()[source]

Return type:: SecurityHandler
Returns:: Currently selected SecurityHandler.

Notes: InitSecurityHandler() should be called before GetSecurityHandler() in order to initialize the handler.

Returned security handler can be modified in order to change the security settings of the existing document. Changes to the current handler will not invalidate the access to the original file and will take effect during document Save().

If the security handler is modified, document will perform a full save even if e_incremental was given as a flag in Save() method.

GetSignatureHandler(signature_handler_id)[source]

Gets the associated signature handler instance from the signature manager by looking it up with the handler name.

Parameters:: signature_handler_id (int) – The unique id of the signature handler to get.
Return type:: SignatureHandler
Returns:: The signature handler instance if found, otherwise NULL.

GetStructTree()[source]

Return type:: STree
Returns:: The document’s logical structure tree root.

GetTotalRemoteByteCount()[source]

Returns the document’s total size in bytes, when HasDownloader() is True.

Return type:: int
Returns:: The total number of bytes in the remote document.
Raises:: if ‘HasDownloader()` returns False, calling this method will result in an exception.

GetTrailer()[source]

Return type:

Returns:

A dictionary representing the Cos root of the document (document’s trailer)

GetTriggerAction(trigger)[source]

Get the Action associated with the selected Doc Trigger event.

Parameters:: trigger (int) – the type of trigger event to get
Return type:: Obj
Returns:: the Action Obj if present, otherwise NULL

GetUndoManager()[source]

Return type:: UndoManager
Returns:: The UndoManager object (one-to-one mapped to document)

GetViewPrefs()[source]

Return type:: PDFDocViewPrefs
Returns:: Viewer preferences for this document.

PDFDocViewPrefs is a high-level utility class that can be used to control the way the document is to be presented on the screen or in print.

HasDownloader()[source]

Indicates whether this document was created via the PDFViewCtrl method OpenURLAsync.

Return type:: boolean
Returns:: True if the document was created via the PDFViewCtrl method OpenURLAsync; False otherwise.

HasOC()[source]

Return type:: boolean
Returns:: true if the optional content (OC) feature is associated with the document. The document is considered to have optional content if there is an OCProperties dictionary in the document’s catalog, and that dictionary has one or more entries in the OCGs array.

HasRepairedXRef()[source]

Checks whether or not the underlying file has an XRef table that had to be repaired when the file was opened. If the document had an invalid XRef table when opened, PDFNet will have repaired the XRef table for its working representation of the document.

Return type:

boolean

Returns:

true if document was found to be corrupted, and was repaired, during

opening and has not been saved since.

Notes: - If this function returns true, it is not possible to incrementally save the document (see http://www.pdftron.com/kb_corrupt_xref)

HasSignatures()[source]

Indicates whether this documents contains any digital signatures.

Return type:: boolean
Returns:: True if a digital signature is found in this PDFDoc.

static HighlightTextDiff(doc1, doc2, options)[source]

Imports two external PDFs and highlights the differences between them. This function directly adds the highlights to the two input documents. Note: Each contiguous block of change is considered a single difference. A deletion immediately followed by an insertion is considered a single edit.

Parameters:

doc1 (PDFDoc) – is the before document, the basis of the comparison
doc2 (PDFDoc) – is the after document, to which the basis is compared
options (TextDiffOptions) – processing options (optional)

Return type:

int

Returns:

the total number of differences found

ImportPages(pages, import_bookmarks=False)[source]

The function imports a list of pages to this document. Although a list of pages can be imported using repeated calls to PageInsert(), PageImport will not import duplicate copies of resources that are shared across pages (such as fonts, images, colorspaces etc). Therefore this method is recommended when a page import list consists of several pages that share the same resources.

Parameters:

pages (std::vector< PDF::Page,std::allocator< PDF::Page > >) – A list of pages to import. All pages should belong to the same source document.
import_bookmarks (boolean, optional) – An optional flag specifying whether any bookmark items pointing to pages in the import list should be merged with the target (i.e. this) document.

Return type:

std::vector< PDF::Page,std::allocator< PDF::Page > >

Returns:

a list of imported pages. Note that imported pages are not placed in the document page sequence. This can be done using methods such as PageInsert(), PagePushBack(), etc.

InitSecurityHandler()[source]

Initializes document’s SecurityHandler. This version of InitSecurityHandler() works with Standard and Custom PDF security and can be used in situations where the password is obtained dynamically via user feedback. See EncTest sample for example code.

This function should be called immediately after an encrypted document is opened. The function does not have any side effects on documents that are not encrypted.

If the security handler was successfully initialized it can be later obtained using GetSecurityHandler() method.

Raises:: An exception is thrown if the matching handler for document’s security was not found in the global SecurityManager. In this case, you need to register additional custom security handlers with the global SecurityManager (using SecurityManagerSingleton).
Return type:: boolean
Returns:: true if the SecurityHandler was successfully initialized (this may include authentication data collection, verification etc.), false otherwise.
Parameters:: custom_data – An optional parameter used to specify custom data that should be passed in SecurityHandler::Initialize() callback.

InitStdSecurityHandler(args)[source]

Overload 1:

Initializes document’s SecurityHandler using the supplied password. This version of InitSecurityHandler() assumes that document uses Standard security and that a password is specified directly.

This function should be called immediately after an encrypted document is opened. The function does not have any side effects on documents that are not encrypted.

If the security handler was successfully initialized, it can be later obtained using GetSecurityHandler() method.

Return type:

boolean

Returns:

true if the given password successfully unlocked the document, false otherwise.

Raises:

An exception is thrown if the document’s security Filter is not ‘Standard’. In this case, you need to register additional custom security handlers with the global SecurityManager (SecurityManagerSingleton).

Parameters:

password (string) – Specifies the password used to open the document without any user feedback. If you would like to dynamically obtain the password, you need to derive a custom class from StdSecurityHandler() and use InitSecurityHandler() without any parameters. See EncTest sample for example code.
password_sz (int) – An optional parameter used to specify the size of the password buffer, in bytes. If the ‘password_sz’ is 0, or if the parameter is not specified, the function assumes that the string is null terminated.

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

Overload 2:

Initializes document’s SecurityHandler using the supplied password. This version of InitSecurityHandler() assumes that document uses Standard security and that a password is specified directly.

This function should be called immediately after an encrypted document is opened. The function does not have any side effects on documents that are not encrypted.

If the security handler was successfully initialized, it can be later obtained using GetSecurityHandler() method.

Parameters:: password (string) – Specifies the password used to open the document without any user feedback. If you would like to dynamically obtain the password, you need to derive a custom class from StdSecurityHandler() and use InitSecurityHandler() without any parameters. See EncTest sample for example code.
Return type:: boolean
Returns:: true if the given password successfully unlocked the document, false otherwise.
Raises:: An exception is thrown if the document’s security Filter is not ‘Standard’. In this case, you need to register additional custom security handlers with the global SecurityManager (SecurityManagerSingleton).

Overload 3:

Initializes document’s SecurityHandler using the supplied password. This version of InitSecurityHandler() assumes that document uses Standard security and that a password is specified directly.

This function should be called immediately after an encrypted document is opened. The function does not have any side effects on documents that are not encrypted.

If the security handler was successfully initialized, it can be later obtained using GetSecurityHandler() method.

Parameters:: password_buf (std::vector< int,std::allocator< int > >) – Specifies the password used to open the document without any user feedback. If you would like to dynamically obtain the password, you need to derive a custom class from StdSecurityHandler() and use InitSecurityHandler() without any parameters. See EncTest sample for example code.
Return type:: boolean
Returns:: true if the given password successfully unlocked the document, false otherwise.
Raises:: An exception is thrown if the document’s security Filter is not ‘Standard’. In this case, you need to register additional custom security handlers with the global SecurityManager (SecurityManagerSingleton).

InsertPages(args)[source]

Overload 1:

Inserts a range of pages from specified PDFDoc

Parameters:

insert_before_page_number (int) –
- the destination of the insertion. If less than or equal to 1,
the pages are added to the beginning of the document. If larger than the number of pages in the destination document, the pages are appended to the document.
src_doc (PDFDoc) –
- source PDFDoc to insert from
start_page (int) –
- start of the page number to insert
end_page (int) –
- end of the page number to insert
flag (int) –
- specifies insert options
progress –
- A pointer to the progress interface. NULL if progress tracking is not required.

Overload 2:

Inserts a range of pages from specified PDFDoc using PageSet

Parameters:

insert_before_page_number (int) –
- the destination of the insertion. If less than or equal to 1,
the pages are added to the beginning of the document. If larger than the number of pages in the destination document, the pages are appended to the document.
src_doc (PDFDoc) –
- source PDFDoc to insert from
source_page_set (PageSet) –
- a collection of the page number to insert
flag (int) –
- specifies insert options
progress –
- A pointer to the progress interface. NULL if progress tracking is not required.

IsEncrypted()[source]

Return type:: boolean
Returns:: true if the document is/was originally encrypted false otherwise.

IsLinearized()[source]

Call this function to determine whether the document is represented in linearized (fast web view) format.

Return type:

boolean

Returns:

true if document is stored in fast web view format, false otherwise.

Notes: any changes to the document can invalidate linearization. The function will return ‘true’ only if the original document is linearized and if it is not modified.

In order to provide good performance over relatively slow communication links, PDFNet can generate PDF documents with linearized objects and hint tables that can allow a PDF viewer application to download and view one page of a PDF file at a time, rather than requiring the entire file (including fonts and images) to be downloaded before any of it can be viewed.

To save a document in linearized (fast web view) format you only need to pass ‘Doc.SaveOptions.e_linearized’ flag in the Save method.

IsModified()[source]

Call this function to determine whether the document has been modified since it was last saved.

Return type:

boolean

Returns:

true if document was modified, false otherwise

IsTagged()[source]

Return type:: boolean
Returns:: true if this document is marked as Tagged PDF, false otherwise.

Lock()[source]: Locks the document to prevent competing threads from accessing the document at the same time. Threads attempting to access the document will wait in suspended state until the thread that owns the lock calls doc.Unlock().

LockRead()[source]: Locks the document to prevent competing write threads (using Lock()) from accessing the document at the same time. Other reader threads however, will be allowed to access the document. Threads attempting to obtain write access to the document will wait in suspended state until the thread that owns the lock calls doc.UnlockRead(). Note: To avoid deadlocks obtaining a write lock while holding a read lock is not permitted and will throw an exception. If this situation is encountered please either unlock the read lock before the write lock is obtained or acquire a write lock (rather than read lock) in the first place.

MergeXFDF(args)[source]

Overload 1:

Merge existing form and annotation data with those imported from the XFDF file. It will replace annotations from pdfdocument with matching annotations from XFDF. In order for the annotations to be considered matching, “name” of the xfdf annotation needs to match “NM” of that in pdf. XFDF annotations that don’t have a match in the pdf document will be added. For regular xfdf files, no deletions will be made This method also supports command form of xfdf, for those files, deletions will be performed for annotations in “delete” section Since this method avoids updating annotations unnecessarily it works well with incremental save. Note: This method is suitable for realtime collaboration.

Parameters:

stream (Filter) –
- Input Filter which provides the xfdf contents
opts (MergeXFDFOptions, optional) –
- MergeXFDFOptions object for finer control

Raises:

PDFNetException

Overload 2:

Merge existing form and annotation data with those imported from the XFDF file. It will replace annotations from pdfdocument with matching annotations from XFDF. In order for the annotations to be considered matching, “name” of the xfdf annotation needs to match “NM” of that in pdf. XFDF annotations that don’t have a match in the pdf document will be added. For regular xfdf files, no deletions will be made This method also supports command form of xfdf, for those files, deletions will be performed for annotations in “delete” section Since this method avoids updating annotations unnecessarily it works well with incremental save. Note: This method is suitable for realtime collaboration.

Parameters:

xfdf (string) –
- xfdf contents in string form or the path to the xfdf file
opts (MergeXFDFOptions, optional) –
- MergeXFDFOptions object for finer control

Raises:

PDFNetException

Overload 3:

Merge existing form and annotation data with those imported from the XFDF file. It will replace annotations from pdfdocument with matching annotations from XFDF. In order for the annotations to be considered matching, “name” of the xfdf annotation needs to match “NM” of that in pdf. XFDF annotations that don’t have a match in the pdf document will be added. For regular xfdf files, no deletions will be made This method also supports command form of xfdf, for those files, deletions will be performed for annotations in “delete” section Since this method avoids updating annotations unnecessarily it works well with incremental save. Note: This method is suitable for realtime collaboration.

Parameters:

xfdf (string) –
- xfdf contents in string form or the path to the xfdf file
opts –
- MergeXFDFOptions object for finer control

Raises:

PDFNetException

MovePages(args)[source]

Overload 1:

Moves a range of pages from specified PDFDoc. Pages are deleted from source document after move.

Parameters:

move_before_page_number (int) –
- the destination of the move. If less than or equal to 1,
the pages are moved to the beginning of the document. If larger than the number of pages in the destination document, the pages are moved to the end of the document.
src_doc (PDFDoc) –
- source PDFDoc to move from
start_page (int) –
- start of the page number to move
end_page (int) –
- end of the page number to move
flag (int) –
- specifies insert options
progress –
- A pointer to the progress interface. NULL if progress tracking is not required.

Notes: MovePages function does not save src_doc. It merely delete pages in memeory. For permanent changes, PDFDoc::Save should be used to save src_doc after function exists.

Overload 2:

Moves a range of pages from specified PDFDoc. Pages are deleted from source document after move.

Parameters:

move_before_page_number (int) –
- the destination of the move. If less than or equal to 1,
the pages are moved to the beginning of the document. If larger than the number of pages in the destination document, the pages are moved to the end of the document.
src_doc (PDFDoc) –
- source PDFDoc to move from
source_page_set (PageSet) –
- a collection of the page number to move
flag (int) –
- specifies insert options
progress –
- A pointer to the progress interface. NULL if progress tracking is not required.

Notes: MovePages function does not save src_doc. It merely delete pages in memeory. For permanent changes, PDFDoc::Save should be used to save src_doc after function exists.

PageCreate(args)[source]

Create a new, empty page in the document. You can use PageWriter to fill the page with new content. Finally the page should be inserted at specific place within document page sequence using PageInsert/PagePushFront/PagePushBack methods.

Return type:: Page
Returns:: A new, empty page.

Notes: the new page still does not belong to document page sequence and should be subsequently placed at a specific location within the sequence.

Parameters:: media_box (Rect, optional) – A rectangle, expressed in default user space units, defining the boundaries of the physical medium on which the page is intended to be displayed or printed. A user space units is 1/72 of an inch. If media_box is not specified the default dimensions of the page are 8.5 x 11 inches (or 8.572, 1172 units).

The following is a listing of some standard U.S. page sizes:

Letter = Rect(0, 0, 612, 792) Legal = Rect(0, 0, 612, 1008) Ledger = Rect(0, 0, 1224, 792) Tabloid = Rect(0, 0, 792, 1224) Executive = Rect(0, 0, 522, 756)

The following is a listing of ISO standard page sizes:

4A0 = Rect(0, 0, 4768, 6741) 2A0 = Rect(0, 0, 3370, 4768) A0 = Rect(0, 0, 2384, 3370) A1 = Rect(0, 0, 1684, 2384) A2 = Rect(0, 0, 1191, 1684) A3 = Rect(0, 0, 842, 1191) A4 = Rect(0, 0, 595, 842) A5 = Rect(0, 0, 420, 595) A6 = Rect(0, 0, 298, 420) A7 = Rect(0, 0, 210, 298) A8 = Rect(0, 0, 147, 210) A9 = Rect(0, 0, 105, 147) A10 = Rect(0, 0, 74, 105) B0 = Rect(0, 0, 2835, 4008) B1 = Rect(0, 0, 2004, 2835) B2 = Rect(0, 0, 1417, 2004) B3 = Rect(0, 0, 1001, 1417) B4 = Rect(0, 0, 709, 1001) B5 = Rect(0, 0, 499, 709) B6 = Rect(0, 0, 354, 499) B7 = Rect(0, 0, 249, 354) B8 = Rect(0, 0, 176, 249) B9 = Rect(0, 0, 125, 176) B10 = Rect(0, 0, 88, 125) C0 = Rect(0, 0, 2599, 3677) C1 = Rect(0, 0, 1837, 2599) C2 = Rect(0, 0, 1298, 1837) C3 = Rect(0, 0, 918, 1298) C4 = Rect(0, 0, 649, 918) C5 = Rect(0, 0, 459, 649) C6 = Rect(0, 0, 323, 459) C7 = Rect(0, 0, 230, 323) C8 = Rect(0, 0, 162, 230) C9 = Rect(0, 0, 113, 162)

C10 = Rect(0, 0, 79, 113)

PageInsert(where, page)[source]

Insert/Import a single page at a specific location in the page sequence.

Parameters:

where (PageIterator) –
- The location in the page sequence indicating where to insert
the page. The page is inserted before the specified location.
page (Page) –
- A page to insert.

Notes: Invalidates all PageIterators pointing to the document.

PagePushBack(page)[source]

Adds a page to the end of a document’s page sequence.

Parameters:

page (Page) –

a page to append to the document

Notes: Invalidates all PageIterators pointing to the document.

PagePushFront(page)[source]

Adds a page to the beginning of a document’s page sequence.

Parameters:

page (Page) –

a page to prepend to the document

Invalidates all PageIterators pointing to the document.

PageRemove(page_itr)[source]

Parameters:

page_itr (PageIterator) –

the PageIterator to the page that should be removed

A PageIterator for the given page can be obtained using PDFDoc::GetPageIterator(page_num) or using direct iteration through document’s page sequence.

RefreshAnnotAppearances(options=None)[source]

Generates the appearance stream for annotations in the document using the specified options. A common use case is to generate appearances only for missing annotations, which can be accomplished using the default options.

Parameters:: options (RefreshOptions, optional) – Options that can be used to adjust this generation process.

RefreshFieldAppearances()[source]: Regenerates the appearance stream for every widget annotation in the document Call this method if you modified field’s value and would like to update field’s appearances.

RemovePageLabel(page_num)[source]

Removes the page label that is attached to the specified page, effectively merging the specified range with the previous page label sequence.

Parameters:: page_num (int) – The page from which the page label is removed. Because PDFNet indexes pages starting from 1, page_num must be larger than 0.

RemoveSecurity()[source]: This function removes document security.

RemoveSignatureHandler(signature_handler_id)[source]

Removes a signature handler from the signature manager.

Parameters:: signature_handler_id (int) – The unique id of the signature handler to remove.

Save(args)[source]

Overload 1:

Saves the document to a file.

If a full save is requested to the original path, the file is saved to a file system-determined temporary file, the old file is deleted, and the temporary file is renamed to path.

A full save with remove unused or linearization option may re-arrange object in the cross reference table. Therefore all pointers and references to document objects and resources should be re acquired in order to continue document editing.

In order to use incremental save the specified path must match original path and e_incremental flag bit should be set.

Parameters:

path (string) –
- The full path name to which the file is saved.
flags (int) –
- A bit field composed of an OR of SDFDoc::SaveOptions values.

Raises:

if the file can’t be opened for saving or if there is a problem during Save

an Exception object will be thrown.

Notes: - Save will modify the PDFDoc object’s internal representation. As such,: the user should acquire a write lock before calling save.

If the original pdf has a corrupt xref table (see HasRepairedXref), then

it can not be saved using the e_incremental flag.

Overload 2:

Saves the document to a stream.

Parameters:

stream (Filter) – The output stream where to write data.
flags (int) –
- A bit field composed of an OR of the SDFDoc::SaveOptions values.

Raises:

if there is a problem during Save an Exception object will be thrown.

Notes: - Save will modify the PDFDoc object’s internal representation. As such,: the user should acquire a write lock before calling save.

If the original pdf has a corrupt xref table (see HasRepairedXref), then

it can not be saved using the e_incremental flag.

SaveCustomSignature(args)[source]

Overload 1:

Saves a custom signature Contents to a document which has been prepared to receive it. No changes should be made to document in meantime.

Parameters:

in_signature (std::vector< UChar,std::allocator< UChar > >) – The signature Contents to write
in_field (DigitalSignatureField) – The signature field to which to write
in_path (string) – The full path name to which the file is saved.

Raises:

if there is a problem during Save an Exception object will be thrown.

Overload 2:

Saves a custom signature Contents to a document which has been prepared to receive it. No changes should be made to document in meantime.

Parameters:

in_signature (std::vector< UChar,std::allocator< UChar > >) – The signature Contents to write
in_field (DigitalSignatureField) – The signature field to which to write
out_stream (Filter) – The output stream where to write data.

Raises:

if there is a problem during Save an Exception object will be thrown.

SaveViewerOptimized(args)[source]

SetOpenAction(action)[source]

Sets the Action that will be triggered when the document is opened.

Parameters:: action (Action) – A new Action that will be triggered when the document is opened. An example of such action is a GoTo Action that takes the user to a given location in the document.

SetPageLabel(page_num, label)[source]

Attaches a label to a page. This establishes the numbering scheme for that page and all following it, until another page label is encountered. This label allows PDF producers to define a page numbering system other than the default.

Parameters:: page_num (int) – The number of the page to label. If page_num is less than 1 or greater than the number of pages in the document, the method does nothing.

SetSecurityHandler(handler)[source]

The function sets a new SecurityHandler as the current security handler.

Parameters:: handler (SecurityHandler) – new SecurityHandler

Notes: Setting a new security handler will not invalidate the access to the original file and will take effect during document Save().

If the security handler is modified, document will perform a full save even if e_incremental was given as a flag in Save() method.

TryLock(milliseconds=0)[source]

Try locking the document, waiting no longer than specified number of milliseconds.

Parameters:

milliseconds (int, optional) –

max number of milliseconds to wait for the document to lock

Return type:

boolean

Returns:

true if the document is locked for multi-threaded access, false otherwise.

TryLockRead(milliseconds=0)[source]

Tries to obtain a read lock the document for <milliseconds> duration, and returns true if the lock was successfully acquired

Parameters:: milliseconds (int, optional) – duration to obtain a read lock for.
Return type:: boolean
Returns:: true if the document is locked for multi-threaded access, false otherwise.

Unlock()[source]: Removes the lock from the document.

UnlockRead()[source]: Removes the read lock from the document.

VerifySignedDigitalSignatures(in_opts)[source]

Attempts to verify all signed cryptographic digital signatures in the document, ignoring unsigned signatures.

Return type:: int
Returns:: an enumeration value representing the state of the document’s signatures

e_action_trigger_doc_did_print = 21

e_action_trigger_doc_did_save = 19

e_action_trigger_doc_will_close = 17

e_action_trigger_doc_will_print = 20

e_action_trigger_doc_will_save = 18

e_annots_only = 1

e_annots_only_no_links = 5

e_both = 2

e_failure = 1

e_forms_only = 0

e_insert_bookmark = 1

e_insert_goto_bookmark = 2

e_none = 0

e_unsigned = 0

e_unsupported = 3

e_untrusted = 2

e_verified = 4

property mp_doc

property thisown: The membership flag

class apryse_sdk.PDFDocInfo(args)[source]

Bases: object

PDFDocInfo is a high-level utility class that can be used to read and modify document’s metadata.

GetAuthor()[source]

Return type:: string
Returns:: The name of the person who created the document.

GetAuthorObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s author.

GetCreationDate()[source]

Return type:: Date
Returns:: The date and time the document was created, in human-readable form.

GetCreator()[source]

Return type:: string
Returns:: If the document was converted to PDF from another format, the name of the application that created the original document from which it was converted.

GetCreatorObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s creator.

GetKeywords()[source]

Return type:: string
Returns:: Keywords associated with the document.

GetKeywordsObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s keywords.

GetModDate()[source]

Return type:: Date
Returns:: The date and time the document was most recently modified, in human-readable form.

GetProducer()[source]

Return type:: string
Returns:: If the document was converted to PDF from another format, the name of the application (for example, Distiller) that converted it to PDF.

GetProducerObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s producer.

GetSDFObj()[source]

Return type:: Obj
Returns:: document’s SDF/Cos ‘Info’ dictionary or NULL if the info dictionary is not available.

GetSubject()[source]

Return type:: string
Returns:: The subject of the document.

GetSubjectObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s subject.

GetTitle()[source]

Return type:: string
Returns:: The document’s title.

GetTitleObj()[source]

Return type:: Obj
Returns:: SDF/Cos string object representing document’s title.

SetAuthor(author)[source]

Set the author of the document.

Parameters:: author (string) – The name of the person who created the document.

SetCreationDate(creation_date)[source]

Set document’s creation date.

Parameters:: creation_date (Date) – The date and time the document was created.

SetCreator(creator)[source]

Set document’s creator.

Parameters:: creator (string) – The name of the application that created the original document.

SetKeywords(keywords)[source]

Set keywords associated with the document.

Parameters:: keywords (string) – Keywords associated with the document.

SetModDate(mod_date)[source]

Set document’s modification date.

Parameters:: mod_date (Date) – The date and time the document was most recently modified.

SetProducer(producer)[source]

Set document’s producer.

Parameters:: producer (string) – The name of the application that generated PDF.

SetSubject(subject)[source]

Set the subject of the document

Parameters:: subject (string) – The subject of the document.

SetTitle(title)[source]

Set document’s title.

Parameters:: title (string) – New title of the document.

property mp_info

property thisown: The membership flag

class apryse_sdk.PDFDocViewPrefs(args)[source]

Bases: object

PDFDocViewPrefs is a high-level utility class that can be used to control the way the document is to be presented on the screen or in print.

PDFDocViewPrefs class corresponds to PageMode, PageLayout, and ViewerPreferences entries in the document’s catalog. For more details please refer to section 8.1 ‘Viewer Preferences’ in PDF Reference Manual.

GetDirection()[source]

Return type:: boolean
Returns:: true is the predominant reading order for text is left to right, false otherwise. See SetDirection() for more information.

GetLayoutMode()[source]

Return type:: int
Returns:: The value of currently selected PageLayout property.

GetNonFullScreenPageMode()[source]

Return type:: int
Returns:: the PageMode used after exiting full-screen mode.

Notes: This entry is meaningful only if the value of the PageMode is set to e_FullScreen; it is ignored otherwise.

GetPageMode()[source]

Return type:: int
Returns:: The value of currently selected PageMode property.

GetPref(pref)[source]

Return type:: boolean
Returns:: the value of given ViewerPref property.
Parameters:: pref (int) – the ViewerPref property type to query.

GetPrintArea()[source]

Return type:: int
Returns:: the page boundary representing the area of a page to be rendered when printing the document.

GetPrintClip()[source]

Return type:: int
Returns:: the page boundary to which the contents of a page are to be clipped when printing the document.

GetSDFObj()[source]

Return type:: Obj
Returns:: document’s SDF/Cos ‘ViewerPreferences’ dictionary or NULL if the object is not present.

GetViewArea()[source]

Return type:: int
Returns:: the page boundary representing the area of a page to be displayed when viewing the document on the screen.

GetViewClip()[source]

Return type:: int
Returns:: the page boundary to which the contents of a page are to be clipped when viewing the document on the screen.

SetDirection(left_to_right)[source]

Sets the predominant reading order for text.

This flag has no direct effect on the document’s contents or page numbering but can be used to determine the relative positioning of pages when displayed side by side or printed n-up.

Parameters:

left_to_right (boolean) –

true if the predominant reading

order for text is from left to right and false if it is right to left (including vertical writing systems, such as Chinese, Japanese, and Korean). Default value: left_to_right is true.

SetInitialPage(dest)[source]

A utility method used to set the fist page displayed after the document is opened. This method is equivalent to PDFDoc::SetOpenAction(goto_action).

If OpenAction is not specified the document should be opened to the top of the first page at the default magnification factor.

Parameters:: dest (Destination) – A value specifying the page destination to be displayed when the document is opened.

Example:

Destination dest = Destination::CreateFit(page);
pdfdoc.GetViewPrefs().SetInitialPage(dest);

SetLayoutMode(layout)[source]

Sets PageLayout property and change the value of the PageLayout key in the Catalog dictionary.

Parameters:: mode – New PageLayout setting. Default value is e_SinglePage.

SetNonFullScreenPageMode(mode)[source]

Set the document’s page mode, specifying how to display the document on exiting full-screen mode.

Parameters:: mode (int) – PageMode used after exiting full-screen mode. Default value: e_UseNone.

Notes: This entry is meaningful only if the value of the PageMode is set to e_FullScreen; it is ignored otherwise.

SetPageMode(mode)[source]

Sets PageMode property and change the value of the PageMode key in the Catalog dictionary.

Parameters:: mode (int) – New PageMode setting. Default value is e_UseNone.

SetPref(pref, value)[source]

Sets the value of given ViewerPref property.

Parameters:

pref (int) – the ViewerPref property type to modify.
value (boolean) – The new value for the property.

SetPrintArea(box)[source]

Sets the page boundary representing the area of a page to be rendered when printing the document.

Parameters:: box (int) – printing region. The default value is page crop-box.

SetPrintClip(box)[source]

Sets the page boundary to which the contents of a page are to be clipped when printing the document.

Parameters:: box (int) – printing clip region. The default value is page crop-box.

SetViewArea(box)[source]

Sets the page boundary representing the area of a page to be displayed when viewing the document on the screen.

Parameters:: box (int) – page boundary displayed when viewing the document on the screen. By default, PDF viewers will display the crop-box.

SetViewClip(box)[source]

Sets the page boundary to which the contents of a page are to be clipped when viewing the document on the screen.

Parameters:: box (int) – screen clip region. The default value is page crop-box.

e_CenterWindow = 4

e_Default = 0

e_DisplayDocTitle = 5

e_FitWindow = 3

e_FullScreen = 3

e_HideMenubar = 1

e_HideToolbar = 0

e_HideWindowUI = 2

e_OneColumn = 2

e_SinglePage = 1

e_TwoColumnLeft = 3

e_TwoColumnRight = 4

e_TwoPageLeft = 5

e_TwoPageRight = 6

e_UseAttachments = 5

e_UseBookmarks = 2

e_UseNone = 0

e_UseOC = 4

e_UseThumbs = 1

property mp_prefs

property thisown: The membership flag

class apryse_sdk.PDFDraw(dpi=92)[source]

Bases: object

PDFDraw contains methods for converting PDF pages to images and to Bitmap objects. Utility methods are provided to export PDF pages to various raster formats as well as to convert pages to GDI+ bitmaps for further manipulation or drawing.

Notes: This class is available on all platforms supported by PDFNet.

Destroy()[source]: Frees the native memory of the object.

Export(args)[source]

Overload 1:

A utility method to export the given PDF page to an image file.

Parameters:

page (Page) – The source PDF page.
filename (string) –
- The name of the output image file. The filename should include
the extension suffix (e.g. ‘c:/output/myimage.png’).

format - The file format of the output image. Currently supported formats are:

“RAW”RAW format. There are four possibilities:

e_rgba - if transparent and color page; e_gray_alpha - if transparent and gray page; e_rgb - if opaque and color page; e_gray - if opaque and gray page.

NOTE that if page is set to be transparent (SetPageTransparent), the output color channels are already multiplied by the alpha channel.

“BMP” : Bitmap image format (BMP)

“JPEG” : Joint Photographic Experts Group (JPEG) image format

“PNG” : 24-bit W3C Portable Network Graphics (PNG) image format

“PNG8”8-bit, palettized PNG format. The exported file size should be
smaller than the one generated using “PNG”, possibly at the expense of some image quality.

“TIFF” : Tag Image File Format (TIFF) image format.

“TIFF8” : Tag Image File Format (TIFF) image format (with 8-bit palete).

By default, the function exports to PNG.

Parameters:

encoder_params (Obj, optional) –

An optional SDF dictionary object containing key/value
pairs representing optional encoder parameters. The following table list possible parameters for corresponding export filters:

|Parameter/Key |Output Format |Description/Value |Example

|Quality |JPEG |The value for compression ‘Quality’ must be a number between 0 and 100

specifying the tradeoff between compression ratio and loss in image quality. 100 stands for best quality.

style=”Z-INDEX: 0”hint.PutNumber(“Quality”, 60);

See the Example 2 in PDFDraw sample project.

|Dither |

PNG, PNG8, TIFF or TIFF8.

|A boolean used to enable or disable dithering. Relevent only for when the image
is exported in palettized or monochrome mode.

|hint.PutBool(“Dither”, true);

|ColorSpace |PNG or TIFF for grayscale; TIFF for CMYK; PNG, BMP, JPEG, or TIFF for Separation. |A name object used to select the rendering and export color space. Currently

supported values are “Gray”, “RGB, “CMYK”, and “Separation”. The output image format must support specified color space, otherwise the parameter will be ignored. An example of image format that supports CMYK is TIFF. Image formats that support grayscale are PNG and TIFF. Separation output is supported in either a single N-Channel TIFF, or in separate single-channel files (either PNG, BMP, or JPEG). Output in “Separation” space implies that overprint simulation is on. By default, the image is rendered and exported in RGB color space.</td>

<td>hint.PutName(“ColorSpace”, “CMYK”);</td>

</tr> <tr>

<td>BPC</td> <td>PNG or TIFF.</td> <td>A number used to specify ‘bits per pixel’ in the output file. Currently

supported values are 1 and 8 (default is 8). To export monochrome (1 bit per pixel) image, use 1 as the value of BPC parameter and use TIFF or PNG as the export format for the image. By default, the image is not dithered when BPC is 1. To enable dithering add ‘Dither’ option in the export hint.</td>

<td>hint.PutNumber(“BPC”, 1);</td>

</tr>

</table>

Overload 2:

Export the given PDF page to an image stream.

Parameters:

page (Page) – The source PDF page.
stream (Filter) –
- The output stream.
format - The output image format. See the overloaded method for details.
encoder_params (Obj, optional) –
- Optional encoder parameters. See the overloaded method for details.

Overload 3:

Export the given PDF page to an image stream.

Parameters:

page (Page) – The source PDF page.
stream (Filter) –
- The output stream.
format - The output image format. See the overloaded method for details.
encoder_params –
- Optional encoder parameters. See the overloaded method for details.

Overload 4:

Export the given PDF page to an image stream.

Parameters:

page (Page) – The source PDF page.
stream (Filter) –
- The output stream.
format - The output image format. See the overloaded method for details.
encoder_params –
- Optional encoder parameters. See the overloaded method for details.

GetBitmap(args)[source]

Returns the raw rasterized image data for the given image.

Notes: This method is relatively low-level and is only available in PDFNet for C++. If you are using PDFNet for .NET, you can use the function with the same name that directly returns GDI+ Bitmap.

Return type:

BitmapInfo

Returns:

a pointer to the internal memory buffer containing the rasterized image of the given page. The buffer size is at least ‘out_heightout_stride’ bytes. The pixel data is stored in 8 bit per component, BGRA format by default.

Parameters:

page (Page) – The source PDF page.
pix_fmt (int, optional) –
- Optional parameter used to specify the desired pixel format. The default pixel format is e_bgra.
demult (boolean, optional) –
- Specifies if the alpha is de-multiplied from the resulting color components. This
parameter is only used for e_rgba, e_bgra, e_gray_alpha formats.

GetSeparationBitmaps(page)[source]

Returns a vector of rasterized separations for the given image.

Notes: This method is relatively low-level and is only available in PDFNet for C++. If you are using PDFNet for .NET, you can use the function with the same name that directly returns GDI+ Bitmap.

Return type:: std::vector< PDF::Separation,std::allocator< PDF::Separation > >
Returns:: Separation has a pointer to the internal memory buffer containing the rasterized image of the given page. The buffer size is at least ‘out_heightout_stride’ bytes. The pixel data is stored in 8 bit per component, BGRA format.
Parameters:: page (Page) – The source PDF page.

SetAntiAliasing(enable_aa)[source]

Enable or disable anti-aliasing.

Anti-Aliasing is a technique used to improve the visual quality of images when displaying them on low resolution devices (for example, low DPI computer monitors).

Parameters:: enable_aa (boolean) – if true anti-aliasing will be enabled. Anti-aliasing is enabled by default.

SetCaching(enabled=True)[source]

Enables or disables caching. Caching can improve the rendering performance in cases where the same page will be drawn multiple times.

Parameters:

enabled (boolean, optional) –

if true PDFRasterizer will cache frequently used graphics objects.

SetClipRect(clip_rect)[source]

clip the render region to the provided rect (in page space)

Parameters:: clip_rect (Rect) – clipping rect. By default, PDFDraw will rasterize the entire page box.

SetColorPostProcessMode(mode)[source]

Set the color post processing transformation. This transform is applied to the rasterized bitmap as the final step in the rasterization process, and is applied directly to the resulting bitmap (disregarding any color space information). Color post processing only supported for RGBA output.

Parameters:: mode (int) – is the specific transform to be applied

SetDPI(dpi)[source]

Sets the output image resolution.

DPI stands for Dots Per Inch. This parameter is used to specify the output image size and quality. A typical screen resolution for monitors these days is 92 DPI, but printers could use 200 DPI or more.

Parameters:: dpi (double) – value to set the image resolution to. Higher value = higher resolution.

Notes: The size of resulting image is a function of DPI and the dimensions of the source PDF page. For example, if DPI is 92 and page is 8 inches wide, the output bitmap will have 928 = 736 pixels per line. If you know the dimensions of the destination bitmap, but don’t care about DPI of the image you can use pdfdraw.SetImageSize() instead.

if you would like to rasterize extremely large bitmaps (e.g. with resolutions of 2000 DPI or more) it is not practical to use PDFDraw directly because of the memory required to store the entire image. In this case, you can use PDFRasterizer directly to generate the rasterized image in stripes or tiles.

SetDefaultPageColor(r, g, b)[source]

Sets the default color of the page backdrop.

By default, the page color is white.

Parameters:

r (int) – the red component of the page backdrop color.
g (int) – the green component of the page backdrop color.
b (int) – the blue component of the page backdrop color.

Notes: Only when the page backdrop is not set to transparent (SetPageTransparent), default page color is used.

SetDrawAnnotations(render_annots)[source]

Enable or disable annotation and forms rendering. By default, all annotations and form fields are rendered.

Parameters:: render_annots (boolean) – True to draw annotations, false otherwise.

SetDrawUIElements(draw_ui_elements)[source]

Enable or disable drawing ui elements. Default is disabled.

Parameters:: draw_ui_elements (boolean) – true to draw ui elements, false otherwise.

SetErrorReportProc(instance)[source]

Sets the error handling function to be called in case an error is encountered during page rendering.

Parameters:

error_proc – Error handling callback function (or delegate in .NET)
data – Custom data to be passed as a second parameter to ‘error_proc’.

SetFlipYAxis(flip_y)[source]

Flips the vertical (i.e. Y) axis of the image.

Parameters:: flip_y (boolean) – true to flip the Y axis, false otherwise. For compatibility with most raster formats ‘flip_y’ is true by default.

SetGamma(exp)[source]

Sets the gamma factor used for anti-aliased rendering.

Parameters:: exp (double) – is the exponent value of gamma function. Typical values are in the range from 0.1 to 3.

Gamma correction can be used to improve the quality of anti-aliased image output and can (to some extent) decrease the appearance common anti-aliasing artifacts (such as pixel width lines between polygons).

Notes: Gamma correction is used only in the built-in rasterizer.

SetHighlightFields(highlight_fields)[source]

Enable or disable highlighting form fields. Default is disabled.

Parameters:: highlight_fields (boolean) – true to highlight, false otherwise.

SetImageSize(width, height, preserve_aspect_ratio=True)[source]

SetImageSize can be used instead of SetDPI() to adjust page scaling so that image fits into a buffer of given dimensions.

If this function is used, DPI will be calculated dynamically for each page so that every page fits into the buffer of given dimensions.

Parameters:

width (int) –
- The width of the image, in pixels/samples.
height (int) –
- The height of the image, in pixels/samples.
preserve_aspect_ratio (boolean, optional) –
- True to preserve the aspect ratio, false
otherwise. By default, preserve_aspect_ratio is true.

SetImageSmoothing(smoothing_enabled=True, hq_image_resampling=False)[source]

Enable or disable image smoothing.

The rasterizer allows a tradeoff between rendering quality and rendering speed. This function can be used to indicate the preference between rendering speed and quality.

Notes: image smoothing option has effect only if the source image has higher resolution that the output resolution of the image on the rasterized page. PDFNet automatically controls at what resolution/zoom factor, ‘image smoothing’ needs to take effect.

Parameters:

smoothing_enabled (boolean, optional) – True to enable image smoothing, false otherwise.
hq_image_resampling (boolean, optional) – True to use a higher quality (but slower) smoothing algorithm image smoothing is enabled and hq_image_resampling is false.

SetOCGContext(ctx)[source]

Sets the Optional Content Group (OCG) context that should be used when rendering the page. This function can be used to selectively render optional content (such as PDF layers) based on the states of optional content groups in the given context.

Parameters:: ctx (Context) – Optional Content Group (OCG) context, or NULL if the rasterizer should render all content on the page.

SetOverprint(op)[source]

Enable or disable support for overprint and overprint simulation. Overprint is a device dependent feature and the results will vary depending on the output color space and supported colorants (i.e. CMYK, CMYK+spot, RGB, etc).

By default overprint is only enabled for PDF/X files.

Parameters:: op (int) – e_op_on: always enabled; e_op_off: always disabled; e_op_pdfx_on: enabled for PDF/X files only.

SetPageBox(region)[source]

Selects the page box/region to rasterize.

Parameters:: region (int) – Page box to rasterize. By default, PDFDraw will rasterize page crop box.

SetPageTransparent(is_transparent)[source]

Sets the page color to transparent.

By default, PDFDraw assumes that the page is imposed directly on an opaque white surface. Some applications may need to impose the page on a different backdrop. In this case any pixels that are not covered during rendering will be transparent.

Parameters:: is_transparent (boolean) – If true, page’s backdrop color will be transparent. If false, the page’s backdrop will be a opaque white.

Notes: If page transparency is enabled, the alpha channel will be preserved when the image is exported as PNG, TIFF(when in RGB space), or RAW.

SetPathHinting(enable_hinting)[source]

Enable or disable path hinting.

Parameters:: enable_hinting (boolean) – if true path hinting will be enabled. Path hinting is used to slightly adjust paths in order to avoid or alleviate artifacts of hair line cracks between certain graphical elements. This option is turned on by default.

SetPrintMode(is_printing)[source]

Tells the rasterizer to render the page ‘print’ mode. Certain page elements (such as annotations or OCG-s) are meant to be visible either on the screen or on the printed paper but not both. A common example, is the “Submit” button on electronic forms.

Parameters:: is_printing (boolean) – set to true if the page should be rendered in print mode. By default, print mode flag is set to false.

SetRasterizerType(type)[source]

Sets the core graphics library used for rasterization and rendering. Using this method it is possible to quickly switch between different implementations. By default, PDFDraw uses the built-in, platform independent rasterizer.

Parameters:: type (int) – Rasterizer type.

Notes: This method is deprecated, since the GDI+ rasterizer itself is deprecated and will be removed in a future version of PDFNet. It is strongly recommended to use the built-in rasterizer and to use the XPS print path where vector conversion is needed.

SetRotate(r)[source]

Sets the rotation value for this page.

Notes: This method is used only for drawing purposes and it does not modify the document (unlike Page::SetRotate()).

Parameters:: r (int) – Rotation value to be set for a given page. Must be one of the Page::Rotate values.

SetThinLineAdjustment(pixel_grid_fit, stroke_adjust)[source]

Set thin line adjustment parameters.

Parameters:

pixel_grid_fit (boolean) – if true (horizontal/vertical) thin lines will be snapped to integer pixel positions. This helps make thin lines look sharper and clearer. This option is turned off by default and it only works if path hinting is enabled.
stroke_adjust (boolean) – if true auto stroke adjustment is enabled. Currently, this would make lines with sub-pixel width to be one-pixel wide. This option is turned on by default.

SetThinLineScaling(scaling)[source]

This setting controls the thickness of zero-width lines when rendered. In a PDF, a line width of zero denotes the thinnest line that can be rendered at device resolution: 1 device pixel wide. However, on high-resolution devices, a single pixel can be nearly invisible.

Parameters:: scaling (double) – use this setting to increase the apparent thickness of these zero-width lines. 1.0 (1 pixel wide)

e_bgr = 3

e_bgra = 1

e_cmyk = 6

e_gray = 4

e_gray_alpha = 5

e_rgb = 2

e_rgba = 0

property mp_draw

property thisown: The membership flag

class apryse_sdk.PDFNet[source]

Bases: object

PDFNet contains global library initialization, registration, configuration, and termination methods.

Notes: there is only a single, static instance of PDFNet class. Initialization and termination methods need to be called only once per application session.

static AddFontSubst(args)[source]

Overload 1:

AddFontSubst functions can be used to create font substitutes that can override default PDFNet font selection algorithm.

AddFontSubst functions are useful in situations where referenced fonts are not present in the document and PDFNet font substitution algorithm is not producing desired results.

AddFontSubst(fontname, fontpath) maps the given font name (i.e. ‘BaseFont’ entry from the font dictionary) to a font file.

The following is an example of using this function to provide user defined font substitutes:

PDFNet::Initialize();
PDFNet::SetResourcesPath("c:/myapp/resources");
// Specify specific font mappings...
PDFNet::AddFontSubst("MinionPro-Regular", "c:/myfonts/MinionPro-Regular.otf");
PDFNet::AddFontSubst("Times-Roman", "c:/windows/fonts/times.ttf");
PDFNet::AddFontSubst("Times-Italic", "c:/windows/fonts/timesi.ttf");
...
PDFDoc doc("c:/my.pdf");
...

Overload 2:

AddFontSubst functions can be used to create font substitutes that can override default PDFNet font selection algorithm.

AddFontSubst functions are useful in situations where referenced fonts are not present in the document and PDFNet font substitution algorithm is not producing desired results.

AddFontSubst(ordering, fontpath) maps the given character ordering (see Ordering entry in CIDSystemInfo dictionary; Section 5.6.2 in PDF Reference) to a font file. This method is less specific that the former variant of AddFontSubst, and can be used to override a range of missing fonts (or any missing font) with a predefined substitute.

The following is an example of using this function to provide user defined font substitutes:

PDFNet::Initialize();
PDFNet::SetResourcesPath("c:/myapp/resources");

// Specify more general font mappings...
PDFNet::AddFontSubst(PDFNet::e_Identity, "c:/myfonts/arialuni.ttf");  // Arial Unicode MS
PDFNet::AddFontSubst(PDFNet::e_Japan1, "c:/myfonts/KozMinProVI-Regular.otf");
PDFNet::AddFontSubst(PDFNet::e_Japan2, "c:/myfonts/KozMinProVI-Regular.otf");
PDFNet::AddFontSubst(PDFNet::e_Korea1, "c:/myfonts/AdobeSongStd-Light.otf");
PDFNet::AddFontSubst(PDFNet::e_CNS1, "c:/myfonts/AdobeMingStd-Light.otf");
PDFNet::AddFontSubst(PDFNet::e_GB1, "c:/myfonts/AdobeMyungjoStd-Medium.otf");
...
PDFDoc doc("c:/my.pdf");
...

static AddPDFTronCustomHandler(custom_id)[source]

End of conditional comment.

Add PDFTron Custom Security handler

Parameters:: custom_id (int) – The user’s custom id. The id should match what was used to create PDFTronCustomSecurityHandler when encrypting the document.

Notes: calling this function is a requirement to load files encrypted with PDFTronCustomSecurityHandler.

static AddResourceSearchPath(path)[source]

Sets the location of PDFNet resource file.

Notes: Starting with v.4.5 PDFNet no longer requires a seperate resource file, and so this function is not required for proper PDFNet initialization. The function can be used on all platforms to specify search paths for ICC profiles, fonts, and other user defined resources.

Parameters:

path (string) –

The resource directory path to add to the search list.

static EnableJavaScript(enable)[source]

A switch that can be used to turn on/off JavaScript engine

Parameters:: enable (boolean) – true to enable JavaScript engine, false to disable.

static GetResourcesPath()[source]

Return type:: string
Returns:: the location of PDFNet resources folder. Empty string means that resources are located in your application folder.

static GetSystemFontList()[source]

Get available fonts on the system.

Return type:: string
Returns:: A JSON list of fonts accessible to PDFNet

static GetVersion()[source]

Return type:: double
Returns:: PDFNet version number.

static GetVersionString()[source]

Return type:: string
Returns:: PDFNet version as a string.

static Initialize(args)[source]

static IsJavaScriptEnabled()[source]

Test whether JavaScript is enabled

Return type:: boolean
Returns:: true if it is enabled, false otherwise

static SetColorManagement(args)[source]

Used to set a specific Color Management System (CMS) for use during color conversion operators, image rendering, etc.

Parameters:: t (int, optional) – identifies the type of color management to use.

static SetConnectionErrorHandlingMode(mode)[source]

Sets the connection error handling behaviour for Apryse SDK The default for this method is e_continue

Parameters:: mode (int) – Rules that Apryse SDK will follow after a connection error.

static SetConnectionErrorProc(instance)[source]

Sets the error handling function to be called when an error is encountered when connecting to PDFTron Web Services.

Parameters:

error_proc – Connection error handling callback function (or delegate in .NET)
data – Custom data to be passed as the fourth parameter to ‘error_proc’.

static SetDefaultDeviceCMYKProfile(args)[source]

Overload 1:

Sets the default ICC color profile for DeviceCMYK color space.

Notes: You can use this method to override default PDFNet settings. For more information on default color spaces please refer to section ‘Default Color Spaces’ in Chapter 4.5.4 of PDF Reference Manual.

Raises:: the function will throw Exception if the ICC profile can’t be found or if it fails to open.

Overload 2:

Sets the default ICC color profile for DeviceCMYK color space.

Notes: You can use this method to override default PDFNet settings. For more information on default color spaces please refer to section ‘Default Color Spaces’ in Chapter 4.5.4 of PDF Reference Manual.

Raises:: the function will throw Exception if the ICC profile fails to open.

static SetDefaultDeviceRGBProfile(args)[source]

Overload 1:

Sets the default ICC color profile for DeviceRGB color space.

Notes: You can use this method to override default PDFNet settings. For more information on default color spaces please refer to section ‘Default Color Spaces’ in Chapter 4.5.4 of PDF Reference Manual.

Raises:: the function will throw Exception if the ICC profile can’t be found or if it fails to open.

Overload 2:

Sets the default ICC color profile for DeviceRGB color space.

Notes: You can use this method to override default PDFNet settings. For more information on default color spaces please refer to section ‘Default Color Spaces’ in Chapter 4.5.4 of PDF Reference Manual.

Raises:: the function will throw Exception if the ICC profile fails to open.

static SetDefaultDiskCachingEnabled(use_disk)[source]

Sets the default policy on using temporary files.

_disk if parameter is true then new documents are allowed to create temporary files; otherwise all document contents will be stored in memory.

static SetDefaultFlateCompressionLevel(level)[source]

Sets the default compression level for Flate (ZLib).

type level:

int

Parameters:

level –

An integer in range 0-9 representing the compression value to use as a default for any Flate streams (e.g used to compress content streams, PNG images, etc). The library normally uses the default compression level (Z_DEFAULT_COMPRESSION). For most images, compression values in the range 3-6 compress nearly as well as higher levels, and do so much faster. For on-line applications it may be desirable to have maximum speed Z_BEST_SPEED = 1). You can also specify no compression (Z_NO_COMPRESSION = 0).

Z_DEFAULT_COMPRESSION (-1).

static SetLogLevel(args)[source]

static SetPersistentCachePath(persistent_path)[source]

Set the location of persistent cache files.

This method is provided for applications that require tight control of the location where temporary files are created.

static SetResourcesPath(path)[source]

Sets the location of PDFNet resource file.

Notes: Starting with v.4.5 PDFNet no longer requires a seperate resource file, and so this function is not required for proper PDFNet initialization. It remains available for backward compatibility. On mobile systems (iOS, Android, etc.) this method is required for proper initialization starting with version 6.0. (This helps reduce overall app size.) The function can be used on all platforms to specify a default search path for ICC profiles, fonts, and other user defined resources.

Parameters:

path (string) –

The default resource directory path.

Return type:

boolean

Returns:

true if path is found, false otherwise.

static SetTempPath(temp_path)[source]

Set the location of temporary folder.

This method is provided for applications that require tight control of the location where temporary files are created.

static SetViewerCache(max_cache_size, on_disk)[source]

Sets the default parameters for the viewer cache. Any subsequently created documents
will use these parameters.

type max_cache_size:

int

Parameters:

max_cache_size –
- The maximum size, in bytes, of the entire document’s page cache. Set to zero to disable the viewer cache.
type on_disk:

boolean
on_disk –
- If set to ‘true’, cache will be stored on the local filesystem. If set to ‘false’, cache will be stored in heap memory.
Desktop: max_cache_size = 512 MB, on_disk = true | Mobile: max_cache_size = 100 MB, on_disk = false

static SetWriteAPIUsageLocally(write_usage)[source]

Enable writing API usage locally.

Parameters:: write_usage (boolean) – if parameter is true API usage will be written to local JSON files in the persistent cache path otherwise no API usage is saved.

static Terminate(args)[source]

e_CNS1 = 4: Chinese; Traditional

e_GB1 = 3: Chinese; Simplified

e_Identity = 0: Generic/Unicode

e_Japan1 = 1: Japanese

e_Japan2 = 2: Japanese

e_Korea1 = 5: Korean

e_LogLevel_Debug = 0

e_LogLevel_Error = 4

e_LogLevel_Fatal = 5

e_LogLevel_Info = 2

e_LogLevel_Off = -1

e_LogLevel_Trace = 1

e_LogLevel_Warning = 3

e_Z_BEST_COMPRESSION = 9

e_Z_BEST_SPEED = 1

e_Z_DEFAULT_COMPRESSION = -1

e_Z_NO_COMPRESSION = 0

e_continue = 0

e_continue_unless_switching_to_demo = 1

e_icm = 1: Use Windows ICM2 (available only on Windows platforms).

e_lcms = 0: Use LittleCMS (available on all supported platforms).

e_no_cms = 2: No ICC color management.

e_stop = 2

property thisown: The membership flag

class apryse_sdk.PDFRasterizer(args)[source]

Bases: object

PDFRasterizer is a low-level PDF rasterizer.

The main purpose of this class is to convert PDF pages to raster images (or bitmaps).

Notes: PDFRasterizer is a relatively low-level class. If you need to convert PDF page to an image format or a Bitmap, consider using PDF::PDFDraw. Similarly, if you are building an interactive PDF viewing application you may want to use PDF::PDFView instead.

Destroy()[source]: Frees the native memory of the object.

GetColorPostProcessMode()[source]

Return type:: int
Returns:: the current color post processing mode.

GetRasterizerType()[source]

Return type:: int
Returns:: the type of current rasterizer.

Notes: This method is deprecated, since the GDI+ rasterizer itself is deprecated and will be removed in a future version of PDFNet. It is strongly recommended to use the built-in rasterizer and to use the XPS print path where vector conversion is needed.

Rasterize(page, width, height, stride, num_comps, demult, device_mtx, clip=None, scrl_clip_regions=None)[source]

Draws the page into a given memory buffer.

Notes: This method is available on all platforms and in all rasterizer implementations.

Parameters:

page (Page) – The page to rasterize.
in_out_image_buffer – A pointer to a memory buffer. The buffer must contain at least (stride height) bytes.
width (int) – The width of the target image in pixels.
height (int) – The height of the target image in pixels (the number of rows).
stride (int) – Stride determines the physical width (in bytes) of one row in memory. If this value is negative the direction of the Y axis is inverted. The absolute value of stride is of importance, because it allows rendering in buffers where rows are padded in memory (e.g. in Windows bitmaps are padded on 4 byte boundaries). Besides allowing rendering on the whole buffer stride parameter can be used for rendering in a rectangular subset of a buffer.
num_comps (int) – The number (4 or 5) representing the number of color components in the device color space. For BGR+Alpha set this parameter to 4, and for CMYK+Alpha use 5. If other values are set, exceptions will be thrown.
demult (boolean) –
- Specifies if the alpha is de-multiplied from the resulting color components.
device_mtx (Matrix2D) – Device transformation matrix that maps PDF page from PDF user space into device coordinate space (e.g. pixel space). PDF user space is represented in page units, where one unit corresponds to 1/72 of an inch.
clip (Rect, optional) – Optional parameter defining the clip region for the page. If the parameter is null or is not specified, PDFRasterizer uses page’s crop box as a default clip region.
scrl_clp_regions – Optional parameter reserved for a future use.
cancel – An optional variable that can be used to stop the rendering thread.

Sample code:

double drawing_scale = 2:
Common::Matrix2D mtx(drawing_scale, 0, 0, drawing_scale, 0, 0);
PDF::Rect bbox(page.GetMediaBox());
bbox.Normalize();
int width = int(bbox.Width()  drawing_scale);
int height = int(bbox.Height()  drawing_scale);

// Stride is represented in bytes and is aligned on 4 byte
// boundary so that you can render directly to GDI bitmap.
// A negative value for stride can be used to flip the image
// upside down.
int comps = 4;  // for BGRA
int stride = ((width  comps + 3) / 4)  4;

// buf is a memory buffer containing at least (strideheight) bytes.
   memset(ptr, 0xFF, heightstride);  // Clear the background to opaque white paper color.

PDFRasterizer rast;
rast.Rasterize(page, buf, width, height, stride, 4, false, mtx);

RasterizeSeparations(page, width, height, mtx, clip, cancel)[source]

Draws the page into a given memory buffer.

Notes: This method is available on all platforms and in all rasterizer implementations.

Parameters:

page (Page) – The page to rasterize.
width (int) – The width of the target image in pixels.
height (int) – The height of the target image in pixels (the number of rows).
mtx (Matrix2D) – Device transformation matrix that maps PDF page from PDF user space into device coordinate space (e.g. pixel space). PDF user space is represented in page units, where one unit corresponds to 1/72 of an inch.
clip (Rect) – Optional parameter defining the clip region for the page. If the parameter is null or is not specified, PDFRasterizer uses page’s crop box as a default clip region.
cancel (boolean) – An optional variable that can be used to stop the rendering thread.

SetAntiAliasing(enable_aa)[source]

Enable or disable anti-aliasing.

Anti-Aliasing is a technique used to improve the visual quality of images when displaying them on low resolution devices (for example, low DPI computer monitors).

Anti-aliasing is enabled by default.

SetCaching(enabled=True)[source]

Enables or disables caching. Caching can improve the rendering performance in cases where the same page will be drawn multiple times.

Parameters:

enabled (boolean, optional) –

if true PDFRasterizer will cache frequently used graphics objects.

SetColorPostProcessMode(mode)[source]

Set the color post processing transformation. This transform is applied to the rasterized bitmap as the final step in the rasterization process, and is applied directly to the resulting bitmap (disregarding any color space information). Color post processing only supported for RGBA output.

Parameters:: mode (int) – is the specific transform to be applied

SetDrawAnnotations(render_annots)[source]

Enable or disable annotation and forms rendering. By default, annotations and forms are rendered.

Parameters:: render_annots (boolean) – True to draw annotations, false otherwise.

SetDrawUIElements(draw_ui_elements)[source]

Enable or disable drawing ui elements. Default is disabled.

Parameters:: draw_ui_elements (boolean) – true to draw ui elements, false otherwise.

SetErrorReportProc(instance)[source]

Sets the error handling function to be called in case an error is encountered during page rendering.

Parameters:

error_proc – Error handling callback function (or delegate in .NET)
data – Custom data to be passed as a second parameter to ‘error_proc’.

SetGamma(expgamma)[source]

Sets the gamma factor used for anti-aliased rendering.

Parameters:: expgamma (double) – is the exponent value of gamma function. Typical values are in the range from 0.1 to 3.

Gamma correction can be used to improve the quality of anti-aliased image output and can (to some extent) decrease the appearance common anti-aliasing artifacts (such as pixel width lines between polygons).

Notes: Gamma correction is used only in the built-in rasterizer.

SetHighlightFields(highlight_fields)[source]

Enable or disable highlighting form fields. Default is disabled.

Parameters:: highlight_fields (boolean) – true to highlight, false otherwise.

SetImageSmoothing(smoothing_enabled=True, hq_image_resampling=False)[source]

Enable or disable image smoothing.

The rasterizer allows a tradeoff between rendering quality and rendering speed. This function can be used to indicate the preference between rendering speed and quality.

Notes: image smoothing option has effect only if the source image has higher resolution that the output resolution of the image on the rasterized page. PDFNet automatically controls at what resolution/zoom factor, ‘image smoothing’ needs to take effect.

Parameters:

smoothing_enabled (boolean, optional) – True to enable image smoothing, false otherwise.
hq_image_resampling (boolean, optional) – True to use a higher quality (but slower) smoothing algorithm image smoothing is enabled and hq_image_resampling is false.

SetNightModeTuning(contrast, saturation, flipness)[source]

This setting controls the contrast, saturation and flipness of a rendered PDF when night mode is set. By default no additional tuning is done.

Parameters:

contrast (double) – change the difference in luminance or color that makes an object distinguishable from other objects.
saturation (double) – change the color intensity.
flipness (double) – controls the inversion of colors when rendered.

Notes: values range from 0.0 to 1.0

SetOCGContext(ctx)[source]

Sets the Optional Content Group (OCG) context that should be used when: rendering the page. This function can be used to selectively render optional

content (such as PDF layers) based on the states of optional content groups in the given context.

Parameters:: ctx (Context) – Optional Content Group (OCG) context, or NULL if the rasterizer should render all content on the page.

SetOverprint(op)[source]

Enable or disable support for overprint and overprint simulation. Overprint is a device dependent feature and the results will vary depending on the output color space and supported colorants (i.e. CMYK, CMYK+spot, RGB, etc).

By default overprint is only enabled for PDF/X files.

Parameters:: op (int) – e_op_on: always enabled; e_op_off: always disabled; e_op_pdfx_on: enabled for PDF/X files only.

SetPathHinting(enable_hinting)[source]

Enable or disable path hinting.

Parameters:: enable_hinting (boolean) – if true path hinting is enabled. Path hinting is used to slightly adjust paths in order to avoid or alleviate artifacts of hair line cracks between certain graphical elements. This option is turned on by default.

SetPrintMode(is_printing)[source]

Tells the rasterizer to render the page ‘print’ mode. Certain page elements (such as annotations or OCG-s) are meant to be visible either on the screen or on the printed paper but not both. A common example, is the “Submit” button on electronic forms.

Parameters:: is_printing (boolean) – set to true is the page should be rendered in print mode. By default, print mode flag is set to false.

SetRasterizerType(type)[source]

Sets the core graphics library used for rasterization and rendering. Using this method it is possible to quickly switch between different implementations. By default, PDFNet uses a built-in, high-quality, and platform independent rasterizer.

Parameters:: type (int) – Rasterizer type.

Notes: This method is deprecated, since the GDI+ rasterizer itself is deprecated and will be removed in a future version of PDFNet. It is strongly recommended to use the built-in rasterizer and to use the XPS print path where vector conversion is needed.

SetThinLineAdjustment(pixel_grid_fit, stroke_adjust)[source]

Set thin line adjustment parameters.

Parameters:

pixel_grid_fit (boolean) – if true (horizontal/vertical) thin lines will be snapped to integer pixel positions. This helps make thin lines look sharper and clearer. This option is turned off by default and it only works if path hinting is enabled.
stroke_adjust (boolean) – if true auto stroke adjustment is enabled. Currently, this would make lines with sub-pixel width to be one-pixel wide. This option is turned on by default.

SetThinLineScaling(scaling)[source]

This setting controls the thickness of zero-width lines when rendered. In a PDF, a line width of zero denotes the thinnest line that can be rendered at device resolution: 1 device pixel wide. However, on high-resolution devices, a single pixel can be nearly invisible.

Parameters:: scaling (double) – use this setting to increase the apparent thickness of these zero-width lines. 1.0 (1 pixel wide)

UpdateBuffer()[source]: This function is typically called for progressive rendering, in which we don’t want to stop the main rendering thread. Since the rendering thread may modify separation channels, we don’t consider separations in progressive rendering.

e_BuiltIn = 0: high-quality, platform independent rasterizer.

e_GDIPlus = 1: GDI+ based rasterizer. (Deprecated)

e_op_off = 0

e_op_on = 1

e_op_pdfx_on = 2

e_postprocess_gradient_map = 2

e_postprocess_invert = 1

e_postprocess_night_mode = 3

e_postprocess_none = 0

property mp_rast

property thisown: The membership flag

class apryse_sdk.PDFTronCustomSecurityHandler(custom_id)[source]

Bases: SecurityHandler

This class represents PDFTron Custom Security handler that applies PDFTron’s custom encryption method on save.

property thisown: The membership flag

class apryse_sdk.PDFUAConformance(args)[source]

Bases: object

The class PDFUAConformance. PDFUAConformance class is used to process PDF documents for PDF/UA (ISO 14289-1) compliance, including converting existing PDF files to PDF/UA compliant documents.

Note: This feature is currently experimental and subject to change.

AutoConvert(args)[source]

Overload 1:

Converts the input pdf to PDF/UA, will auto-gen required structure analysis JSON, requires DataExtractionModule (with Doc Structure engine) to be properly configured.

Parameters:

src_file (string) – The path to the PDF file to convert.
dest_file (string) – The path to output the converted file.

Notes: This function is experimental and is subject to change.

Overload 2:

Converts the input pdf to PDF/UA, will auto-gen required structure analysis JSON, requires DataExtractionModule (with Doc Structure engine) to be properly configured.

Parameters:

src_file (string) – The path to the PDF file to convert.
dest_file (string) – The path to output the converted file.
options (PDFUAOptions) – The options to use when converting/validating, see PDFUAOptions for details.

Notes: This function is experimental and is subject to change.

static CreateInternal(impl)[source]

Destroy()[source]

GetHandleInternal()[source]

e_UA_Level1 = 0

property m_impl

property thisown: The membership flag

class apryse_sdk.PDFUAOptions[source]

Bases: object

GetConformanceLevel()[source]

Gets the value ConformanceLevel from the options object. The PDF/UA conformance level. By default the conformance level is PDF/UA-1.

Return type:: int
Returns:: The current value for ConformanceLevel.

GetFirstStop()[source]

Gets the value FirstStop from the options object. Whether to stop processing after the first PDF/UA error is detected. By default, processing continues.

Return type:: boolean
Returns:: The current value for FirstStop.

GetMaxRefObjs()[source]

Gets the value MaxRefObjs from the options object. The maximum number of object references per error condition. This is 10 by default.

Return type:: int
Returns:: The current value for MaxRefObjs.

GetPassword()[source]

Gets the value Password from the options object. The password to be used for encrypted PDF documents. By default, no password is used.

Return type:: string
Returns:: The current value for Password.

GetSaveLinearized()[source]

Gets the value SaveLinearized from the options object. Whether to linearize when saving converted output. By default, the output is not linearized.

Return type:: boolean
Returns:: The current value for SaveLinearized.

SetConformanceLevel(value)[source]

Sets the value for ConformanceLevel in the options object. The PDF/UA conformance level. By default the conformance level is PDF/UA-1.

Parameters:: value (int) – The new value for ConformanceLevel.
Return type:: PDFUAOptions
Returns:: This object, for call chaining.

SetFirstStop(value)[source]

Sets the value for FirstStop in the options object. Whether to stop processing after the first PDF/UA error is detected. By default, processing continues.

Parameters:: value (boolean) – The new value for FirstStop.
Return type:: PDFUAOptions
Returns:: This object, for call chaining.

SetMaxRefObjs(value)[source]

Sets the value for MaxRefObjs in the options object. The maximum number of object references per error condition. This is 10 by default.

Parameters:: value (int) – The new value for MaxRefObjs.
Return type:: PDFUAOptions
Returns:: This object, for call chaining.

SetPassword(value)[source]

Sets the value for Password in the options object. The password to be used for encrypted PDF documents. By default, no password is used.

Parameters:: value (string) – The new value for Password.
Return type:: PDFUAOptions
Returns:: This object, for call chaining.

SetSaveLinearized(value)[source]

Sets the value for SaveLinearized in the options object. Whether to linearize when saving converted output. By default, the output is not linearized.

Parameters:: value (boolean) – The new value for SaveLinearized.
Return type:: PDFUAOptions
Returns:: This object, for call chaining.

property thisown: The membership flag

class apryse_sdk.PDFView[source]

Bases: object

PDFView is a utility class that can be used for interactive rendering of PDF documents.

In .NET environment PDFView is derived from System.Windows.Forms.Control and it can be used like a regular form (see PDFViewForm.cs in PDFView sample for C# for a concrete example).

PDFView implements some essential features such as double-buffering, multi-threaded rendering, scrolling, zooming, and page navigation that are essential in interactive rendering applications (e.g. in client PDF viewing and editing applications).

PDFView defines several coordinate spaces and it is important to understand their differences:

Page Space refers to the space in which a PDF page is defined. It is determined by

a page itself and the origin is at the lower-left corner of the page. Note that Page Space is independent of how a page is viewed in PDFView and each page has its own Page space.

Canvas Space refers to the tightest axis-aligned bounding box of all the pages given

the current page presentation mode in PDFView. For example, if the page presentation mode is e_single_continuous, all the pages are arranged vertically with one page in each row, and therefore the Canvas Space is rectangle with possibly large height value. For this reason, Canvas Space is also, like Page Space, independent of the zoom factor. Also note that since PDFView adds gaps between adjacent pages, the Canvas Space is larger than the space occupied by all the pages. The origin of the Canvas Space is located at the upper-left corner.

Screen Space (or Client Space) is the space occupied by PDFView and its origin is at

the upper-left corner. Note that the virtual size of this space can extend beyond the visible region.

Scrollable Space is the virtual space within which PDFView can scroll. It is determined

by the Canvas Space and the current zoom factor. Roughly speaking, the dimensions of the Scrollable Space is the dimensions of the Canvas Space timed by the zoom. Therefore, a large zoom factor will result in a larger Scrollable region given the same Canvas region. For this reason, Scrollable Space might also be referred to as Zoomed Canvas Space. Note that since PDFView adds gaps between pages in Canvas Space and these gaps are not scaled when rendered, the scrollable range is not exactly what the zoom factor times the Canvas range. For functions such as SetHScrollPos(), SetVScrollPos(), GetCanvasHeight(), and GetCanvasWidth(), it is the Scrollable Space that is involved.

Notes: PDFView is available on all platforms supported by PDFNet.

CanRedo()[source]

Returns whether there is a redo state available in the undo/redo chain.

Return type:: boolean
Returns:: true if redo is available, false otherwise.

CanUndo()[source]

Returns whether there is an undo state available in the undo/redo chain.

Return type:: boolean
Returns:: true if undo is available, false otherwise.

CancelAllThumbRequests()[source]

CancelFindText()[source]: Cancel the text search thread if FindText() is started in a different thread. Note that if the text search thread is currently being suspended by the render thread, it will only be canceled after it is awaken by the render thread.

CancelRendering()[source]: Cancels rendering in progress. If PDFView is not busy rendering the page, the function has no side effects.

ClearSelection()[source]: Remove any text selection.

ClearThumbCache()[source]: Remove all thumbnails from the persistent disk cache.

CloseDoc()[source]: Close the associated PDF document.

ConvCanvasPtToPagePt(pt, page_num=-1)[source]

Converts a point expressed in canvas space to a point in a page space.

Parameters:: page_num (int, optional) – the page number for the page used as the origin of the destination coordinate system. Negative values are used to represent the current page. Pages are indexed starting from one.

ConvCanvasPtToScreenPt(pt)[source]: Converts a point expressed in canvas space to a point in screen space.

ConvPagePtToCanvasPt(pt, page_num=-1)[source]

Converts a point from a page space to point in canvas space.

Parameters:: page_num (int, optional) – the page number for the page used as the origin of the destination coordinate system. Negative values are used to represent the current page. Pages are indexed starting from one.

ConvPagePtToScreenPt(pt, page_num=-1)[source]

Converts a point in a page space to a point in the screen space. If PDFView is in a non-continous page view mode, and the page is not visible, the result is undefined.

Parameters:: page_num (int, optional) – the page number for the page used as the origin of the destination coordinate system. Negative values are used to represent the current page. Pages are indexed starting from one.

ConvScreenPtToCanvasPt(pt)[source]: Converts a point expressed in screen space to a point in canvas space.

ConvScreenPtToPagePt(pt, page_num=-1)[source]

Converts a point expressed in screen space to a point in a page space.

Parameters:: page_num (int, optional) – the page number for the page used as the origin of the destination coordinate system. Negative values are used to represent the current page. Pages are indexed starting from one.

Destroy()[source]: Frees the native memory of the object.

DocLock(cancel_threads)[source]: Acquires a write lock on the currently open document, optionally canceling all threads accessing the document.

DocLockRead()[source]: Locks the currently open document to prevent competing write threads (using Lock()) from accessing the document at the same time. Other reader threads however, will be allowed to access the document. Threads attempting to obtain write access to the document will wait in suspended state until the thread that owns the lock calls doc.UnlockRead(). Note: To avoid deadlocks obtaining a write lock while holding a read lock is not permitted and will throw an exception. If this situation is encountered please either unlock the read lock before the write lock is obtained or acquire a write lock (rather than read lock) in the first place.

DocTryLock(milliseconds=0)[source]

Try acquiring a write lock on the currently open document, waiting no longer than specified number of milliseconds.

Return type:: boolean
Returns:: true if the document is locked for multi-threaded access, false otherwise.

DocTryLockRead(milliseconds=0)[source]

Try acquiring a read lock on the current document, waiting no longer than specified number of milliseconds.

Return type:: boolean
Returns:: true if the document is locked for multi-threaded access, false otherwise.

DocUnlock()[source]: Releases the write lock from the currently open document.

DocUnlockRead()[source]: Releases the read lock from the currently open document.

EnableUndoRedo()[source]: Enable Undo/Redo in this PDFView.

ExecuteAction(args)[source]

FindTextAsync(search_str, match_case, match_whole_word, search_up, reg_exp)[source]

Searches for the provided search string in the documents in a secondary thread, and calls FindTextHandler with the resulting selection.

Parameters:

search_str (string) – The string to search for in the document
match_case (boolean) – Set to true for case-sensitive search
match_whole_word (boolean) – Set to true to match whole words only
search_up (boolean) – Set to true to search up through the document
reg_exp (boolean) – Set to true to interpret search_str as a regular expression

GetAnnotTypeUnder(x, y)[source]

Return type:

int

Returns:

annotation type at the given point

Parameters:

x (double) –
- x coordinate of the input point
y (double) –
- y coordinate in the input point

GetAnnotationAt(x, y, distanceThreshold, minimumLineWeight)[source]

Gets the annotation at the (x, y) position expressed in screen coordinates

Parameters:

x (int) – x coordinate of the screen point
y (int) – y coordinate of the screen point
distanceThreshold (double) – Maximum distance from the point (x, y) to the annotation for the annot to be considered a hit.
minimumLineWeight (double) – For very thin lines, it is almost impossible to hit the actual line. This specifies a minimum line thickness (in screen coordinates) for the purpose of calculating whether a point is inside the annotation or not.

Return type:

Annot

Returns:

the annotation at (x, y). If there is no annotation at (x, y), the returned annotation’s IsValid method will return false.

GetAnnotationListAt(x1, y1, x2, y2)[source]

Returns a vector of annotations under the line (x1, y1, x2, y2) expressed in screen coordinates. Does not include form field annotations.

Parameters:

x1 (int) – The x-coordinate of the first point of the line.
y1 (int) – The y-coordinate of the first point of the line.
x2 (int) – The x-coordinate of the second point of the line.
y2 (int) – The y-coordinate of the second point of the line.

Return type:

std::vector< PDF::Annot,std::allocator< PDF::Annot > >

Returns:

A vector of annotations under the line (x1, y1, x2, y2) expressed in screen coordinates.

GetAnnotationsOnPage(page_num)[source]

Returns a vector of all of the annotations on the given page.

Parameters:: page_num (int) – The page number for which to retrieve annotations.
Return type:: std::vector< PDF::Annot,std::allocator< PDF::Annot > >
Returns:: A vector of all of the annotations on the given page.

GetBuffer()[source]: Returns the pointer to the internal memory buffer containing the rasterized image of the given page. The buffer size is at least ‘GetBufferHeightGetBufferStride’ bytes. The pixel data is stored in 8 bit per component, BGRA format.

GetBufferHeight()[source]: Returns the width of the rendering buffer in pixels. Notes: this method is typically used only in PDFNet for C++

GetBufferStride()[source]: Returns the stride of the rendering buffer in pixels. Notes: this method is typically used only in PDFNet for C++

GetBufferWidth()[source]: Returns the width of the rendering buffer in pixels. Notes: this method is typically used only in PDFNet for C++

GetCanvasHeight()[source]: Returns the height of the scrollable space.

GetCanvasWidth()[source]: Returns the width of the scrollable space.

GetColorPostProcessMode()[source]

Return type:: int
Returns:: the current color post processing mode.

GetCurrentPage()[source]

Return type:: int
Returns:: the current page displayed in the view.

GetDeviceTransform(page_num=-1)[source]

Return type:: Matrix2D
Returns:: the device transformation matrix. The device transformation matrix maps the page coordinate system to screen (or device) coordinate system.
Parameters:: page_num (int, optional) – same as for PDFView.Conv???() methods.

Notes: to obtain a transformation matrix that maps screen coordinates to page coordinates, you can invert the device matrix. For example:

Common::Matrix2D scr2page(pdfview.GetDeviceTransform());
scr2page.Inverse();

GetDoc()[source]

Return type:: PDFDoc
Returns:: Currently associated document with this PDFView.

GetExternalAnnotManager(args)[source]

GetHScrollPos()[source]

Return type:: double
Returns:: the current horizontal scroll position in the scrollable space.

GetLinkAt(x, y)[source]

Gets the link info at a given point, specified in client space.

Parameters:

x (int) – the x position in client space
y (int) – the y position in client space

Return type:

LinkInfo

Returns:

the LinkInfo object with the link information or null if no link is found in the queried location.

Notes: To get valid links, SetUrlExtraction(boolean) must be set to true before opening the document.

GetNextRedoInfo()[source]

Returns any meta-data associated with the next state in the redo chain.

Notes: An empty string means redo is unavailable.

Return type:: string
Returns:: The meta-data associated with the next state in the redo chain.

GetNextUndoInfo()[source]

Returns any meta-data associated with the previous state in the undo chain.

Notes: An empty string or “{"label”:”initial”}” means that undo is unavailable.

@return The meta-data associated with the previous state in the undo chain.

GetOCGContext()[source]

Return type:: Context
Returns:: the Optional Content Group (OCG) context associated with this PDFView, or NULL (i.e. context.IsValid()==false) if there is no OCG context associated with the view. If an OCG context associated with the view, optional content (such as PDF layers) will be selectively rendered based on the states of optional content groups in the given context.

GetPageCount()[source]

Return type:: int
Returns:: the total number of pages in the document.

GetPageNumberFromScreenPt(x, y)[source]

Return type:: int
Returns:: the number of the page located under the given screen coordinate. The positive number indicates a valid page, whereas number less than 1 means that no page was found.

GetPagePresentationMode()[source]

Return type:: int
Returns:: the current page presentation mode.

GetPageRefViewMode()[source]: Gets the reference page view mode. See more details about reference page view mode in {#setPageRefViewMode(int)}.

GetPageViewMode()[source]

Return type:: int
Returns:: the current page viewing mode

GetPostProcessedColor(color)[source]

Converts a color based on the view’s color post processing transformation.

Parameters:: color (ColorPt) – the color to be converted
Return type:: ColorPt
Returns:: the post-processed color

GetRightToLeftLanguage()[source]

Returns the current right-to-left language mode for structural selection.

Return type:: boolean
Returns:: true if the current language mode is right-to-left.

GetRotation()[source]

Return type:: int
Returns:: The current rotation of this PDFView.

GetScreenRectForAnnot(annot, page_num=-1)[source]

Gets the annotation bounding box in screen points

Parameters:

annot (Annot) – target annotation
page_num (int, optional) – the page number that the annotation is on S

Return type:

Rect

Returns:

the annotation bounding box in screen points

GetSelection(pagenum=-1)[source]

Return type:: Selection
Returns:: Current text selection for a given page

GetSelectionBeginPage()[source]

Return type:: int
Returns:: the first page number that has text selection on it. Useful when there are selections on multiple pages at the same time.

GetSelectionEndPage()[source]

Return type:: int
Returns:: the last page number that has text selection on it. Useful when there are selections on multiple pages at the same time.

GetTextSelectionMode()[source]

Return type:: int
Returns:: the current selection mode used for text highlighting.

GetThumbAsync(page_num, instance)[source]

Retrieves the specified thumbnail from the persistent thumbnail cache on disk, then calling proc on the resulting thumbnail.

Parameters:

page_num (int) – The page number of the thumbnail.
proc – A callback function that will be called after the thumbnail is retrieved, or if that retrieval fails.
data – Custom data to be passed as a parameter to ‘proc’.

GetThumbInCache(page_num, buf, out_width, out_height)[source]

Retrieves the specified thumbnail from the persistent thumbnail cache on disk if it is available.

Parameters:

page_num (int) – The page number of the thumbnail.
buf (UChar) – the buffer in which to store thumbnail data. This buffer should have space for GetThumbInCacheSize bytes.
out_width (int) – the width of the thumbnail
out_height (int) – the height of the thumbnail

Return type:

boolean

Returns:

true if the thumbnail is found in the cache and false otherwise.

GetThumbInCacheSize(page_num)[source]

Gets the data size of a cached thumbnail.

Parameters:: page_num (int) – The page number of the thumbnail.
Return type:: int
Returns:: if the thumbnail is available returns the size of the thumbnail in bytes otherwise returns 0

GetVScrollPos()[source]

Return type:: double
Returns:: the current vertical scroll position in the scrollable space.

GetVisiblePages()[source]

Get a vector with the pages currently visible on the screen.

Return type:: std::vector< int,std::allocator< int > >
Returns:: a vector of the pages currently visible on the screen.

GetZoom()[source]

Returns the current zoom factor.

Return type:: double
Returns:: current zoom (or scaling) component used to display the page content.

GotoFirstPage()[source]

Sets the current page to the first page in the document.

Return type:: boolean
Returns:: true if successful, false otherwise.

GotoLastPage()[source]

Sets the current page to the last page in the document.

Return type:: boolean
Returns:: true if successful, false otherwise.

GotoNextPage()[source]

Sets the current page to the next page in the document.

Return type:: boolean
Returns:: true if successful, false otherwise.

GotoPreviousPage()[source]

Sets the current page to the previous page in the document.

Return type:: boolean
Returns:: true if successful, false otherwise.

HasChangesSinceSnapshot()[source]

Returns whether the document has been modified since the last undo/redo snapshot was taken.

Return type:: boolean
Returns:: true if the document was modified since the last snapshot, false otherwise.

HasSelection()[source]

Return type:: boolean
Returns:: return true if there is selection, false otherwise.

HasSelectionOnPage(ipage)[source]

Return type:: boolean
Returns:: returns true if given page number has any text selection on it. Useful when there are selections on multiple pages at the same time.

HideAnnotation(annot)[source]

Disable rendering of a particular annotation. This does not change the annotation itself, just how it is displayed in this viewer instance.

Parameters:: annot (Annot) – The annotation object to cease drawing for.

IsFinishedRendering(visible_region_only)[source]

Parameters:

visible_region_only (boolean) –

Specifies if the method refers only to currently

visible content.

Return type:

boolean

Returns:

true is the rendering thread finished rendering the view, false if the rendering is still in progress.

IsThereTextInRect(x1, y1, x2, y2)[source]

Return type:: boolean
Returns:: true if there is a text in the given rectangle, false otherwise and point (x2, y2) is the end selection point. The points are defined in screen space.

OnScroll(pix_dx, pix_dy)[source]

Scrolls the contents of the rendering buffer ‘pix_dx’ horizontally and ‘pix_dy’ vertically.

Parameters:

pix_dx (int) – horizontal scroll offset, in pixels
pix_dy (int) – vertical scroll offset, in pixels

OnSize(width, height)[source]

Resize rendering buffer to new dimensions.

Parameters:

width (int) –
- The width of the target image in pixels.
height (int) –
- The height of the target image in pixels (the number of rows).

Notes: this method is typically used only in PDFNet for C++

OpenUniversalDoc(conversion)[source]

Associates this PDFView with a given document conversion. The conversion will be performed page-by-page, asynchronously. The pdview object will be updated to display the conversion result

Parameters:

doc –

A document to be displayed in the view.

PVM_SIZE = 4

PrepareAnnotsForMouse(page_num, distance_threshold, minimum_line_weight)[source]

Requests for preparing annotations of the given page. Note: Annotations are going to be prepared asynchronously

Parameters:

page_num (int) –
- page number
distance_threshold (double) –
- Maximum distance from the point (x, y) to the annotation for the annot to be considered a hit.
minimum_line_weight (double) –
- For very thin lines, it is almost impossible to hit the actual line.
This specifies a minimum line thickness (in screen coordinates) for the purpose of calculating whether a point is inside the annotation or not

PrepareWords(page_num)[source]

Requests for preparing words of the given page. Note: Words are going to be prepared asynchronously

Parameters:

page_num (int) –

page number

Redo()[source]

Go to the next state in the undo/redo chain.

Return type:: string
Returns:: Meta-data associated with the current state in the undo/redo chain after redoing one state.

RefreshAndUpdate(view_change)[source]

Helper function that will refresh annotation and/or field appearances if needed, and then render modified page areas, all based on the contents of the view_change parameter.

Parameters:: view_change (ViewChangeCollection) – contains all the updated fields and rectangles.

RevertAllChanges()[source]: Returns the document to the initial state in the undo/redo chain.

RevertChangesSinceSnapshot()[source]

Restores the document to the state of the last undo/redo snapshot, if it has been modified since.

See also: IconCaptionRelation

GetMouseDownCaptionText()[source]

Returns the button down caption text of the annotation.

Return type:: string
Returns:: A string containing the button down text of the annotation.

Notes: The button down caption shall be displayed when the mouse button is pressed within its active area.

GetMouseDownIcon()[source]

Returns the Mouse Down icon associated with the annotation.

Return type:: Obj
Returns:: An SDF object that represents the Mouse Down icon associated with the annotation.

Notes: The Mouse Down icon object is a form XObject defining the annotation’s alternate (down) icon, which shall be displayed when the mouse button is pressed within its active area.

GetRolloverCaptionText()[source]

Returns the rollover caption text of the annotation.

Return type:: string
Returns:: A string containing the rollover caption text of the annotation.

Notes: The rollover caption shall be displayed when the user rolls the cursor into its active area without pressing the mouse button.

GetRolloverIcon()[source]

Returns the rollover icon associated with the annotation.

Return type:: Obj
Returns:: An SDF object that represents the rollover icon associated with the annotation.

Notes: The rollover icon object is a form XObject defining the annotation’s rollover icon, which shall be displayed when the user rolls the cursor into its active area without pressing the mouse button.

GetScaleCondition()[source]

Returns the condition under which the icon should be scaled.

Return type:: int
Returns:: A value of the “ScaleCondition” enum type. Default value: e_Always.

See also: ScaleType

GetStaticCaptionText()[source]

Returns static caption text of the annotation.

Return type:: string
Returns:: A string containing the static caption text of the annotation.

Notes: The static caption is the annotation’s normal caption, which shall be displayed when it is not interacting with the user.

GetStaticIcon()[source]

Returns the static icon associated with the annotation.

Return type:: Obj
Returns:: An SDF object that represents the static icon associated with the annotation.

Notes: The static icon object is a form XObject defining the annotation’s normal icon, which shall be displayed when it is not interacting with the user.

GetTitle()[source]

Returns the title of the annotation.

Return type:: string
Returns:: A string representing the title of the annotation.

GetVIconLeftOver()[source]

Returns the vertical leftover space of the icon within the annotation.

Return type:: double
Returns:: a number indicating the vertical leftover space of the icon within the annotation.

Notes: the vertical leftover space is a number that shall be between 0.0 and 1.0 indicating the fraction of leftover space to allocate at the bottom of the icon. A value of 0.0 shall position the icon at the bottom of the annotation rectangle. A value of 0.5 shall center it in the vertical direction within the rectangle. This entry shall be used only if the icon is scaled proportionally. Default value: 0.5.

SetAction(action)[source]

Sets the action of the Screen annotation (Optional; PDF 1.1 )

Parameters:: action (Action) – An action object representing the action of the annotation.

Notes: The action is an action that shall be performed when the annotation is activated.

SetBackgroundColor(col, numcomp)[source]

Sets the background color of the annotation. (Optional)

Parameters:

col (ColorPt) – A color point that denotes the color of the screen background.
numcomp (int) – An integer which value indicates the color space used for the parameter c.

SetBorderColor(col, numcomp)[source]

Sets the border color of the annotation. (Optional)

Parameters:

col (ColorPt) – A color object that denotes the color of the screen border.
numcomp (int) – An integer which value indicates the color space used for the parameter c.

SetFitFull(ff)[source]

Sets the “fit full” flag. (Optional)

Parameters:: ff (boolean) – A boolean value indicating the “fit full” flag value.

Notes: the fit full flag, if true, indicates that the button appearance shall be scaled to fit fully within the bounds of the annotation without taking into consideration the line width of the border. Default value: false.

SetHIconLeftOver(hl)[source]

Sets the horizontal leftover space of the icon within the annotation. (Optional)

Parameters:: hl (double) – A number indicating the horizontal leftover space of the icon within the annotation.

Notes: the horizontal leftover space is a number that shall be between 0.0 and 1.0 indicating the fraction of leftover space to allocate at the left. A value of 0.0 shall position the icon at the left of the annotation rectangle. A value of 0.5 shall center it in the horizontal direction within the rectangle. This entry shall be used only if the icon is scaled proportionally. Default value: 0.5.

SetIconCaptionRelation(icr)[source]

Sets the Icon and caption relationship of the annotation. (Optional; pushbutton fields only)

Parameters:: icr (int) – A value of the “IconCaptionRelation” enum type. Default value: e_NoIcon.

See also: IconCaptionRelation

SetMouseDownCaptionText(contents)[source]

Sets the button down caption text of the annotation. (Optional; button fields only)

Parameters:: contents (string) – A string containing the button down text of the annotation.

Notes: The button down caption shall be displayed when the mouse button is pressed within its active area.

SetMouseDownIcon(icon)[source]

Sets the Mouse Down icon associated with the annotation. (Optional; button fields only)

Parameters:: icon (Obj) – An SDF object that represents the Mouse Down icon associated with the annotation.

Notes: The Mouse Down icon object is a form XObject defining the annotation’s alternate (down) icon, which shall be displayed when the mouse button is pressed within its active area.

SetRolloverCaptionText(contents)[source]

Sets the roll over caption text of the annotation. (Optional; button fields only)

Parameters:: contents (string) – A string containing the roll over caption text of the annotation.

Notes: The rollover caption shall be displayed when the user rolls the cursor into its active area without pressing the mouse button.

SetRolloverIcon(icon)[source]

Sets the rollover icon associated with the annotation. (Optional; button fields only)

Parameters:: icon (Obj) – An SDF object that represents the rollover icon associated with the annotation.

Notes: The rollover icon object is a form XObject defining the annotation’s rollover icon, which shall be displayed when the user rolls the cursor into its active area without pressing the mouse button.

SetScaleCondition(sc)[source]

Sets the condition under which the icon should be scaled. (Optional)

Parameters:: sc (int) – A value of the “ScaleCondition” enum type. Default value: e_Always.

SetScaleType(st)[source]

Sets the Scale Type of the annotation. (Optional)

Parameters:: st (int) – An entry of the “ScaleType” enum which represents the Scale Type of the annotation. Default value: P.

See also: ScaleType

SetStaticCaptionText(contents)[source]

Sets static caption text of the annotation. (Optional; button fields only)

Parameters:: contents (string) – A string containing the static caption text of the annotation.

Notes: The static caption is the annotation’s normal caption, which shall be displayed when it is not interacting with the user.

SetStaticIcon(icon)[source]

Sets the static icon associated with the annotation. (Optional; button fields only)

Parameters:: icon (Obj) – An SDF object that represents the static icon associated with the annotation.

Notes: The static icon object is a form XObject defining the annotation’s normal icon, which shall be displayed when it is not interacting with the user.

SetTitle(title)[source]

Sets the title of the Annotation. (Optional)

Parameters:: title (string) – A string representing the title of the annotation.

SetVIconLeftOver(vl)[source]

Sets the vertical leftover space of the icon within the annotation. (Optional)

Parameters:: vl (double) – A number indicating the vertical leftover space of the icon within the annotation.

Notes: the vertical leftover space is a number that shall be between 0.0 and 1.0 indicating the fraction of leftover space to allocate at the bottom of the icon. A value of 0.0 shall position the icon at the bottom of the annotation rectangle. A value of 0.5 shall center it in the vertical direction within the rectangle. This entry shall be used only if the icon is scaled proportionally. Default value: 0.5.

e_Always = 0: Always scale

e_Anamorphic = 0

e_CAboveI = 3

e_CBelowI = 2

e_CLeftIRight = 5

e_COverlayI = 6

e_CRightILeft = 4

e_Never = 3: Never scale

e_NoCaption = 1

e_NoIcon = 0

e_Proportional = 1

e_WhenBigger = 1: Scale only when the icon is bigger than the annotation rectangle

e_WhenSmaller = 2: Scale only when the icon is smaller than the annotation rectangle

property thisown: The membership flag

class apryse_sdk.SearchResult(args)[source]

Bases: object

The result of running PDF::TextSearch::Run()

GetAmbientString()[source]

Return type:: string
Returns:: the ambient string of the found string (computed only if ‘e_ambient_string’ is set).

GetHighlights()[source]

Return type:: Highlights
Returns:: The Highlights info associated with the match (computed only if ‘e_highlight’ is set).

GetMatch()[source]

Return type:: string
Returns:: the string that matches the search pattern.

GetPageNumber()[source]

Return type:: int
Returns:: the number of the page with the match.

IsDocEnd()[source]

Return type:: boolean
Returns:: true if finished searching the entire document.

IsFound()[source]

Return type:: boolean
Returns:: true if a match was found.

IsPageEnd()[source]

Return type:: boolean
Returns:: true if finished searching a page.

property thisown: The membership flag

class apryse_sdk.SecurityHandler(args)[source]

Bases: object

Standard Security Handler is a standard password-based security handler.

Authorize(p)[source]

The method is called when a user tries to set security for an encrypted document and when a user tries to open a file. It must decide, based on the contents of the authorization data structure, whether or not the user is permitted to open the file, and what permissions the user has for this file.

Notes: - This callback must not obtain the authorization data (e.g. by displaying a user interface into which a user can type a password). This is handled by the security handler’s GetAuthorizationData(), which must be called before this callback. Instead, Authorize() should work with authorization data it has access to.

Parameters:

p (int) –

permission to authorize

AuthorizeFailed()[source]: A callback method indicating repeated failed authorization. Override this callback in order to provide a UI feedback for failed authorization. Default implementation returns immediately.

ChangeMasterPassword(args)[source]

Overload 1:

Set the new master password to a binary string

Parameters:: password (string) – the new user password

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

Overload 2:

Sets the new master/owner password.

Parameters:: password (string) – The new master/owner password.

Overload 3:

Sets the new master/owner password.

Parameters:: password_buf (std::vector< int,std::allocator< int > >) – The new master/owner password.

ChangeMasterPasswordASCII(password)[source]

Set the new master password to an ASCII text string

Parameters:: password (string) – the new master/owner password

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

ChangeRevisionNumber(rev_num)[source]

Change the revision number and the encryption algorithm of the standard security handler.

Parameters:

rev_num (int) –

the new revision number of the standard security algorithm. Currently allowed values for the revision number are (see Table 3.18 in PDF Reference Manual v1.6 for more details):

2 : Encryption using 40-bit RC4 algorithm.

3 : Encryption using 128-bit RC4 algorithm. Available in PDF 1.4 and above.

4Encryption using Crypt filters and 128-bit AES (Advanced Encryption
Standard) algorithm. Available in PDF 1.6 and above.

ChangeUserPassword(args)[source]

Overload 1:

Set the new user password to a binary string

Parameters:: password (string) – the new user password

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

Overload 2:

Sets the new user password.

Parameters:: password (string) – The new user password.

Overload 3:

Sets the new user password.

Parameters:: password_buf (std::vector< int,std::allocator< int > >) – The new user password.

ChangeUserPasswordASCII(password)[source]

Set the new user password to an ASCII text string

Parameters:: password (string) – the new user password

Remarks: Deprecated. Use versions that accepts UString or buffer instead..

Clone(base)[source]

Return type:: SecurityHandler
Returns:: A new, cloned instance of SecurityHandler.

Notes: this method must be implemented in any derived class from SecurityHandler.

EditSecurityData(doc)[source]

Called when the security handler should activate a dialog box with the current security settings that may be modified.

Return type:: boolean
Returns:: true if the operation was successful false otherwise.

FillEncryptDict(doc)[source]

Called when an encrypted document is saved. Fills the document’s Encryption dictionary with whatever information the security handler wants to store in the document.

The sequence of events during creation of the encrypt_dict is as follows:

encrypt_dict is created (if it does not exist)
Filter attribute is added to the dictionary
call this method to allow the security handler to add its own attributes
call the GetCryptKey to get the algorithm version, key, and key length
checks if the V attribute has been added to the dictionary and, if not, then sets V to the algorithm version
set the Length attribute if V is 2 or greater
add the encrypt_dict to the document

type doc:

SDFDoc

Parameters:

doc –

The document to save.

rtype:

return:

encrypt_dict

Warning: - Unlike all other strings and streams, direct object elements of the encrypt_dict are not encrypted automatically. If you want them encrypted, you must encrypt them before inserting them into the dictionary.

GetAuthorizationData(req_opr)[source]

This method is invoked in case Authorize() failed. The callback must determine the user’s authorization properties for the document by obtaining authorization data (e.g. a password through a GUI dialog).

The authorization data is subsequently used by the security handler’s Authorize() to determine whether or not the user is authorized to open the file.

Return type:

boolean

Returns:

false if the operation was canceled, true otherwise.

Parameters:

req_opr (int) –

the permission for which authorization data is requested.

GetDerived()[source]

Return type:: SecurityHandler
Returns:: The derived class or NULL for standard security handler.

GetEncryptionAlgorithmID()[source]

Return type:: int
Returns:: The encryption algorithm identifier. A code specifying the algorithm to be used in encrypting and decrypting the document. Returned number corresponds to V entry in encryption dictionary. Currently allowed values are from 0-4. See PDF Reference Manual for more details.

GetHandlerDocName()[source]

Return type:: string
Returns:: The name of the security handler as it appears in the serialized file as the value of /Filter key in /Encrypt dictionary.

GetKeyLength()[source]

Return type:: int
Returns:: The length of the encryption key in bytes.

Notes: The returned key length is given in bytes.

GetMasterPassword()[source]

Return type:: string
Returns:: Current master (owner) password.

GetMasterPasswordSize()[source]

Return type:: int
Returns:: Length of the current owner password string. This has to be used when password is a non-ASCII string that may contain ‘’ bytes.

GetPermission(p)[source]

Return type:: boolean
Returns:: true if the SecurityHandler permits the specified action (Permission p) on the document, or false if the permission was not granted.
Parameters:: p (int) – A Permission to be granted.

Notes: in order to check for permission the method will repeatedly (up to three times) attempt to GetAuthorizationData() and Authorize() permission. If the permission is not granted AuthorizeFailed() callback will be called. This callback method allows derived class to provide UI feedback for failed authorization.

GetRevisionNumber()[source]

Return type:: int
Returns:: the revision number of the standard security algorithm.

GetUserPassword()[source]

Return type:: string
Returns:: Current user password.

GetUserPasswordSize()[source]

Return type:: int
Returns:: Length of the current user password string. This has to be used when password is a non-ASCII string that may contain ‘’ bytes.

InitPassword(args)[source]

Overload 1:

The method can be called in GetAuthorizationData() callback to specify user supplied non-ASCII password.

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

Overload 2:

This method can be called in GetAuthorizationData() callback to specify user supplied password.

Overload 3:

This method can be called in GetAuthorizationData() callback to specify user supplied password.

InitPasswordASCII(password)[source]

The method can be called in GetAuthorizationData() callback to specify user supplied ASCII password.

Remarks: Deprecated. Use versions that accepts UString or buffer instead.

IsAES(args)[source]

Overload 1:

Return type:: boolean
Returns:: true is this security handler uses 128 bit AES (Advanced Encryption Standard) algorithm to encrypt strings or streams.

Overload 2:

The following function can be used to verify whether a given stream is encrypted using AES.

Return type:: boolean
Returns:: true if the given stream is encrypted using AES encryption.
Parameters:: stream (Obj) – A pointer to an SDF::Stream object

IsMasterPasswordRequired()[source]

Return type:: boolean
Returns:: true if the SecurityHandler requires a master (owner) password.

IsModified()[source]

Return type:

boolean

Returns:

true if the SecurityHandler was modified (by calling SetModified())

or false otherwise.

If the user changes SecurityHandler’s settings (e.g. by changing a password), IsModified() should return true.

IsRC4()[source]

Return type:: boolean
Returns:: true is this security handler uses RC4 algorithm to encrypt strings or streams.

IsUserPasswordRequired()[source]

Return type:: boolean
Returns:: true if the SecurityHandler requires a user password.

IsValid()[source]

Return type:

boolean

Returns:

true if the SecurityHandler is valid.

SetDerived(overloaded_funct)[source]

This method informs base security handler which methods are overridden in the derived class. The only place this method needs to be invoked is in Create(name, key_len, enc_code) static factory method in the derived class.

Parameters:: overloaded_funct (int) – A flag that specifies which functions are overloaded in the derived class. For example: SetDerived(SecurityHandler::has_Clone + SecurityHandler::has_FillEncDictProc);

SetEncryptMetadata(encrypt_metadata)[source]

Indicates whether the document-level metadata stream is to be encrypted.

Parameters:: encrypt_metadata (boolean) – true if metadata stream should be encrypted, false otherwise.

Notes: EncryptMetadata flag affects only Crypt filters available in PDF 1.5 (Acrobat 6) and later. By default, metadata stream will be encrypted.

SetModified(is_modified=True)[source]

The method allows derived classes to set SecurityHandler is modified flag. This method should be called whenever there are changes (e.g. a password change) to the SecurityHandler

Parameters:: is_modified (boolean, optional) – Value to set the SecurityHandler’s is modified flag to

SetPermission(perm, value)[source]

Set the permission setting of the StdSecurityHandler.

Parameters:

perm (int) –
indicates a permission to set or clear. It can be any of the following values:

e_print // print the document. e_doc_modify // edit the document more than adding or modifying text notes. e_extract_content // enable content extraction e_mod_annot // allow modifications to annotations e_fill_forms // allow changes to fill in forms e_access_support // content access for the visually impaired. e_assemble_doc // allow document assembly e_print_high // high resolution print.
value (boolean) – true if the permission/s should be granted, false otherwise.

e_AES = 3: Use Crypt filters with 128-bit AES (Advanced Encryption Standard) algorithm.

e_AES_256 = 4: Use Crypt filters with 256-bit AES (Advanced Encryption Standard) algorithm.

e_RC4_128 = 2: 128-bit RC4 algorithm.

e_RC4_40 = 1: 40-bit RC4 algorithm.

e_access_support = 9: content access for the visually impaired.

e_assemble_doc = 10: allow document assembly

e_doc_modify = 3: edit the document more than adding or modifying text notes.

e_doc_open = 2: open and decrypt the document.

e_extract_content = 6: enable content extraction

e_fill_forms = 8: allow changes to fill in forms

e_mod_annot = 7: allow modifications to annotations

e_owner = 1: the user has ‘owner’ rights (e.g. rights to change the document’s security settings).

e_print = 4: print the document.

e_print_high = 5: high resolution print.

has_AuthFailedProc = 4

has_AuthProc = 2

has_CloneProc = 1

has_EditSecurDataProc = 16

has_FillEncDictProc = 32

has_GetAuthDataProc = 8

property m_derived_procs

property m_owner

property mp_handler

property thisown: The membership flag

class apryse_sdk.Selection(args)[source]

Bases: object

A class representing the current text selection.

GetAsHtml()[source]

Return type:: string
Returns:: the current text selection in HTML format. HTML text will contain styling information such as text color, font size, style etc.

Notes: this function can be used to implement clipboard copy and paste that preserves text formating.

GetAsUnicode()[source]

Return type:: string
Returns:: the current text selection represented as an Unicode string.

GetPageNum()[source]

Return type:: int
Returns:: the page number containing the selected text.

GetQuads()[source]

Returns the list of tight bounding quads in the current text selection. :param quads: - Sets a pointer to an array of vertices representing

a list of bounding quads for the selected text. Each bounding quad is represented using 8 numbers in an array of doubles. Each two consecutive values represent the x and y coordinates of a quad vertex and the four vertices are arranged counter-clockwisely,

3——–2 | | | | | | 0——–1

e.g., (quad[0], quad[1]) is the coordinate of vertex 0, and (quad[4], quad[5]) is the coordinate of vertex 2. Note that it is only ensured that the four vertices are arranged sequentially; it is possible in practice that (quad[0], quad[1]) is the coordinate of any vertex.

Return type:: std::vector< QuadPoint,std::allocator< PDF::QuadPoint > >
Returns:: the number of quads in ‘quads’ array.

Notes: the ‘quads’ array is owned by the current selection and does not need to be explicitly released.

property thisown: The membership flag

class apryse_sdk.Separation(args)[source]

Bases: object

This class is used to store separations in PDFRasterize and PDFDraw

C()[source]

GetData()[source]

GetDataSize()[source]

GetSeparationName()[source]

K()[source]

M()[source]

Y()[source]

property m_separation_name

property thisown: The membership flag

class apryse_sdk.Shading(args)[source]

Bases: object

Shading is a class that represents a flat interface around all PDF shading types:

In Function-based (type 1) shadings, the color at every point in the domain is defined by a specified mathematical function. The function need not be smooth or continuous. This is the most general of the available shading types, and is useful for shadings that cannot be adequately described with any of the other types.

Axial shadings (type 2) define a color blend along a line between two points, optionally extended beyond the boundary points by continuing the boundary colors.

Radial shadings (type 3) define a color blend that varies between two circles. Shadings of this type are commonly used to depict three-dimensional spheres and cones.

Free-form Gouraud-shaded triangle mesh shadings (type 4) and lattice Gouraud shadings (type 5) are commonly used to represent complex colored and shaded three-dimensional shapes. The area to be shaded is defined by a path composed entirely of triangles. The color at each vertex of the triangles is specified, and a technique known as Gouraud interpolation is used to color the interiors. The interpolation functions defining the shading may be linear or nonlinear.

Coons patch mesh shadings (type 6) areructed from one or more color patches, each bounded by four cubic Bezier curves.

A Coons patch generally has two independent aspects: - Colors are specified for each corner of the unit square, and bilinear

interpolation is used to fill in colors over the entire unit square

Coordinates are mapped from the unit square into a four-sided patch whose sides are not necessarily linear. The mapping is continuous: the corners of the unit square map to corners of the patch and the sides of the unit square map to sides of the patch.

Tensor-product patch mesh shadings (type 7) are identical to type 6 (Coons mesh), except that they are based on a bicubic tensor-product patch defined by 16 control points, instead of the 12 control points that define a Coons patch. The shading Patterns dictionaries representing the two patch types differ only in the value of the Type entry and in the number of control points specified for each patch in the data stream. Although the Coons patch is more concise and easier to use, the tensor- product patch affords greater control over color mapping.

Destroy()[source]: Frees the native memory of the object.

GetAntialias()[source]

Return type:: boolean
Returns:: A flag indicating whether to filter the shading function to prevent aliasing artifacts. See Table 4.25

GetBBox()[source]

Return type:: Rect
Returns:: a rectangle giving the left, bottom, right, and top coordinates, respectively, of the shading’s bounding box. The coordinates are interpreted in the shading’s target coordinate space. If present, this bounding box is applied as a temporary clipping boundary when the shading is painted, in addition to the current clipping path and any other clipping boundaries in effect at that time.

Notes: Use HasBBox() method to determine whether the shading has a background color.

GetBackground()[source]

An color point represented in base color space specifying a single background color value. If present, this color is used before any painting operation involving the shading, to fill those portions of the area to be painted that lie outside the bounds of the shading object itself. In the opaque imaging model, the effect is as if the painting operation were performed twice: first with the background color and then again with the shading.

Notes: The background color is applied only when the shading is used as part of a shading pattern, not when it is painted directly with the sh operator.

Use HasBackground() method to determine whether the shading has a background color.

GetBaseColorSpace()[source]

Return type:: ColorSpace
Returns:: The color space in which color values are expressed. This may be any device, CIE-based, or special color space except a Pattern space.

GetColor(args)[source]

Overload 1:

Return type:: ColorPt
Returns:: a color point for the given value of parametric variable t.

Notes: for shadings other than Axial or Radial this method throws an exception.

Overload 2:

Return type:: ColorPt
Returns:: a color point for the given value of parametric variable (t1, t2).

Notes: for shadings other than Function this method throws an exception.

GetCoordsAxial()[source]

Return type:: std::vector< double,std::allocator< double > >
Returns:: for Axial shading returns four numbers (out_x0, out_y0, out_x1, out_y1) specifying the starting and ending coordinates of the axis, expressed in the shading’s target coordinate space.

Notes: for shadings other than Axial this method throws an exception.

GetCoordsRadial()[source]

Return type:: std::vector< double,std::allocator< double > >
Returns:: for Radial shading returns six numbers (x0 y0 r0 x1 y1 r1) specifying the centers and radii of the starting and ending circles, expressed in the shading’s target coordinate space. The radii r0 and r1 must both be greater than or equal to 0. If one radius is 0, the corresponding circle is treated as a point; if both are 0, nothing is painted.

Notes: for shadings other than Radial this method throws an exception.

GetDomain()[source]

Return type:: std::vector< double,std::allocator< double > >
Returns:: An array of four numbers [xmin xmax ymin ymax] specifying the rectangular domain of coordinates over which the color function(s) are defined. If the function does not contain /Domain entry the function returns: [0 1 0 1].

Notes: for shadings other than Function this method throws an exception.

GetMatrix()[source]

Return type:: Matrix2D
Returns:: a matrix specifying a mapping from the coordinate space specified by the Domain entry into the shading’s target coordinate space.

Notes: for shadings other than Function this method throws an exception.

GetParamEnd()[source]

Return type:: double
Returns:: a number specifying the limiting value of a parametric variable t. The variable is considered to vary linearly between GetParamStart() and GetParamEnd() as the color gradient varies between the starting and ending points of the axis for Axial shading or circles for Radial shading. The variable t becomes the input argument to the color function(s).

Notes: the returned value corresponds to the second value in Domain array.

for shadings other than Axial or Radial this method throws an exception.

GetParamStart()[source]

Return type:: double
Returns:: a number specifying the limiting value of a parametric variable t. The variable is considered to vary linearly between GetParamStart() and GetParamEnd() as the color gradient varies between the starting and ending points of the axis for Axial shading or circles for Radial shading. The variable t becomes the input argument to the color function(s).

Notes: the returned value corresponds to the first value in Domain array.

for shadings other than Axial or Radial this method throws an exception.

GetSDFObj()[source]

Return type:: Obj
Returns:: the underlying SDF/Cos object

GetType()[source]

Return type:: int
Returns:: The shading type

HasBBox()[source]

Return type:: boolean
Returns:: true if shading has a bounding box, false otherwise.

HasBackground()[source]

Return type:: boolean
Returns:: true if the shading has a background color or false otherwise.

IsExtendEnd()[source]

Return type:: boolean
Returns:: a flag specifying whether to extend the shading beyond the ending point of the axis for Axial shading or ending circle for Radial shading.

Notes: for shadings other than Axial or Radial this method throws an exception.

IsExtendStart()[source]

Return type:: boolean
Returns:: a flag specifying whether to extend the shading beyond the starting point of the axis for Axial shading or starting circle for Radial shading.

Notes: for shadings other than Axial or Radial this method throws an exception.

e_axial_shading = 1

e_coons_shading = 5

e_free_gouraud_shading = 3

e_function_shading = 0

e_lattice_gouraud_shading = 4

e_null = 7

e_radial_shading = 2

e_tensor_shading = 6

property mp_shade

property thisown: The membership flag

class apryse_sdk.ShapedText(args)[source]

Bases: object

The class ShapedText. A sequence of positioned glyphs – the visual representation of a given text string

Destroy()[source]

GetFailureReason()[source]

In the case where GetShapingStatus() returns something other than FullShaping, this method will return a more detailed reason behind the failure.

Return type:

int

Returns:

.

GetGlyph(index)[source]

Get the glyph ID at the indicated place in the shaped sequence. This number is specific to the font file used to generate the shaping results, and does not always have a clean mapping to a particular Unicode codepoint in the original string.

Parameters:: index (int) – – the index of the glyph to be retrieved. Must be less than GetNumGlyphs().
Return type:: int
Returns:: returns the glyph ID for the indicated place in the shaped result.

GetGlyphXPos(index)[source]

The X position of the glyph at the requested index. This number has been scaled by GetScale().

Parameters:: index (int) – – the index of the glyph position to be retrieved. Must be less than GetNumGlyphs().
Return type:: double
Returns:: returns the X position for the glyph at the specified index.

GetGlyphYPos(index)[source]

The Y position of the glyph at the requested index. This number has been scaled by GetScale().

Parameters:: index (int) – – the index of the glyph position to be retrieved. Must be less than GetNumGlyphs().
Return type:: double
Returns:: returns the Y position for the glyph at the specified index.

GetNumGlyphs()[source]

Number of glyphs present in the shaped text. Might be different from the .

Return type:: int
Returns:: returns the number of utf32 codepoints in this shaped text.

GetScale()[source]

Scaling factor of this shaped text relative to the em size. A scaling factor of 1 means that all units are relative to the em size. PDF scaling is typically 1000 units per em.

Return type:: double
Returns:: returns the scaling factor for the glyph positions.

GetShapingStatus()[source]

Get the state of the shaping operation. Even if the shaping did not fully succeed, this object can be added to an elementbuilder, and will fallback to placing unshped text. See GetFailureReason() in the case this method returns something other than FullShaping.

Return type:

int

Returns:

.

GetText()[source]

The original source text string.

Return type:: string
Returns:: returns the source text string.

e_FontDataNotFound = 3

e_FullShaping = 0

e_NoFailure = 0

e_NoShaping = 2

e_NotIndexEncoded = 2

e_PartialShaping = 1

e_UnsupportedFontType = 1

property m_impl

property thisown: The membership flag

class apryse_sdk.SignatureHandler[source]

Bases: object

A base class for SignatureHandler. SignatureHandler instances are responsible for defining the digest and cipher algorithms to create and/or validate a signed PDF document. SignatureHandlers are added to PDFDoc instances by calling the PDFDoc::AddSignatureHandler method.

AppendData(data)[source]

Adds data to be signed. This data will be the raw serialized byte buffer as the PDF is being saved to any stream.

Parameters:: data (std::vector< int,std::allocator< int > >) – A chunk of data to be signed.

Clone()[source]

This method returns a cloned copy of the current instance.

Return type:: SignatureHandler
Returns:: A new, cloned instance of SignatureHandler.

Notes: this method must be implemented in any derived class from SignatureHandler.

CreateSignature()[source]

Calculates the actual signature using client implemented signing methods. The returned value (byte array) will be written as the /Contents entry in the signature dictionary.

Return type:: std::vector< int,std::allocator< int > >
Returns:: The calculated signature data.

GetName()[source]

Gets the name of this SignatureHandler. The name of the SignatureHandler is what identifies this SignatureHandler from all others. This name is also added to the PDF as the value of /Filter entry in the signature dictionary.

Return type:: string
Returns:: The name of this SignatureHandler.

Reset()[source]

Resets any data appending and signature calculations done so far. This method should allow PDFNet to restart the whole signature calculation process. It is important that when this method is invoked, any data processed with the AppendData method should be discarded.

Return type:: boolean
Returns:: True if there are no errors, otherwise false.

property thisown: The membership flag

class apryse_sdk.SignatureWidget(args)[source]

Bases: Widget

An object representing a Signature used in a PDF Form. These Widgets can be signed directly, or signed using a DigitalSignatureField.

static Create(args)[source]

Overload 1:

Creates a new SignatureWidget annotation in the specified document, and adds an associated signature form field to the document.

Parameters:

doc (PDFDoc) – The document to which the widget is to be added.
pos (Rect) – A rectangle specifying the widget’s bounds in default user space units.
field_name (string, optional) – The name of the digital signature field to create. Optional - autogenerated by default.

Return type:

SignatureWidget

Returns:

A newly-created blank SignatureWidget annotation.

Overload 2:

Creates a new SignatureWidget annotation associated with a particular form field in the specified document.

Parameters:

doc (PDFDoc) – The document to which the widget is to be added.
pos (Rect) – A rectangle specifying the widget’s bounds in default user space units.
field (Field) – The digital signature field for which to create a signature widget.

Return type:

SignatureWidget

Returns:

A newly-created blank SignatureWidget annotation.

Overload 3:

Creates a new SignatureWidget annotation associated with a particular DigitalSignatureField object (representing a signature-type form field) in the specified document.

Parameters:

doc (PDFDoc) – The document to which the widget is to be added.
pos (Rect) – A rectangle specifying the widget’s bounds in default user space units.
field (DigitalSignatureField) – The digital signature field for which to create a signature widget.

Return type:

SignatureWidget

Returns:

A newly-created blank SignatureWidget annotation.

CreateSignatureAppearance(img)[source]

A function that will create and add an appearance to this widget by centering an image within it.

Parameters:: img (Image) – A PDF::Image object representing the image to use.

GetDigitalSignatureField()[source]

Retrieves the DigitalSignatureField associated with this SignatureWidget.

Return type:: DigitalSignatureField
Returns:: A DigitalSignatureField object representing the digital signature form field associated with this signature widget annotation.

property thisown: The membership flag

class apryse_sdk.Sound(args)[source]

Bases: Markup

A Sound annotation represents a sound recording attached to a point in the PDF document. When closed, this annotation appear as an icon; when open and activated, a sound record from the computer’s microphone or imported from a file associated with this annotation is played.The icon of this annotation by default is a speaker.

static Create(args)[source]

Overload 1:

Creates a new Sound annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds in default user space units.

Return type:

Returns:

A newly created blank Sound annotation.

Overload 2:

Creates a new Sound annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Point) – A point specifying the annotation’s location in default user space units.

Return type:

Returns:

A newly created blank Sound annotation.

Overload 3:

Creates a new Sound annotation in the specified document.

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Point) – A point specifying the annotation’s location in default user space units.

Return type:

Returns:

A newly created blank Sound annotation.

static CreateAnnot(args)[source]

static CreateWithData(args)[source]

Creates a new Sound annotation in the specified document. Accepts raw audio data, along with a few parameters describing the format of that data

Parameters:

doc (SDFDoc) – A document to which the annotation is added.
pos (Rect) – A rectangle specifying the annotation’s bounds in default user space units.
source_data (Filter) – The raw sound data for the newly created annot
bits_per_sample (int) – The number of bits per sample in source data
sample_freq (int) – The number of samples per second present in source data
num_channels (int) – The number of audio channels in source_data
icon (int, optional) – A value of the “Icon” enumeration type specifying the icon to display.

Return type: