Show / Hide Table of Contents

Class TextExtractor.Word

TextExtractor.Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor.Style).

Inheritance
object
TextExtractor.Word
Implements
IDisposable
Inherited Members
object.Equals(object, object)
object.GetHashCode()
object.GetType()
object.MemberwiseClone()
object.ReferenceEquals(object, object)
object.ToString()
Namespace: pdftron.PDF
Assembly: PDFTronDotNet.dll
Syntax
public class TextExtractor.Word : IDisposable

Constructors

Word()

Declaration
public Word()

Methods

Dispose()

Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.

Declaration
public void Dispose()

Dispose(bool)

Declaration
protected virtual void Dispose(bool disposing)
Parameters
Type Name Description
bool disposing

Equals(object)

Checks whether this Word object is the same as the opject specified.

Declaration
public override bool Equals(object o)
Parameters
Type Name Description
object o

specified object

Returns
Type Description
bool

true if equals to the specified object

Overrides
object.Equals(object)

~Word()

Releases all resources used by the Word

Declaration
protected ~Word()

GetBBox()

Gets the b box.

Declaration
public Rect GetBBox()
Returns
Type Description
Rect

The bounding box for this word (in unrotated page coordinates).

Remarks

To account for the effect of page '/Rotate' attribute, transform all points using page.GetDefaultMatrix().

GetCharStyle(int)

Gets the char style.

Declaration
public TextExtractor.Style GetCharStyle(int char_idx)
Parameters
Type Name Description
int char_idx

The index of a character in this word.

Returns
Type Description
TextExtractor.Style

The style associated with a given character.

GetCurrentNum()

Gets the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).

Declaration
public int GetCurrentNum()
Returns
Type Description
int

the index of this word of the current line

GetGlyphQuad(int)

Gets the glyph from index

Declaration
public double[] GetGlyphQuad(int glyph_idx)
Parameters
Type Name Description
int glyph_idx

The index of a glyph in this word.

Returns
Type Description
double[]

The quadrilateral representing a tight bounding box for a given glyph in the word (in unrotated page coordinates).

GetNextWord()

Gets the next object

Declaration
public TextExtractor.Word GetNextWord()
Returns
Type Description
TextExtractor.Word

the next object

GetNumGlyphs()

Gets the num glyphs.

Declaration
public int GetNumGlyphs()
Returns
Type Description
int

The number of glyphs in this word.

GetQuad()

return The quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).

Declaration
public double[] GetQuad()
Returns
Type Description
double[]

the quad

GetString()

Gets Unicode string

Declaration
public string GetString()
Returns
Type Description
string

the content of this word represented as a Unicode string.

GetStringLen()

Gets the number of chars in the string.

Declaration
public int GetStringLen()
Returns
Type Description
int

the number of characters in this word.

GetStyle()

Gets predominant style for this word.

Declaration
public TextExtractor.Style GetStyle()
Returns
Type Description
TextExtractor.Style

the style

IsValid()

Checks if valid word

Declaration
public bool IsValid()
Returns
Type Description
bool

true if this is a valid word, false otherwise.

Set(Word)

Sets value to given Word object

Declaration
public void Set(TextExtractor.Word r)
Parameters
Type Name Description
TextExtractor.Word r

a given Word object

op_Assign(Word)

Assignment operator

Declaration
public TextExtractor.Word op_Assign(TextExtractor.Word r)
Parameters
Type Name Description
TextExtractor.Word r

a given Word object

Returns
Type Description
TextExtractor.Word

Word object equals to the given Word object

Operators

operator ==(Word, Word)

Equality operator check whether two Word objects are the same.

Declaration
public static bool operator ==(TextExtractor.Word l, TextExtractor.Word r)
Parameters
Type Name Description
TextExtractor.Word l

Word object at the left of the operator

TextExtractor.Word r

Word object at the right of the operator

Returns
Type Description
bool

true if both Word objects are equal, false otherwise

operator !=(Word, Word)

Inequality operator check whether two Word objects are different.

Declaration
public static bool operator !=(TextExtractor.Word l, TextExtractor.Word r)
Parameters
Type Name Description
TextExtractor.Word l

Word object at the left of the operator

TextExtractor.Word r

Word object at the right of the operator

Returns
Type Description
bool

true if both Word object are not equal, false otherwise

Implements

IDisposable
In This Article
Back to top Generated by DocFX