Package org.apache.pdfbox.contentstream
Class PDFStreamEngine
java.lang.Object
org.apache.pdfbox.contentstream.PDFStreamEngine
- Direct Known Subclasses:
PDFGraphicsStreamEngine,PDFMarkedContentExtractor,PDFTextStripper
Processes a PDF content stream and executes certain operations.
Provides a callback interface for clients that want to do things with the stream.
- Author:
- Ben Litchfield
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal voidAdds an operator processor to the engine.protected voidapplyTextAdjustment(float tx, float ty) Applies a text position adjustment from the TJ operator.voidbeginMarkedContentSequence(COSName tag, COSDictionary properties) Called when a marked content group beginsvoidCalled when the BT operator is encountered.voidDecrease the level.voidCalled when a marked content group endsvoidendText()Called when the ET operator is encountered.getAppearance(PDAnnotation annotation) Returns the appearance stream to process for the given annotation.intGets the stream's initial matrix.intgetLevel()Get the current level.voidIncrease the level.booleanTells whether color operators should be processed.protected voidoperatorException(Operator operator, List<COSBase> operands, IOException e) Called when an exception is thrown by an operator.protected voidprocessAnnotation(PDAnnotation annotation, PDAppearanceStream appearance) Process the given annotation with the specified appearance stream.protected voidprocessChildStream(PDContentStream contentStream, PDPage page) Process a child stream of the given page.voidprocessOperator(String operation, List<COSBase> arguments) This is used to handle an operation.protected voidprocessOperator(Operator operator, List<COSBase> operands) This is used to handle an operation.voidprocessPage(PDPage page) This will initialize and process the contents of the stream.protected voidProcesses a soft mask transparency group stream.protected final voidprocessTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) Process the given tiling pattern.protected final voidprocessTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) Process the given tiling pattern.protected voidProcesses a transparency group stream.protected voidprocessType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix) Processes a Type 3 character stream.voidregisterOperatorProcessor(String operator, OperatorProcessor op) Deprecated.protected final voidrestoreGraphicsStack(Deque<PDGraphicsState> snapshot) Restores the entire graphics stack.voidPops the current graphics state from the stack.protected final Deque<PDGraphicsState> Saves the entire graphics stack.voidPushes the current graphics state to the stack.voidsetLineDashPattern(COSArray array, int phase) voidsetTextLineMatrix(Matrix value) voidsetTextMatrix(Matrix value) voidshowAnnotation(PDAnnotation annotation) Shows the given annotation.protected voidshowFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)insteadprotected voidshowFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) Called when a glyph is to be processed.voidshowForm(PDFormXObject form) Shows a form from the content stream.protected voidDeprecated.useshowGlyph(Matrix, PDFont, int, Vector)insteadprotected voidCalled when a glyph is to be processed.protected voidshowText(byte[] string) Process text from the PDF Stream.voidshowTextString(byte[] string) Called when a string of text is to be shown.voidshowTextStrings(COSArray array) Called when a string of text with spacing adjustments is to be shown.voidShows a transparency group from the content stream.protected voidshowType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)insteadprotected voidshowType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) Called when a glyph is to be processed.transformedPoint(float x, float y) Transforms a point using the CTM.protected floattransformWidth(float width) Transforms a width using the CTM.protected voidunsupportedOperator(Operator operator, List<COSBase> operands) Called when an unsupported operator is encountered.
-
Constructor Details
-
PDFStreamEngine
protected PDFStreamEngine()Creates a new PDFStreamEngine.
-
-
Method Details
-
registerOperatorProcessor
Deprecated.UseaddOperator(OperatorProcessor)insteadRegister a custom operator processor with the engine.- Parameters:
operator- The operator as a string.op- Processor instance.
-
addOperator
Adds an operator processor to the engine.- Parameters:
op- operator processor
-
processPage
This will initialize and process the contents of the stream.- Parameters:
page- the page to process- Throws:
IOException- if there is an error accessing the stream
-
showTransparencyGroup
Shows a transparency group from the content stream.- Parameters:
form- transparency group (form) XObject- Throws:
IOException- if the transparency group cannot be processed
-
showForm
Shows a form from the content stream.- Parameters:
form- form XObject- Throws:
IOException- if the form cannot be processed
-
processSoftMask
Processes a soft mask transparency group stream.- Parameters:
group- the transparency group.- Throws:
IOException
-
processTransparencyGroup
Processes a transparency group stream.- Parameters:
group- the transparency group.- Throws:
IOException
-
processType3Stream
protected void processType3Stream(PDType3CharProc charProc, Matrix textRenderingMatrix) throws IOException Processes a Type 3 character stream.- Parameters:
charProc- Type 3 character proceduretextRenderingMatrix- the Text Rendering Matrix- Throws:
IOException- if there is an error reading or parsing the character content stream.
-
processAnnotation
protected void processAnnotation(PDAnnotation annotation, PDAppearanceStream appearance) throws IOException Process the given annotation with the specified appearance stream.- Parameters:
annotation- The annotation containing the appearance stream to process.appearance- The appearance stream to process.- Throws:
IOException- If there is an error reading or parsing the appearance content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace) throws IOException Process the given tiling pattern.- Parameters:
tilingPattern- the tiling patterncolor- color to use, if this is an uncoloured pattern, otherwise null.colorSpace- color space to use, if this is an uncoloured pattern, otherwise null.- Throws:
IOException- if there is an error reading or parsing the tiling pattern content stream.
-
processTilingPattern
protected final void processTilingPattern(PDTilingPattern tilingPattern, PDColor color, PDColorSpace colorSpace, Matrix patternMatrix) throws IOException Process the given tiling pattern. Allows the pattern matrix to be overridden for custom rendering.- Parameters:
tilingPattern- the tiling patterncolor- color to use, if this is an uncoloured pattern, otherwise null.colorSpace- color space to use, if this is an uncoloured pattern, otherwise null.patternMatrix- the pattern matrix, may be overridden for custom rendering.- Throws:
IOException- if there is an error reading or parsing the tiling pattern content stream.
-
showAnnotation
Shows the given annotation.- Parameters:
annotation- An annotation on the current page.- Throws:
IOException- If an error occurred reading the annotation
-
getAppearance
Returns the appearance stream to process for the given annotation. May be used to render a specific appearance such as "hover".- Parameters:
annotation- The current annotation.- Returns:
- The stream to process.
-
processChildStream
Process a child stream of the given page. Cannot be used withprocessPage(PDPage).- Parameters:
contentStream- the child content streampage- the current page- Throws:
IOException- if there is an exception while processing the stream
-
beginText
Called when the BT operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException- if there was an error processing the text
-
endText
Called when the ET operator is encountered. This method is for overriding in subclasses, the default implementation does nothing.- Throws:
IOException- if there was an error processing the text
-
showTextString
Called when a string of text is to be shown.- Parameters:
string- the encoded text- Throws:
IOException- if there was an error showing the text
-
showTextStrings
Called when a string of text with spacing adjustments is to be shown.- Parameters:
array- array of encoded text strings and adjustments- Throws:
IOException- if there was an error showing the text
-
applyTextAdjustment
Applies a text position adjustment from the TJ operator. May be overridden in subclasses.- Parameters:
tx- x-translationty- y-translation- Throws:
IOException- if something went wrong
-
showText
Process text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.- Parameters:
string- the encoded text- Throws:
IOException- if there is an error processing the string
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowGlyph(Matrix, PDFont, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowFontGlyph(Matrix, PDFont, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showFontGlyph
protected void showFontGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, String unicode, Vector displacement) throws IOException Deprecated.useshowType3Glyph(Matrix, PDType3Font, int, Vector)insteadCalled when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphunicode- the Unicode text for this glyph, or null if the PDF does provide itdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
showType3Glyph
protected void showType3Glyph(Matrix textRenderingMatrix, PDType3Font font, int code, Vector displacement) throws IOException Called when a glyph is to be processed. This method is intended for overriding in subclasses, the default implementation does nothing.- Parameters:
textRenderingMatrix- the current text rendering matrix, Trmfont- the current fontcode- internal PDF character code for the glyphdisplacement- the displacement (i.e. advance) of the glyph in text space- Throws:
IOException- if the glyph cannot be processed
-
beginMarkedContentSequence
Called when a marked content group begins- Parameters:
tag- indicates the role or significance of the sequenceproperties- optional properties
-
endMarkedContentSequence
public void endMarkedContentSequence()Called when a marked content group ends -
processOperator
This is used to handle an operation.- Parameters:
operation- The operation to perform.arguments- The list of arguments.- Throws:
IOException- If there is an error processing the operation.
-
processOperator
This is used to handle an operation.- Parameters:
operator- The operation to perform.operands- The list of arguments.- Throws:
IOException- If there is an error processing the operation.
-
unsupportedOperator
Called when an unsupported operator is encountered.- Parameters:
operator- The unknown operator.operands- The list of operands.- Throws:
IOException- if something went wrong
-
operatorException
protected void operatorException(Operator operator, List<COSBase> operands, IOException e) throws IOException Called when an exception is thrown by an operator.- Parameters:
operator- The unknown operator.operands- The list of operands.e- the thrown exception.- Throws:
IOException- if something went wrong
-
saveGraphicsState
public void saveGraphicsState()Pushes the current graphics state to the stack. -
restoreGraphicsState
public void restoreGraphicsState()Pops the current graphics state from the stack. -
saveGraphicsStack
Saves the entire graphics stack.- Returns:
- the saved graphics state stack.
-
restoreGraphicsStack
Restores the entire graphics stack.- Parameters:
snapshot- the graphics state stack to be restored.
-
getGraphicsStackSize
public int getGraphicsStackSize()- Returns:
- Returns the size of the graphicsStack.
-
getGraphicsState
- Returns:
- Returns the graphicsState.
-
getTextLineMatrix
- Returns:
- Returns the textLineMatrix.
-
setTextLineMatrix
- Parameters:
value- The textLineMatrix to set.
-
getTextMatrix
- Returns:
- Returns the textMatrix.
-
setTextMatrix
- Parameters:
value- The textMatrix to set.
-
setLineDashPattern
- Parameters:
array- dash arrayphase- dash phase
-
getResources
- Returns:
- the stream' resources. This is mainly to be used by the
OperatorProcessorclasses.
-
getCurrentPage
- Returns:
- the current page.
-
getInitialMatrix
Gets the stream's initial matrix.- Returns:
- the initial matrix.
-
transformedPoint
Transforms a point using the CTM.- Parameters:
x- x-coordinate of the point to be transformed.y- y-coordinate of the point to be transformed.- Returns:
- the transformed point.
-
transformWidth
protected float transformWidth(float width) Transforms a width using the CTM.- Parameters:
width- the width value to be transformed.- Returns:
- the transformed width value.
-
getLevel
public int getLevel()Get the current level. This can be used to decide whether a recursion has done too deep and an operation should be skipped to avoid a stack overflow.- Returns:
- the current level.
-
increaseLevel
public void increaseLevel()Increase the level. Call this before running a potentially recursive operation. -
decreaseLevel
public void decreaseLevel()Decrease the level. Call this after running a potentially recursive operation. A log message is shown if the level is below 0. This can happen if the level is not decreased after an operation is done, e.g. by using a "finally" block. -
isShouldProcessColorOperators
public boolean isShouldProcessColorOperators()Tells whether color operators should be processed. To be used in some OperatorProcessor classes.- Returns:
- true if color operators should be processed, false if not, e.g. in type3 charprocs with d1 or in uncolored tiling patterns.
-
addOperator(OperatorProcessor)instead