public class TextExtractor extends DefaultElementVisitor
DefaultElementVisitor, which is used to extract display text
from ODF element. For example, if you want to get all of the text content in a slide notes, you
can call getOdfElement() to get the ODF element of this notes, then pass it to
newOdfTextExtractor to create a TextExtractor. The last step is very easy, you only
need to use getText(), all of the text content will be return as string. Another
easier way is pass the ODF element to the static method TextExtractor.getText(OdfElement)
directly.
If you pass the content root which you can get by Document.getContentRoot() as the parameter, the
whole document content will be returned, without any tag information.
This extractor implements parts of ODF elements' white space handling functions. They are
text:p, text:h, text:s, text:tab and text:linebreak, which visit() are override to
process white space, according to ODF specification.
OdfElement| Modifier and Type | Class and Description |
|---|---|
protected static class |
TextExtractor.ExtractorStringBuilder
This class is used to provide the string builder functions to extractor.
|
| Modifier and Type | Field and Description |
|---|---|
protected TextExtractor.ExtractorStringBuilder |
mTextBuilder |
protected static char |
NewLineChar |
protected static char |
TabChar |
| Modifier | Constructor and Description |
|---|---|
protected |
TextExtractor()
Default constructor
|
protected |
TextExtractor(OdfElement element)
Constructor with an ODF element as parameter
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
appendElementText(OdfElement ele)
Append the text content of this element to string buffer.
|
String |
getText()
Return the text content of specified ODF element as a string.
|
static String |
getText(OdfElement ele)
Return the text content of a element as String
|
static TextExtractor |
newOdfTextExtractor(OdfElement element)
Create a TextExtractor instance using specified ODF element, which text content can be
extracted by
getText(). |
void |
visit(OdfElement element)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of
OdfElement. |
void |
visit(TextHElement ele)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of text:h.
|
void |
visit(TextLineBreakElement ele)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of text:linebreak.
|
void |
visit(TextPElement ele)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of text:p.
|
void |
visit(TextSElement ele)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of text:s.
|
void |
visit(TextTabElement ele)
The end users needn't to care of this method, if you don't want to override the text content
handling strategy of text:tab.
|
visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visit, visitprotected static final char NewLineChar
protected static final char TabChar
protected final TextExtractor.ExtractorStringBuilder mTextBuilder
protected TextExtractor()
protected TextExtractor(OdfElement element)
element - the ODF element whose text would be extracted.public static String getText(OdfElement ele)
ele - the ODF elementpublic static TextExtractor newOdfTextExtractor(OdfElement element)
getText().element - the ODF element whose text will be extracted.public String getText()
public void visit(OdfElement element)
OdfElement.visit in interface ElementVisitorvisit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.pkg.OdfElement)public void visit(TextPElement ele)
visit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.dom.element.text.TextPElement)public void visit(TextHElement ele)
visit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.dom.element.text.TextHElement)public void visit(TextSElement ele)
visit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.dom.element.text.TextSElement)public void visit(TextTabElement ele)
visit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.dom.element.text.TextTabElement)public void visit(TextLineBreakElement ele)
visit in class DefaultElementVisitorDefaultElementVisitor.visit(org.odftoolkit.odfdom.dom.element.text.TextLineBreakElement)protected void appendElementText(OdfElement ele)
ele - the ODF element whose text will be appended.Copyright © 2010–2018 Apache Software Foundation; Copyright © 2018–2020 The Document Foundation. All rights reserved.