Package org.apache.commons.csv
Class ExtendedBufferedReader
java.lang.Object
java.io.Reader
org.apache.commons.io.input.UnsynchronizedReader
org.apache.commons.io.input.UnsynchronizedBufferedReader
org.apache.commons.csv.ExtendedBufferedReader
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Readable
final class ExtendedBufferedReader
extends org.apache.commons.io.input.UnsynchronizedBufferedReader
A special buffered reader which supports sophisticated read access.
In particular the reader supports a look-ahead option, which allows you to see the next char returned by
read()
. This reader also tracks how many characters have been read with getPosition()
.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate long
The number of bytes read so far.private long
private final CharsetEncoder
Encoder for calculating the number of bytes for each character read.private int
The last char returnedprivate int
private long
The count of EOLs (CR/LF/CRLF) seen so farprivate long
private long
The position, which is the number of characters read so farprivate long
-
Constructor Summary
ConstructorsConstructorDescriptionExtendedBufferedReader
(Reader reader) Constructs a new instance using the default buffer size.ExtendedBufferedReader
(Reader reader, Charset charset, boolean trackBytes) Constructs a new instance with the specified reader, character set, and byte tracking option. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Closes the stream.(package private) long
Gets the number of bytes read by the reader.private int
getEncodedCharLength
(int current) Gets the byte length of the given character based on the the original Unicode specification, which defined characters as fixed-width 16-bit entities.(package private) int
Returns the last character that was read as an integer (0 to 65535).(package private) long
Returns the current line number(package private) long
Gets the character position in the reader.void
mark
(int readAheadLimit) int
read()
int
read
(char[] buf, int offset, int length) readLine()
Gets the next line, dropping the line terminator(s).void
reset()
Methods inherited from class org.apache.commons.io.input.UnsynchronizedBufferedReader
markSupported, peek, peek, ready, skip
Methods inherited from class org.apache.commons.io.input.UnsynchronizedReader
isClosed, setClosed
-
Field Details
-
lastChar
private int lastCharThe last char returned -
lastCharMark
private int lastCharMark -
lineNumber
private long lineNumberThe count of EOLs (CR/LF/CRLF) seen so far -
lineNumberMark
private long lineNumberMark -
position
private long positionThe position, which is the number of characters read so far -
positionMark
private long positionMark -
bytesRead
private long bytesReadThe number of bytes read so far. -
bytesReadMark
private long bytesReadMark -
encoder
Encoder for calculating the number of bytes for each character read.
-
-
Constructor Details
-
ExtendedBufferedReader
ExtendedBufferedReader(Reader reader) Constructs a new instance using the default buffer size. -
ExtendedBufferedReader
Constructs a new instance with the specified reader, character set, and byte tracking option. Initializes an encoder if byte tracking is enabled and a character set is provided.- Parameters:
reader
- the reader supports a look-ahead option.charset
- the character set for encoding, ornull
if not applicable.trackBytes
-true
to enable byte tracking;false
to disable it.
-
-
Method Details
-
close
Closes the stream.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Overrides:
close
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Throws:
IOException
- If an I/O error occurs
-
getBytesRead
long getBytesRead()Gets the number of bytes read by the reader.- Returns:
- the number of bytes read by the read
-
getEncodedCharLength
Gets the byte length of the given character based on the the original Unicode specification, which defined characters as fixed-width 16-bit entities.The Unicode characters are divided into two main ranges:
- U+0000 to U+FFFF (Basic Multilingual Plane, BMP):
- Represented using a single 16-bit
char
. - Includes UTF-8 encodings of 1-byte, 2-byte, and some 3-byte characters.
- Represented using a single 16-bit
- U+10000 to U+10FFFF (Supplementary Characters):
- Represented as a pair of
char
s: - The first
char
is from the high-surrogates range (?-?). - The second
char
is from the low-surrogates range (?-?). - Includes UTF-8 encodings of some 3-byte characters and all 4-byte characters.
- Represented as a pair of
- Parameters:
current
- the current character to process.- Returns:
- the byte length of the character.
- Throws:
CharacterCodingException
- if the character cannot be encoded.
- U+0000 to U+FFFF (Basic Multilingual Plane, BMP):
-
getLastChar
int getLastChar()Returns the last character that was read as an integer (0 to 65535). This will be the last character returned by any of the read methods. This will not include a character read using theUnsynchronizedBufferedReader.peek()
method. If no character has been read then this will returnConstants.UNDEFINED
. If the end of the stream was reached on the last read then this will returnIOUtils.EOF
.- Returns:
- the last character that was read
-
getLineNumber
long getLineNumber()Returns the current line number- Returns:
- the current line number
-
getPosition
long getPosition()Gets the character position in the reader.- Returns:
- the current position in the reader (counting characters, not bytes since this is a Reader)
-
mark
- Overrides:
mark
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Throws:
IOException
-
read
- Overrides:
read
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Throws:
IOException
-
read
- Overrides:
read
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Throws:
IOException
-
readLine
Gets the next line, dropping the line terminator(s). This method should only be called when processing a comment, otherwise, information can be lost.Increments
lineNumber
and updatesposition
.Sets
lastChar
toConstants.EOF
at EOF, otherwise the last EOL character.- Overrides:
readLine
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Returns:
- the line that was read, or null if reached EOF.
- Throws:
IOException
-
reset
- Overrides:
reset
in classorg.apache.commons.io.input.UnsynchronizedBufferedReader
- Throws:
IOException
-