Class CsvParser


public final class CsvParser extends AbstractParser<CsvParserSettings>
A very fast CSV parser implementation.
Author:
Univocity Software Pty Ltd - parsers@univocity.com
See Also:
  • Constructor Details

    • CsvParser

      public CsvParser(CsvParserSettings settings)
      The CsvParser supports all settings provided by CsvParserSettings, and requires this configuration to be properly initialized.
      Parameters:
      settings - the parser configuration
  • Method Details

    • parseRecord

      protected final void parseRecord()
      Description copied from class: AbstractParser
      Parser-specific implementation for reading a single record from the input.

      The AbstractParser handles the initialization and processing of the input until it is ready to be parsed.

      It then delegates the input to the parser-specific implementation defined by AbstractParser.parseRecord(). In general, an implementation of AbstractParser.parseRecord() will perform the following steps:

      • Test the character stored in ch and take some action on it (e.g. is while (ch != '\n'){doSomething()})
      • Request more characters by calling ch = input.nextChar();
      • Append the desired characters to the output by executing, for example, output.appender.append(ch)
      • Notify a value of the record has been fully read by executing output.valueParsed(). This will clear the output appender (CharAppender) so the next call to output.appender.append(ch) will be store the character of the next parsed value
      • Rinse and repeat until all values of the record are parsed

      Once the AbstractParser.parseRecord() returns, the AbstractParser takes over and handles the information (generally, reorganizing it and passing it on to a RowProcessor).

      After the record processing, the AbstractParser reads the next characters from the input, delegating control again to the parseRecord() implementation for processing of the next record.

      This cycle repeats until the reading process is stopped by the user, the input is exhausted, or an error happens.

      In case of errors, the unchecked exception TextParsingException will be thrown and all resources in use will be closed automatically unless CommonParserSettings.isAutoClosingEnabled() evaluates to false. The exception should contain the cause and more information about where in the input the error happened.

      Specified by:
      parseRecord in class AbstractParser<CsvParserSettings>
      See Also:
    • getInputAnalysisProcess

      protected final InputAnalysisProcess getInputAnalysisProcess()
      Description copied from class: AbstractParser
      Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.
      Overrides:
      getInputAnalysisProcess in class AbstractParser<CsvParserSettings>
      Returns:
      a custom implementation of InputAnalysisProcess. By default, null is returned and no special input analysis will be performed.
    • getDetectedFormat

      public final CsvFormat getDetectedFormat()
      Returns the CSV format detected when one of the following settings is enabled: The detected format will be available once the parsing process is initialized (i.e. when runs.
      Returns:
      the detected CSV format, or null if no detection has been enabled or if the parsing process has not been started yet.
    • consumeValueOnEOF

      protected final boolean consumeValueOnEOF()
      Description copied from class: AbstractParser
      Allows the parser implementation to handle any value that was being consumed when the end of the input was reached
      Overrides:
      consumeValueOnEOF in class AbstractParser<CsvParserSettings>
      Returns:
      a flag indicating whether the parser was processing a value when the end of the input was reached.
    • updateFormat

      public final void updateFormat(CsvFormat format)
      Allows changing the format of the input on the fly.
      Parameters:
      format - the new format to use.