Class BinaryCasSerDes4.Serializer

java.lang.Object
org.apache.uima.cas.impl.BinaryCasSerDes4.Serializer
Enclosing class:
BinaryCasSerDes4

private class BinaryCasSerDes4.Serializer extends Object
Class instantiated once per serialization Multiple serializations in parallel supported, with multiple instances of this
  • Field Details

    • serializedOut

      private final DataOutputStream serializedOut
    • baseCas

      private final CASImpl baseCas
    • bcsd

      private final BinaryCasSerDes bcsd
    • mark

      private final MarkerImpl mark
    • sm

      private final SerializationMeasures sm
    • baosZipSources

      private final ByteArrayOutputStream[] baosZipSources
    • dosZipSources

      private final DataOutputStream[] dosZipSources
    • heapStart

      private int heapStart
      start of heap, in v2 pseudo-addr coordinates
    • heapEnd

      private int heapEnd
      end of heap, in v2 pseudo-addr coordinates = addr of last + length of last
    • isDelta

      private final boolean isDelta
    • isTsi

      private final boolean isTsi
    • doMeasurement

      private final boolean doMeasurement
    • os

      private final OptimizeStrings os
    • compressLevel

      private final BinaryCasSerDes4.CompressLevel compressLevel
    • compressStrategy

      private final BinaryCasSerDes4.CompressStrat compressStrategy
    • prevFsByType

      private final TOP[] prevFsByType
      For differencing when reading and writing. Also used for arrays to difference the 0th element.
    • prevFs

      private TOP prevFs
    • only1CommonString

      private boolean only1CommonString
    • byte_dos

      private final DataOutputStream byte_dos
    • typeCode_dos

      private final DataOutputStream typeCode_dos
    • strOffset_dos

      private final DataOutputStream strOffset_dos
    • strLength_dos

      private final DataOutputStream strLength_dos
    • float_Mantissa_Sign_dos

      private final DataOutputStream float_Mantissa_Sign_dos
    • float_Exponent_dos

      private final DataOutputStream float_Exponent_dos
    • double_Mantissa_Sign_dos

      private final DataOutputStream double_Mantissa_Sign_dos
    • double_Exponent_dos

      private final DataOutputStream double_Exponent_dos
    • fsIndexes_dos

      private final DataOutputStream fsIndexes_dos
    • control_dos

      private final DataOutputStream control_dos
    • strSeg_dos

      private final DataOutputStream strSeg_dos
    • csds

      private final CommonSerDesSequential csds
    • fs2seq

      private final Obj2IntIdentityHashMap<TOP> fs2seq
      convert between FSs and "sequential" numbers This is for compression efficiency and also is needed for backwards compatibility with v2 serialization forms, where index information was written using "sequential" numbers Note: This may be identity map, but may not in the case for V3 where some FSs are GC'd Contrast with fs2addr and addr2fs in csds - these use the pseudo v2 addresses as the int
    • uimaSerializableSavedToCas

      private PositiveIntSet uimaSerializableSavedToCas
      Set of FSes on which UimaSerializable _save_to_cas_data has already been called.
  • Constructor Details

  • Method Details

    • serialize

      private void serialize() throws IOException
      Form 4 serialization is tied to the layout of V2 Feature Structures in heaps. It does not walk the indexes to serialize just those FSs that are reachable. For V3, it scans the CASImpl.id2fs information and serializes those (except those which have been GC'd). The seq numbers of the target incrementing sequentially will be different from the source id's if some FSs were GC'd. To determine for delta what new strings and new
      Throws:
      IOException
    • writeStringInfo

      private void writeStringInfo() throws IOException
      Write the compressed string table(s)
      Throws:
      IOException
    • writeFs

      private void writeFs(TOP fs) throws IOException
      Throws:
      IOException
    • serializeIndexedFeatureStructures

      private void serializeIndexedFeatureStructures(CommonSerDesSequential csds) throws IOException
      Throws:
      IOException
    • compressFsxPart

      private int compressFsxPart(int[] fsIndexes, int fsNdxStart, CommonSerDesSequential csds) throws IOException
      Throws:
      IOException
    • serializeArray

      private void serializeArray(TOP fs) throws IOException
      Throws:
      IOException
    • getPrevArray0HeapRef

      private int getPrevArray0HeapRef()
    • getPrevArray0Int

      private int getPrevArray0Int()
    • isNoPrevArrayValue

      private boolean isNoPrevArrayValue(CommonArrayFS prevCommonArray)
    • serializeByKind

      private void serializeByKind(TOP fs, FeatureImpl feat) throws IOException
      Throws:
      IOException
    • serializeArrayLength

      private int serializeArrayLength(TOP fs) throws IOException
      Throws:
      IOException
    • collectAndZip

      private void collectAndZip() throws IOException
      Method: write with deflation into a single byte array stream skip if not worth deflating skip the Slot_Control stream record in the Slot_Control stream, for each deflated stream: the Slot index the number of compressed bytes the number of uncompressed bytes add to header: nbr of compressed entries the Slot_Control stream size the Slot_Control stream all the zipped streams
      Throws:
      IOException - passthru
    • writeLong

      private void writeLong(long v, long prev) throws IOException
      Throws:
      IOException
    • writeString

      private void writeString(String s) throws IOException
      String encoding Length = 0 - used for null, no offset written Length = 1 - used for "", no offset written Length > 0 (subtract 1): used for actual string length Length < 0 - use (-length) as slot index (minimum is 1, slot 0 is NULL) For length > 0, write also the offset.
      Throws:
      IOException - passthru
    • writeFloat

      private void writeFloat(int raw) throws IOException
      Need to support NAN sets, 0x7fc.... for NAN 0xff8.... for NAN, negative infinity 0x7f8 for NAN, positive infinity Because 0 occurs frequently, we reserve exp of 0 for the value 0
      Parameters:
      raw - the number to write
      Throws:
      IOException
    • writeVnumber

      private void writeVnumber(int kind, int v) throws IOException
      Throws:
      IOException
    • writeVnumber

      private void writeVnumber(int kind, long v) throws IOException
      Throws:
      IOException
    • writeVnumber

      private void writeVnumber(DataOutputStream s, int v) throws IOException
      Throws:
      IOException
    • writeVnumber

      private void writeVnumber(DataOutputStream s, long v) throws IOException
      Throws:
      IOException
    • writeUnsignedByte

      private void writeUnsignedByte(DataOutputStream s, int v) throws IOException
      Throws:
      IOException
    • writeDouble

      private void writeDouble(long raw) throws IOException
      Throws:
      IOException
    • encodeIntSign

      private int encodeIntSign(int v)
    • writeDiff

      private void writeDiff(int kind, int v, int prev) throws IOException
      Encoding: bit 6 = sign: 1 = negative bit 7 = delta: 1 = delta
      Parameters:
      kind - the kind of slot
      i - runs from iHeap + 3 to end of array
      Throws:
      IOException - passthru
    • extractStrings

      private void extractStrings(TOP fs)
      add strings to the optimizestrings object If delta, only process for fs's that are new; modified string values picked up when scanning FsChange items
      Parameters:
      fs - feature structure
    • extractStringsFromModifications

      private void extractStringsFromModifications(CASImpl.FsChange fsChange)
      For delta, for each fsChange element, extract any strings
      Parameters:
      fsChange -
    • fs2seq

      private int fs2seq(TOP fs)