Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
java.lang.Object
org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
- Enclosing class:
CasSerializerSupport
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its
own instance.
package private to allow a test case to access not static to share the logger and the
initializing values (could be changed)
-
Field Summary
FieldsModifier and TypeFieldDescriptionfinal CASImpl
private final CasSerializerSupport.CasSerializerSupportSerialize
Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written.private final ErrorHandler
Array of Lists of all FS that are indexed in some view (other than sofas).final boolean
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.final boolean
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and listsfinal boolean
Whether the serializer needs to check for filtered-out types/features.final boolean
final MarkerImpl
Used to tell if a FS was created before or after mark.Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature.boolean
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back inmap from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringFSs not in an index, but only being serialized becaused they're referenced.for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationprivate TypeImpl[]
final Comparator
<TOP> Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idfinal TypeSystemImpl
private final BitSet
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. -
Constructor Summary
ConstructorsConstructorDescriptionCasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss) CasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs) -
Method Summary
Modifier and TypeMethodDescriptionvoid
Encode an individual FS.private void
void
void
(package private) int
enqueueCommon
(TOP fs) private int
enqueueCommon
(TOP fs, boolean doDeltaAndFilteringCheck) (package private) int
private void
enqueueFeatures
(TOP fs) Enqueue all FSs reachable from features of the given FS.private void
enqueueFeaturesOfFSs
(List<TOP> fss) private void
Enqueue everything reachable from features of indexed FSs.private void
Enqueue an FS, and everything reachable from it.private void
enqueueFSArrayElements
(FSArray fsArray) Enqueues all FS reachable from an FSArray.private void
enqueueFSListElements
(FSList<TOP> node) Enqueues all Head values of FSList reachable from an FSList.private void
Enqueues all FS that are stored in the sharedData's id map.private void
add the indexed FSs onto the indexedFSs by view.(package private) void
enqueueIndexedFs_only_not_features
(int viewNumber, TOP fs) private void
When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified.(package private) int
getNameSpacePrefix
(String uimaTypeName, String nsUri, int lastDotIndex) getSofa
(int sofaNum) TypeImpl[]
Get the XMI ID to use for an FS.int
getXmiIdAsInt
(TOP fs) private boolean
isListElementsMultiplyReferenced
(TOP listNode) For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - tprivate boolean
isMultiRef_enqueue
(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue".boolean
private void
void
Starts serializationvoid
-
Field Details
-
cas
-
tsi
-
visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer -
enqueued_multiRef_arrays_or_lists
Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written. It has array or list elements where the item needs to be enqueued onto the "queue" list. Use: limit the put-onto-queue list to one time -
multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref) -
isDynamicMultiRef
public final boolean isDynamicMultiRefSet to true for JSON configuration of using dynamic multi-ref detection for arrays and lists -
previouslySerializedFSs
-
modifiedEmbeddedValueFSs
-
indexedFSs
Array of Lists of all FS that are indexed in some view (other than sofas). Array indexed by view. -
queue
FSs not in an index, but only being serialized becaused they're referenced. Exception: the sofa's are here. -
typeCode2namespaceNames
-
typeUsed
-
needNameSpaces
public boolean needNameSpaces -
nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string -
nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in -
marker
Used to tell if a FS was created before or after mark. -
isDelta
public final boolean isDeltaWhether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object. -
isFiltering
public final boolean isFilteringWhether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer. -
sortedUsedTypes
-
errorHandler2
-
filterTypeSystem_inner
-
uniqueStrings
-
isFormattedOutput_inner
public final boolean isFormattedOutput_inner -
csss
-
sortFssByType
Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Details
-
Method Details
-
reportMultiRefWarning
- Throws:
SAXException
-
serialize
Starts serialization- Throws:
Exception
- -
-
getSofa
- Parameters:
sofaNum
- - starts at 1- Returns:
- the sofa FS, or null
-
writeViewsCommons
- Throws:
Exception
-
getSortedUsedTypes
-
getUsedTypesIterable
-
enqueueIncoming
private void enqueueIncoming()Enqueues all FS that are stored in the sharedData's id map. This map is populated during the previous deserialization. This method is used to make sure that all incoming FS are echoed in the next serialization. It is required if there are out-of-type FSs that are being merged back into the serialized form; those might reference some of these. -
enqueueIndexed
private void enqueueIndexed()add the indexed FSs onto the indexedFSs by view. add the SofaFSs onto the by-ref queue -
enqueueFeaturesOfIndexed
Enqueue everything reachable from features of indexed FSs.- Throws:
SAXException
-
enqueueFeaturesOfFSs
- Throws:
SAXException
-
enqueueCommon
-
enqueueCommonWithoutDeltaAndFilteringCheck
-
enqueueCommon
- Parameters:
fs
- -doDeltaAndFilteringCheck
- -- Returns:
- true to have enqueue put onto "queue" and enqueue features
-
enqueueIndexedFs_only_not_features
-
enqueueFsAndMaybeFeatures
Enqueue an FS, and everything reachable from it. This call is recursive with enqueueFeatures, \ and an arbitrary long chain can get stack overflow error. Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106- Parameters:
addr
- The FS address.- Throws:
SAXException
-
isListElementsMultiplyReferenced
For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - t- Parameters:
curNode
- -featCode
- -- Returns:
- false if no list element is multiply-referenced, true if there is a loop or another ref from outside the list, for one or more list element nodes
-
isMultiRef_enqueue
private boolean isMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) throws SAXException ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue". (not handled here: ordinary FSs are serialized in-line in JSON with isDynamicMultiRef)- Parameters:
fi
- - the feature, to look up the multiRefAllowed flagfeatVal
- - the List or array elementalreadyVisited
- true if visited_not_yet_written contains the featValisListNode
- -isListFeat
- -- Returns:
- false if should skip enqueue because this array or list is being serialized inline
- Throws:
SAXException
- -
-
enqueueFeatures
Enqueue all FSs reachable from features of the given FS.- Parameters:
addr
- address of an FStypeCode
- type of the FSinsideListNode
- true iff the enclosing FS (addr) is a list type- Throws:
SAXException
-
enqueueFSArrayElements
Enqueues all FS reachable from an FSArray.- Parameters:
addr
- Address of an FSArray- Throws:
SAXException
-
enqueueFSListElements
Enqueues all Head values of FSList reachable from an FSList. This does NOT include the list nodes themselves.- Parameters:
addr
- Address of an FSList- Throws:
SAXException
-
encodeIndexed
- Throws:
Exception
-
encodeFSs
- Throws:
Exception
-
encodeQueued
- Throws:
Exception
-
encodeFS
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
fs
- the FS to be encoded.- Throws:
SAXException
- passthruException
-
getXmiId
Get the XMI ID to use for an FS.- Parameters:
fs
- the FS- Returns:
- XMI ID or null
-
getXmiIdAsInt
-
getNameSpacePrefix
-
getUniqueString
-
getTypeNameFromXmlElementName
-
isStaticMultiRef
-