Users of XML parsers are probably already quite familiar with the concepts of SAX. Significant events are defined that occur during the parsing of a message. As a parser works through a message, these events are ‘fired’ as they occur by invoking user defined callback functions. These callback functions are also known as event handler functions. A diagram illustrating this parsing process is as follows:
The events are defined to be significant actions that occur during the parsing process. We will define the following events that will be passed to the user when an ASN.1 message is parsed:
startElement – This event occurs when the parser moves into a new element. For example, if we have a SEQUENCE { a, b, c } construct (type names omitted), this event will fire when we begin parsing a, b, and c. The name of the element is passed to the event handling callback function.
endElement – This event occurs when the parser leaves a given element space. Using the example above, these would occur after the parsing of a, b, and c are complete. The name of the element is once again passed to the event handling callback function.
contents methods – A series of virtual methods are defined to pass all of the different types of primitive values that might be encountered when parsing a message (see the event handler class definition below for a complete list).
error – This event will be fired when a parsing error occurs. It will provide fault-tolerance to the parsing process as it will give the user the opportunity to fix or ignore errors on the fly to allow the parsing process to continue.
In C++, these events are defined as unimplemented virtual methods in two base classes: Asn1NamedEventHandler (the first 3 events) and Asn1ErrorHandler (the error event). These classes are defined in the asn1CppEvtHndlr.h header file.
In C, the first 3 event types are contained within a struct, Asn1NamedCEventHandler, defined in asn1CEvtHndlr.h, as consisting of function pointers. The error event, however, is not part of this struct and must be defined separately.
The start and end element methods are invoked when an element is parsed within a constructed type. The start method is invoked as soon as the tag/length is parsed in a BER message or the preamble/length is parsed in a PER message. The end method is invoked after the contents of the field are processed. The signature of these methods, in C++, is as follows:
virtual void startElement (const char* name, int index) = 0; virtual void endElement (const char* name, int index) = 0;
and in C:
typedef void (*rtxStartElement) (const char* name, int idx) ; typedef void (*rtxEndElement) (const char* name, int idx) ;
The name argument is used pass the element name. The index argument is used for SEQUENCE OF/SET OF constructs only. It is used to pass the index of the item in the array. This argument is set to –1 for all other constructs.
There is one contents method for passing each of the ASN.1 data types. Some methods are used to handle several different types. For example, the charValue method is used for values of all of the different character string types (IA5String, NumericString, PrintableString, etc.) as well as for big integer values. Note that this method is overloaded. The second implementation is for 16-bit character strings. These strings are represented as an array of unsigned short integers in ASN1C. All of the other contents methods correspond to a single equivalent ASN.1 primitive type.
The C++ error handler base class has a single virtual method that must be implemented. This is the error method and this has the following signature:
virtual int error (OSCTXT* pCtxt, ASN1CCB* pCCB, int stat) = 0;
The C error handler function, unlike the other events in C, is not contained within a struct. Its signature is as follows:
typedef int (*rtErrorHandler) (OSCTXT *pctxt, ASN1CCB *pCCB, int stat);
In these definitions, pCtxt and pctxt are pointers to the standard ASN.1 context block that should already be familiar. The pCCB structure is known as a “Context Control Block”. This can be thought of as a sub-context used to control the parsing of nested constructed types within a message. It is included as a parameter to the error method mainly to allow access to the “seqx” field. This is the sequence element index used when parsing a SEQUENCE construct. If parsing a particular element is to be retried, this item must be decremented within the error handler.