A Problem of JAXB (Source: "An analysis of interface specification in XML using Design by Contract ", Author: Yichong Zhou)

来源:互联网 发布:痘印修复 知乎 编辑:程序博客网 时间:2024/04/28 05:23

The help classes which generated by JAXB are not a full representation of the XML schema, since they don’t contain most constraints information in the XML schema and depend on the XML schema when data needs to be validated. But there is another problem. Imagine such a situation: we used a XML schema to generate a pile of help classes, and then we modified the XML schema, and used the modified XML schema to validate the data in the content objects and documents, but the help classes are still those generated from the original schema, what will happen?

 

What if we change the structure definition in the XML schema, strengthen or weaken the constraints on some data or variables? They may become incompatible. Some data may be valid in the XML schema of the original version, but invalid in the modified version, or the structures are valid in the original version, but invalid in the modified version. Although it’s flexible, this may also lead to disaster.

 

We may consider the modifying a XML schema as the re-declaration, and there may be several versions of modified XML schema. We can refer the help classes of original version to either of them, just like the dynamic binding. According to DBC theory, what the re-declaration and dynamic binding mean is the ability to subcontracting. The subcontractors must honor the prime contractor’s promises in the original contract in order to prevent misuse. A re-declaration shouldn’t produce an effect that is incompatible with the semantics of the original version. It may keep or weaken the precondition and keep or strengthen the postcondition. So what are the preconditions and postconditions here?

 

According to DBC theory, the contract governs the interactions between the software element and the rest of world. The preconditions govern the input, and the postconditions govern the output. We use the XML schema to validate the input, and we also use it to validate the output. For example, we may unmarshal a XML document to a tree of content objects: the XML schema is used to validate this input. After that, we may modify the content objects and marshal them to XML document: the XML schema is used to validate this output again. Thus, if we use the same XML schema to validate the input and output, the conditions in the XML schema are not only preconditions but also postconditions. A re-declaration can neither weaken nor strengthen them, they must be kept. But if we use different XML schemas to validate the input and output respectively, it is another situation: the XML schema used to validate the input is preconditions, which must be kept or weakened, and the XML schema used to validate the output is postconditions, which must be kept or strengthened. For the software we designed, the unmarshalling is the input, and the marshaling is the output. But it is relative. The unmarshalling could be seen as the output, and the marshaling could be seen as the input, from the perspective of the XML document and its users.

 

In conclusion, if we put all the constraints information into the help classes while compiling the XML schema, making them a full representation of the XML schema, rather than depending on the XML schema to validate data, such a problem will no longer exist. When the XML schema is required to validate data, the constraints information are separated from help classes, and they can be modified arbitrarily, which may become unsafe. Although it is flexible, as the constraints information can be modified without changing any application codes, it is at the price of the security.