Tuesday, June 26, 2007

EMF tips and tricks: Don't save if it hasn't changed.

For a while I have been thinking about writing some EMF tips that may not be known by most of the users. What keeps me from doing it is the lack of inspiration to come up with a nice introduction and to explain the rationale behind a design. This is probably silly though. I am sure I would be way more useful if I simply focus on showing the "what" and "how", leaving the long discussions for the cases when someone is actually interested ;-)

So here we go...

We've added a few new options to the EMF Resource in EMF 2.3 (the one available in Europa). One of them is Resource.OPTION_SAVE_ONLY_IF_CHANGED. When this option is used, the save method writes out the bytes of the serialized objects only if these bytes are different from the existing persisted form of the resource. Behind the scenes, the objects in the resource are serialized to either a temporary file or memory buffer and then compared with the bytes read from the URI pointed to by the resource. Clearly this slows down the save operation, but it may be useful when, for example, the resource serializes to a file that is versioned by CVS, hence avoiding "zero delta" changes.

This snippet shows how to use this option:

Map<Object, Object> saveOptions = new HashMap<Object, Object>();
saveOptions.put(
Resource.OPTION_SAVE_ONLY_IF_CHANGED,
Resource.OPTION_SAVE_ONLY_IF_CHANGED_MEMORY_BUFFER);
resource.save(saveOptions);

Using OPTION_SAVE_ONLY_IF_CHANGED_FILE_BUFFER instead of OPTION_SAVE_ONLY_IF_CHANGED_MEMORY_BUFFER would serialize the objects to a temporary file instead of using a buffer in memory. This subtle difference is important when dealing with big resources.

An experienced EMF user may be asking how this relates to the Resource.setTrackingModification(boolean) method and to the "dirty" state of the editor generated by EMF. The main difference here is that the new option applies to the actual bytes of the serialized resource while the other two approaches are related to the state of the objects in memory. For example, the new option doesn't change the serialized resource when a "dummy" modification is made to an object (such as setting an attribute to its current value), but would do it if the serialization format changes (which could be caused by using a different value for the XMLResource.OPTION_LINE_WIDTH option, for instance). The other two approaches do the exact opposite in these examples.

No comments: