
A few months ago I blogged about a bug in the serialization code in Windows Workflow Foundation[1]. The bug is that when you change a custom workflow activity that is being used in a persisted workflow, you will get weird errors like ArgumentOutOfRangeException or an error telling you an IObjectReference is missing something. This happens when you use the default SqlWorkflowPersistenceService. This service uses a BinaryFormatter to serialize the workflow and save the bytes to the database. When you load the workflow again, the service deserializes the content from the database and returns the results to the runtime, so that execution can continue.
My third solution to this problem (Yep, I had a couple more, but this one sounded realistic enough to actually work) was to customize the serialization procedure that Windows Workflow Foundation uses to deserialize the workflow. To customize the deserialization process I had to add the IDeserializationCallback interface to my activity and fixup any missing fragments or references. This way the errors should disappear and the workflow should deserialize.
Well as it turns out, my third solution isn't going to work at all. The Activity type is not serializable, there's no implementation of ISerializable and no [Serializable] attribute on it neither is there on any of the base classes. So how does Windows workflow foundation serialize the workflow when persisting it in the database?
Windows Workflow Foundation uses ISerializationSurrogateSelector instances to create Surrogate objects that are used to serialize the various parts of a Workflow Instance. When the BinaryFormatter is used to serialize an activity, the formatter looks up the surrogate selector and asks it to give a surrogate for the type that is being serialized. The surrogate selector builds a new surrogate object and passes it to the formatter. Once that is completed, the surrogate serializes the object using the ISerializable interface method GetObjectData. The reverse of this operation happens when the activity is deserialized again.
There's a special surrogate for the Activity class, that works using Reflection. The surrogate selects all members of an activity using reflection and serializes them during serialization. The same happens when an activity is deserialized, again all members are selected and deserialized. Here comes the fun part. When I add a new member to my custom activity it isn't yet stored in the serialized version, so when the workflow gets serialized the surrogate object hits an empty spot where it expected data for my new member. Instead of skipping it, the surrogate creates a new IObjectReference to temporarely replace the missing object. (This is normal, because the missing object could not yet have been deserialized yet.) After the complete objectgraph (Representing the workflow instance) is deserialized the formatter calls IObjectReference.GetRealObject() on all members of the activities to retrieve the actual object and blam, there you have it: a very nice exception, because the empty spot is there.
The exception only happens when you add new activities to a custom composite activity. Adding a new DependencyProperty doesn't cause this exception to occur and can be done without any consequences for the already running workflows.
So the best solution I have so far is to not base your custom activities on composite activities, but to base them on Activity. This is a bit more work, as you can't drag and drop the complete custom activity together, but the result is more durable and in some cases faster.
[1]https://blogs.infosupport.com/blogs/willemm/archive/2007/05/16/Serialization-bug-in-Windows-Workflow-Foundation.aspx