blog community

Welcome to blog community Sign in | Join | Help
in Search

Marcel de Vries, MVP Team System

.NET Technologies, Architecture and Web Development

  • Recovering workflows that did not complete work

    Last week I have been working on our workflow solution where someone discovered that in some cases it appeared that certain workflows did not complete and would never wake up again.

    It took me a while to figure out why this was the case. We host our workflows in IIS and there we use the ManualWorkflowSchedulerService to schedule the workflow. What we do is have a WCF call coming in, persist some data in the database and return an ID that the customer can use for future reference. Then we use a threadpool (a custom implementation, since we needed to prioritize the initial requests from the background work done later) to schedule the remaining work that needs to complete in the background. We use the SqlWorkflowPersistenceService to make sure the workflows can be recovered if the IS worker process is recycled.

    We did a lot of testing previously(as you might have read before), and there we have seen the workflows did get recovered.

    So what happened…?

    <edit> 
    During performance testing we found that the configuration section for workflow contained the a wrong name for the workflow configuration. This caused a second workflow runtime to be started with default settings. The default settings use the default scheduler and that was causing problems in terms of a fight for threads in the host process. (Lots of context switching) this made us decide to do two things. First remove the wrong configuration, so we would only have one workflow runtime and secondly write a custom threadpool implementation where we could throttle the number of requests we want to process.
    </edit>

    What we did not realize when we removed the erroneous entry, that this also killed our auto recovery of workflows. The second workflow runtime got loaded with the same services as the main workflow runtime, except that it used the default scheduler. As you might recall this scheduler uses a thread pool to schedule its workflows and that was where the workflow recovery took place.

    In our main workflowruntime we use the ManualWorkflowSchedulerService and the SQLWorkflowPersistenceService, but this does not take care of the actual recovery of the workflows.

    It appears (after some reflection workJ) that the SQLWorklflowPersistenceService does do recovery of instances, but it assumes these will be automatically scheduled. Unfortunately this is not documented, but when you dig around in the implementation (using reflector of course) you can see that the only thing the Recovery does is get all running instances from the database and call a WorkflowRuntime.GetWorkflow(ID).Load().

    This means it will only load the workflow into the workflow runtime memory, but not schedule the workflow to actually proceed with its work!

    As a matter a fact the whole implementation for recovery does not take in account e.g. the fact that we have multiple nodes running in a load balancing scenario where you perhaps want to recover the workflows in a balanced fashion as well. So to get around this problem I wrote a custom scheduler that does nothing more than deriving from the manual scheduler, except that it also takes care of the recovery of workflows. Since this might be a problem you will run into yourself, I have pasted the code below for you to reuse J

    The only thing you need to do is copy the code compile it and then configure your workflow runtime to use this scheduler service.

    It will just bahave the same as the manual scheduler with the only difference that it will recover your workflows when needed on the threadpool in the background. You can trim the number of workflows that are recovered as one batch, you can configure the wait time after startup, to start the recovery (normaly you want to start after a minute or so, since the first request needs to be serviced first before you want to start do recovery) and the poll time is configurable as well.

    Hope you find it usefull.

    Cheers,

    Marcel

        [Serializable]

        public class RecoveringWorkflowSchedulerService : ManualWorkflowSchedulerService , IDisposable

        {

            private Timer _recoveryPollTimer;

            private int _recoveryDuePeriod;

            private int _recoveryPollPeriod;

            private int _recoveryBatchSize;

            private bool _disposed = false;

            // default to one minute, so we don't frustrate the initial request that activates the service

            private static readonly int DEFAULTRECOVERYDUEPERIOD = 60000;

            // default to every 5 minutes to check if there are workflows to recover

            private static readonly int DEFAULTRECOVERYPOLLPERIOD = 300000;

            // default to recovery batch size of 5

            private static readonly int DEFAULTRECOVERYBATCHSIZE = 5;

            public RecoveringWorkflowSchedulerService(bool useActiveTimers) : base(useActiveTimers)

            {

                _recoveryPollPeriod = DEFAULTRECOVERYPOLLPERIOD;

                _recoveryDuePeriod = DEFAULTRECOVERYDUEPERIOD;

                _recoveryBatchSize = DEFAULTRECOVERYBATCHSIZE;

            }

           

            public RecoveringWorkflowSchedulerService(bool useActiveTimers, int recoveryDuePeriod,

                                                      int recoveryPollPeriod, int recoveryBatchSize):base(useActiveTimers)

            {

                _recoveryPollPeriod = recoveryPollPeriod;

                _recoveryDuePeriod = recoveryDuePeriod;

                _recoveryBatchSize = recoveryBatchSize;

            }

            public RecoveringWorkflowSchedulerService(NameValueCollection parameters):base(parameters)

            {

                string recoveryDuePeriodString = parameters["RecoveryDuePeriod"];

                string recoveryPollPeriod = parameters["RecoveryPollPeriod"];

                string recoveryBatchSize = parameters["RecoveryBatchSize"];

                if (!int.TryParse(recoveryDuePeriodString, out _recoveryDuePeriod))

                {

                    _recoveryDuePeriod = DEFAULTRECOVERYDUEPERIOD;

                }

                if (!int.TryParse(recoveryPollPeriod, out _recoveryPollPeriod))

                {

                    _recoveryPollPeriod = DEFAULTRECOVERYPOLLPERIOD;

                }

                if (!int.TryParse(recoveryBatchSize, out _recoveryBatchSize))

                {

                    _recoveryBatchSize = DEFAULTRECOVERYBATCHSIZE;

                }

            }

           

            protected override void OnStarted()

            {

                base.OnStarted();

                // initialize a recovery timer, where we pickup anny instances that got terminated by

                // system failure

                // like power outage, IISReset commands or other process recycles.

                _recoveryPollTimer = new Timer(RecoveryThreadCallback, this.Runtime,

                                             _recoveryDuePeriod, _recoveryPollPeriod);

            }

            protected override void OnStopped()

            {

                base.OnStopped();

                // cleanup the polling timer

                if (_recoveryPollTimer != null)

                {

                    _recoveryPollTimer.Dispose();

                }

            }

            /// <summary>

            void RecoveryThreadCallback(object stateInfo)

            {

                if (!_disposed)

                {

                    try

                    {

                      RecoveringWorkflowSchedulerService schedulerService =

                           this.Runtime.GetService<RecoveringWorkflowSchedulerService>();

                        if (schedulerService != null)

                        {

                            SqlWorkflowPersistenceService persistenceService =

                                      this.Runtime.GetService<SqlWorkflowPersistenceService>();

                            // there is an persistence service so get workflows that need recovery

                            if (persistenceService != null)

                            {

                              if (Runtime.IsStarted)

                              {

                               var allPersistedWorkflows = persistenceService.GetAllWorkflows();

                               var allRecoverableWorkflows =

                               (from persistedWorkflow in allPersistedWorkflows

                                 where !persistedWorkflow.IsBlocked &&

                                        persistedWorkflow.Status == WorkflowStatus.Running

                                 select

                                persistedWorkflow.WorkflowInstanceId).Take<Guid>(_recoveryBatchSize);

                                foreach (Guid workflowInstanceID in allRecoverableWorkflows)

                                {

                                 // process the recovery work on the threadpool thread;

                                 ThreadPool.QueueUserWorkItem(WorkflowWaitCallback,

                                                                   workflowInstanceID);

                                }

                              }

                            }

                        }

                    }

                    catch (System.ObjectDisposedException)

                    {

                        // appdomain is beeing teared down, gracefully swallow exception
                        // and exit by clearing the timer

                        _recoveryPollTimer.Dispose();

                    }

                }

            }

            /// <summary>

            /// This is where the execution continues after we queue a recovery workflow

            /// on the threadpool.       
          
    /// </summary>

            /// <param name="stateInfo"></param>

            void WorkflowWaitCallback(object stateInfo)

            {

                // here we get the threadpool thread donated, now use it to schedule the requested

                // workflow

                this.RunWorkflow((Guid)stateInfo);

            }

            #region IDisposable Members

            public void Dispose()

            {

                _disposed = true;

               

                if (_recoveryPollTimer != null)

                {

                    _recoveryPollTimer.Dispose();

                }

                

            }

            #endregion

        }

     

  • How the transaction property for WF tracking can make a huge performance difference

    The past two days I have been working on a performance problem we were having with a workflow implementation. The workflow was quite simple, A WCF call coming in, then registering the request data in a SQL database and return a ticket that the request was received. The ticket can be used by people to get information later on the status of their request. (A pretty common SOA pattern)

    The workflow used a TransactionScope activity to guard the persistence of the workflow with the registration call of the custom activity in the database. The workflow runtime had the SqlPersistence service loaded and we use the SqlTrackingService to enable administrators of the system to monitor and see workflow progress of certain requests.

    The time to register a request appeared to be around the 18 seconds per request when we have 20 concurrent users?!? We expected this to be around 1 second Max, depending on the server load.

    After some digging around We detected that if we remove the tracking service the response times where back to what we expected, somewhere around the 0.6 seconds. So how can this happen?

    The answer to this is one very important property that can be set for the tracking service configuration called Transactional. This property was set to false, by one of the developers. This simple setting added around 17 of overhead in SQL communication(with the load we already had on SQL, so can be much less in other specific cases). The property is used to determine if each tracking record that needs to be saved to SQL is done one call at the time or if all work is queued and persisted in one work batch. As you can imagine this will reduce the number of roundtrips to SQL significantly and when the workflow and database server are as busy as with the work we do, times can add up pretty fast.

    So when you see a slow performance in workflow and you have tracking enabled, Look at the transactional property and check if it is not set to false (default is true, so someone would explicitly change this)

    Cheers,
    Marcel

  • Announcing: Team System Web Access Translation Project

    Brian Harry’s announced this a while ago on his blog (here) and now the time is there to give you the first pointers to where we are in the Team System Web Access Translation effort. The last few weeks I have been working with MS Folks on this project. This project is trying to provide additional translations of the Visual studio Team System Web Access web site that is provided by Microsoft to enable access to TFS from an internet location.

    I think it was about 3 months ago, that I ran into a customer who did not want to use TSWA because it was not available in the Dutch Language. I knew Microsoft did not have the resources to make this translation, so I asked them if it would be possible to create a community effort around translating the site into other languages then Microsoft provides out of the box. While I thought it would be a long shot, I was very happily surprised with the reaction to my mail. Brian Harry responded to me with the following mail:

    “Thanks for the request Marcel.  After discussing this internally for a bit, we have decided to enable community translations of TSWA.  If you’d like, you are welcome to be our guinea pig, um, I mean Beta tester J

     

     Hakan Eskici (lead for TSWA) created for this effort what we call the Translation Toolkit. This toolkit contains a VS solution with all the appropriate resource files in it that we need to start creating our own translated versions. He will also create some video’s on how you can create a localized version based on the toolkit. This will all be posted on the codeplex site once we have the goods available J

    They asked me if I would like to manage the community effort and I agreed. So from now on you can contact me if you would like to start a localized version for your native language that is not provided by MS. We have created a Codeplex project to host the sources and the work items, you can find it over here: http://www.codeplex.com/TSWAL

    Before you start writing me mails to contribute, I want to note that currently we are in the first phase of the project where we test drive the toolkit and are experimenting in the deployment models (batch files, installers, etc). We have now started with the Dutch and Turkish language and I have some fellow VSTS MVP’s who are going to take care of at least the Portuguese an Norwegian versions to get us started.

    I must say it quite a lot of work to make a translation. There are > 2000 entries in the resource files that need to be translated an I must say it can be quite hard to come up with an appropriate translation of e.g. the word: Work Item or Change Set, etc.

    If you would like to contribute after we have finished the first phase, please send me an email using the email option on this site to contact me. State in the subject [TSWAL] and the language you are willing to work on. Eg. nl-nl (that is e.g dutch). I will keep a list of people who volunteered and I will contact them as soon as we have the finished the first languages and have a clear model in terms of how we can involve you. (send resource files via mail, add you to codeplex as contributor, etc, etc.)

    Hope you all like the effort and I hope a lot of people volunteer.

    Cheers,

    Marcel

    p.s. for the Dutch people who read my blog, what do you think translates best for Work Item? Until now I came up with "werkpakket". I also got sugestions like "actiepunt" or "werkopdracht". What do you think translates best to the dutch language?

  • Creating more elaborate work item descriptions with a HTML field

    On e of the things only few people know about work item tracking is its ability to have HTML descriptions in your work item. With HTML, I mean that you can have rich editing capabilities, like colors, bold, underline, different fonts, bulleted lists and even hyperlinks.

    For this to work, you only need to change your work item type definition to contain a field that is of the type HTML. If you show this field in your work item form, you can activate the formatting toolbar to enable the text formatting in the text box.

    We use this with our change request work items so we can create some more enhanced description on what we need to get done for a certain change. I thought I would mention it here on my blog, because only few people actually know this is possible J

    Cheers,
    Marcel

     

     

  • Developer days 2008 Presentations on VSTS 2008 and VSTS Rosario

    Last week I did two sessions on Visual studio Team System. The first session was about Team system 2008 and all the new features that are in the product. Not only did I discuss the new features in the 2008 Release, I also showed the new features found in SP1 of VSTS 2008 and the new features in the Power tools that are available for download now some time. Here is a picture of the Thursday morning session.

     

    The second session was about Team System Rosario and what we can expect in the future. This session was a No slides only demo session, where I tried to pack most of the user stories available in the current CTP 12 release into a one hour session. I must say it was quite fun to do and based on the responses I got until now, everybody was quite pleased with the features I showed.

    You can download the presentations here:

    What is new in Visual Studio 2008, SP1 and Power tools

    Visual Studio Team System Code name Rosario

    Like I said, the second session was more demos, then slides. But at least it gives you an impression of all the demos I showed in only one hour :-)
    Hope you find the useful.

    Cheers,
    Marcel

  • How to share Work Item Templates in your team

    After you install the Team Foundation Power Tools, you get a new node in your team Explorer called Work Item Templates.

    It appears that Work items Templates have been around for a while, only almost nobody knew they existed since they were nicely tucked away in the team menu. Since the template option now they became visible in the team explorer, it also caught my attention and I must say I really like it allot.

    So what can you do with Work item Templates? Well as the name implies, you can set up a pre-populated version of a work item that serves as a template for work items you want to create. This is very useful when you have a work item form that requires you to populate certain fields the right way to get them into your development process. E.g. we have currently 5 versions of the product we work on. The product is also dissected into different product areas. So when I wan to file a Change request, I at least need to make sure I fill in the right Area and Iteration to have the work item show up in the work item queries the project manager and Configuration Item owners run on a daily basis. If I would forget to set up the appropriate fields this can cause my work item to end up in WIT but nobody would ever notice or take action.

    After creating the template you can right click it on the team explorer menu and there you can choose Create Work item. Then you will see a new work item form that has the pre-populated fields.

    After working with these templates I actually wanted to share the template definitions with my team mates. But unfortunately the templates are currently not stored at the server, but in a folder in your local profile. (default: <SystemDrive>:\users\<username>\Documents\Work Item Templates)

    When you create a template, it will save the file as an .wt file on your local hard drive. But it appeared there is a solution to sharing the templates and that is to set up a shared folder on your network. What you do is create a share that allows all team members to create and update files and folders and set the work item template location to that share. You can change the template location using the Team Foundation Power Tools options menu. There you can enter the UNC path of the share you use.

    Today we have set this up for the whole team and we are pretty happy with the way this works. One thing to keep in mind is that everybody has all rights to the templates, including modification and deletion. So your team members must be aware of the templates they should maintain and the once that are from their team members. To ease this issue, we decided we would set up a set of folders that are specific to the developer and one shared folder that contains the templates we all can use, but only few maintain.

    I would have liked to see this notion of template sharing in the tool as first class citizen, but I think this will work quite well for us at this moment. Who knows the team might decide to store it in TFS someday.

    Cheers,
    Marcel

  • TFS Power tool TfsServerManager connecting from a non domain joined computer

    Last week at the MVP summit we got a great demo of the features that are in the just released power tools. One of them is the new TfsServerManager tool that you can use to look at your TFS server and it's overall health and performance.

    The tool can be found after installation in the "Program Files\Microsoft Team Foundation Server 2008 Power Tools" folder and is called TfsServerManager.

    I did ran into one issue and that is caused by the fact that I am not working on a machine that is joined to our domain. therefore I always need to supply credentials to fain access to the TFS server. This also happens with the tool, only after providing credentials I was confronted with a message: Login failed for user 'marcelv'

    after some digging around it appears that the tool not only connects to the web services of the TFS server, but also directly connects to the TFS database. While I am an admin on the database, it was not possible to connect to the database. This was caused by the fact that the application just runs as my local user and that user is unknown in the database. To resolve this issue, I created a small command script that runs the application in the security context of my domain user. This way the tool will authenticate as the domain user marcelv and then it is able to connect to the SQL server.

    C:\Windows\System32\runas.exe /netonly /user:[yourDomain]\[yourUserName] TfsServerManager.exe

    After I was connected I took a look at the statistics page. There I found to my surprise that the TFS server was up and running for only a few hours. I asked our TFS admin and he did not know better than the server is up and running for several days and we only had a reboot a month ago to run the latest updates.

    It appears the time reported here is a value in TFS and refers to how long TFS has been up, not the server. (So last restart does not refer to a reboot of the serer!) There are many things that can cause TFS to shut down. E.g The app pool may be configured to recycle after a number of minutes, when idle for so much time or when certain memory thresholds are hit.

    One other feature I find particular useful is the option to view the workspaces, shelve sets and labels.  This way you can see if there a stale workspaces and shelvesets that can be cleaned from the server. My first run already resulted in 50 workspaces and about 20 shelve sets that where never accessed for the past year. So it is good practice to run this tool once in a while and see what you can clean on your server.

    Cheers,
    Marcel

  • How to remove a toolbar header from a SharePoint web part

    Yes you read it correct, I did some SharePoint development last week. And I must say It was quite a challenge. I was just helping out on a project where we created a feature that combines a custom list deployment with the deployment of a custom page. On this page we place web parts that provide a view to the custom list.

    Now the challenge I faced was to get the deployment working in such a way that we add web parts to the page not showing the toolbar. At first I just laugh and said that I would fix that for them in a few minutes, assuming I could just use the object model, set a toolbar property to false and from that point on have a big hug from my colleague.

    Well I was wrong. Removing a toolbar from a webpart using the object model is something that is just not supported. You can read more about this in e.g. this forum thread.

    After some digging around I finally got the problem solved. I thought I would provide the code snippet here, so you can leverage it.

     

    private static void DisableToolbar(ListViewWebPart lv)

    {

     //  Extract view

     

       System.Reflection.PropertyInfo ViewProp = lv.GetType().GetProperty("View",
        System.Reflection.
    BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);

     

       SPView spView = ViewProp.GetValue(lv, null) as SPView;

     

       string txt = spView.SchemaXml;

       System.Reflection.PropertyInfo NodeProp = spView.GetType().GetProperty("Node",
         System.Reflection.
    BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);

     

       XmlNode node = NodeProp.GetValue(spView, null) as XmlNode;

       XmlNode tBarNode = node.SelectSingleNode("Toolbar");

     

       if (tBarNode != null)

       {

          XmlAttribute typeNode = tBarNode.Attributes["Type"];

          // make the contents empty so we realy remove the toolbar .....

          // otherwise you might get a different type of toolbar popup when we have a
          // Migrated site from 2.0

          tBarNode.RemoveAll();

          // re-add the type attribute

          tBarNode.Attributes.Append(typeNode);

          // finally set the toolbar to not show....

           typeNode.Value = "None";

        }