K2 Host server process crashes

14 years ago
28 September 2009
7 replies
108 views

SanderKooij
2 replies

We are experiencing some stability problems with our K2 installation. For some reason the K2 host server service suddenly stops (crashes). There are no error or warning entries written to the event log when this happens. I have enabled the logging feature in the HostServerLogging.config. When the K2 Process stops an error is written to the log file (the specific error can be found at the bottom of this posting). The process instance on which the K2 service stops is different every time. The process instance is in an error state after the restart of the service. When retrying the process instance manually it continues to work without a problem. We have been experiencing this problem for a few weeks now. Some days the problem occurs only one time a day, some times more than twenty times! We are using K2 Blackpearl version 0807 Update (4.8210.2.450)

The error seems to occur in two different unrelated situations:

High workload:
When there is a high workload on one of our workflows the problem seems to occur more often. During this ‘high’ workload the processor is never stressed at 100% and there are no abnormal mounts of memory consumed. There is only an above-normal processing of the workflows.

Open TCP Connections:
The second situation in which the host process crashes is when the host process has been running without any problems for quite a while (couple of days). When looking at the counters in the performance monitor the K2 counter ‘TCP Connections opened’ seems to have a high amount of open TCP connections. Usually when it hits 15.000 open connections the process stops.

As a workaround I have written a process monitor which monitors the K2 host process and restarts the service after it has stopped. Although this currently allows our server to keep functioning it’s not a permanent solution.

Has anyone encountered the same problem? Or knows a solution to this problem? Any help is highly appreciated.

Here is the section of the log file containing the error. The username has been replaced by <domain><UserAccount> for security purposes.

----------------

"5437453751","2009-09-28 10:08:02","Error","EnvironmentServer","15100","Generic","SourceCode.Workflow.Runtime.Management [SendArchiveX [string[] names]]","15100 Error occurred, ERROR: 26023 Process instance 20272 not found for K2:<domain><UserName> at 192.168.200.14:10","anonymous","0.0.0.0","K2SRV01:c:program files (x86)k2 blackpearlHost ServerBin","5437453751","8ea18d89f7ae4390bb62b8c7c8200c7e",""

"5437453752","2009-09-28 10:08:02","Error","EnvironmentServer","15101","Generic","SourceCode.Workflow.Runtime.Management [GotoActivity [string[] names]]","15101 Error occurred, ERROR: 26023 Process instance 20272 not found for K2:<domain><UserName> at 192.168.200.14:10","anonymous","0.0.0.0","K2SRV01:c:program files (x86)k2 blackpearlHost ServerBin","5437453752","98ca976ddb924590bbfcacf927d79c48",""

"5437453753","2009-09-28 10:08:02","Error","System","2025","InternalMarshalError","SourceCode.Hosting.Server.Runtime.HostServerBroker.InternalMarshal","2025 Error Marshalling SourceCode.Workflow.Runtime.Management.WorkflowManagementHostServer.GotoActivity, 26023 Process instance 20272 not found for K2:<domain><UserName> at 192.168.200.14:10","system","192.168.200.14","K2SRV01:c:program files (x86)k2 blackpearlHost ServerBin","5437453753","18e7a173c01e41f9866a8ca8631611e7",""

"5437453754","2009-09-28 10:08:02","Error","System","2025","InternalMarshalError","SourceCode.Hosting.Server.Services.TCPClientSocket.InternalMarshal","2025 Error Marshalling SourceCode.Workflow.Runtime.Management.WorkflowManagementHostServer.GotoActivity, 26023 Process instance 20272 not found for K2:<domain><UserName> at 192.168.200.14:10","system","192.168.200.14","K2SRV01:c:program files (x86)k2 blackpearlHost ServerBin","5437453754","286926ddf56b468bac2f19f49090815f",""

7 replies

+13

PYao
Rookie
619 replies
14 years ago
28 September 2009

Not sure if this will help w/excessive TCP connection but try to reduce the timeout to 30 seconds. HKLMSystemCurrentControlSetServicesTcpipParametersTCPTimedWaitDelay

http://blogs.msdn.com/dgorti/archive/2005/09/18/470766.aspx

Userlevel 4

+14

If what Peter metioned doesnt work, can you provide the following detail.

I noticed a Goto Activity is being called prior to the Server crash. Could you provide us with more information regarding the Goto . Is it being called via the API? Are you doing a Goto from the Workspace? Or is it an Escalation Goto ?

Also are you having the same behaviour when attempting to Delegate an item?

vernon

Thanks for your reply Vernon.

To put it more into context:

We have built our own web application which uses K2 as runtime environment for various workflows. We don’t use the default K2 Workspace for showing and processing tasks. The workflow which is most probable to cause the crash is a relative simple workflow which processes a credit claim. A user enters a claim, this claim is redirected to a claim administrator which approves or denies the claim. After the approval or deny, the workflow pushes the outcome to our internal system for further processing by calling a function using a SmartObject. The flow is a little more complicated, but in a nutshell that's what the flow does.

It's possible to reassign a task from our application, but in the cases where K2 crashed this was not the case as far as I can see. I have tried to manually delegate a task, but this did no lead to any unexpected behavior. There are also no escalations defined. When looking at the workflow that is the error state after the service crashes, the workflow has stopped at the activity containing the SmartObject. I can understand that there is problem with (or bug in) the smart object but it seems to me that K2 should not crash when a simple error occurs in the smart object. Or is it possible that, in case an exception occurs in the Smart Object, the K2 host server process crashes because of this exception?

Userlevel 4

+14

So it’s definitely the SmartObject that makes the server crash, for an unknown reason. Do you use a connection string in the SmartObject and what service do you use, Dynamic SQL from BlackMarket? You can also see what happens when testing the SmartObject with the SmartObject tester which you can find in the BlackPearl installation folder service brokerSmartObject service tester.exe Vernon

The smart object where the workflow is in error is a custom in-house developed SmartObject. In this smart object a web-service is called which submits the claim workflow outcome to our back-end system. When an error occurs in the smart object, this error is catched and rethrown to alert K2 that an error occurred, this construction has always worked nicely until now.

Further I have done some more research. Because almost every time K2 crashed a workflow instance was in error state, I assumed that this was the instance causing the error. Yesterday I took a deeper look and saw that process instance reported in the log file was working again after a restart. And after resubmitting the task again the workflow continued without a problem. So it seems that there is no direct relation between the workflow in error state and the workflow instance reported in the log file.

However I have discovered another strange phenomenon. I decided the open a performance monitor and monitor the SQL Server Error counters. The first strange thing I noticed was that there are a lot errors generated when K2 is function normally (about 20 per second). I am not an expert in SQL Server, but it seems a lot to me. But just before K2 crashed there was a sudden burst of errors and the performance monitor counted 1400+ errors per second for a couple of seconds. I have been unable to identify these errors yet, but it seems like it could be relevant.

Another thing I remembered was that a couple of week ago (about a week before the problems occurred) we had a hard disk storage problem. The transaction log of the K2ServerLog database had expanded to a size of over 280 GB and we had to purge and shrink the transaction log. A lot of workflows where in an error state, but after retrying these workflows, 99% of the workflows continued to work without problems after retrying. Maybe this also could have something to do with it...

Hi there!

I am having the same problem, with similar issues (the workload and TCP connections), however the memory and CPU hardware capabilities I have set up for the server far exceed the recommended amount for K2 Blackpearl Server service. At this time it is urgently affecting our production envrionment, could you share your workaround for restarting the server when it is off? Thank you so much!

This is an old thread, but seeing that there are at least one new entry in this thread, I will post what we have found.
If you are using - a now old version of K2 Blackpearl - you will probably be using .NET 4.5.0. K2 4.7 uses .NET 4.6.

.NET 4.5.0 has a threading error, which is resolved in .NET 4.5.1 and later versions. This is described in the link below, and pertains to changing of thread context and its transactionscope, which is exactly what happends when a given server is taxed.

https://particular.net/blog/transactionscope-and-async-await-be-one-with-the-flow

K2 Host server process crashes

7 replies

Reply

View our Community Site Map

Reply

View our Community Site Map

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded