We’re seeing a very random issue when using the busy plugin with Servoy v6.0.4. What happens is we call block(), the screen grays out with Cancel button displayed fine, the background process runs and completes and unblock() is called but the grey screen/cancel button stay up on screen. The user is then forced to cancel which seems to call the cancel callback (but of course the process has actually already completed). We haven’t been able to readily reproduce this ourselves so we don’t know exactly all the details in how it manifests itself. It does sound like some sort of concurrency/visibility issue across threads. We opened case #408 a few days ago but haven’t heard anything yet from Patrick/Scott.
Is anyone else using the Busy plugin with Webclient on Servoy 6.0.4? Have you seen issues like this?
I’ve seen your issue at the time but completely forgot about it, sorry. Unfortunately I have little answer to give you and no known workaround.
The problem is that as you say yourself the issue it totally random. And I agree with you that it seems like a threading issue.
Since Servoy 5.2.x (I don’t remember exactly which version but it dates from the start of last year), Servoy updated the Wicket framework which changed the way callbacks were made to the server. Before that change, there could have been more than one thread calling back the server (the first call being the one launching your process, the second being the one cancelling it), but since that change all callbacks to the server are queued. Meaning that from that change on, I couldn’t do it the way I was doing it before, because the cancel call was called after the previous process has finished.
We discussed this with Johan and I found out a way to workaround this limitation, and we concluded with Johan that it was viable (although a little convolute).
What I did was to expose a simple servlet that was accepting a clientID and holding a ConcurrentHashMap of Boolean per clientID, so instead of calling the server web service back I call this servlet instead which updates a boolean (set to true = process cancelled). Now the server process asks about whether this boolean is true or false in your loop (this is what plugins.busy.isCanceled() is doing), but the client needs to call an RMI service to get access to that boolean, even though it is located on the server-side just like the web-client.
Somewhere in between the call to the servlet and the call to the RMI service to get that boolean state something seems to not be happening and I don’t know why… The Map is a ConcurrentHashMap so it is safe to use it from various thread (the servlet thread and the RMI service) so I don’t believe that’s the problem, but I’m not 100% sure that either the servlet or the RMI service is always receiving the calls at all time.
So that’s where I’m at. You can have a look at the sources on ServoyForge https://www.servoyforge.net/projects/bu … /revisions, the changes I’m talking about are mostly in revision 17.
It’s been working like a charm or so it seems, but I have already heard of random issues like yours, except that I haven’t never been able to reproduce reliably… If I could put my hand on a sample that would be capable of reproducing that reliably, I might be able to fix this.
Thanks a lot for the detailed background information. Not sure if this helps narrow down the problem but we were previously using the busy plugin with Servoy v5.2.8 with no known issues. We then jumped to v6.0.4 where we started seeing this issue (we did test each of the v6.0.x point releases but we never have seen this issue in our QA environment so it may well exist in earlier v6.0.x releases as well).
If I read your message correctly you believe the issue is somewhere in how the ‘cancelled’ boolean is either set or read? I haven’t actually seen this problem occur myself but the problem (as described to me) seems to be just that the glass overlay screen doesn’t close when the process completes (leading the user to believe that the process has hung) so I’m not so sure that it’s directly related to the ‘cancelled’ boolean. Is it possible for there to be some kind of failure due to variable visibility across threads in the unblock() method?
Another bit of strangeness is that at the end of one of our processes we were calling showURL() and when this problem occurred the user had to cancel. Then they tried the same operation again and the showURL() from the first process ended up getting executed?!? Very strange behaviour.
While we’ve pretty much made the decision to remove the busy plugin and use headlessclient instead we have a number of processes that take a foundset as input so headlessclient isn’t much help in those cases. I’ll maybe checkout the plugin source code and see about adding at least a bit of debug logging to some spots so we have a better idea what’s going on and when. I may have more questions for you regarding the best places to do that.
The overlay in client is using the jQuery.blockUI plugin.
If this overlay is not closed properly, then this would mean that the finally { plugins.busy.unblock() } isn’t sent back to the client, which then triggers the $j.unblockUI() in the browser.
It may be that the callback to your showURL() is coming from this unblock() call being executed the next time the plugin is accessed, because the boolean is not correctly set, so the process is still on for this clientID as far as the plugin is concerned.
It doesn’t make that much sense to me how/why this is happening like this, would need to find a way to reproduce reliably to debug in Eclipse and see what’s the exact sequence of event when that happens. If you find out more on your end please keep me posted.
I don’t know enough about the plugin to understand exactly which Threads are accessing the various objects but at a minimum the busy variable should declared final to ensure safe publication of that reference. Alternatively, it doesn’t look like you’re using any of the other methods of AtomicBoolean so you could just declare it as private volatile boolean busy. If you keep the AtomicBoolean maybe it would make sense to use the compareAndSet() so you can catch and log cases where the busy variable was unexpectedly set to the opposite value (true when you expect false)? Even if this doesn’t solve the problem the added logging may provide extra hints as to what/where things are going wrong?
About final on the busy AtomicBoolean, yes that would be safer, I’ll change that.
The call to busy.set() are hard-coded in the plugin, so I don’t think using compareAndSet() will give us more info, what I could do is systematically call a method to update the value instead of using busy.set() and log these calls.
Ok, looks like the issue isn’t really in the busy plugin, but maybe there could be some extra error checking added to check the output of the cancel servlet?
Spot the problem? The /servoy-service URLs are all being blocked (returning 404 error codes). I guess the Webclient code doesn’t notice the bad return code, assumes the cancel completed and removes the glasspane. The user is none-the-wiser, but things are messed up. The reason those URLs are failing is because we’re configured with Apache/mod_jk in front of Servoy/Tomcat and didn’t realize we needed to route that URL into Servoy. I had also forgotten that we did upgrade to the latest busy plugin version between v5.2.8 and v6.0.4.
Anyway, this doesn’t completely explain the issue as explained to me by the users (no mention was made of having tried to cancel the process prior to it having completed) but users do tend to miss such important little details sometimes and I’m getting this all secondhand. At any rate, I’m hopeful that this is the cause of all our problems.
Thanks again for your help. I’ll update the servoyforge ticket and maybe open a new one about adding some error-checking to the ajax request.
Yes, if the servoy-service call is blocked, the boolean is never set to true, and it is also used to check if a process is finished, so that would explain it.
Thanks again for your detailed replies. If you hadn’t mentioned that servlet used for cancellation I probably never would have thought to check the Apache server logs for possible errors there.