Online Forums
Technical support is provided through Support Forums below. Anybody can view them; you need to Register/Login to our site (see links in upper right corner) in order to Post questions. You do not have to be a licensed user of our product.
Please read Rules for forum posts before reporting your issue or asking a question. OPC Labs team is actively monitoring the forums, and replies as soon as possible. Various technical information can also be found in our Knowledge Base. For your convenience, we have also assembled a Frequently Asked Questions page.
Do not use the Contact page for technical issues.
- Forum
- Discussions
- QuickOPC-Classic in COM
- Connections, Reconnections, COM/DCOM
- Random Disconnect from OPC server
Random Disconnect from OPC server
thanks for the information. I actually have something which I think is "good" news: The stress test application has now crashed. It took it over a day. The not-so-good part is that I will have to do this repeatedly, each time obtaining a bit more information. The problem is that there is some kind of memory corruption which means that the problem does not show when it happens, but later, in seemingly unrelated moment. Please trust me that we are working on this, it will just take time. Assuming that these crashes will still happen here, in the end we should be able to find and fix it. The fact that it nows happens here makes it much easier to address.
With regard to your question: I think what happens is that the timeouts that you get may be the initial symptom of something wrong on our side. They are probably not correct, but you can handle those, on item level. But at some time later, the problem on our side causes the ReadMultipleItems method itself to fail, and that causes a problem on your side, because it gets transformed to an exception, and your current code does not handle it. I think that the generated C++ wrapper throws _com_error (msdn.microsoft.com/en-us/libra...) on any failed HRESULT, so that's the exception you should be catching.
Best regards
Please Log in or Create an account to join the conversation.
- algorithmica
- Topic Author
- Offline
- Elite Member
- Posts: 18
- Thank you received: 0
Please Log in or Create an account to join the conversation.
I have worked on the problem you reported. The information you have provided makes it clear that there is something wrong in the component, but even after close analysis, it was not sufficient to allow me to find the cause. I will really need to reproduce it here in order to be able to fix it. I am now using a stress test program (according to my knowledge) does things similar to yours: It reads 1-1000 items randomly, waits a little, and repeats it in a loop. Unfortunately, no crash so far (since yesterday).
I can now see following options - can do one of them or multiple at once:
I can continue running the test, and possibly write even more demanding tests, in an attempt to reproduce the problem.
Modify the code in suspected spots (in somewhat "blind" way) to put in more checks, safeguards etc. - even though no clear bug could be identified so far. Then I can delived the modified binaries to you for re-test.
Or would you be able to figure it how to make the crash in an environment that I can reproduce? Do you think you can cause the crash with our simulation server? And, by the way, can you send me the details of the system you are using (OS version & service pack, bitness, number of CPUs)? - I will deploy my test on a similar computer and let it run there.
Regards
Please Log in or Create an account to join the conversation.
Here is some explanation to the my previous post:
In general, for methods like ReadMultipleItem, we try to report all errors through the elements in the result array - the Exception property of DAVtq object that you are already testing. Almost all errors are reported in this way; specifically, anything that has to do with communication problems to the target OPC server, is reported in this way. The advantage of this way is that you have full control over how to test for the errors, and all errors are on per-item level, allowing you to test them one by one.
As it turns out, however, there are some errors that cannot be reported in this way. In such case, the method itself (ReadMultipleItems or similar method) returns a failed HRESULT. There are, very roughly, 4 main areas when this can happen:
An invalid argument passed to the method. Note that this is not about such "benign" things such as invalid OPC item ID - as far as the method is concerned, any string is a valid argument. But this case is about programming errors, such as passing in NULL argument where it should not be, or let's say a float number instead of array of strings, etc.
System errors that prevent the method from being executed or finalized. For example, "Out of memory" condition.
Errors reported by the system for the communication between your program and the component. Since the component lives in a separate process, there are some things that can go wrong. And, some cannot be fully prevented. For example, an administrator can terminate the component's process. In such case, the RPC error will be returned for any further attempts to communicate to already existing instances of the components' objects.
A bug in the component (such as the crash of it - that's what you have actually observed).
If you do everything right, you can prevent #1 from happening. And, if we do everything right, we can prevent #4 from happening. But there are still edge cases (#2 and #3) that cannot be fully prevented. For this reason, your code should test the HRESULT of the method, and act appropriately. I don not know the logic of your application, but one possible approach is to treat such error as it was an error reported to all items involved in the operation.
What has happened there is that if you use the wrapper classes generated by C++, then they turn any failed HRESULT of a method into an exception. You can see that in the pictures you have sent to me: There is a line in IEasyDAClient::ReadMultipleItems wrapper that is:
if (FAILED(_hr)) _com_issue_errorex(_hr, this, __uuidof(this));
The _com_issue_errorex throws an exception in error cases described above. You need to either catch that exception type, or not use the wrapper (or use the raw_ReadMultipleItems in the wrapper) and simply test the HRESULT from the method call - if done in this way, no exception will be thrown and no exception need to be caught.
On a separate note, I have some results from the testing; I will post them here shortly.
Please Log in or Create an account to join the conversation.
- algorithmica
- Topic Author
- Offline
- Elite Member
- Posts: 18
- Thank you received: 0
Please Log in or Create an account to join the conversation.
Please Log in or Create an account to join the conversation.
thanks for this.
"The RPC server is not available." should not cause an ASSERT like it did, but otherwise it is quite common error usually cause by disrupted networking communication, OPC server crashes/unexpected terminations etc.
I am finishing a different task, but should be able to switch to your issue today or tommorrow.
Best regards
Please Log in or Create an account to join the conversation.
- algorithmica
- Topic Author
- Offline
- Elite Member
- Posts: 18
- Thank you received: 0
Please Log in or Create an account to join the conversation.
- algorithmica
- Topic Author
- Offline
- Elite Member
- Posts: 18
- Thank you received: 0
Please Log in or Create an account to join the conversation.
Anyway, I'd rather focus on the original post for now, because that is something that should not be happening and is probably the same issue as you have encountered before.
Regards,
Please Log in or Create an account to join the conversation.
- Forum
- Discussions
- QuickOPC-Classic in COM
- Connections, Reconnections, COM/DCOM
- Random Disconnect from OPC server