Professional OPC
Development Tools

logos

Online Forums

Technical support is provided through Support Forums below. Anybody can view them; you need to Register/Login to our site (see links in upper right corner) in order to Post questions. You do not have to be a licensed user of our product.

Please read Rules for forum posts before reporting your issue or asking a question. OPC Labs team is actively monitoring the forums, and replies as soon as possible. Various technical information can also be found in our Knowledge Base. For your convenience, we have also assembled a Frequently Asked Questions page.

Do not use the Contact page for technical issues.

Threading/Periodic Worker Processes

More
08 Apr 2024 08:25 #12718 by support
Hello,
and thank you for all the details.

For clarification from my side, the two PeriodicWorker.Process threads that you are seeing look precisely as expected.
I am somewhat surprised by the threads near the bottom of your picture (Managed Ids 15, 16, 17), as they seem to have something to do Visual Studio design-time mode - I have not seen them before. But I suppose you know what you are doing/what is happening there.

I think it would definitely be worth testing .NET Framework 4.8.4110.0 or 4.8.3928.0 (those from the good machines) on the bad machine.

I also suggest to make absolutely sure that the problem is not actually related to the instrumentation (which is in part put in place to troubleshoot it, but probably it was there before as well). I have seen a situation where a lock somewhere inside the .NET tracing infrastructure caused mysterious problems. Note that we have some tracing calls - although the traces normally do not normally end up anywhere - in our code as well - so there can be contention. Try to remove all existing external-facing instrumentation except for some simple test of the occurrence of the problem, and check if the problem persists.

The ideas above are just that - ideas, wild guesses. But I think they are quote important to test, before going any further. Plus, it is not clear to me how to proceed afterwards. If I had a reproducible scenario here, I could perhaps turn on/off parts of the code that gets executed as a consequence of "new EasyUAClient()", and search for the part that triggers the issue. That would be very tedious, but in principle doable. But I cannot imagine doing that remotely. In addition, it is possible that the outcome of such troubleshooting process would be that the problem is triggered by some to a lower-level library - a call that we need to make, but cannot influence what is happening inside.

Unfortunately, Windows/.NET do not provide real-time execution guarantees, so what you are observing still falls under "normal behavior", seen from that perspective. Normally it is managed by making the necessary intermediate queues long enough to cope with situations encountered in the wild. I am not sure whether that is a possibility in your case.

Please Log in or Create an account to join the conversation.

More
02 Apr 2024 13:29 #12706 by KPersyn33
Hello! I will attach the response from the customer that answers the below questions.

File Attachment:

File Name: SoftwareTo...5C9.docx
File Size:1,615 KB
Attachments:

Please Log in or Create an account to join the conversation.

More
20 Mar 2024 18:30 #12661 by support
Hello.

If it is not related to garbage collection, and is not something we can reproduce locally, it will be difficult to diagnose. So, I will have some more questions and perhaps some suggestions, but it just to obtain some more clarity and perhaps trigger an idea that could lead us in the right direction.

1. Why precisely is the issue titled "Threading/Periodic Worker Processes"? In the original report, there is only one sentence that appears related to this, saying "Ultimately, we are looking to get information on the possible threading/periodic worker processes that could be having an effect on this.". We have a class (that runs a thread in its instance) called PeriodicWorker. Are you referring to this, i.e. do you see some (or many) threads named "OpcLabs.PeriodicWorker.Process"? If so, how does it look in "normal" situation, and how does it look when the problem occurs (if it can be determined at that time)?

2. Is the general CPU usage on the computer, and CPU usage by the application, "good" (low)?

3. The architecture ("block diagram") of the whole system is not fully clear to me. But I assume that whatever dequeues the entries from the "event status queue" in the .NET application must be running on some thread. How is this thread created? Is it with "new Thread()"? Or is a thread from a .NET thread pool, perhaps a Task? Or something else?

4. Can you obtain the actual full .NET Framework version being used by the application on the "good", and "bad" machines? Suggestion is to read and capture System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription from inside the app.

5. Have you tried to break into the debugger when the problem occurs, and look for suspicious symptoms? Mainly, does the number and "structure" of managed threads significantly differ, when the problem occurs? (this is more or less repeating a part of question #1, but is more general here)

6. Does the issue possibly appear multiple times during a single run of the application, if you let it run long enough- or just once? (If it happens once, it may indicate an issue with .NET JIT compilation)

7. In the original report, you wrote "When they removed the call to client = new EasyUAClient(), the problems stopped occurring.". Does this mean that adding "client = new EasyUAClient()", *without any further operation on that 'client' object* , is enough to introduce the problem? Or did you actually mean that besides "client = new EasyUAClient()", a whole set of new functionality has also been enabled inside the app, consequently?

Regards

Please Log in or Create an account to join the conversation.

More
19 Mar 2024 15:41 #12648 by KPersyn33
The customer has gotten back to us and have done some testing to determine if the Garbage Collector is possibly causing the Sync Pulse issue. They monitored the Garbage Collection by registering for Full Notification and the waiting for Full GC Approach and reading and logging the Collection Count for 0, 1 and 2. After the application is running, I receive 3 notifications with the first 30 seconds and then no others. The Sync Pulse issue still occurs.

I also monitored Garbage Collection using Perfmon and see no issues with it.

Do you have any other suggestions?

Please Log in or Create an account to join the conversation.

More
07 Mar 2024 14:55 #12625 by support
Hello.

There can be various reasons, but the main suspect is .NET Garbage Collector (GC). In its default behavior, it kicks in at unpredictable times, and it stops all other threads in the running process, performs memory cleanup, and then resumes the normal execution of the program. Pulling in our library (well, any library, or adding any code) can change the memory usage patterns and therefore the GC invocations - which is nobody's fault, but can stull have negative effect.

Please determine which .NET or .NET Framework version are they running the program in (because the GC behavior depends on that).
It is important to know the version is truly running, and not just the version they are targeting. Example: They may be targeting .NET Framework 4.7.2, but run under .NET Framework 4.8. Or, they can be targeting .NET 6, but run under .NET 7.

In order to determine whether the above suspicion is correct, please ask the customer to correlate the occurrences of the problem with the times when .NET GC runs. There are tools such as PerfMon (and others) that can help with that. Microsoft has all the docs. Based on the outcome, we can decide on further steps. If the culprit is truly the GC, there are some (limited) ways to influence its behavior (and they depend greatly on the .NET Runtime version).

Best regards

Please Log in or Create an account to join the conversation.

More
07 Mar 2024 12:09 - 07 Mar 2024 12:09 #12622 by support
Hello,
I am currently travelling, please allow some time before I can answer.

Best regards
Last edit: 07 Mar 2024 12:09 by support.

Please Log in or Create an account to join the conversation.

More
06 Mar 2024 14:50 #12621 by KPersyn33
A customer has reached out to our team because they were able to identify an issue with their project that contains OPC Data Client. Unfortunately, this is not something that they are able to record or provide a replicable project for, so I have provided all of the information that they have provided to us below.

Ultimately, we are looking to get information on the possible threading/periodic worker processes that could be having an effect on this.

A PCIe-C1553 card from Avionics Interface Technologies (AIT) is installed. This PCIe card is connected to the customer’s Time Server to read IRIG time and to receive Sync Pulses at 100Hz. The PCIe card receives a pulse every 10 ms and communicates to the application via an interrupt-driven event handler. The event handler reads the IRIG time and updates a pulse counter to be used for application synchronization. If the events are not handled in a timely manner the event status queue will overflow. The event status queue has a depth of 256.

Version is OPC Data Client v2022.1, though the user is in the process of testing with the latest release. The user has been successfully running their application on 5 different PCs, using Windows 10 Enterprise version 1607 (OS Build 14393.1198 and 14393.447) and Enterprise version 1703 (OS build 15063.726 and 15063.502).

When they tried to run the application on a new PC running Windows 10 Enterprise version 22H2 (OS Build 19045.3996), that is where they ran into problems. After a variable amount of time the event status queue for the incoming Sync Pulses overflows. After approximately 2.5 seconds, the event handler resumes receiving Sync Pulse events. They were able to determine this because the Sync Pulse counter and the IRIG time is sent to a queue in the Event handler to be logged to a file in another task. They also log when the status queue overflows.

To determine the root cause of the Sync Pulse interrupt problem, the user have tried to isolate the various components of the application code. Having only the AIT IRIG and Sync pulse portion of the code running along with some basic logging does not present any problems.

When they removed the call to client = new EasyUAClient(), the problems stopped occurring. They were able to confirm that when using ClientAce in place of OPC Data Client, the application does not exhibit any problems either.

Please Log in or Create an account to join the conversation.

Moderators: support
Time to create page: 0.125 seconds