Balluff - BVS CA-GT Technical Documentation
|
While acquiring data from the device image data is lost e.g. the counter for incomplete or lost images is increasing or request objects inside an application are returned with errors.
There are A LOT of potential causes for this so a general cause as well as a general resolution cannot be given:
General information about potential resolutions for network related transmission problems that are not specific to the Balluff Multi-Core Acquisition Optimizer can be found here.
The Balluff Multi-Core Acquisition Optimizer makes heavy use of the RSS features offered by NICs. Especially the combination of the mvMultiCoreAcquisitionCoreCount and the mvMultiCoreAcquisitionCoreSwitchInterval needs to be mentioned here. Internally these 2 properties will configure a connected device in a way that every mvMultiCoreAcquisitionCoreSwitchInterval network packets sent out by the device, the packets will be modified in a way that, when received, they will be processed by another CPU. Only mvMultiCoreAcquisitionCoreCount different CPU cores will be used by the algorithm.
Be aware of the impact of using multiple CPUs for receiving data! If possible always favor using a single, dedicated CPU core for processing the data stream of a single camera while moving the load of the application away from this core. If dealing with multiple cameras try to configure the system in a way that each camera uses its own, dedicated core to transmit network data to while the application uses the remaining cores. This will result in the best performance and in that case the parameter mvMultiCoreAcquisitionCoreSwitchInterval is not important since CPU switching for an individual network stream is not necessary.
If this is not possible for whatever reason try to find a good balance between CPU cores used and CPU core switch interval! As described here a network stream is usually bound to a specific CPU anyway! This is done for performance reasons, as switching from one core to another consumes additional CPU cycles and thus is avoided by the operating system. So regarding this troubleshooting section be aware of the fact that the smaller the value of the property mvMultiCoreAcquisitionCoreSwitchInterval is selected the higher the additional overhead will become. Extensive testing did show that values around 64 - 128 result in a good compromise between improved reliability and introduced overhead which becomes almost undetectable then.
Switching the CPU core every now and then in combination with parallel processors however introduces another aspect to the incoming traffic usually only encountered with LAG (Link Aggregation) configurations: It is no longer guaranteed that the driver receives the network packets in order! So a driver needs to be aware of this!
Also selecting a lot of CPU cores for processing with a large number (256 or greater) for the switch interval will reduce the overhead to a minimum but might cause the packets to arrive way out of order since several RSS queues might raise interrupts and the order these are served is not defined.
When using a very small value for mvMultiCoreAcquisitionCoreSwitchInterval in combination with an interrupt moderation scheme that results in a very low number of interrupts might also have a negative effect as then all processors will wait until their queues are almost full until signaling an interrupt. With a small switch interval this will likely happen when all queues are almost full and then it might happen that certain queues do overrun since every core wants to process data at the same time. Another aspect of this is, that the number of RSS queues divided by the number of receive descriptors allocated by the NIC should always be smaller than the value for mvMultiCoreAcquisitionCoreSwitchInterval since otherwise the card might actually run out of receive descriptors before getting rid of its data.