Performance

Vortex OpenSplice is the fastest Ultra-Low Latency messaging middleware enabling several million data updates per second, while retaining stability and reliability on commodity multi-core PC hardware and 1Gigabit Ethernet.

The rich set of supported Qualities of Service (QoS) enables Vortex OpenSplice to be tuned for maximum performance over an extremely wide set of use case and deployment scenarios, ranging from embedded systems to system of systems.

Vortex OpenSplice also features negligible inter-core and inter-processor communication latency thanks to its shared memory architecture.

The following benchmark data demonstrates the exceptional performance of Vortex OpenSplice when compared to other implementations and alternative messaging technologies.

Test Environment

  • Vortex OpenSplice Enterprise v6.4.x
  • 2 * Intel(R) Xeon(R) CPU E3-1270 V2 @ 3.50GHz (4 cores with hyperthreading)
    • Machine names: perftest8, perftest9
    • 16 GB RAM
    • Disk: WDC WD5003ABYX-1 Rev 01.0
  • Gigabit Ethernet
    • Switch: Dell PowerConnect 2816
  • OS
    • 64-bit linux
    • Linux 3.8.13-rt14.20.el6rt.x86_64 #1 SMP PREEMPT RT Mon Aug 19 23:09:43 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

Latency Benchmarks

The following curve shows the latency measured in micro seconds to send a message between a DDS Data Writer (Producer) on one node and a DDS Data Reader on another node. In order to avoid time synchronization issues latency was measured by sending the message between node and then echoing it immediately back. One-way latency was then calculated by dividing the roundtrip time by two. The test was repeated up to a maximum payload of 64Kbytes. In this test Vortex OpenSplice's C API was used with the standard DDSI interoperability protocol and the message exchanges were reliable.

DDS Latency Benchmarks

Both the average and minimum latency values for each different payload size were measured. The curves shows excellent latency can be achieved even when the payload size is significant. For payloads up to 2Kbytes then the latency was less than 100 micro seconds.

The test was also repeated using Vortex OpenSplice's proprietary RTNetworkingService, with very similar results.

On a zoomed in scale you can see that the difference between the median and minimum values across the range of data points is very low demonstrating that data delivery times are very predictable, making Vortex OpenSplice suitable for use in systems requiring high levels of determinism.

End to end latencies C API

The following curve illustrates the latency difference between Vortex OpenSplice's Java and C APIs. In fact the Java APIs impose a very small additional overhead and can be used very efficiently in systems where performance is critical.

Latency Benchmarks end to end C Java API

Throughput Benchmarks

The following curves demonstrate the sustainable throughput application level measured in Mbits / second that can be achieved between a Data Writer (Producer) on one node and a DDS Data Reader on another node over a Gigabit Ethernet connection. The approximate available application bandwidth of the network is 950 Mbits / second taking into account the overhead of Ethernet and UDP.  The test was repeated using both the standard default DDSI protocol and also Vortex OpenSplice's optional RTNetworking Service. The curves show that even for reasonably small packet sizes throughput is limited by the network and not by Vortex OpenSplice .

Throughput benchmarks performance unbatched

The following curve shows throughout measured in Kmsg / sec. For small messages it can be seen that the Vortex OpenSplice RTNetworking Services can deliver better throughput than the default DDSI protocol.

Vortex OpenSplice provides a Streams-API that supports transparent packing and queuing of data samples using auto-generated 'containers' (batching), thus minimizing the overhead normally associated with the management and distribution of individual DDS samples. The Streams API is particularly useful feature for any system where small topics are published at high frequency such as for 'periodic updates' like instreaming data.

The following curve illustrates the batched throughput performance of Vortex OpenSplice using the Streams API in comparison to the un-batched performance.  It can be seen that for both DDSI and RTNetworking protocols the batched throughput is 5x the achievable un-batched performance. The curves also show that the network bandwidth limit can be reached for samples sizes as low as 16 bytes.

Throughput messaging performance

For a more complete set of Vortex OpenSplice performance data please contact PrismTech for more information.

Scalability Benchmarks

Vortex OpenSplice provides excellent scalability for systems where many application need to share data. In particular when scaling the amount of applications reading data on a single node OpenSplice's shared memory architecture can offer significant advantages.

The following curve shows the aggregate throughput that can be achieved on a single node. For a test with 20 applications running on a single node an aggregate a throughout of 19 Gbits / sec can be achieved with next to linear scalability. The curves demonstrate the advantage of Vortex OpenSplice's unique shared memory architecture where only one copy of a sample is physically present in shared memory regardless of the number of subscribers for that data (requiring data coming from the network to be deserialized only once).

DDS Scalability Benchmarks