(mysql.info) performance-figures
Info Catalog
(mysql.info) sci-sockets
(mysql.info) mysql-cluster-interconnects
15.7.2 Understanding the Impact of Cluster Interconnects
--------------------------------------------------------
The `ndbd' process has a number of simple constructs which are used to
access the data in a MySQL Cluster. We have created a very simple
benchmark to check the performance of each of these and the effects
which various interconnects have on their performance.
There are four access methods:
* *Primary key access*
This is access of a record through its primary key. In the
simplest case, only one record is accessed at a time, which means
that the full cost of setting up a number of TCP/IP messages and a
number of costs for context switching are borne by this single
request. In the case where multiple primary key accesses are sent
in one batch, those accesses share the cost of setting up the
necessary TCP/IP messages and context switches. If the TCP/IP
messages are for different destinations, additional TCP/IP
messages need to be set up.
* *Unique key access*
Unique key accesses are similar to primary key accesses, except
that a unique key access is executed as a read on an index table
followed by a primary key access on the table. However, only one
request is sent from the MySQL Server, and the read of the index
table is handled by `ndbd'. Such requests also benefit from
batching.
* *Full table scan*
When no indexes exist for a lookup on a table, a full table scan
is performed. This is sent as a single request to the `ndbd'
process, which then divides the table scan into a set of parallel
scans on all cluster `ndbd' processes. In future versions of MySQL
Cluster, an SQL node will be able to filter some of these scans.
* *Range scan using ordered index*
When an ordered index is used, it performs a scan in the same
manner as the full table scan, except that it scans only those
records which are in the range used by the query transmitted by
the MySQL server (SQL node). All partitions are scanned in
parallel when all bound index attributes include all attributes in
the partitioning key.
To check the base performance of these access methods, we have
developed a set of benchmarks. One such benchmark, `testReadPerf',
tests simple and batched primary and unique key accesses. This
benchmark also measures the setup cost of range scans by issuing scans
returning a single record. There is also a variant of this benchmark
which uses a range scan to fetch a batch of records.
In this way, we can determine the cost of both a single key access and
a single record scan access, as well as measure the impact of the
communication media used, on base access methods.
In our tests, we ran the base benchmarks for both a normal transporter
using TCP/IP sockets and a similar setup using SCI sockets. The figures
reported in the following table are for small accesses of 20 records
per access. The difference between serial and batched access decreases
by a factor of 3 to 4 when using 2KB records instead. SCI Sockets were
not tested with 2KB records. Tests were performed on a cluster with 2
data nodes running on 2 dual-CPU machines equipped with AMD MP1900+
processors.
*Access Type* *TCP/IP Sockets* *SCI Socket*
Serial pk access 400 microseconds 160 microseconds
Batched pk access 28 microseconds 22 microseconds
Serial uk access 500 microseconds 250 microseconds
Batched uk access 70 microseconds 36 microseconds
Indexed eq-bound 1250 microseconds 750 microseconds
Index range 24 microseconds 12 microseconds
We also performed another set of tests to check the performance of SCI
Sockets vis-a`-vis that of the SCI transporter, and both of these as
compared with the TCP/IP transporter. All these tests used primary key
accesses either serially and multi-threaded, or multi-threaded and
batched.
The tests showed that SCI sockets were about 100% faster than TCP/IP.
The SCI transporter was faster in most cases compared to SCI sockets.
One notable case occurred with many threads in the test program, which
showed that the SCI transporter did not perform very well when used for
the `mysqld' process.
Our overall conclusion was that, for most benchmarks, using SCI sockets
improves performance by approximately 100% over TCP/IP, except in rare
instances when communication performance is not an issue. This can
occur when scan filters make up most of processing time or when very
large batches of primary key accesses are achieved. In that case, the
CPU processing in the `ndbd' processes becomes a fairly large part of
the overhead.
Using the SCI transporter instead of SCI Sockets is only of interest in
communicating between `ndbd' processes. Using the SCI transporter is
also only of interest if a CPU can be dedicated to the `ndbd' process
because the SCI transporter ensures that this process will never go to
sleep. It is also important to ensure that the `ndbd' process priority
is set in such a way that the process does not lose priority due to
running for an extended period of time, as can be done by locking
processes to CPUs in Linux 2.6. If such a configuration is possible, the
`ndbd' process will benefit by 10-70% as compared with using SCI
sockets. (The larger figures will be seen when performing updates and
probably on parallel scan operations as well.)
There are several other optimized socket implementations for computer
clusters, including Myrinet, Gigabit Ethernet, Infiniband and the VIA
interface. We have tested MySQL Cluster so far only with SCI sockets.
See sci-sockets for information on how to set up SCI sockets
using ordinary TCP/IP for MySQL Cluster.
Info Catalog
(mysql.info) sci-sockets
(mysql.info) mysql-cluster-interconnects
automatically generated byinfo2html