Accord: A high-performance coordination service for write-intensive workloads
Accord is a high-performance coordination service like Apache ZooKeeper that uses Corosync Cluster Engine as a total-order messaging infrastructure.
- Accord is a distributed, transactional, and fully-replicated (No SPoF) Key-Value Store with strong consistency.
- Accord can be scale-out up to tens of nodes.
- Accord servers can handle tens of thousands of clients.
- Any clients can issue I/O requests, and any servers can accept and process I/O requests.
- The changes for a write request from a client can be notified to the other clients.
- Accord detects events of client's joining/leaving, and notifies joined/left client information to the other clients.
The difference between Accord and ZooKeeper is:
- Accord focuses on write-intensive workloads unlike ZooKeeper. ZooKeeper forwards all write requests to a master server. It can be bottleneck in write-intensive workloads. The below benchmark demonstrates that the write-operation throughput of Accord is much higher than one of ZooKeeper (up to 18 times better throughput at persistent mode, and up to 20 times better throughput at in-memory mode).
- More flexible transactions are supported. Not only write, del operations, cmp, copy, but also read operations are supported in transaction operations.
- In-memory mode and persistent mode supported.
- Message size is unbounded, and partial update is supported.
You can use read/write/del/transaction APIs to build high-performance distributed systems. API Documents is available from here
- Sends 8 bytes write requests from 16 threads by using asynchronous IO.
- You may think that this data size is too small, however typical data to preserve in coordination services is small. Assuming that an application is a distributed lock manager, for instance, each value-size paired to a key is 64 bit.
This data characteristic is also documented in ZooKeeper project site.
- This micro benchmark shows that Accord is up to 18 times better throughput at persistent mode, and up to 20 times better throughput at in-memory mode.
- BerkeleyDB (Backend storage)
- Corosync (Total-order messaging infrastructure)
- Accord (Server Daemon)
- Accord-client (a binary linked with libbdr)
Accord is specific to write-intensive workloads. So assumed applications are as follows:
- Distributed Lock Manager whose lock operations occur at a high frequency from thousands of clients.
- Metadata management service for distributed storage, including Sheepdog, HDFS, etc...
- Replicated Message Queue or logger (For instance, replicated RabbitMQ).
and so on.
- One or more x86-64 machines.
- Linux kernel 2.6.27 or later
- glibc 2.9 or later
- The corosync and corosync lib package.
- The BerkeleyDB lib package.
- Compile-time dependencies
- GNU Autotools
- corosync devel package
- git (when compiling from source repo)
- svn (when compiling from source repo)
- nss devel packages (when compiling corosync from source)
- groff packages (when compiling corosync from source)
- accord dependencies
- BerkeleyDB (libdb) 4.8 or later
- Test-time dependencies (For running make check)
For debian package based systems:
$ sudo aptitude install corosync libcorosync-dev libdb4.8-dev
For RPM package based systems:
$ sudo yum install corosynclib-devel db4-devel
If your distribution doesn't provide the corosync packages, or you prefer to compile from source:
$ svn co http://svn.fedorahosted.org/svn/corosync/branches/flatiron
$ cd flatiron
$ sudo make install
$ cd ..
see also: https://github.com/collie/sheepdog/wiki/Corosync-config
NOTE: If you run the sheepdog cluster in the same network segment, please specify another "mcastport" parameter which sheepdog uses.
Download and build Accord
$ git clone git://github.com/collie/accord.git
$ cd accord
- API Document is here
- $ACCORD_HOME/test are very useful to understand how to use Accord APIs.
- Support RHEL5.
- Supporting ZooKeeper compatilibity.
- Adding servers without reboot.
- Re-writing KVS store for Accord. BDB seems to be bottle neck at not only disk-sync mode, but also in-memory mode.
- Improving write performance by creating new cpg groups.
- Adding language bindings (python, java, Erlang, etc...)
- Reducing notification latency.
- Read-only node (Observer in ZooKeeper) support.
- More benchmark.
- Source code of Accord is available from github.
- This software is experimental and developing software. Therefore, this software is provided without support and without any obligation on the part of NTT Laboratories to assist in its use, correction, modification or enhancement. There is no guarantee that this software will be included in future software releases, and it probably will not be included.
THIS SOFTWARE IS PROVIDED "AS IS" WITH NO WARRANTIES OF ANY KIND INCLUDING THE WARRANTIES OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, OR ARISING FROM A COURSE OF DEALING, USAGE OR TRADE PRACTICE.