Benchmarking OpenStack Swift

In this blog post we release some benchmarking results about OpenStack Swift. Cloudwatt, a company we work with, lent us some servers in order to perform a benchmark on a Swift installation. We’ll describe the hardware, the tool and the methodology we used to do that.

Swiftstack develops a great tool called ssbench designed to benchmark OpenStack Swift, you could find this tool on github. Ssbench is based on scenario files that let the user describe which kind of operations ssbench’s workers will perform against the Swift cluster. The architecture of ssbench is really handy as it is composed of a master process and one or many worker process. Workers are connected to master by a message queue bus so you can spread workers across many hosts allowing you to evaluate wide sized Swift cluster.

The cluster architecture we used is composed of 7 servers DELL R720xd (2 x Intel(R) Xeon(R) CPU E5-2630L 0 @ 2.00GHz, thus 24 threads) and 32GB of RAM. Those servers are connected to a 10GB switch by 10GB NICs. Each host owns 25 internal hard drives and 12 external (DELL MD1200) and each drive is 1TB.

Linux distribution on each host is Debian Wheezy and Openstack Swift version is 1.7.5. Swift configuration is almost default using tempauth and one memcache server.

We did some tests to reach the best performance with those hosts and concluded that using two of them for swift proxy process is better. HAProxy is installed on one of them to spread operations against both proxy. Five hosts are then used for account, container and object storage.

The Swift ring has been configured with 3 replicas and the following for devices:
– 24 devices for storing object on each server
– 9 devices for storing container on each server
– 3 devices for storing account on each server

And each storage server has been sized this way:
– 17 workers for object server
– 8 workers for container server
– 3 workers for account server

Swift proxies are configured to run 24 proxy workers and HAProxy policy is a simple round robin.

As mention above ssbench need a scenario file that describes operations that will be performed on Swift. We have written various simple scenario to benchmark specific object size and for each object size the main C.R.U.D operations (Create, Read, Update, Delete). Below this is one of our scenario targeted to only perform create operation on 24KB objects:

{
  "name": "Pure create scenario",
  "sizes": [{
    "name": "Small files (24K)",
    "size_min": 24576,
    "size_max": 24576
  }],
  "initial_files": {
    "Small files (24K)": 100
  },
  "container_count": 100,
  "operation_count": 5000,
  "crud_profile": [100, 0, 0, 0],
  "user_count": 20
}

One of the main problems was to correctly adjust the user concurrency (user_count). A value of 20 will simulate 20 clients fairly spread over ssbench’s worker process but how to know whether our swift cluster can handle more. To find this concurrency limit we have created of small bash wrapper that manage multiple ssbench runs and change concurrency value by increasing it. Once the operations/sec count during two run remains the same we stop and keep the last concurrency value as reference. A specific scenario where C.R.U.D profile is 25/25/25/25 is used by the wrapper. Then we start our specific CREATE, READ, UPDATE, DELETE scenario at this concurrency on 30000 operations.

The chart below shows the performance reached in operations by seconds for different object size 24KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 2800KB. The later object size correspond to the average object size stored in Cloudwatt’s object store (take a look at this project swift-account-stats). For each object size we have 4 bars one for each kind of operation. Note that Y axis is on a logarithmic scale.

Performance remains similar up to 64KB object and then begins to decrease. The cluster can handle 7000 read ops/second and 3800 create/update ops/second for object up to 64KB.

This chart shows the same results as above but displays the bandwidth in MB/second revealing that we quickly reached the bandwidth limit of the network for 512KB object and wider objects. The network architecture used for this benchmark was undersized and at least we should have used 2 NICs on proxy hosts one for client inbound/outbound data (ssbench worker) and one for access the storage network.

During high load three components are heavily solicited thus CPU, disk IO and network bandwidth. Our benchmark tests shown us that proxy process are really CPU intensive so you need to carefully size your proxy hosts when designing your cluster. For storage node disk performance is crucial and having some disk that perform badly can significantly decrease the overall performance of the cluster. Be sure to evaluate hard disks before integrating them into your cluster (fio is a good tool to benchmark storage devices).

7 Responses

Tim 2013-08-20 at 00:00 | Permalink

there’s a slight contradiction in the blog. You said:
“Below this is one of our scenario targeted to only perform read operation on 24KB objects:”
however the CRUD profile of [100,0,0,0]in the blog indicates that only PUTs will be performed. There will be no READS, Updates, or deletes performed.
tim 2013-08-21 at 00:00 | Permalink

The “crud_profile”: [100, 0, 0, 0]
indicates 100% CREATE operations not READs as stated in the text.
Andrey N. Groshev 2014-01-27 at 08:30 | Permalink

What HDD models you used?
1. Fabien Boucher 2014-02-03 at 11:30 | Permalink
  
  Hello Andrey !
  Sorry for this late response, as you read in the article on each storage node there was 25 internal hard drives and 12 external disks behind a DELL MD1200 card.
  The internal disks was : SEAGATE ST91000640SS. The external disks was : SEAGATE ST1000NM0001.
trent 2014-02-21 at 23:25 | Permalink

Sorry, I’m new to the ssbench tool. I have a simple topology with just one storage node and one ssbench tool (RelHat Linux)
storage node————–ssbench
I should be able to run ssbench –worker process and ssbench master on the same ssbench tool machine. Correct?
Thanks
1. Fabien Boucher 2014-02-24 at 13:01 | Permalink
  
  Hello,
  Yes usually it is the first way to run ssbench, you will have the ssbench master and one or many workers running on the same host. Running workers on one or many other hosts is useful when you cannot reach the limit in term of request capacity handling of you Swift cluster. This can occurred when the output bandwidth or CPU power is limited on you ssbench host but with your architecture this will probably not happen.
  You can have a look at https://github.com/openstack/swift-bench too that is a bit more simple to begin with Swift benchmark.
  Cheers,
  Fabien
Gil 2014-12-02 at 10:21 | Permalink

I don’t see any charts in the post, only ‘this image failed to load’.

Comments are closed.

Articles about OpenStack and related technologies from the RDO community

Benchmarking OpenStack Swift

Article written by Fabien Boucher

7 Responses

Recent Posts

Share this:

Article written by Fabien Boucher

7 Responses

Recent Posts