<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Comparing Key/Value Stores</title>
	<atom:link href="http://pl.atyp.us/wordpress/?feed=rss2&#038;p=2417" rel="self" type="application/rss+xml" />
	<link>http://pl.atyp.us/wordpress/?p=2417</link>
	<description>Making the world better, one byte at a time.</description>
	<lastBuildDate>Sun, 05 Sep 2010 02:56:27 -0400</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: kunthar</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141994</link>
		<dc:creator>kunthar</dc:creator>
		<pubDate>Sat, 31 Oct 2009 13:02:59 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141994</guid>
		<description>I really would like to see second round benchmarks with a little help from owners.
Just pinning this message to keep fire alive...</description>
		<content:encoded><![CDATA[<p>I really would like to see second round benchmarks with a little help from owners.<br />
Just pinning this message to keep fire alive&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Darcy</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141575</link>
		<dc:creator>Jeff Darcy</dc:creator>
		<pubDate>Wed, 28 Oct 2009 11:31:41 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141575</guid>
		<description>Yes, Justin, I used the JSON/HTTP interface because the descriptions of the lower-level interface were Erlang-specific and inadequate.  If you like, in the next round I&#039;ll reverse engineer enough of that interface to use it.  Would that make you happy?  Would you like to place any bets on how much that will change the overall picture?  It&#039;s worth noting that Riak trailed even tabled in the initial results, even though tabled also uses an HTTP interface and offers more functionality.  If your HTTP implementation is so bad that it&#039;s the only culprit, then that&#039;s useful information all by itself.  I&#039;ve seen people play the &quot;benchmark this, recommend that&quot; game for twenty years and I have no tolerance for it.

As for sharing more detailed results, I&#039;ve already said I&#039;ll provide them when I&#039;ve run what we all seem to agree are the more meaningful tests.  I&#039;m just providing a first look here, not a last word.  How are impressions now and details later worse than just details later?  How are they worse than nothing now and nothing later, which seems to be what most of the developers for these systems are providing?  For example, this is from your own FAQ.
&lt;blockquote&gt;Performance depends on many factors, including hardware and network parameters as well as a great many tunable parameters in the way you set up your cluster and the way that you use Riak from an application. We&#039;ve found it to be fast enough for our purposes, and our goal is not to be &quot;fastest&quot; but rather to stay &quot;fast enough&quot; as the system grows, as hosts fail, and so on. That said, as soon as we get a chance to produce a general, reproducible benchmarking suite, we&#039;ll share it with you.&lt;/blockquote&gt;
So, how&#039;s that effort going?  Would you like to collaborate on developing such tests?  You clearly have some expertise to share, and that would be valuable.</description>
		<content:encoded><![CDATA[<p>Yes, Justin, I used the JSON/HTTP interface because the descriptions of the lower-level interface were Erlang-specific and inadequate.  If you like, in the next round I&#8217;ll reverse engineer enough of that interface to use it.  Would that make you happy?  Would you like to place any bets on how much that will change the overall picture?  It&#8217;s worth noting that Riak trailed even tabled in the initial results, even though tabled also uses an HTTP interface and offers more functionality.  If your HTTP implementation is so bad that it&#8217;s the only culprit, then that&#8217;s useful information all by itself.  I&#8217;ve seen people play the &#8220;benchmark this, recommend that&#8221; game for twenty years and I have no tolerance for it.</p>
<p>As for sharing more detailed results, I&#8217;ve already said I&#8217;ll provide them when I&#8217;ve run what we all seem to agree are the more meaningful tests.  I&#8217;m just providing a first look here, not a last word.  How are impressions now and details later worse than just details later?  How are they worse than nothing now and nothing later, which seems to be what most of the developers for these systems are providing?  For example, this is from your own FAQ.</p>
<blockquote><p>Performance depends on many factors, including hardware and network parameters as well as a great many tunable parameters in the way you set up your cluster and the way that you use Riak from an application. We&#8217;ve found it to be fast enough for our purposes, and our goal is not to be &#8220;fastest&#8221; but rather to stay &#8220;fast enough&#8221; as the system grows, as hosts fail, and so on. That said, as soon as we get a chance to produce a general, reproducible benchmarking suite, we&#8217;ll share it with you.</p></blockquote>
<p>So, how&#8217;s that effort going?  Would you like to collaborate on developing such tests?  You clearly have some expertise to share, and that would be valuable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin Sheehy</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141541</link>
		<dc:creator>Justin Sheehy</dc:creator>
		<pubDate>Wed, 28 Oct 2009 04:08:18 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141541</guid>
		<description>(disclaimer: I work on Riak)

Jeff, I entirely agree with you that measurements that aren’t shared are effectively no measurements.  Can you please share your measurements and your benchmarking code?  I see no measurements here.

You are also very right that an efficient single-node data store is an essential building block for any distributed data store.  However, you&#039;re comparing basic building blocks with entire distributed systems, which is a little confusing to me.  Your point is a good one -- if you set aside TC&#039;s problems with recovery, it is among the many things whose shape makes it a possible underlying storage container for Riak.

You mentioned that this is an &quot;out of the box&quot; comparison, but Riak&#039;s &quot;out of the box&quot; configuration is optimized for its more common use cases... which are deployments on three or more machines.  Among other things, your test was almost certainly writing and reading three replicas of each object on Riak while the other systems were likely only using a single copy.  If you only wanted single-system behavior this is easy to set up in Riak, but the default behavior was almost certainly costing a great deal of wasted extra work in this situation.

You said that you used the best available protocols for each system, choosing native interfaces instead of HTTP, etc... but it sounds like you only did that for some systems.  It seems that you used Riak&#039;s HTTP interface instead of the native client.  While the HTTP interface is very nice and provides some useful features (and excellent interoperability) it certainly isn&#039;t very objective to choose it in this situation.  In addition to the protocol marshaling, that interface must do additional computation to produce headers such as Last-Modified and Etag as it goes out of its way to be very well-behaved and cache-friendly HTTP.

Without you providing any of your benchmark code, system specifications, or your output values and statistical methods, it is impossible to have any useful discussion about the &quot;results&quot;.  Just saying &quot;8-24x faster&quot; isn&#039;t just apples-to-oranges, it is also entirely vacuous without both data and repeatable methods backing it up.</description>
		<content:encoded><![CDATA[<p>(disclaimer: I work on Riak)</p>
<p>Jeff, I entirely agree with you that measurements that aren’t shared are effectively no measurements.  Can you please share your measurements and your benchmarking code?  I see no measurements here.</p>
<p>You are also very right that an efficient single-node data store is an essential building block for any distributed data store.  However, you&#8217;re comparing basic building blocks with entire distributed systems, which is a little confusing to me.  Your point is a good one &#8212; if you set aside TC&#8217;s problems with recovery, it is among the many things whose shape makes it a possible underlying storage container for Riak.</p>
<p>You mentioned that this is an &#8220;out of the box&#8221; comparison, but Riak&#8217;s &#8220;out of the box&#8221; configuration is optimized for its more common use cases&#8230; which are deployments on three or more machines.  Among other things, your test was almost certainly writing and reading three replicas of each object on Riak while the other systems were likely only using a single copy.  If you only wanted single-system behavior this is easy to set up in Riak, but the default behavior was almost certainly costing a great deal of wasted extra work in this situation.</p>
<p>You said that you used the best available protocols for each system, choosing native interfaces instead of HTTP, etc&#8230; but it sounds like you only did that for some systems.  It seems that you used Riak&#8217;s HTTP interface instead of the native client.  While the HTTP interface is very nice and provides some useful features (and excellent interoperability) it certainly isn&#8217;t very objective to choose it in this situation.  In addition to the protocol marshaling, that interface must do additional computation to produce headers such as Last-Modified and Etag as it goes out of its way to be very well-behaved and cache-friendly HTTP.</p>
<p>Without you providing any of your benchmark code, system specifications, or your output values and statistical methods, it is impossible to have any useful discussion about the &#8220;results&#8221;.  Just saying &#8220;8-24x faster&#8221; isn&#8217;t just apples-to-oranges, it is also entirely vacuous without both data and repeatable methods backing it up.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Darcy</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141511</link>
		<dc:creator>Jeff Darcy</dc:creator>
		<pubDate>Tue, 27 Oct 2009 22:57:24 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141511</guid>
		<description>Well, Dave, look at it this way: most of the comparisons out there are &lt;b&gt;entirely&lt;/b&gt; measurement-free.  I know this is just a tiny little step toward something more quantitative, but the longest journey begins with a single step.  Instead of sitting around doing nothing, or devising some test plan that would take more time and resources than I have to execute, I decided to start walking and talking as I go instead of waiting.  I seriously hope that more steps will be taken, either by myself or by others, but the important thing is that even lame measurements are better than no measurements (and measurements that aren&#039;t shared are effectively no measurements as far as anyone else is concerned).

As for how difficult it is to implement a more robust distributed system, I&#039;m pretty well aware.  There are enough bits of my background all over the site to make any convenient assumption to the contrary quite risible.  After all, you say you&#039;ve replicated Dynamo.  Why do you assume I couldn&#039;t as well, and BTW is your version available anywhere?  The 24:1 ratio is just a single observation, not the be-all and end-all of the observations I intend to make or expect others to make.  The fact is that an efficient single-node data store is an essential building block for any distributed data store.  If one alternative is 24x as fast as another, it&#039;s not at all unreasonable to consider that it might be worthwhile to make it properly distributed despite the difficulty.  In this particular case, it might well be possible to combine the upper layer from one of the already-distributed stores with a more efficient single-node store to get something better than either as it is now.  As I said in a &lt;a href=&quot;/wordpress/?p=2166&quot; rel=&quot;nofollow&quot;&gt;previous article&lt;/a&gt;, scalability is not the same as performance but it&#039;s no excuse for ignoring performance either.</description>
		<content:encoded><![CDATA[<p>Well, Dave, look at it this way: most of the comparisons out there are <b>entirely</b> measurement-free.  I know this is just a tiny little step toward something more quantitative, but the longest journey begins with a single step.  Instead of sitting around doing nothing, or devising some test plan that would take more time and resources than I have to execute, I decided to start walking and talking as I go instead of waiting.  I seriously hope that more steps will be taken, either by myself or by others, but the important thing is that even lame measurements are better than no measurements (and measurements that aren&#8217;t shared are effectively no measurements as far as anyone else is concerned).</p>
<p>As for how difficult it is to implement a more robust distributed system, I&#8217;m pretty well aware.  There are enough bits of my background all over the site to make any convenient assumption to the contrary quite risible.  After all, you say you&#8217;ve replicated Dynamo.  Why do you assume I couldn&#8217;t as well, and BTW is your version available anywhere?  The 24:1 ratio is just a single observation, not the be-all and end-all of the observations I intend to make or expect others to make.  The fact is that an efficient single-node data store is an essential building block for any distributed data store.  If one alternative is 24x as fast as another, it&#8217;s not at all unreasonable to consider that it might be worthwhile to make it properly distributed despite the difficulty.  In this particular case, it might well be possible to combine the upper layer from one of the already-distributed stores with a more efficient single-node store to get something better than either as it is now.  As I said in a <a href="/wordpress/?p=2166" rel="nofollow">previous article</a>, scalability is not the same as performance but it&#8217;s no excuse for ignoring performance either.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Smith</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141505</link>
		<dc:creator>Dave Smith</dc:creator>
		<pubDate>Tue, 27 Oct 2009 22:32:44 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141505</guid>
		<description>Fair enough re: # of client connections and duration -- rookie mistake on my part. :)

I think we&#039;ll have to agree to disagree about whether or not VMs are good testing grounds. At the root of my dislike is the hypervisor-introduce latency, particularly for disk access; I&#039;ve seen enough in production to make me very leery of using it for low-latency situations. Perhaps that&#039;s acceptable in some use cases. 

I&#039;m not fully sure how to respond to your last comment block. Perhaps I should state my biases for starters. I&#039;ve tried using tokyo-cabinent in production and found it to be fast in the short-term and then increasingly slow/latency-unhappy as the amount of data grew. In addition,there were also some significant issues with recovery of TC data in the scenario where the server or host fails; speed (for me) is pointless if you can&#039;t guarantee (most) of my data will be present  in the event of a power failure. FWIW, I&#039;m now using embedded Inno and have found that to exceed 2x+ what TC could do in raw requests/sec, while providing decent data integrity.

I&#039;ve also benchmarked both Voldemort and Riak. Yes, V is ~4x as fast as Riak, but it crashes when you cross a threshold of data storage/memory usage. I&#039;ve built my own version of the Dynamo system and have been able to approach V in terms of speed. All this to say that it&#039;s harder, maybe far harder, than you seem to think to build something the provides the sort of data integrity, speed and multi-replica storage on top of Tyrant. So there&#039;s a trade-off there of time to implementation vs. raw perf -- that 24:1 number dwindles quickly when faced with the harsh realities of actually implementing a solution. Tyrant isn&#039;t a valid solution for distributed data stores right now, maybe it can be with time. To compare it against existing data stores and say it&#039;s 24 times faster is misleading and unfair, in my opinion. I appreciate your comment re: comparing first impressions with actual experience, but would argue that it&#039;s equally important to ensure that the tests are providing meaningful comparisons.

It&#039;s not my intention to be an ass about this -- but perhaps I&#039;m succeeding in spite of myself. :) I wish more people put the time and effort into benchmarking, and I thank you for the effort.</description>
		<content:encoded><![CDATA[<p>Fair enough re: # of client connections and duration &#8212; rookie mistake on my part. <img src='http://pl.atyp.us/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>I think we&#8217;ll have to agree to disagree about whether or not VMs are good testing grounds. At the root of my dislike is the hypervisor-introduce latency, particularly for disk access; I&#8217;ve seen enough in production to make me very leery of using it for low-latency situations. Perhaps that&#8217;s acceptable in some use cases. </p>
<p>I&#8217;m not fully sure how to respond to your last comment block. Perhaps I should state my biases for starters. I&#8217;ve tried using tokyo-cabinent in production and found it to be fast in the short-term and then increasingly slow/latency-unhappy as the amount of data grew. In addition,there were also some significant issues with recovery of TC data in the scenario where the server or host fails; speed (for me) is pointless if you can&#8217;t guarantee (most) of my data will be present  in the event of a power failure. FWIW, I&#8217;m now using embedded Inno and have found that to exceed 2x+ what TC could do in raw requests/sec, while providing decent data integrity.</p>
<p>I&#8217;ve also benchmarked both Voldemort and Riak. Yes, V is ~4x as fast as Riak, but it crashes when you cross a threshold of data storage/memory usage. I&#8217;ve built my own version of the Dynamo system and have been able to approach V in terms of speed. All this to say that it&#8217;s harder, maybe far harder, than you seem to think to build something the provides the sort of data integrity, speed and multi-replica storage on top of Tyrant. So there&#8217;s a trade-off there of time to implementation vs. raw perf &#8212; that 24:1 number dwindles quickly when faced with the harsh realities of actually implementing a solution. Tyrant isn&#8217;t a valid solution for distributed data stores right now, maybe it can be with time. To compare it against existing data stores and say it&#8217;s 24 times faster is misleading and unfair, in my opinion. I appreciate your comment re: comparing first impressions with actual experience, but would argue that it&#8217;s equally important to ensure that the tests are providing meaningful comparisons.</p>
<p>It&#8217;s not my intention to be an ass about this &#8212; but perhaps I&#8217;m succeeding in spite of myself. <img src='http://pl.atyp.us/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I wish more people put the time and effort into benchmarking, and I thank you for the effort.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Darcy</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141499</link>
		<dc:creator>Jeff Darcy</dc:creator>
		<pubDate>Tue, 27 Oct 2009 21:30:54 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141499</guid>
		<description>&lt;blockquote&gt;I respectfully posit that load testing on a VM is something of an oxymoron. You’d never (or at least I wouldn’t) run these sorts of apps in a VM for production purposes due to the unpredictable latency introduced.&lt;/blockquote&gt;
I don&#039;t think that&#039;s true at all.  These kinds of stores are quite popular in the cloud crowd, where they are often deployed in virtualized environments.  Since most of them lack any kind of multi-user security, running them as a non-virtualized service is untenable and anyone who wants to use them in the cloud has little choice but to keep them on-instance behind firewalls.
&lt;blockquote&gt;The warm vs. cold measurements may be misleading due to the caching of the VM and the underlying O/S. &lt;/blockquote&gt;
If that were the case, I&#039;d expect the warm vs. cold numbers to end up being similar, but instead they are quite different.  More significantly, they are &lt;b&gt;consistently&lt;/b&gt; different.  Hypervisors actually do less caching than you&#039;d think, because they generally want to devote as much memory as possible to guests which do their own caching.
&lt;blockquote&gt;IMNSHO, latency at some # of requests per second are the two dimensions that are most informative for this type of performance testing.&lt;/blockquote&gt;
IMNSHO and IMX that&#039;s true sometimes.  Other times it&#039;s requests/second that matters, still other times it&#039;s MB/second.  I picked a value and measured it.  Maybe I&#039;ll measure another value some time, with different workloads and all sorts of other methodological twists.  Gotta start somewhere.
&lt;blockquote&gt;Another big question is how long your load tests ran.&lt;/blockquote&gt;
A minute, as I said above.  I&#039;m aware that this doesn&#039;t measure effects that kick in when the database is large, and that those can be very important, but it&#039;s also important to measure the starting-line performance and since my time is limited that&#039;s what I did.
&lt;blockquote&gt;What steps did you take to ensure your load generator did not introduce a bottleneck?&lt;/blockquote&gt;
Not much.  OTOH, there seemed to be little sign of it.  I do have some experience measuring both network and storage performance, and I know where to look for excess load.  There just didn&#039;t seem to be any on the client side, though there was plenty - CPU utilization, I/O wait time, etc. - visible on the server side in pretty much all cases.  If I were trying to measure latency I&#039;d need a more sophisticated test program (part of the reason I didn&#039;t go that route) but for testing throughput multiple instances of a simple test program seems to work fine.
&lt;blockquote&gt;Did you use multiple connections to each server?&lt;/blockquote&gt;
Ten, as I said above.
&lt;blockquote&gt;What protocol was used?&lt;/blockquote&gt;
Generally &quot;best available&quot; - e.g. native instead of memcached for Tyrant, plain TCP instead of HTTP for some others, etc.  I think the quality of the available interfaces is part of the performance equation.  Yes, you can do better with some of these by rolling your own hyper-tweaked interface, just as you can do better by using features unique to each, but then the comparisons between them tend to become meaningless.  Generally, &quot;out of the box&quot; and &quot;tweaked to the max&quot; are the only configurations worth comparing.  This is an &quot;out of the box&quot; comparison and I doubt that I&#039;ll ever have time to compare &quot;tweaked to the max&quot; for more than a couple of alternatives.
&lt;blockquote&gt;Saying that Tyrant is 8-24 times faster than Riak is a bit apples-to-oranges. Tyrant is basically a btree w/ sockets on front. Riak is an eventually-consistent, multiple-replica data store. It’s like comparing a bullet-proof vest to a tank — I know which one I’d prefer to be using when things go south.&lt;/blockquote&gt;
Bear in mind that these are single-node tests so they&#039;re more directly comparable than you seem to want.  As I discussed above (I see a pattern here), a 24:1 difference in single-node performance strongly implies that a distributed and eventually consistent layer on top of the faster alternative might yield a significntly faster solution than a similar layer on top of the slower one - whether or not such a layer already exists.  At that level of difference, one could say to hell with eventual consistency and do synchronous mirroring between a half-dozen Tyrant instances, and &lt;b&gt;still&lt;/b&gt; be faster than Riak.  I actually kind of wanted Riak to do better, because I like its feature set and what I&#039;ve read about its architecture passes the &quot;how I would have done it&quot; test, but I ran the tests and it fared very poorly even compared to other distributed and eventually consistent stores like Cassandra and Voldemort.  I know which one of those I&#039;d prefer to be using when things go south.  This is why we do tests, to see whether first impressions survive actual experience, and so far the results seem to suggest that Cassandra might be a better choice.  Maybe the results will be different when I run multi-server tests, but I sure as hell wouldn&#039;t assume any such thing.</description>
		<content:encoded><![CDATA[<blockquote><p>I respectfully posit that load testing on a VM is something of an oxymoron. You’d never (or at least I wouldn’t) run these sorts of apps in a VM for production purposes due to the unpredictable latency introduced.</p></blockquote>
<p>I don&#8217;t think that&#8217;s true at all.  These kinds of stores are quite popular in the cloud crowd, where they are often deployed in virtualized environments.  Since most of them lack any kind of multi-user security, running them as a non-virtualized service is untenable and anyone who wants to use them in the cloud has little choice but to keep them on-instance behind firewalls.</p>
<blockquote><p>The warm vs. cold measurements may be misleading due to the caching of the VM and the underlying O/S. </p></blockquote>
<p>If that were the case, I&#8217;d expect the warm vs. cold numbers to end up being similar, but instead they are quite different.  More significantly, they are <b>consistently</b> different.  Hypervisors actually do less caching than you&#8217;d think, because they generally want to devote as much memory as possible to guests which do their own caching.</p>
<blockquote><p>IMNSHO, latency at some # of requests per second are the two dimensions that are most informative for this type of performance testing.</p></blockquote>
<p>IMNSHO and IMX that&#8217;s true sometimes.  Other times it&#8217;s requests/second that matters, still other times it&#8217;s MB/second.  I picked a value and measured it.  Maybe I&#8217;ll measure another value some time, with different workloads and all sorts of other methodological twists.  Gotta start somewhere.</p>
<blockquote><p>Another big question is how long your load tests ran.</p></blockquote>
<p>A minute, as I said above.  I&#8217;m aware that this doesn&#8217;t measure effects that kick in when the database is large, and that those can be very important, but it&#8217;s also important to measure the starting-line performance and since my time is limited that&#8217;s what I did.</p>
<blockquote><p>What steps did you take to ensure your load generator did not introduce a bottleneck?</p></blockquote>
<p>Not much.  OTOH, there seemed to be little sign of it.  I do have some experience measuring both network and storage performance, and I know where to look for excess load.  There just didn&#8217;t seem to be any on the client side, though there was plenty &#8211; CPU utilization, I/O wait time, etc. &#8211; visible on the server side in pretty much all cases.  If I were trying to measure latency I&#8217;d need a more sophisticated test program (part of the reason I didn&#8217;t go that route) but for testing throughput multiple instances of a simple test program seems to work fine.</p>
<blockquote><p>Did you use multiple connections to each server?</p></blockquote>
<p>Ten, as I said above.</p>
<blockquote><p>What protocol was used?</p></blockquote>
<p>Generally &#8220;best available&#8221; &#8211; e.g. native instead of memcached for Tyrant, plain TCP instead of HTTP for some others, etc.  I think the quality of the available interfaces is part of the performance equation.  Yes, you can do better with some of these by rolling your own hyper-tweaked interface, just as you can do better by using features unique to each, but then the comparisons between them tend to become meaningless.  Generally, &#8220;out of the box&#8221; and &#8220;tweaked to the max&#8221; are the only configurations worth comparing.  This is an &#8220;out of the box&#8221; comparison and I doubt that I&#8217;ll ever have time to compare &#8220;tweaked to the max&#8221; for more than a couple of alternatives.</p>
<blockquote><p>Saying that Tyrant is 8-24 times faster than Riak is a bit apples-to-oranges. Tyrant is basically a btree w/ sockets on front. Riak is an eventually-consistent, multiple-replica data store. It’s like comparing a bullet-proof vest to a tank — I know which one I’d prefer to be using when things go south.</p></blockquote>
<p>Bear in mind that these are single-node tests so they&#8217;re more directly comparable than you seem to want.  As I discussed above (I see a pattern here), a 24:1 difference in single-node performance strongly implies that a distributed and eventually consistent layer on top of the faster alternative might yield a significntly faster solution than a similar layer on top of the slower one &#8211; whether or not such a layer already exists.  At that level of difference, one could say to hell with eventual consistency and do synchronous mirroring between a half-dozen Tyrant instances, and <b>still</b> be faster than Riak.  I actually kind of wanted Riak to do better, because I like its feature set and what I&#8217;ve read about its architecture passes the &#8220;how I would have done it&#8221; test, but I ran the tests and it fared very poorly even compared to other distributed and eventually consistent stores like Cassandra and Voldemort.  I know which one of those I&#8217;d prefer to be using when things go south.  This is why we do tests, to see whether first impressions survive actual experience, and so far the results seem to suggest that Cassandra might be a better choice.  Maybe the results will be different when I run multi-server tests, but I sure as hell wouldn&#8217;t assume any such thing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Smith</title>
		<link>http://pl.atyp.us/wordpress/?p=2417&#038;cpage=1#comment-141493</link>
		<dc:creator>Dave Smith</dc:creator>
		<pubDate>Tue, 27 Oct 2009 20:46:50 +0000</pubDate>
		<guid isPermaLink="false">http://pl.atyp.us/wordpress/?p=2417#comment-141493</guid>
		<description>Interesting analysis. I have a random assortment of observations and questions, if you don&#039;t mind:

* I respectfully posit that load testing on a VM is something of an oxymoron. You&#039;d never (or at least I wouldn&#039;t) run these sorts of apps in a VM for production purposes due to the unpredictable latency introduced. Even more so when both VMs are competing for the same hardware. 

* The warm vs. cold measurements may be misleading due to the caching of the VM and the underlying O/S.  To get an accurate picture (assuming VM testing is acceptable), you&#039;d need to reboot both the VMs AND the host machine to get anything meaningful.

* There are a lot of numbers in your analysis, but no breakdown of actual throughput or latency. IMNSHO, latency at some # of requests per second are the two dimensions that are most informative for this type of performance testing. You also would need a break down of read vs. write requests, since light write load could be significantly faster.

* Without knowing how big your key space (assuming this is all key/value pairs) is, it&#039;s hard to determine how much load was actually exerted on the data stores. Also, using a fixed-size key can be misleading as it permits some structures unfair advantages since they try to match up similar block sizes. 

* Another big question is how long your load tests ran. Most data stores slow down as the amount of data grows (typically logarithmically, IIRC). The severity of that curve is tightly coupled to the data structures used. Implementation details matter, a lot, when you start talking more than a few gigabytes of data. 

* What steps did you take to ensure your load generator did not introduce a bottleneck? My personal experience is that writing the load generator is as hard, or harder, than writing the original server. 

* Did you use multiple connections to each server? What protocol was used? The client protocol (and number of conns) can make a HUGE dent on performance measurements. 

* Saying that Tyrant is 8-24 times faster than Riak is a bit apples-to-oranges. Tyrant is basically a btree w/ sockets on front. Riak is an eventually-consistent, multiple-replica data store. It&#039;s like comparing a bullet-proof vest to a tank -- I know which one I&#039;d prefer to be using when things go south. :)</description>
		<content:encoded><![CDATA[<p>Interesting analysis. I have a random assortment of observations and questions, if you don&#8217;t mind:</p>
<p>* I respectfully posit that load testing on a VM is something of an oxymoron. You&#8217;d never (or at least I wouldn&#8217;t) run these sorts of apps in a VM for production purposes due to the unpredictable latency introduced. Even more so when both VMs are competing for the same hardware. </p>
<p>* The warm vs. cold measurements may be misleading due to the caching of the VM and the underlying O/S.  To get an accurate picture (assuming VM testing is acceptable), you&#8217;d need to reboot both the VMs AND the host machine to get anything meaningful.</p>
<p>* There are a lot of numbers in your analysis, but no breakdown of actual throughput or latency. IMNSHO, latency at some # of requests per second are the two dimensions that are most informative for this type of performance testing. You also would need a break down of read vs. write requests, since light write load could be significantly faster.</p>
<p>* Without knowing how big your key space (assuming this is all key/value pairs) is, it&#8217;s hard to determine how much load was actually exerted on the data stores. Also, using a fixed-size key can be misleading as it permits some structures unfair advantages since they try to match up similar block sizes. </p>
<p>* Another big question is how long your load tests ran. Most data stores slow down as the amount of data grows (typically logarithmically, IIRC). The severity of that curve is tightly coupled to the data structures used. Implementation details matter, a lot, when you start talking more than a few gigabytes of data. </p>
<p>* What steps did you take to ensure your load generator did not introduce a bottleneck? My personal experience is that writing the load generator is as hard, or harder, than writing the original server. </p>
<p>* Did you use multiple connections to each server? What protocol was used? The client protocol (and number of conns) can make a HUGE dent on performance measurements. </p>
<p>* Saying that Tyrant is 8-24 times faster than Riak is a bit apples-to-oranges. Tyrant is basically a btree w/ sockets on front. Riak is an eventually-consistent, multiple-replica data store. It&#8217;s like comparing a bullet-proof vest to a tank &#8212; I know which one I&#8217;d prefer to be using when things go south. <img src='http://pl.atyp.us/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
</channel>
</rss>
