Skip to content

Conversation

@awetzel
Copy link

@awetzel awetzel commented Jul 8, 2014

Hello all,
We use riak_search for 3 years. And merge_index suffers of very poor performance in high write load environment.
This is mainly due to the compaction strategy which is not sufficient to keep the number of segment stable when there is a high throughput and you need to decrease buffer_rollover_size in order to keep buffers ETS table memory stable. And when the number of segment grows too much, all the merge_index server become unusable because it needs to keep track of locks per segments.

So this week end I managed to rewrite merge_index compaction strategy. It was first just a test but regarding the performance I observed since, the result is much much better than the previous merge_index. And all the issues that we faced with riak_search are solved because they were direct or indirect consequences of bad write performance of merge_index.

To do that so quickly, I only copied Cassandra :

I give to the new parameters involved the name of Cassandra equivalents :

  • segment_similarity_ratiodefines the ratio of size wich will be used to group segments of similar size to compact (default to 50%, so a segment is in a group if 0,5*avg_group_seg_size < seg_size < 1,5*avg_group_seg_size.
  • min_segment_size define the minimum segment size which is targeted by segment grouping for compaction.
  • compaction_throughput_mb_per_sec defines the throughput which will adjust throttling of compaction

I understand Riak Search is not maintained anymore because of Riak2.0 upcoming. But Riak Search fits well a particular use case : when you need a simple full text engine with a term based distribution, for instance as a building block for another kind of search. This is the case in the company I have founded, even if SOLR/Yokozuna migration could be a possibility, it is not an easy one for us and I expect it could be the case for other companies.

So please take my pull request into consideration, because we cannot use Riak Search without this improvement and I expect that many of your users had the same issues and that this fix would help them. SOLR/Yokozuna are great but very different from Riak Search which is great in its field. And we would love to keep code upstream even if merge_index is not maintained anymore.

Best regards.

Arnaud.

@jonmeredith
Copy link

Thanks for the PR @awetzel. As you noted we are putting our efforts against Search2.0 at the moment, so it may be a long time until we are able to review and as importantly load test this PR (possibly a very long time).

Are you supporting your own build currently?

@awetzel
Copy link
Author

awetzel commented Jul 17, 2014

Yes we are supporting our own build. I will inform you in this thread if some bug appear in this merge_index branch. And push fix commit on this branch if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants