[OMID-102] Support for user Filter when using coprocessor for snapshot filtering #41

yonigottesman · 2018-07-25T06:45:56Z

No description provided.

…o HBase. In many cases, as in the Phoenix case, these attributes are required and should be propagated to the server side. In Phoenix for example, attributes are used to mark data as one that should propagate to the secondary index. This commit propagates the attributes to HBase side.

… in conflict analysis. The purpose of this feature is to let the user decide for each write, whether it should take part in the conflict analysis. The motivation infers from Apache Phoenix that utilizes this feature when writing to the secondary index and also when writing to the data table for immutable tables (each key is added once and is not modified).

…write as a write that was done by a specific transaction. However, due to lack of shadow cells, getting the commit timestamp of the transaction can be done only by access the commit table. The motivation of this feature is to add the shadow cells during the write and save the commit table access. This feature is required by Apache Phoenix that during index creation adds the data table's entries, appeared before creation, to the index. In this case, the version and the commit timestamp should be the fence id and therefore, a direct write to HBase with the addition of shadow cells is required.

…dule.

incorrect since the family deletion qualifier needs to be added instead of a row marker. Therefore, this commit fixes this case. This is crucial since the write set information is needed for adding shadow cells, when transaction successfully commits, and for garbage collection when transaction aborts.

metadata (shadow cells) to an existing mutation. This feature is required by Apache Phoenix both for local index population and update.

This commit changes visibility of several function in order to run Omid in testing mode from Phoenix testing environment.

…on. This comes in addition to the option of marking each column family using HBase metadata.

https://issues.apache.org/jira/browse/OMID-100 including OmidCompactor and OmidSnapshotFilter

…ger also for conflict free writes. This is because fences should also force conflict free transactions to abort.

…ng to commit table

found. We need to continue looking until we either find a committed one in the past or no committed family deletion marker for this column is found. Otherwise, we might miss committed family deletion markers that exists in a transaction snapshot.

family deletion marker. This is noticeable after a checkpoint.

ohadshacham · 2018-08-01T11:11:05Z

hbase-client/src/test/java/org/apache/omid/transaction/TestCellUtils.java

        // Create the required data
        final byte[] validShadowCellQualifier =
-                com.google.common.primitives.Bytes.concat(qualifier, SHADOW_CELL_SUFFIX);
+                com.google.common.primitives.Bytes.concat(new byte[1], qualifier, SHADOW_CELL_SUFFIX);


Define SHADOW_CELL_PREFIX to "new byte[1]" and replace all the new byte[1].

ohadshacham · 2018-08-01T11:27:53Z

hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java

                        cell.getQualifierLength());
                if (!matchingQualifier(otherCell,
-                        cell.getQualifierArray(), cell.getQualifierOffset(), qualifierLength)) {
+                        cell.getQualifierArray(), cell.getQualifierOffset() + 1, qualifierLength)) {


Will it work when the shadow cell prefix is absent? legacy data.

ohadshacham · 2018-08-01T11:28:21Z

hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java

                qualifierLength = qualifierLengthFromShadowCellQualifier(cell.getQualifierArray(),
                        cell.getQualifierOffset(),
                        cell.getQualifierLength());
+                qualifierOffset = qualifierOffset + 1;


Will it work when the shadow cell prefix is absent? legacy data.

ohadshacham · 2018-08-01T11:48:57Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/OmidSnapshotFilter.java

-            }
+        HBaseTransaction hbaseTransaction = getHBaseTransaction(get.getAttribute(CellUtils.TRANSACTION_ATTRIBUTE));
+        SnapshotFilterImpl snapshotFilter = getSnapshotFilter(e);
+        get.setMaxVersions();


I would add a comment that we set to max versions since we are doing Omid filtering in the VisibilityFilter

ohadshacham · 2018-08-01T11:54:08Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;


Different Nullable usage

ohadshacham · 2018-08-01T12:33:06Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+    private final SnapshotFilterImpl snapshotFilter;
+    private final Map<Long ,Long> shadowCellCache;
+    private final HBaseTransaction hbaseTransaction;
+    private final Map<String, Long> familyDeletionCache;


I would add a comment that the row info is redundant in here since reset is called between rows and we clear this map in reset.

ohadshacham · 2018-08-01T12:38:22Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+                if (snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) {
+                    return runUserFilter(v, ReturnCode.INCLUDE);
+                } else {
+                    return runUserFilter(v, ReturnCode.INCLUDE_AND_NEXT_COL);


I would add a comment that since v is in snapshot and not in transaction then it is the last result for SNAPSHOT_ALL.

ohadshacham · 2018-08-01T12:40:45Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+        if (isCellInSnapshot(v)) {
+
+            if (CellUtils.isTombstone(v)) {
+                return ReturnCode.NEXT_COL;


What about tombstones in snapshot_all? I assume it should be SKIP.

ohadshacham · 2018-08-01T12:52:47Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+
+
+    private boolean isCellInSnapshot(Cell v) throws IOException {
+        if (shadowCellCache.containsKey(v.getTimestamp()) &&


two collection calls.

ohadshacham · 2018-08-01T12:58:36Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+    @Override
+    public boolean filterRowKey(byte[] buffer, int offset, int length) throws IOException {
+        if (userFilter != null) {
+            return userFilter.filterRowKey(buffer, offset, length);


Does this function call before each filterRowKey(Cell) ? That's what the documentation say. If it is true then the whole implementation should be different...

"Filters a row based on the row key. If this returns true, the entire row will be excluded. If false, each KeyValue in the row will be passed to filterCell(Cell) below"

I think this ok (this is what Tephra does).

JamesRTaylor · 2018-08-01T15:31:24Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+
+            if (shadowCellCache.containsKey(v.getTimestamp()) &&
+                    hbaseTransaction.getStartTimestamp() >= shadowCellCache.get(v.getTimestamp())) {
+                familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), shadowCellCache.get(v.getTimestamp()));


Avoid allocating memory within this call. Either use TreeMap<byte[],Cell> and don't do the copy or use HashMap<ImmutableBytesPtr,Cell>. You can copy/paste ImmutableBytesPtr from Phoenix - it's a wrapper around byte[] that handles equality and hashcode without doing any copying.

JamesRTaylor · 2018-08-01T15:54:02Z

hbase-coprocessor/src/test/java/org/apache/omid/transaction/TestSnapshotFilter.java

    }

    @Test(timeOut = 60_000)
-    public void testGetSecondResult() throws Throwable {


Would be good to have a test that uses a scan with FirstKeyOnlyFilter that would get incorrect results without this new implementation.

Is there a test which would have failed without this patch (i.e. one that demonstrates the need for having the visibility filtering done as a pure HBase filter)?

JamesRTaylor · 2018-08-01T15:57:34Z

hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java


    private static final Logger LOG = LoggerFactory.getLogger(CellUtils.class);
    static final byte[] SHADOW_CELL_SUFFIX = "\u0080".getBytes(Charsets.UTF_8); // Non printable char (128 ASCII)
    static byte[] DELETE_TOMBSTONE = Bytes.toBytes("__OMID_TOMBSTONE__");


Will Cells end up with these constant values and if so can we make them shorter?

JamesRTaylor · 2018-08-01T15:59:50Z

This is great, @yonigottesman! Do the Phoenix unit tests FlappingTransactionIT.testInflightUpdateNotSeen() and testInflightDeleteNotSeen() pass with this change? You can try running them from the omid2 feature branch.

ohadshacham

Great work!

JamesRTaylor · 2018-08-06T14:47:35Z

hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java

        return result == 0;
    }

+    private static boolean startsWith(byte[] value, int offset, int length, byte[] prefix) {


Can you just use Bytes.startsWith() instead?

No because Bytes.startWith doesnt work with offset

JamesRTaylor · 2018-08-06T14:48:51Z

hbase-client/src/main/java/org/apache/omid/transaction/HTableAccessWrapper.java

-import org.apache.hadoop.hbase.client.HTableInterface;
-import org.apache.hadoop.hbase.client.Put;
-import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.client.*;


Was the use of wildcarding here intentional? Not sure about Omid, but in Phoenix we always explicitly specify all the imports.

JamesRTaylor · 2018-08-06T14:54:45Z

hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java

+            Result shadowCell = snapshotFilter.getTableAccessWrapper().get(get);
+
+            if (!shadowCell.isEmpty() &&
+                    Bytes.toLong(CellUtil.cloneValue(shadowCell.rawCells()[0] )) <= hbaseTransaction.getStartTimestamp()){


Don't repeat Bytes.toLong(CellUtil.cloneValue(shadowCell.rawCells()[0] )) here. You want to minimize heap allocation during filtering.

JamesRTaylor · 2018-08-06T15:07:05Z

Nice work, @yonigottesman. I made a few minor comments, @ohadshacham. My main question is do the Phoenix unit tests FlappingTransactionIT.testInflightUpdateNotSeen() and testInflightDeleteNotSeen() pass with this change? You can try running them from the omid2 feature branch in Phoenix against the phoenix-integration branch in omid2 with your patch applied.

yonigottesman · 2018-08-07T09:28:49Z

All FlappingTransactionIT tests pass and i added a test that will pass only if snapshot filtering is done in a coprocessor before user filter is called

Ohad Shacham and others added 18 commits February 7, 2018 12:44

[OMID-70] - reopen in order to bind WorldClockOracleImpl in TSOMockMo…

f3d980c

…dule.

[OMID-93] This commit adds an option to add commit

0515a94

metadata (shadow cells) to an existing mutation. This feature is required by Apache Phoenix both for local index population and update.

[OMID-94] Tune Omid for Phoenix testing environment.

a4ab246

This commit changes visibility of several function in order to run Omid in testing mode from Phoenix testing environment.

[OMID-96] Enable compactor on all column families during initializati…

76400f6

…on. This comes in addition to the option of marking each column family using HBase metadata.

[OMID-99] Change TestNG version to 6.10.

5ab09f3

[OMID-100] James Taylor's patch to:

d4dab87

https://issues.apache.org/jira/browse/OMID-100 including OmidCompactor and OmidSnapshotFilter

[OMID-72] bug fix, accessed tables should be sent to transaction mana…

0f14b27

…ger also for conflict free writes. This is because fences should also force conflict free transactions to abort.

Support for user Filter when using coprocessor for snapshot filtering

41554ec

In coprocessor filtering, get shadow cell of delete family before goi…

ba57538

…ng to commit table

add inMemoryCommitTable client option in omid coprocessor for testing

26469a0

Fix visibilityFilter to check if delete family is in current TX

8d63267

[OMID-106] Delete should use write timestamp when writing

1fdd260

family deletion marker. This is noticeable after a checkpoint.

Merge OMID-105 changes

b2980c7

yonigottesman changed the title ~~Support for user Filter when using coprocessor for snapshot filtering~~ [OMID-102] Support for user Filter when using coprocessor for snapshot filtering Aug 1, 2018

yonigottesman closed this Aug 1, 2018

yonigottesman reopened this Aug 1, 2018

ohadshacham reviewed Aug 1, 2018

View reviewed changes

JamesRTaylor reviewed Aug 1, 2018

View reviewed changes

yonigottesman added 2 commits August 6, 2018 09:46

fix server filtering

575c437

fix snapshot filter to work with SNAPSHOT_ALL in client

c890c00

ohadshacham approved these changes Aug 6, 2018

View reviewed changes

JamesRTaylor reviewed Aug 6, 2018

View reviewed changes

Add server side filtering test. Improve server filtering memory usage

2071125

yonigottesman added 3 commits August 7, 2018 13:58

remove wildcard importas

46b0c81

Fix coprocessor scanning bug. add scan with filter test

a79d375

Add CellSkipFilter

a86afc5

asfgit force-pushed the phoenix-integration branch from 30c49d3 to 1e5ee3f Compare November 14, 2018 13:59



		private boolean isCellInSnapshot(Cell v) throws IOException {
		if (shadowCellCache.containsKey(v.getTimestamp()) &&

[OMID-102] Support for user Filter when using coprocessor for snapshot filtering #41

Are you sure you want to change the base?

[OMID-102] Support for user Filter when using coprocessor for snapshot filtering #41

Uh oh!

Conversation

yonigottesman commented Jul 25, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamesRTaylor commented Aug 1, 2018

Uh oh!

ohadshacham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamesRTaylor commented Aug 6, 2018

Uh oh!

yonigottesman commented Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yonigottesman commented Aug 7, 2018 •

edited

Loading