-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29368 [Precursor] Add API surface and refactoring for key management feature #7584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ement feature This commit prepares the codebase for the upcoming key management feature (HBASE-29368) by introducing the necessary API definitions, protocol buffer changes, and infrastructure refactoring. No functional changes are included; all implementation will follow in the feature PR. This precursor PR essentially extracts the API surface definitions and infrastructure refactoring from the main feature PR (apache#7421) to facilitate easier review. By separating the ~15k line feature PR into a smaller precursor containing interface definitions, protocol changes, and method signature updates, the subsequent feature PR will focus purely on implementation logic. API Surface Additions: * New interfaces: - KeymetaAdmin: Admin API for key management operations - Server methods for cache management (getManagedKeyDataCache, getSystemKeyCache) * Protocol buffer definitions: - ManagedKeys.proto: Definitions for managed key data and operations - Admin.proto: RPC methods for key management admin operations - Procedure.proto: Key rotation procedure support Infrastructure Refactoring: * Encryption context creation: - Moved createEncryptionContext from EncryptionUtil (client) to SecurityUtil (server) where it properly belongs, as it requires server-side resources - Added overloads to support future key encryption key (KEK) parameters * Method signature updates: - Added ManagedKeyDataCache and SystemKeyCache parameters to encryption-related methods throughout HRegion, HStore, HStoreFile, and HFile classes - Updated constructors and factory methods to thread cache references - All cache parameters are currently null/unused, enabling gradual feature rollout * New utility methods: - Encryption.encryptWithGivenKey() / decryptWithGivenKey(): Extract method refactoring to support both subject-based and KEK-based encryption - EncryptionUtil.wrapKey() / unwrapKey() overloads with KEK parameter - Bytes.add() 4-argument overload for concatenation Stub Infrastructure: * Blank place holder shells for some public data classes such as ManagedKeyData and KeymetaAdminClient * Stub implementations for key management services and caches that return null or throw UnsupportedOperationException, clearly documented as placeholders * New package org.apache.hadoop.hbase.keymeta for key management classes * Mock services updated to support new cache getter methods for testing Code Organization: * Procedure framework: Added support for region-level server name tracking to support future key rotation procedures * Testing infrastructure updated to support new constructor signatures All stub implementations clearly document they are placeholders for the upcoming feature PR. Existing encryption functionality remains unchanged and continues to work as before. Testing: * All existing tests pass (precursor introduces no functional changes) * Build completes successfully with new API surface * Backward compatibility maintained for non-key-management code paths
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
There are 4 test failures in (TestGzipFilter,TestNamespacesInstanceResource,TestScannersWithFilters,TestSchemaResource) but I can't repro them locally so they appear to have flapped. The 4 checkstyle errors can be ignored, they will go away in my next PR. There are apparently 28 errors from spotless:check, but when I ran spotless:apply, only the 2 unused imports were removed nothing else, so CI seems to be overreporting it (I have observed this previously too). |
I also think sometimes it does overreport. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing, so far looks good overall, left few comments
| * @param d fourth fourth | ||
| * @return New array made from a, b, c, and d | ||
| */ | ||
| public static byte[] add(final byte[] a, final byte[] b, final byte[] c, final byte[] d) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding more byte[], how about we let core logic to keep using existing utilities? e.g. if we have to concatenate 4 byte[] (a, b, c, d), we can first concatenate c + d, and then concatenate (a + b + result of c+d)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion is possible, in fact from this angle the current 3 param variant can also be done using the 2 param variant, but I would assume that a dedicated method was created to make it more efficient and produces less garbage, so I extended the same philosophy to the new 4 param variant. IMHO, the duplication is worth having from the point of squeezing out additional performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, for now we can add note in the javadoc that if any use case needs more than 4 bytes, divide and use any of the existing utilities. So that in future, no one keeps adding more variants.
| * STUB IMPLEMENTATION - Feature not yet complete. This class represents encryption key data. Full | ||
| * implementation will be in HBASE-29368 feature PR. | ||
| */ | ||
| @InterfaceAudience.Public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expecting downstreamers to directly use ManagerKeyData class? If no, we should keep IA.Private. If yes, we should still add IS.Unstable with IA.Public for now.
These are not final consumable APIs anyway so we can't only keep IA.Public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ManagedKeyData is exposed to the hbase clients so I think it qualifies as a public interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add IS.Unstable per your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't IS.Evolving a better fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evolving is also fine
| * KeymetaAdmin is an interface for administrative functions related to managed keys. It handles the | ||
| * following methods: | ||
| */ | ||
| @InterfaceAudience.Public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, applies to all new interfaces and apis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used outside the hbase-common module so I marked it as public, is that not a good enough reason to make it public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this too is exposed to the client applications so seems appropriate. I will double check any other usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phoenix will directly use the class? If so, yes we can expose it but still we should mark it IS.Unstable because it is in dev phase and not stabilitized. We can also change signatures in future and after a couple years and few big releases, we can mark it stable.
| rpc RefreshSystemKeyCache(EmptyMsg) | ||
| returns(EmptyMsg); | ||
|
|
||
| rpc EjectManagedKeyDataCacheEntry(ManagedKeyEntryRequest) | ||
| returns(BooleanMsg); | ||
|
|
||
| rpc ClearManagedKeyDataCache(EmptyMsg) | ||
| returns(EmptyMsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expecting more apis in future?
We might want to consider having ManagedKeyAdmin interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a need to add more APIs. Adding these here made it simpler to take advantage of the existing async framework and the primitives for parallel execution (across all the RS) for the admin API. The higher level KeymetaAdmin RPC interface is blocking and makes use of these primitives.
| } catch (Throwable e) { | ||
| // Try the old signature for the sake of test code. | ||
| return createInstance(conf, ctorArgTypes.subList(0, ctorArgTypes.size() - 1), | ||
| ctorArgs.subList(0, ctorArgs.size() - 1)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this fallback necessary? We can catching any type of Throwable here. Or is it going to be a lot of changes in tests to use new constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if you look for the usage of HConstants.REGION_IMPL, you would see many test classes each with their own test/mock class so I wanted to reduce the change. However, I think I can be more specific on the exception types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the existing createInstance is already doing a catch all and convert it to IllegalStateException, so there is technically no difference between catching Throwable or more specific IllegalStateException in this catch block. I will still change it to be clear about the intention and avoid any potential confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, IllegalStateException will be better
| public static Region openHRegion(final Region other, final CancelableProgressable reporter) | ||
| throws IOException { | ||
| return openHRegion((HRegion) other, reporter); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While HRegion is IA.Private, downstreamers with coproc still tend to use some utilities. It would be better to keep this. It is more generic Region instead of HRegion anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I can restore this method.
| decryptWithGivenKey(key, out, in, outLen, alterCipher, iv); | ||
| } else { | ||
| throw new IOException(e); | ||
| throw e; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to retain throw new IOException(e)? unless this is easier to catch later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, e is already of type IOException so I didn't think wrapping it again with another IOException is adding any value. Do you see any issue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought that it should not be a problem, i was just curious if it breaks any exception catching and validation of the error message based on the current logic. If everything is fine, we are good.
| @InterfaceStability.Evolving | ||
| public class SecurityUtil { | ||
| public final class SecurityUtil { | ||
| private static final Logger LOG = LoggerFactory.getLogger(SecurityUtil.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we need this later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is being used in the feature PR.
This commit prepares the codebase for the upcoming key management feature (HBASE-29368) by introducing the necessary API definitions, protocol buffer changes, and infrastructure refactoring. No functional changes are included; all implementation will follow in the feature PR.
This precursor PR essentially extracts the API surface definitions and infrastructure refactoring from the main feature PR (#7421) to facilitate easier review. By separating the ~15k line feature PR into a smaller precursor containing interface definitions, protocol changes, and method signature updates, the subsequent feature PR will focus purely on implementation logic.
API Surface Additions:
New interfaces:
Protocol buffer definitions:
Infrastructure Refactoring:
Encryption context creation:
Method signature updates:
New utility methods:
Stub Infrastructure:
Code Organization:
All stub implementations clearly document they are placeholders for the upcoming feature PR. Existing encryption functionality remains unchanged and continues to work as before.
Testing: