-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss #7582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @hgromer, I agree HBASE-29776 is an issue (sorry for not responding sooner there, I was on vacation the past weeks), but I'm not yet convinced this is the right approach to fix it. It feels very complex to reason about, so I wonder if there isn't a simpler approach. Already wanted to give some intermediate feedback while I think a bit more about it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My suggestion is to simplify all changes in this file to (the fix for the excluded log files is also needed): Then when finding which logs to include, these are the options:
This approach will keep Similar code suffices in the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No worries, appreciate you taking the time to look here. We've found a slew of bugs in the WAL retention system and I think it's important to get this right, so happy to iterate on feedback.
Agree with this. It's something we should take a look at. To your point about WAL retention and boundaries, conceptually, I've been trying to think about it from the perspective of "which WAL files have been backed up". Otherwise you run into issues when a host goes offline. For example, in the case where We have a Server A with row X
So we've resurfaced dead data that shouldn't be included. It's problematic to back up WALs that are very old. So this is the main culprit for the added complexity here Additionally, I'm weary of comparing timestamps across hosts, which is why I was wary of doing something like generating a boundary timestamp in the backup process, which happens client side and opted to compare WAL timestamps which are generated by the same host.
If I understand correctly, run into this issue
Agree here on the first backup this happens, but we never update the host TS and so we'll continue to backup the WAL files and run into the issue mentioned above. I'd be more than happy to find a simpler solution though, I really don't love how complex this WAL retention boundary logic is; but struggled to do so and also avoid corrupting the data
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Map<String, Long> newTimestamps =
rollTimestamps.entrySet().stream()
// Region servers that are offline since the last backup will have old roll timestamps,
// prune their information here, as it is not relevant to be stored or used for finding
// the relevant logs.
.filter(entry -> entry.getValue() > rollStartTs)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));I'm not sure I necessarily agree with this logic. B/c the RS not being rolled this go around doesn't mean we've backed up all the files from the RS we need to backup. It just means the host doesn't exist on the cluster at the moment.
We need to backup the files that were generated between the last backup, and this backup |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -58,7 +58,6 @@ public IncrementalBackupManager(Connection conn, Configuration conf) throws IOEx | |
| */ | ||
| public Map<String, Long> getIncrBackupLogFileMap() throws IOException { | ||
| List<String> logList; | ||
| Map<String, Long> newTimestamps; | ||
| Map<String, Long> previousTimestampMins; | ||
|
|
||
| String savedStartCode = readBackupStartCode(); | ||
|
|
@@ -83,12 +82,48 @@ public Map<String, Long> getIncrBackupLogFileMap() throws IOException { | |
| LOG.info("Execute roll log procedure for incremental backup ..."); | ||
| BackupUtils.logRoll(conn, backupInfo.getBackupRootDir(), conf); | ||
|
|
||
| newTimestamps = readRegionServerLastLogRollResult(); | ||
| Map<String, Long> newTimestamps = readRegionServerLastLogRollResult(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This method does an unnecessary scan, since you override all entries in the code you add below. |
||
|
|
||
| Map<String, Long> latestLogRollByHost = readRegionServerLastLogRollResult(); | ||
| for (Map.Entry<String, Long> entry : latestLogRollByHost.entrySet()) { | ||
| String host = entry.getKey(); | ||
| long latestLogRoll = entry.getValue(); | ||
| Long earliestTimestampToIncludeInBackup = previousTimestampMins.get(host); | ||
|
|
||
| boolean isInactive = earliestTimestampToIncludeInBackup != null | ||
| && earliestTimestampToIncludeInBackup > latestLogRoll; | ||
|
|
||
| long latestTimestampToIncludeInBackup; | ||
| if (isInactive) { | ||
| LOG.debug("Avoided resetting latest timestamp boundary for {} from {} to {}", host, | ||
| earliestTimestampToIncludeInBackup, latestLogRoll); | ||
| latestTimestampToIncludeInBackup = earliestTimestampToIncludeInBackup; | ||
| } else { | ||
| latestTimestampToIncludeInBackup = latestLogRoll; | ||
| } | ||
| newTimestamps.put(host, latestTimestampToIncludeInBackup); | ||
| } | ||
|
|
||
| logList = getLogFilesForNewBackup(previousTimestampMins, newTimestamps, conf, savedStartCode); | ||
| logList = excludeProcV2WALs(logList); | ||
| backupInfo.setIncrBackupFileList(logList); | ||
|
|
||
| // Update boundaries based on WALs that will be backed up | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For my understanding, is this code block an optimization, or a necessary fix for a specific case of appearing/disappearing region servers?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Figured it out, it is to update the newTimestamps entries for regionservers that have since gone offline, but for which the logs are now backed up. |
||
| for (String logFile : logList) { | ||
| Path logPath = new Path(logFile); | ||
| String logHost = BackupUtils.parseHostFromOldLog(logPath); | ||
| if (logHost == null) { | ||
| logHost = BackupUtils.parseHostNameFromLogFile(logPath.getParent()); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This method seems to support parsing old log names as well, is it possible to merge with the parsing 2 lines above? Though I am confused as to why the former uses |
||
| } | ||
| if (logHost != null) { | ||
| long logTs = BackupUtils.getCreationTime(logPath); | ||
| Long latestTimestampToIncludeInBackup = newTimestamps.get(logHost); | ||
| if (latestTimestampToIncludeInBackup == null || logTs > latestTimestampToIncludeInBackup) { | ||
| LOG.debug("Updating backup boundary for inactive host {}: timestamp={}", logHost, logTs); | ||
| newTimestamps.put(logHost, logTs); | ||
| } | ||
| } | ||
| } | ||
| return newTimestamps; | ||
| } | ||
|
|
||
|
|
@@ -228,15 +263,6 @@ private List<String> getLogFilesForNewBackup(Map<String, Long> olderTimestamps, | |
| } else if (currentLogTS > oldTimeStamp) { | ||
| resultLogFiles.add(currentLogFile); | ||
| } | ||
|
|
||
| // It is possible that a host in .oldlogs is an obsolete region server | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think removing this block entirely is wrong. I believe the semantics of So I think this block should be kept, but adjusted to: I also think a similar issue is present for the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From your comment in HBASE-29776:
Your comment is correct, but I think the better fix is to ensure the newTimestamps are correctly updated (as you do in your other changes). Removing this block would result in too many logs being included in the backup.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if (newTimestamp != null && currentLogTS > newTimestamp) {
newestLogs.add(currentLogFile);
}I don't think so, this would exclude all WAL files between last backup (previousTimestamps) and the current log roll (newTimestamp). Unless I'm misunderstanding
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the relevant filtering out for very old log files happens here So this logic is safe to remove imo |
||
| // so newestTimestamps.get(host) here can be null. | ||
| // Even if these logs belong to a obsolete region server, we still need | ||
| // to include they to avoid loss of edits for backup. | ||
| Long newTimestamp = newestTimestamps.get(host); | ||
| if (newTimestamp == null || currentLogTS > newTimestamp) { | ||
| newestLogs.add(currentLogFile); | ||
| } | ||
| } | ||
| // remove newest log per host because they are still in use | ||
| resultLogFiles.removeAll(newestLogs); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,129 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| package org.apache.hadoop.hbase.backup; | ||
|
|
||
| import static org.junit.Assert.assertFalse; | ||
| import static org.junit.Assert.assertNotNull; | ||
| import static org.junit.Assert.assertTrue; | ||
|
|
||
| import java.util.List; | ||
| import java.util.Map; | ||
| import org.apache.hadoop.hbase.HBaseClassTestRule; | ||
| import org.apache.hadoop.hbase.HBaseTestingUtil; | ||
| import org.apache.hadoop.hbase.SingleProcessHBaseCluster; | ||
| import org.apache.hadoop.hbase.TableName; | ||
| import org.apache.hadoop.hbase.backup.impl.BackupSystemTable; | ||
| import org.apache.hadoop.hbase.client.Connection; | ||
| import org.apache.hadoop.hbase.client.ConnectionFactory; | ||
| import org.apache.hadoop.hbase.regionserver.HRegionServer; | ||
| import org.apache.hadoop.hbase.testclassification.LargeTests; | ||
| import org.junit.BeforeClass; | ||
| import org.junit.ClassRule; | ||
| import org.junit.Test; | ||
| import org.junit.experimental.categories.Category; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| import org.apache.hbase.thirdparty.com.google.common.collect.Lists; | ||
|
|
||
| /** | ||
| * Tests that WAL files from offline/inactive RegionServers are handled correctly during backup. | ||
| * Specifically verifies that WALs from an offline RS are: | ||
| * <ol> | ||
| * <li>Backed up once in the first backup after the RS goes offline</li> | ||
| * <li>NOT re-backed up in subsequent backups</li> | ||
| * </ol> | ||
| */ | ||
| @Category(LargeTests.class) | ||
| public class TestBackupOfflineRS extends TestBackupBase { | ||
|
|
||
| @ClassRule | ||
| public static final HBaseClassTestRule CLASS_RULE = | ||
| HBaseClassTestRule.forClass(TestBackupOfflineRS.class); | ||
|
|
||
| private static final Logger LOG = LoggerFactory.getLogger(TestBackupOfflineRS.class); | ||
|
|
||
| @BeforeClass | ||
| public static void setUp() throws Exception { | ||
| TEST_UTIL = new HBaseTestingUtil(); | ||
| conf1 = TEST_UTIL.getConfiguration(); | ||
| conf1.setInt("hbase.regionserver.info.port", -1); | ||
| autoRestoreOnFailure = true; | ||
| useSecondCluster = false; | ||
| setUpHelper(); | ||
| // Start an additional RS so we have at least 2 | ||
| TEST_UTIL.getMiniHBaseCluster().startRegionServer(); | ||
| TEST_UTIL.waitTableAvailable(table1); | ||
| } | ||
|
|
||
| /** | ||
| * Tests that when a full backup is taken while an RS is offline (with WALs in oldlogs), the | ||
| * offline host's timestamps are recorded so subsequent incremental backups don't re-include those | ||
| * WALs. | ||
| */ | ||
| @Test | ||
| public void testBackupWithOfflineRS() throws Exception { | ||
| LOG.info("Starting testFullBackupWithOfflineRS"); | ||
|
|
||
| SingleProcessHBaseCluster cluster = TEST_UTIL.getMiniHBaseCluster(); | ||
| List<TableName> tables = Lists.newArrayList(table1); | ||
|
|
||
| if (cluster.getNumLiveRegionServers() < 2) { | ||
| cluster.startRegionServer(); | ||
| Thread.sleep(2000); | ||
| } | ||
|
|
||
| LOG.info("Inserting data to generate WAL entries"); | ||
| try (Connection conn = ConnectionFactory.createConnection(conf1)) { | ||
| insertIntoTable(conn, table1, famName, 2, 100); | ||
| } | ||
|
|
||
| int rsToStop = 0; | ||
| HRegionServer rsBeforeStop = cluster.getRegionServer(rsToStop); | ||
| String offlineHost = | ||
| rsBeforeStop.getServerName().getHostname() + ":" + rsBeforeStop.getServerName().getPort(); | ||
| LOG.info("Stopping RS: {}", offlineHost); | ||
|
|
||
| cluster.stopRegionServer(rsToStop); | ||
| // Wait for WALs to be moved to oldlogs | ||
| Thread.sleep(5000); | ||
|
|
||
| LOG.info("Taking full backup (with offline RS WALs in oldlogs)"); | ||
| String fullBackupId = fullTableBackup(tables); | ||
| assertTrue("Full backup should succeed", checkSucceeded(fullBackupId)); | ||
|
|
||
| try (BackupSystemTable sysTable = new BackupSystemTable(TEST_UTIL.getConnection())) { | ||
| Map<TableName, Map<String, Long>> timestamps = sysTable.readLogTimestampMap(BACKUP_ROOT_DIR); | ||
| Map<String, Long> rsTimestamps = timestamps.get(table1); | ||
| LOG.info("RS timestamps after full backup: {}", rsTimestamps); | ||
|
|
||
| Long tsAfterFullBackup = rsTimestamps.get(offlineHost); | ||
| assertNotNull("Offline host should have timestamp recorded in trslm after full backup", | ||
| tsAfterFullBackup); | ||
|
|
||
| LOG.info("Taking incremental backup (should NOT include offline RS WALs)"); | ||
| String incrBackupId = incrementalTableBackup(tables); | ||
| assertTrue("Incremental backup should succeed", checkSucceeded(incrBackupId)); | ||
|
|
||
| timestamps = sysTable.readLogTimestampMap(BACKUP_ROOT_DIR); | ||
| rsTimestamps = timestamps.get(table1); | ||
| assertFalse("Offline host should not have a boundary ", | ||
| rsTimestamps.containsKey(offlineHost)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't gotten around to processing the changes in this file, but can you sketch why they are needed? Since your original ticket only discusses an issue with incremental backups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured it out, it's to ensure the
newTimestampsfor no longer active region servers are updated.