Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/geth:Implement freezer truncation as a subcommand(fixes #31135) #31351

Conversation

sivaratrisrinivas
Copy link

@sivaratrisrinivas sivaratrisrinivas commented Mar 11, 2025

Hii, I have implemented a new subcommand geth db truncate-freezer that truncates the freezer at the merge block, keeping headers but removing bodies. This addresses issue #31135.

These are the Implementation Details

  • Added a new subcommand truncate-freezer to the db command
  • Consolidated flags into reusable variables
  • Implemented the truncation logic that:
    • Finds the merge block using binary search
    • Preserves headers and hashes before truncation
    • Truncates all tables
    • Re-inserts the preserved headers and hashes

This implementation follows the same pattern as the existing prune-history command and uses the same underlying truncateAncientStore method.

Fixes #31135

Comment on lines +1018 to +1035

for low <= high {
mid := (low + high) / 2
header := rawdb.ReadHeader(db, rawdb.ReadCanonicalHash(db, mid), mid)
if header == nil {
return fmt.Errorf("header %d not found", mid)
}

if header.Difficulty.Sign() == 0 {
// This is a post-merge block, look earlier
high = mid - 1
mergeBlock = mid
found = true
} else {
// This is a pre-merge block, look later
low = mid + 1
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do something like this here

Suggested change
for low <= high {
mid := (low + high) / 2
header := rawdb.ReadHeader(db, rawdb.ReadCanonicalHash(db, mid), mid)
if header == nil {
return fmt.Errorf("header %d not found", mid)
}
if header.Difficulty.Sign() == 0 {
// This is a post-merge block, look earlier
high = mid - 1
mergeBlock = mid
found = true
} else {
// This is a pre-merge block, look later
low = mid + 1
}
}
sort.Search(*headNumber, func(index int) bool {
header := rawdb.ReadHeader(db, rawdb.ReadCanonicalHash(db, index), index)
if header == nil {
panic(fmt.Sprintf("header %d not found", index))
}
return header.Difficulty.Sign() == 0
})

Copy link
Contributor

@jwasinger jwasinger Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just assume the merge block is hardcoded to ethereum mainnet value? , and assume that we are running on a pre-merged chain?


// First, read all headers up to the merge block
log.Info("Reading headers to preserve them", "count", mergeBlock)
headers := make([][]byte, mergeBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will be 15 Gbs of memory on mainnet

}

// Re-insert the headers and hashes
log.Info("Re-inserting headers", "count", len(headers))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will not work. Yes you can re-insert the headers, but the freezer has the assumption that all of its tables have the same length. So upon next boot it will truncate the headers to match that of blocks, receipts etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have two options:

  • relax this assumption in the freezer
  • Move the headers and hashes to a fresh freezer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankyou for pointing out the issues.

@sivaratrisrinivas
Copy link
Author

I've updated the PR with optimizations that address all the concerns raised:

1. Memory Usage Issue

  • Implemented batch processing with a configurable batch size (default: 10,000)
  • Added progress reporting to show status during processing
  • Extracted batch processing logic into dedicated helper functions

2. Freezer Table Length Assumption

  • Created a temporary freezer specifically for headers and hashes
  • Completely truncate the original freezer (to zero) before re-inserting headers
  • Added proper error handling when copying between freezers

3. Hardcoded Merge Block Values

  • Added useHardcodedMergeFlag flag
  • Created knownMergeBlocks map with network-specific merge block numbers
  • Extracted merge block detection into a dedicated function

Additional Improvements

  • Better Code Organization: Broke down the large function into smaller, focused helper functions
  • Enhanced User Experience: Added progress reporting for long-running operations
  • Improved Error Handling: Using %w for error wrapping
  • Configurability: Made batch size configurable via command-line flag

Please let me know your thoughts on it.

@sivaratrisrinivas
Copy link
Author

Hii @s1na , I fixed the issues you have mentioned and made some modifications to the codebase to make it more readable and maintainable. Can you please review it and share your feedback.

@fjl
Copy link
Contributor

fjl commented Mar 14, 2025

We decided to work on our own version of this. Thank you for your contribution!

@fjl fjl closed this Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement freezer truncation as a subcommand
5 participants