Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[15/n] Make MoveVM stateless, remove environment, refactor sessions #16088

Merged
merged 5 commits into from
Mar 17, 2025

Conversation

georgemitenkov
Copy link
Contributor

@georgemitenkov georgemitenkov commented Mar 10, 2025

Description

Refactoring:

  1. MoveVM is now fully stateless. Removed VMRuntime moving methods to (de) serialize arguments to the move_vm.rs file. Wrapper calls to execute things are removed, and sessions call execute_loaded_function directly.
  2. Session no longer owns MoveVM (it is stateless anyway), this allows to get rid of some lifetimes throughout the code. IIRC, the VM was added there purely for module access, which is now moved out.
  3. As a result, MoveVMExt no longer owns MoveVM: it just uses its session creation methods.

With this change the Move VM is now completely stateless, and we should be able to swap the implementation easily.

Also, some basic renaming.

How Has This Been Tested?

Existing tests.

Key Areas to Review

Type of Change

  • Refactoring

Which Components or Systems Does This Change Impact?

  • Move/Aptos Virtual Machine

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Mar 10, 2025

⏱️ 38m total CI duration on this PR
Job Cumulative Duration Recent Runs
check-dynamic-deps 17m 🟩🟩🟩🟩🟩 (+4 more)
rust-cargo-deny 12m 🟩🟩🟩🟩🟩 (+2 more)
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+5 more)
general-lints 3m 🟩🟩🟩🟩🟩 (+2 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+2 more)
permission-check 25s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 22s 🟩🟩🟩🟩🟩 (+4 more)
check-branch-prefix 1s 🟩🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor Author

georgemitenkov commented Mar 10, 2025

@georgemitenkov georgemitenkov force-pushed the george/remove-feature-flag-3.13 branch from 8f6c66e to 08749a2 Compare March 13, 2025 17:51
Base automatically changed from george/remove-feature-flag-3.12 to main March 13, 2025 19:27
@georgemitenkov georgemitenkov force-pushed the george/remove-feature-flag-3.13 branch from 08749a2 to 482fb3a Compare March 14, 2025 10:34
@@ -38,7 +38,7 @@ use aptos_types::{
vm_status::StatusCode,
AptosCoinType, CoinType,
};
use aptos_vm::{AptosSimulationVM, AptosVM};
use aptos_vm::{AptosSimulationVM, AptosVm};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VM makes sense, since it's not one word but abbreviation

Copy link
Contributor Author

@georgemitenkov georgemitenkov Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think main motivation is when you have AptosVmBlockExecutor this reads nicer than AptosVMBlockExecutor so hence the change, I think in one of @igor-aptos's PR we discussed that some time ago. But I do not have an opinion here, @gelash would you prefer VM over Vm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it's logical because it's an abbreviation, there are cases also where it looks arguably better but yeah in VMBlock something a bit worse. Sorry, didn't remember or follow the other discussion, if we had decided that way I'm totally okay w. it. But since we are doing it now, we can also tag @zekun000 :) see what everyone thinks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also things like VMError etc, we should most likely do a separate PR even if we do change to Vm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, rolled back VM --> Vm for now - we can decide here and rename separately, so that this PR is not blocked on naming :)

.map_err(|e| e.finish(Location::Undefined))?;

// locals should not be dropped until all return values are serialized
drop(dummy_locals);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any benefit to explicitly dropping here?

Also, it's a lot cleaner now - great refactor. it's moved so a bit harder to review side by side (nothing two windows can't fix). But please do point out when things aren't just copy-paste (e.g. add a comment inline) so reviewers don't miss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method was full copy paste (hence not addressing some minor nits initially, so it is easier to compare). Same for the drop - kept it like in the original code.

I think here the intention was to make sure locals are live past serialization, but here rust should drop them there anyway, so not sure

@georgemitenkov georgemitenkov force-pushed the george/remove-feature-flag-3.13 branch from 482fb3a to 151745f Compare March 15, 2025 09:39
@georgemitenkov georgemitenkov force-pushed the george/remove-feature-flag-3.13 branch from 151745f to 25be810 Compare March 17, 2025 17:22
@georgemitenkov georgemitenkov requested a review from gelash March 17, 2025 17:23
@georgemitenkov georgemitenkov enabled auto-merge (squash) March 17, 2025 17:56

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 5051d3d476fa47906a2968c74159b621a61defbc

two traffics test: inner traffic : committed: 12528.01 txn/s, latency: 3175.72 ms, (p50: 2700 ms, p70: 3300, p90: 4800 ms, p99: 7200 ms), latency samples: 4763480
two traffics test : committed: 100.00 txn/s, latency: 2836.05 ms, (p50: 2600 ms, p70: 3400, p90: 4500 ms, p99: 9300 ms), latency samples: 1780
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.692, avg: 0.701", "ConsensusProposalToOrdered: max: 0.319, avg: 0.310", "ConsensusOrderedToCommit: max: 0.474, avg: 0.405", "ConsensusProposalToCommit: max: 0.782, avg: 0.714"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.79s no progress at version 45520 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.97s no progress at version 1796582 (avg 0.88s) [limit 16].
Test Ok

Copy link
Contributor

✅ Forge suite compat success on 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc

Compatibility test results for 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc (PR)
1. Check liveness of validators at old version: 82240c9c7137087c575bf5d670abfa0dddc3ae9f
compatibility::simple-validator-upgrade::liveness-check : committed: 4267.33 txn/s, submitted: 4268.17 txn/s, expired: 0.84 txn/s, latency: 4048.37 ms, (p50: 3800 ms, p70: 4500, p90: 5300 ms, p99: 8300 ms), latency samples: 279185
2. Upgrading first Validator to new version: 5051d3d476fa47906a2968c74159b621a61defbc
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 2755.83 txn/s, latency: 10474.22 ms, (p50: 11300 ms, p70: 13000, p90: 13500 ms, p99: 13600 ms), latency samples: 61920
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 2792.73 txn/s, latency: 11881.10 ms, (p50: 13300 ms, p70: 14100, p90: 14200 ms, p99: 14200 ms), latency samples: 104300
3. Upgrading rest of first batch to new version: 5051d3d476fa47906a2968c74159b621a61defbc
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 2736.61 txn/s, latency: 10686.71 ms, (p50: 11400 ms, p70: 13300, p90: 13700 ms, p99: 13800 ms), latency samples: 60960
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 2734.63 txn/s, latency: 12147.08 ms, (p50: 13800 ms, p70: 14200, p90: 14500 ms, p99: 14500 ms), latency samples: 102880
4. upgrading second batch to new version: 5051d3d476fa47906a2968c74159b621a61defbc
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 5006.23 txn/s, latency: 5976.05 ms, (p50: 5400 ms, p70: 7300, p90: 8100 ms, p99: 8400 ms), latency samples: 98080
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 1851.99 txn/s, submitted: 1852.11 txn/s, expired: 0.12 txn/s, latency: 7283.34 ms, (p50: 8100 ms, p70: 8300, p90: 8600 ms, p99: 8800 ms), latency samples: 168209
5. check swarm health
Compatibility test for 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc passed
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc

Compatibility test results for 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc (PR)
Upgrade the nodes to version: 5051d3d476fa47906a2968c74159b621a61defbc
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1749.67 txn/s, submitted: 1754.89 txn/s, failed submission: 5.01 txn/s, expired: 5.23 txn/s, latency: 1701.82 ms, (p50: 1500 ms, p70: 1800, p90: 2400 ms, p99: 3400 ms), latency samples: 153701
Upgrade the remaining nodes to version: 5051d3d476fa47906a2968c74159b621a61defbc
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1753.99 txn/s, submitted: 1761.14 txn/s, failed submission: 7.15 txn/s, expired: 7.15 txn/s, latency: 1699.34 ms, (p50: 1600 ms, p70: 1800, p90: 2400 ms, p99: 3600 ms), latency samples: 157063
5. check swarm health
Compatibility test for 82240c9c7137087c575bf5d670abfa0dddc3ae9f ==> 5051d3d476fa47906a2968c74159b621a61defbc passed
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1639.23 txn/s, submitted: 1642.33 txn/s, failed submission: 3.10 txn/s, expired: 3.10 txn/s, latency: 1772.77 ms, (p50: 1800 ms, p70: 2000, p90: 2400 ms, p99: 3400 ms), latency samples: 147901
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants