Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add a stress test for the https outcalls feature #4449

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

mihailjianu1
Copy link
Contributor

@mihailjianu1 mihailjianu1 commented Mar 20, 2025

This test adds an update message to the already existing proxy_canister, which performs count concurrent outcalls.

However, as the number of canister messages is limited to 500, the requests have to be split into batches.

All requests are lightweight both in therms of request size and response size.

The test sets up two testnets, with 13 and 40 nodes respectively. For each of them it tries to send 200, 500 and 1000 requests, and measures the average qps for each (over 3 experiments).

Results are as follows:

QPS:
node# \ concurrency_level | 200 | 500 | 1000
13:                       | 89  | 155 | 144
40:                       | 52  | 70  | 68

Interpretation:

  • The canister measures the time of the http_request to the management canister. This includes ingress message processing, consensus, block making etc etc.
  • The 40 node subnet is slower mainly because consensus takes longer, and there's a better chance of some adapters slightly slower
  • Not fully saturating the 500 concurrency limit yields worse qps.
  • In the current setup, the qps is also correlated with the delay of the target server.

Colocating nodes doesn't seem to affect the average qps by a lot (sometimes it's ~10% higher / lower, but there's no clear indication that it's way better / worse), as the main bottleneck here is the 500 limit on the concurrent canister messages.

Future attempts at stressing the feature may include:

  1. Installing multiple proxy canisters on the same subnet to bypass the 500 limit
  2. Fully saturating the bottleneck by continuously sending new requests in the message queue instead of waiting for all 500 to return.

@github-actions github-actions bot added the test label Mar 20, 2025
@mihailjianu1 mihailjianu1 marked this pull request as ready for review March 28, 2025 10:33
@mihailjianu1 mihailjianu1 requested a review from a team as a code owner March 28, 2025 10:33
Copy link
Contributor

@Sawchord Sawchord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks!

@mihailjianu1 mihailjianu1 requested review from gbrel and a team March 28, 2025 13:48
@kpop-dfinity
Copy link
Contributor

kpop-dfinity commented Mar 28, 2025

Great stuff!

Non-blocking comment: I haven't looked at the code yet but two more things to consider would be:

  1. Run the test on the performance cluster to simulate the production environment more closely(see for example). I'm not entirely sure if it's going to work because in your setup you need 53+ nodes and I don't know how big the performance cluster is.
  2. Impose some artificial network conditions (example) to see how the QPS correlates with the latency between nodes. If you want to be even fancier, you might want to use ProductionSubnetTopology::UZR34 to simulate a subnet which has some heavy https outcalls traffic.

url: String,
logger: &Logger,
concurrent_requests: u64,
) -> Result<u64, anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be easier to use if we return std::time::Duration here?

Suggested change
) -> Result<u64, anyhow::Error> {
) -> Result<Duration, anyhow::Error> {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants