Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader: optimize BackoffTimer using float #156

Merged
merged 1 commit into from
Jun 6, 2016

Conversation

ploxiln
Copy link
Member

@ploxiln ploxiln commented Jun 4, 2016

my version of #155

performance tested with @virtuald's rawperf.py on a plugged-in 2013 macbook pro retina 13"

  • master: about 16s
  • optim_backofftimer: about 12s

Maybe a refactor - to not call these methods so much - is what the library really needs, but this was a fun easy exercise 😁

@mreiferson
Copy link
Member

I still think this is silly, the culprit is clearly (at least) this line https://github.com/nsqio/pynsq/blob/master/nsq/reader.py#L358, if that performs no decimal calculations how much does that buy us?

FWIW this PR might be solving exactly that, since it caches the value, but do we need all the other noise?

@mreiferson mreiferson changed the title optimize BackoffTimer using float reader: optimize BackoffTimer using float Jun 4, 2016
@ploxiln
Copy link
Member Author

ploxiln commented Jun 4, 2016

I think using Decimal for this purpose is silly. This code doesn't really care what kind of float. Isn't it a simplification to not use Decimal? To not convert float -> string -> Decimal -> float?

It just happens that the numbers used for short_length and long_length have both 2 and 5 as prime factors instead of just 2, so base 10 avoids a rounding errors in short_unit and long_unit that compound, that cause you to be an iota off from 0 after a test does the same number of failure() and success() calls. (Also the fact that the test uses min_interval=0.1, another contributor to a 5 factor in the denominator.)

I'll run the simple benchmark with just the get_interval() caching just to see. But @virtuald's profiling suggested that success()` was also called a lot more often than we would expect (not just when in backoff state).

@mreiferson
Copy link
Member

I don't disagree about decimal/float, but we should first make sure it's called only when necessary and then optimize it if necessary.

This code is ancient Bitly code pre-dating pynsq.

@ploxiln ploxiln force-pushed the optim_backofftimer branch from b02ada4 to d739920 Compare June 4, 2016 20:35
@ploxiln
Copy link
Member Author

ploxiln commented Jun 4, 2016

Putting everything back, and then caching just get_interval(), benchmark run-time was about 14.5 seconds. Adding short-circuiting of success() brought that down to 12 seconds. So right now, success() is actually more significant because it has more Decimal operations. (I think this reader benchmark is constantly on the edge of max_in_flight, which I'm guessing is related to the "continue" event.)

I also think it's weird that the test uses min_interval=0.1, but actual pynsq reader always uses min_interval=0, and that's actually a significant number to a branch in the reader code. So if I were to do the "big cleanup" option for this code ... I would also get rid of min_interval :)

@jehiah jehiah added performance and removed feature labels Jun 5, 2016
@ploxiln
Copy link
Member Author

ploxiln commented Jun 6, 2016

let me know if/when to squash

@mreiferson
Copy link
Member

🔨

cache the calculation+conversion for get_interval()
if get_interval() is 0, then success() can be short-circuited
@ploxiln ploxiln force-pushed the optim_backofftimer branch from d739920 to f632b56 Compare June 6, 2016 06:06
@virtuald
Copy link
Contributor

virtuald commented Jun 6, 2016

I'll run this locally later today and see how it shows up in the profiling output.

@mreiferson mreiferson merged commit 4cdb4eb into nsqio:master Jun 6, 2016
@mreiferson
Copy link
Member

thanks!

@ploxiln ploxiln deleted the optim_backofftimer branch May 1, 2017 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants