Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IQN implementation #1784

Merged
merged 67 commits into from
Oct 25, 2022
Merged

Add IQN implementation #1784

merged 67 commits into from
Oct 25, 2022

Conversation

kashif
Copy link
Contributor

@kashif kashif commented Nov 21, 2021

Issue #, if available:

Description of changes:

Implemented IQN distribution output head from the paper https://arxiv.org/abs/2107.03743

fixes #1643

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup

@kashif kashif requested a review from a team as a code owner November 22, 2021 18:18
@kashif kashif requested a review from a team as a code owner December 3, 2021 13:06
@lostella lostella added the new feature (one of pr required labels) label Feb 17, 2022
Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kashif looks good to me, I would ask just for a few minor changes (see also the comment about pinball loss)

Comment on lines 132 to 136
# penalize by tau for over-predicting
# and by 1-tau for under-predicting
return (self.taus - (self.outputs < value).float()) * (
self.outputs - value
)
Copy link
Contributor

@lostella lostella Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, got confused: it should be

return (self.taus - (value < self.outputs).float()) * (value - self.outputs)

Proof:

import torch
import numpy as np

def pinball_loss(obs, pred, alpha):
    return (alpha - (obs < pred).float()) * (obs - pred)

def subgradient_method(f, p, maxit):
    for k in range(maxit):
        v = f(p)
        v.backward()
        with torch.no_grad():
            p -= 1/(k+1) * p.grad
            p.grad.zero_()
    return p

data = np.random.normal(size=(1000,))
f = lambda q: pinball_loss(torch.from_numpy(data), q, 0.1).mean()
q = subgradient_method(f, torch.tensor(0.0, requires_grad=True), 1000)
print(q)

@kashif
Copy link
Contributor Author

kashif commented Oct 25, 2022

awesome thanks @lostella fixed

Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 thanks @kashif!

@lostella lostella changed the title IQN implementation Add IQN implementation Oct 25, 2022
@lostella lostella enabled auto-merge (squash) October 25, 2022 13:20
@kashif
Copy link
Contributor Author

kashif commented Oct 25, 2022

I thank you!

@lostella lostella merged commit ab911fe into awslabs:dev Oct 25, 2022
@kashif kashif deleted the iqn branch October 25, 2022 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature (one of pr required labels)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add IQN probabilistic head to the repo
2 participants