Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST]: StratifiedStandardize for multi-output models #2739

Open
1 task done
Hrovatin opened this issue Feb 10, 2025 · 3 comments
Open
1 task done

[FEATURE REQUEST]: StratifiedStandardize for multi-output models #2739

Hrovatin opened this issue Feb 10, 2025 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@Hrovatin
Copy link

Motivation

Would it be possible to make StratifiedStandardize more general so that it can work with multi-output models as Standardize? This would be needed for MultiTaskGP which requires stratified standardisation and can work with multiple outputs.

Describe the solution you'd like to see implemented in BoTorch.

Same as Standardise which has multi-output parameters m and outputs.

Describe any alternatives you've considered to the above solution.

No response

Is this related to an existing issue in BoTorch or another repository? If so please include links to those Issues here.

No response

Pull Request

None

Code of Conduct

  • I agree to follow BoTorch's Code of Conduct
@Hrovatin Hrovatin added the enhancement New feature or request label Feb 10, 2025
@sdaulton
Copy link
Contributor

Hi @Hrovatin,

This would be needed for MultiTaskGP which requires stratified standardisation and can work with multiple outputs.

StratifiedStandardize works with MultiTaskGP if you include the task values in X when calling MultiTaskGP.posterior, since that makes the posterior a single-output posterior. I agree that it would be good to support the case where there are multiple outputs too. Would you be willing to put up a PR? Here is an example of the current single output support with MTGP:

import math

import matplotlib.pyplot as plt
import torch
from botorch.fit import fit_gpytorch_mll
from botorch.models.multitask import MultiTaskGP
from botorch.models.transforms.outcome import StratifiedStandardize
from gpytorch.mlls.exact_marginal_log_likelihood import ExactMarginalLogLikelihood

tkwargs = {"dtype": torch.double}
torch.manual_seed(0)
X_task = torch.zeros(40, 1, **tkwargs)
X_task[(X_task.shape[0] // 2) :] = 1
X = torch.cat(
    [torch.rand_like(X_task), X_task],
    dim=-1,
)
Y = torch.sin(2 * math.pi * X[..., :1]) + 5 * X_task + X[..., :1]


task0 = (X_task == 0).view(-1)
plt.plot(X[task0, 0], Y[task0], ".", ms=10)
task1 = (X_task == 1).view(-1)
plt.plot(X[task1, 0], Y[task1], ".", ms=10)
model = MultiTaskGP(
    X,
    Y,
    task_feature=1,
    outcome_transform=StratifiedStandardize(
        task_values=X_task.unique().long(), stratification_idx=1
    ),
)

mll = ExactMarginalLogLikelihood(model.likelihood, model)
_ = fit_gpytorch_mll(mll)

test_X = torch.linspace(0, 1, 101, **tkwargs).unsqueeze(1)
test_X = torch.cat(
    [
        test_X,
        torch.zeros(101, 1, **tkwargs),
    ],
    dim=-1,
)
test_X2 = torch.linspace(0, 1, 101, **tkwargs).unsqueeze(1)
test_X2 = torch.cat(
    [
        test_X2,
        torch.ones(101, 1, **tkwargs),
    ],
    dim=-1,
)
test_X_all = torch.cat([test_X, test_X2], dim=0)

with torch.no_grad():
    posterior = model.posterior(test_X_all)

plt.plot(test_X_all[:101, 0], posterior.mean[:101], label="posterior mean")
plt.plot(test_X_all[101:, 0], posterior.mean[101:], label="posterior mean")

@Hrovatin
Copy link
Author

Thank you very much for the response. I think wasn't clear - I meant that using StratifiedStandardize would prevent using the multi-output option of the MultiTaskGP, if I am not mistaken.

We are currently testing the MultiTaskGP and the StratifiedScale - depending on how that affects performance we will decide to use it or not. If we decide to use it we would need a multi-output function so in that case I would probably implement it.

Another question:
What is your opinion on using StratifiedScale in transfer learning setting when one domain may have much less measured data points that may be also biased towards a specific parameter space/output region. - Did you ever observe that doing startified output scaling could in fact be problematic when tasks do not cover the same regions of the output space?

@sdaulton
Copy link
Contributor

Did you ever observe that doing startified output scaling could in fact be problematic when tasks do not cover the same regions of the output space?

I haven't observed that. The task-task covariance should be able to capture tasks being on different scales. It seems like tasks not covering the same region would have the same effect with/without standardizing each task independently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants