-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Optimize HSTU training and sampling process #93
base: master
Are you sure you want to change the base?
Conversation
…e similarity calculation in MatchModel; refine HardNegativeSampler documentation
…fig in model.proto
… handling and configuration
…STU integration tests
@@ -40,3 +40,12 @@ docs/source/intro.md | |||
docs/source/proto.html | |||
|
|||
.vscode/ | |||
graphlearn* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these
@@ -201,8 +201,8 @@ def _get_dataloader( | |||
dataloader = DataLoader( | |||
dataset=dataset, | |||
batch_size=None, | |||
pin_memory=data_config.pin_memory if mode != Mode.PREDICT else False, | |||
collate_fn=lambda x: x, | |||
# pin_memory=data_config.pin_memory if mode != Mode.PREDICT else False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comments
@@ -237,6 +240,7 @@ def launch_sampler_cluster( | |||
multival_sep=self._fg_encoded_multival_sep | |||
if self._fg_mode == data_pb2.FgMode.FG_NONE | |||
else chr(29), | |||
seq_str_delim=self._seq_str_delim, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seq_str_delim -> item_id_delim, should be param of sampler_config
features = self._parse_nodes(nodes) | ||
result_dict = dict(zip(self._attr_names, features)) | ||
return result_dict | ||
# ids = np.pad(ids, (0, self._batch_size - len(ids)), "edge") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert it
@@ -338,6 +338,8 @@ def _train_and_evaluate( | |||
ckpt_path: Optional[str] = None, | |||
eval_result_filename: str = "train_eval_result.txt", | |||
) -> None: | |||
torch.backends.cuda.matmul.allow_tf32 = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should allow tf32?
self._loss_collection, self.item_tower.group_variational_dropout_loss | ||
) | ||
batch_sparse_features = batch.sparse_features["__BASE__"] | ||
nonzero_indices = torch.where( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should we need nonzero_indices?
)[0] | ||
default_value = torch.tensor([-1]).to(nonzero_indices.device) | ||
batch_size = torch.cat([nonzero_indices, default_value]).max() + 1 | ||
neg_sample_size = batch_sparse_features.lengths()[-1] - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comments detaily explain the design
@@ -181,18 +181,28 @@ def sim( | |||
user_emb: torch.Tensor, | |||
item_emb: torch.Tensor, | |||
neg_for_each_sample: bool = False, | |||
is_hstu: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not modify the sim, overwrite sim in hstu.py, and explain the logic in comments
@@ -317,6 +321,92 @@ def _build_batch(self, input_data: Dict[str, pa.Array]) -> Batch: | |||
input_data = _expand_tdm_sample( | |||
input_data, pos_sampled, neg_sampled, self._data_config | |||
) | |||
elif self._enable_hstu: | |||
seq_attr = self._sampler._item_id_field | |||
if pa.types.is_string(input_data[seq_attr].type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make line 326-376 a function and move it to datasets/utils.py. Add comments to explain the logic and add a unit test for the function.
pa.array(input_data_k_split.offsets.to_numpy()[1:] - 1) | ||
) | ||
sampled = self._sampler.get(input_data) | ||
for k, v in sampled.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make line 377-409 a function and move it to datasets/utils.py. Add comments to explain the logic and add a unit test for the function.
No description provided.