[feat] Optimize HSTU training and sampling process #93

iWelkin-coder · 2025-01-17T03:24:38Z

No description provided.

…e similarity calculation in MatchModel; refine HardNegativeSampler documentation

… hstu

…fig in model.proto

… hstu

… handling and configuration

…STU integration tests

tiankongdeguiji · 2025-03-14T11:34:31Z

.gitignore

@@ -40,3 +40,12 @@ docs/source/intro.md
 docs/source/proto.html

 .vscode/
+graphlearn*


remove these

tiankongdeguiji · 2025-03-14T11:35:20Z

tzrec/main.py

@@ -201,8 +201,8 @@ def _get_dataloader(
    dataloader = DataLoader(
        dataset=dataset,
        batch_size=None,
-        pin_memory=data_config.pin_memory if mode != Mode.PREDICT else False,
-        collate_fn=lambda x: x,
+        # pin_memory=data_config.pin_memory if mode != Mode.PREDICT else False,


remove comments

tiankongdeguiji · 2025-03-14T11:36:36Z

tzrec/datasets/dataset.py

@@ -237,6 +240,7 @@ def launch_sampler_cluster(
                multival_sep=self._fg_encoded_multival_sep
                if self._fg_mode == data_pb2.FgMode.FG_NONE
                else chr(29),
+                seq_str_delim=self._seq_str_delim,


seq_str_delim -> item_id_delim, should be param of sampler_config

tiankongdeguiji · 2025-03-14T11:37:26Z

tzrec/datasets/sampler.py

-        features = self._parse_nodes(nodes)
-        result_dict = dict(zip(self._attr_names, features))
-        return result_dict
+        # ids = np.pad(ids, (0, self._batch_size - len(ids)), "edge")


tiankongdeguiji · 2025-03-14T11:37:45Z

tzrec/main.py

@@ -338,6 +338,8 @@ def _train_and_evaluate(
    ckpt_path: Optional[str] = None,
    eval_result_filename: str = "train_eval_result.txt",
 ) -> None:
+    torch.backends.cuda.matmul.allow_tf32 = True


why should allow tf32?

tiankongdeguiji · 2025-03-14T12:01:09Z

tzrec/models/hstu.py

-            self._loss_collection, self.item_tower.group_variational_dropout_loss
-        )
+        batch_sparse_features = batch.sparse_features["__BASE__"]
+        nonzero_indices = torch.where(


why should we need nonzero_indices?

tiankongdeguiji · 2025-03-14T12:01:42Z

tzrec/models/hstu.py

+        )[0]
+        default_value = torch.tensor([-1]).to(nonzero_indices.device)
+        batch_size = torch.cat([nonzero_indices, default_value]).max() + 1
+        neg_sample_size = batch_sparse_features.lengths()[-1] - 1


add comments detaily explain the design

tiankongdeguiji · 2025-03-14T12:03:46Z

tzrec/models/match_model.py

@@ -181,18 +181,28 @@ def sim(
        user_emb: torch.Tensor,
        item_emb: torch.Tensor,
        neg_for_each_sample: bool = False,
+        is_hstu: bool = False,


do not modify the sim, overwrite sim in hstu.py, and explain the logic in comments

tiankongdeguiji · 2025-03-14T12:07:28Z

tzrec/datasets/dataset.py

@@ -317,6 +321,92 @@ def _build_batch(self, input_data: Dict[str, pa.Array]) -> Batch:
                input_data = _expand_tdm_sample(
                    input_data, pos_sampled, neg_sampled, self._data_config
                )
+            elif self._enable_hstu:
+                seq_attr = self._sampler._item_id_field
+                if pa.types.is_string(input_data[seq_attr].type):


Make line 326-376 a function and move it to datasets/utils.py. Add comments to explain the logic and add a unit test for the function.

tiankongdeguiji · 2025-03-14T12:08:22Z

tzrec/datasets/dataset.py

+                        pa.array(input_data_k_split.offsets.to_numpy()[1:] - 1)
+                    )
+                sampled = self._sampler.get(input_data)
+                for k, v in sampled.items():


Make line 377-409 a function and move it to datasets/utils.py. Add comments to explain the logic and add a unit test for the function.

iWelkin-coder added 11 commits December 6, 2024 14:36

add match model proto

27d22d5

add hstu model content, initialization

d71db96

[bugfix] fix bugs in hstu

769cc0a

[feat] enhance HSTU model with attention and linear dimensions; updat…

374f650

…e similarity calculation in MatchModel; refine HardNegativeSampler documentation

[bugfix] Rename HSTU model; Update related test cases and configs

4582436

Merge branch 'master' of https://github.com/alibaba/TorchEasyRec into…

ab5f21a

… hstu

[bugfix] Update version to 0.6.11 and add HSTUMatch field to ModelCon…

a5d038a

…fig in model.proto

Merge branch 'master' of https://github.com/alibaba/TorchEasyRec into…

47abb3f

… hstu

[feat] Enhance HSTUMatch model with sequence training; update dataset…

04a2e90

… handling and configuration

[feat] make pos and neg items use the same embedding pool

b2efa47

[feat] Implement HSTU sequence feature handling in BaseDataset; add H…

29951de

…STU integration tests

tiankongdeguiji requested changes Mar 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Optimize HSTU training and sampling process #93

[feat] Optimize HSTU training and sampling process #93

iWelkin-coder commented Jan 17, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

tiankongdeguiji Mar 14, 2025

[feat] Optimize HSTU training and sampling process #93

Are you sure you want to change the base?

[feat] Optimize HSTU training and sampling process #93

Conversation

iWelkin-coder commented Jan 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment