PartialOrd/Ord implementation possibility #35

3Hren · 2024-06-24T17:45:45Z

Hi again!

There is the final step to take full advantage of your great library.

Sometimes it is necessary to deduplicate bitsets or to build a mapping between bitsets and some values. There are two standard approaches: using HashMap or BTreeMap.

The first one requires to calculate bitset's hash, which, in case of large bitsets, can be painful. However, for BTreeMap it seems that a multilevel approach to comparing two bitsets will result in a very fast comparison, also lazy bitsets seems to perfectly fit in this case.
Since the PartialEq is already implemented it seems that it will no harm to also implement PartialOrd/Ord. Or is there something that prevents this from being implemented?

The text was updated successfully, but these errors were encountered:

tower120 · 2024-06-24T18:39:33Z

If you want to "deduplicate bitsets", why not substract? S1 - S2 will give you S1 without S2 elements.

If you want to map indices to values, you need https://github.com/tower120/hi_sparse_array . It is somewhat like hi_sparse_bitset, but last level points to the actual data. It is still in research phase.

And what is the meaning of sets order? When S1>S2?

3Hren · 2024-06-24T18:54:25Z

By deduplicating/mapping I meant that suppose we have the following mapping:

{
  0b10101 -> 0,
  0b10101 -> 1,
  0b10001 -> 2,
  ...
  0b10001 -> 100500,
}

I would like to transform it in a such way:

{
  0b10101 -> [0, 1],
  0b10001 -> [2, 100500],
  ...
}

Using BTreeMap this is trivial.

By order I mean that in fact each bitset can be represented as a very large unsigned integer. So why not allow to comparing them?

tower120 · 2024-06-24T19:24:12Z

Looks like you need hi_sparse_array's multi_merge. There is a version of it now in repository, but I rewriting it to new more space efficient data structure.

Ord is probably possible. But looks like what you want to do will be inefficient (since you'll need to process each bitset one-by-one). Again, looks like you need merge. And I don't clearly see how to implement Ord... Especially for non TRUSTED_HIERARCHY.
It's not trivial algorithm (maybe I'm wrong).

tower120 · 2024-06-24T19:29:42Z

Well, if you "just" need Ord, and don't care about speed - you probably can compare block_iter() between two sets. (Don't forget to take block starts into count as well).

3Hren · 2024-06-24T19:35:20Z

Isn't it enough to just to compare levels one-by-one? First try level0 indices, if they are equal then level1, and finally, if again equal - block-by-block?
I'm not familiar with TRUSTED_HIERARCHY internals, will take a look.

Well, if you "just" need Ord, and don't care about speed - you probably can compare block_iter() between two sets. (Don't forget to take block starts into count as well).

Of course I care, that's why I am here ;)

Anyway, thanks for the answer!

tower120 · 2024-06-24T19:49:14Z

For TRUSTED_HIERARCHY maybe that is enough (I can't see that far without practical implementation).

But if you work, for example, with lazy(!) result of intersection - raised bits in level bitmasks may lead to EMPTY data blocks. That case is called non-trusted-hierarchy. You can't trust hierarchy and just compare level bitmasks. From Eq implementation experience thou, there is a "special case" when you compare TRUSTED_HIERARCHY to !TRUSTED_HIERARCHY - then some optimization possible. Otherwise - (as far as I remember) it ends up with plain data block comparison one-by-one.

tower120 · 2024-06-24T19:53:31Z

It is just with multi_merge/fold_merge you basically OR bitmasks of all maps/arrays at once, and then just apply resolve function to intersected data items in the very last level. At least on paper that should be super-linearly faster then what you trying to do.

3Hren · 2024-06-24T20:06:47Z

Ah, I see it's more complicated than I thought, interesting nonetheless. Thanks for the clarification!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PartialOrd/Ord implementation possibility #35

PartialOrd/Ord implementation possibility #35

3Hren commented Jun 24, 2024 •

edited

Loading

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024 •

edited

Loading

tower120 commented Jun 24, 2024 •

edited

Loading

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024 •

edited

Loading

tower120 commented Jun 24, 2024

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024

PartialOrd/Ord implementation possibility #35

PartialOrd/Ord implementation possibility #35

Comments

3Hren commented Jun 24, 2024 • edited Loading

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024 • edited Loading

tower120 commented Jun 24, 2024 • edited Loading

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024 • edited Loading

tower120 commented Jun 24, 2024

tower120 commented Jun 24, 2024

3Hren commented Jun 24, 2024

3Hren commented Jun 24, 2024 •

edited

Loading

3Hren commented Jun 24, 2024 •

edited

Loading

tower120 commented Jun 24, 2024 •

edited

Loading

3Hren commented Jun 24, 2024 •

edited

Loading