Fix support for sparse labels in MatthewsCorrelationCoefficient #2495

lgeiger · 2021-06-02T17:20:16Z

Description

This PR adds support for sparse labels and predictions in MatthewsCorrelationCoefficient. This re-enables the use of num_classes=1 which was broken in #2406.

Fixes #2494

Type of change

Checklist:

I've properly formatted my code according to the guidelines
- By running Black + Flake8
- By running pre-commit hooks
This PR addresses an already submitted issue for TensorFlow Addons
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
This PR contains modifications to C++ custom-ops

bot-of-gabrieldemarmiesse · 2021-06-02T17:21:10Z

@autoih @marload

You are owners of some files modified in this pull request.
Would you kindly review the changes whenever you have the time to?
Thank you very much.

lgeiger · 2021-06-02T17:21:35Z

tensorflow_addons/metrics/tests/matthews_correlation_coefficient_test.py

+def test_binary_classes_sparse():
+    gt_label = tf.constant([[1.0], [1.0], [1.0], [0.0]], dtype=tf.float32)
+    preds = tf.constant([[1.0], [0.0], [1.0], [1.0]], dtype=tf.float32)
+    # Initialize
+    mcc = MatthewsCorrelationCoefficient(1)
+    # Update
+    mcc.update_state(gt_label, preds)
+    # Check results
+    check_results(mcc, [-0.33333334])


This restores the behaviour of v0.12 which was removed here

Not sure the above example is sparse, can you report the result of the following input?

gt_label = tf.constant([1, 1, 1, 0], dtype=tf.float32) preds = tf.constant([1, 0, 1, 1], dtype=tf.float32)

Yeah I don't think this would work correctly with the current code. But the motivation of this PR was to restore the behaviour of v0.12 and earlier. I'd be happy to change it to support the above example though. Is there a place where the conventions of the metrics are documented? /cc @seanpmorgan

In tensorflow (and in most libraries), sparse labels are tensors of rank 1. That means it's dimensionality is (numclasses, ) and not (numclasses, 1).

As for conventions, you can check README.md in metrics to get a brief idea on how things work. You can also check metrics.py from tensorflow official for code conventions as we follow the same standard. There's also an exquisite set of metrics available in our metrics folders which will give you a good idea.

Let me know if this helps, 👍

jonpsy · 2021-06-03T02:05:19Z

tensorflow_addons/metrics/matthews_correlation_coefficient.py

        y_true = tf.cast(y_true, dtype=self.dtype)
        y_pred = tf.cast(y_pred, dtype=self.dtype)


Instead of rounding in the future, I suggest cast it to tf.int64

jonpsy · 2021-06-03T02:16:44Z

The test doesn't "break" per se, it wasn't intended to be used for sparse-labels in the first place. See this comment. I gather that you're trying to add sparse support with multi and binary labels? This code will serve as a good reference for library convention.

Most importantly, it doesn't make sense to have numclasses=1, even if you have one class, it still represented with 2; one indicating its presence and other its lack thereof ( think of it like Yes/No). So the current code rightly "breaks" your edge case.

jonpsy · 2021-06-03T02:19:50Z

tensorflow_addons/metrics/matthews_correlation_coefficient.py

    ):
        """Creates a Matthews Correlation Coefficient instance."""
        super().__init__(name=name, dtype=dtype)
-        self.num_classes = num_classes
+        self.num_classes = max(2, num_classes)
        self.conf_mtx = self.add_weight(
            "conf_mtx",
            shape=(self.num_classes, self.num_classes),


Instead of max(2, num_classes), add a check to detect if num_classes is less than 2 and appropriately throw error. See cohen_kappa.py for this.

jneeven · 2021-06-03T07:22:53Z

Most importantly, it doesn't make sense to have numclasses=1, even if you have one class, it still represented with 2; one indicating its presence and other its lack thereof ( think of it like Yes/No). So the current code rightly "breaks" your edge case.

@jonpsy I strongly disagree. It is very common to, if there is just a single class, have only a single (sigmoid) output representing the confidence that the input is a positive. It's the reason the BinaryCrossentropy loss exists and it's a totally valid usecase. In fact, I'd argue that it doesn't make sense to have two outputs if you only have a single class, because you can represent the same prediction with just a single number without losing any information. Additionally, num_classes=1 was actually the "default" used in the original docstring, so saying that this update has rightly broken that usecase seems strange to me. Changing the concept of binary predictions from num_classes=1 to num_classes=2 unnecessarily breaks backward compatibility, as both concepts were previously supported (ignoring the fact that the multi-class case was apparently bugged).

lgeiger · 2021-06-03T13:19:51Z

The test doesn't "break" per se, it wasn't intended to be used for sparse-labels in the first place. See this comment. I gather that you're trying to add sparse support with multi and binary labels?

To be honest, I am not very familiar with this code and the conventions for TFA metrics. But I disagree with your assessment that this behaviour was never intended. It has been the behaviour in all releases between v0.7 and v0.12 including it being tested in the unittest. And the functionality was removed in #2406 while changing the unittest.
I think there is definitely a point to discuss in how to handle this in a consistent way across the other metrics. But as far as I can tell this functionality was removed in #2406 without additional discussion mentioning that this is a breaking change for users, or am I missing something here?

I don't want to advocate for not ever making breaking changes (in fact semver definitely allows for that when moving from 0.12 to 0.13), but then it needs to be documented in the release notes and ideally the code should throw an actionable error message. Which currently isn't the case since the release notes only mention that there was a "fix for MCC" and the metric will now return 0 for num_classes=1 which can be tricky to debug in userspace.

I also agree with the point @jneeven raises about supporting num_classes=1 being a sensible thing.

For me primarily there are two main questions about this PR:

Do we want to restore the behaviour of v0.12 and earlier or do we want to support the behaviour mentioned in Fix support for sparse labels in MatthewsCorrelationCoefficient #2495 (comment)
Do we want to support num_classes=1 or not. I do think it makes sense since it is an easy to support thing. But if we are breaking backwards compatibility anyway, we could also make the choice of only supporting num_classes=2 with sparse labels which would be an easy to do code change in user space however would break with the conventions of e.g. BinaryCrossentropy loss which @jneeven detailed above.

jonpsy · 2021-06-05T04:34:37Z

@jonpsy I strongly disagree. It is very common to, if there is just a single class, have only a single (sigmoid) output representing the confidence that the input is a positive. It's the reason the BinaryCrossentropy loss exists and it's a totally valid usecase. In fact, I'd argue that it doesn't make sense to have two outputs if you only have a single class, because you can represent the same prediction with just a single number without losing any information.

It looks like you have confused the probability of class occurring with the number of distinguishable classes. In your example itself, please note the word "Binary" which itself suggests two-class problem. Label representation (via sparse or OHE) and probability of class occurring are two different things, the former is used in metric implementations. For instance, consider the following example,

import tensorflow as tf
m = tf.keras.metrics.MeanIoU(num_classes=1)
m.update_state([1.0, 1.0, 1.0, 1.0], [0.3, 0.7, 0.8, 0.9])
m.result().numpy()

As per your line of reasoning, this is perfectly fine since we're using a "single number for representing class", As expected, this throws errors. I believe this should suffice unless of course, official TensorFlow implementation is wrong.

But what about backward compatibility?

But I disagree with your assessment that this behaviour was never intended. It has been the behaviour in all releases between v0.7 and v0.12 including it being tested in the unittest. And the functionality was removed in #2406 while changing the unittest.

There's no "backward compatibility" issue here, the previous implementation was invaild in the doctest, simple as that. It's not an assessment, it's a fact. Bluntly put, you've restored an invalid case and changed the current code to "work" on that case. Although, I concede that we should've discussed it in our PR.

I also agree that we should've thrown an error when this invalid case is invoked, for example: cohen_kappa.py does this

addons/tensorflow_addons/metrics/cohens_kappa.py

Lines 112 to 120 in 97eb293

    
                   if num_classes == 2: 
        
                       self._update = self._update_binary_class_model 
        
                   elif num_classes > 2: 
        
                       self._update = self._update_multi_class_model 
        
                   else: 
        
                       raise ValueError( 
        
                           """Number of classes must be 
        
                                         greater than or euqal to two""" 
        
                       )

Which I've already suggested in a previous comment on this PR.

I've done my best to explain this and if you're still having trouble understanding, I suppose @bhack could help you out here or you can post it to StackOverflow.

If you want to add support for sparse labels, I would be happy to help you in that regard 👍

seanpmorgan · 2023-03-01T04:40:31Z

Thank you for your contribution. We sincerely apologize for any delay in reviewing, but TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

lgeiger requested a review from autoih as a code owner June 2, 2021 17:20

boring-cyborg bot added the metrics label Jun 2, 2021

google-cla bot added the cla: yes label Jun 2, 2021

lgeiger force-pushed the fix-mcc branch from db67152 to da2870f Compare June 2, 2021 17:21

lgeiger commented Jun 2, 2021

View reviewed changes

lgeiger mentioned this pull request Jun 2, 2021

Restore single class MatthewsCorrelationCoefficient test cases #2494

Open

21 tasks

Add support for sparse labels in MatthewsCorrelationCoefficient

9b7fb79

lgeiger force-pushed the fix-mcc branch from da2870f to 9b7fb79 Compare June 2, 2021 18:29

jonpsy reviewed Jun 3, 2021

View reviewed changes

lgeiger changed the title ~~Add support for sparse labels in MatthewsCorrelationCoefficient~~ Fix support for sparse labels in MatthewsCorrelationCoefficient Jun 4, 2021

seanpmorgan closed this Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix support for sparse labels in MatthewsCorrelationCoefficient #2495

Fix support for sparse labels in MatthewsCorrelationCoefficient #2495

lgeiger commented Jun 2, 2021

bot-of-gabrieldemarmiesse commented Jun 2, 2021

lgeiger Jun 2, 2021 •

edited

Loading

jonpsy Jun 3, 2021 •

edited

Loading

lgeiger Jun 3, 2021 •

edited

Loading

jonpsy Jun 5, 2021

jonpsy Jun 3, 2021

jonpsy commented Jun 3, 2021

jonpsy Jun 3, 2021

jneeven commented Jun 3, 2021

lgeiger commented Jun 3, 2021 •

edited

Loading

jonpsy commented Jun 5, 2021

seanpmorgan commented Mar 1, 2023

		y_true = tf.cast(y_true, dtype=self.dtype)
		y_pred = tf.cast(y_pred, dtype=self.dtype)

Fix support for sparse labels in MatthewsCorrelationCoefficient #2495

Fix support for sparse labels in MatthewsCorrelationCoefficient #2495

Conversation

lgeiger commented Jun 2, 2021

Description

Type of change

Checklist:

bot-of-gabrieldemarmiesse commented Jun 2, 2021

lgeiger Jun 2, 2021 • edited Loading

Choose a reason for hiding this comment

jonpsy Jun 3, 2021 • edited Loading

Choose a reason for hiding this comment

lgeiger Jun 3, 2021 • edited Loading

Choose a reason for hiding this comment

jonpsy Jun 5, 2021

Choose a reason for hiding this comment

jonpsy Jun 3, 2021

Choose a reason for hiding this comment

jonpsy commented Jun 3, 2021

jonpsy Jun 3, 2021

Choose a reason for hiding this comment

jneeven commented Jun 3, 2021

lgeiger commented Jun 3, 2021 • edited Loading

jonpsy commented Jun 5, 2021

seanpmorgan commented Mar 1, 2023

lgeiger Jun 2, 2021 •

edited

Loading

jonpsy Jun 3, 2021 •

edited

Loading

lgeiger Jun 3, 2021 •

edited

Loading

lgeiger commented Jun 3, 2021 •

edited

Loading