Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset node button does not update green status icon #104

Open
pmrv opened this issue Mar 8, 2025 · 33 comments
Open

Reset node button does not update green status icon #104

pmrv opened this issue Mar 8, 2025 · 33 comments

Comments

@pmrv
Copy link
Contributor

pmrv commented Mar 8, 2025

I'd like that it goes back to grey after the output cache is reset.

@Tara-Lakshmipathy
Copy link
Contributor

For this to happen, I think node.ready has to be set to False. But the ready property of a node has no setter. So, instead of clearing the cache using node._cached_inputs = {}, maybe you could use wf.replace_child to reinstantiate the node and replace it in the workflow?

@pmrv
Copy link
Contributor Author

pmrv commented Mar 10, 2025

Yeah, some variation of node.replace_with(type(node)()) that carries over manually set input values may work.

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 10, 2025

Hang on, when I look here, I see that both node.ready and node.cache_hit are used.

Then we could have cache_hit in the node dictionary:

return {
        'id': node.label,
        'data': {
            'label': label,
            'source_labels': list(node.outputs.channel_dict.keys()),
            'target_labels': list(node.inputs.channel_dict.keys()),
            'import_path': get_import_path(node),
            'target_values': get_node_values(node.inputs.channel_dict),
            'target_types': get_node_types(node.inputs),
            'source_values': get_node_values(node.outputs.channel_dict),
            'source_types': get_node_types(node.outputs),
            'failed': str(node.failed),
            'running': str(node.running),
            'ready': str(node.outputs.ready),
            'cache_hit': str(node.cache_hit), # --------> Add this line here
            'python_object_id': id(node),
        }

Then CustomNode.jsx can be changed to something like this:

const renderLabel = (label, failed, running, ready) => {
        let status = '';

        if (failed === "True") {
            status = '🟥   ';
        } else if (running === "True") {
            status = '🟨   ';
        } else if ((ready === "True") && (cache_hit === "True")) {
            status = '🟩   ';
        } else {
            status = '⬜   ';
        }

This could probably also be used somehow address #105, haven't yet figured out how.

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 11, 2025

I came so close to fixing it with the above code, but there is this annoying bug that seems to update the status randomly now when the node is reset and run again. No idea whether it is coming from the python or JS side.

Image

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 12, 2025

The above gif seems to be a pyiron_workflow issue. This is the pure code version:

wf = Workflow("test")
wf.add = example_nodes.Addition(1, 3)
wf.subtract = example_nodes.Subtraction(wf.add, 1)

wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True

wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False
#---------- So far so good -------------

wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False             
#---------- Why is cache_hit still False? -------------

wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False       
#---------- Reset and try again -------------

wf()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True
#---------- Why is cache hit true now but was false with pull? -------------

wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False       
#---------- Reset and try again -------------

wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True
#---------- Why is cache hit true now even with pull? -------------

@liamhuber, not sure what is going on here...

@liamhuber
Copy link
Member

It's happening because we, ultimately, are still running the parent tree:

https://github.com/pyiron/pyiron_workflow/blob/90755291892ab79eb52a128f27fc712627d61291/pyiron_workflow/node.py#L678-L679

And you haven't actually changed any input, so you're just getting the cached result. You're not even re-running Subtract in this case. I would argue this is not a bug, this is exactly the right behaviour. You've gone in and manually fiddled with the cache for wf.add, but you haven't actually changed any upstream input or upstream operations, so why should wf.subtract.pull() actually run anything? It's completely valid to return the cached result.

I show how we could do it here: pyiron/pyiron_workflow#618, but as stated I don't actually think we should.

Also, consider your knuckles wrapped for posting a non-running example 😝 This example will make it easier to see what's happening, both with and without the above PR:

import pyiron_workflow as pwf

@pwf.as_function_node("add")
def Add(obj, other):
    print("Adding", obj, other)
    return obj + other

@pwf.as_function_node("subtract")
def Subtract(obj, other):
    print("Subtracting", obj, other)
    return obj - other

wf = pwf.Workflow("test")
# wf.add = pwf.standard_nodes.Add(1, 3)
# wf.subtract = pwf.standard_nodes.Subtract(wf.add, 1)
wf.add = Add(1, 3)
wf.subtract = Subtract(wf.add, 1)

# wf.subtract.use_cache = False
# We could force individual children to run if we wanted, but that's not the point

print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True

print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- So far so good -------------

print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- Why is cache_hit still False? -------------

print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- Reset and try again -------------

print("wf()")
wf()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True
#---------- Why is cache hit true now but was false with pull? -------------

print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True)
# >> False
#---------- Reset and try again -------------

print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True

@liamhuber
Copy link
Member

There is perhaps a more graceful solution that gets your desired behaviour more robustly than the PR I linked: pyiron/pyiron_workflow#618 (comment)

@liamhuber
Copy link
Member

Yeah, the second solution here I am open to, and it also gets the behaviour you are wanting:

pyiron/pyiron_workflow#619

@Tara-Lakshmipathy
Copy link
Contributor

I think it comes down to what "clearing the cache" or "resetting a node" means. For me it means recompute this node (irrespective of whether the input has changed or not) and all downstream nodes that are affected. So when I hit pull on a downstream node, I expect any upstream node which has been reset to be recomputed irrespective of inputs.
If clearing the cache is not the right operation for this, maybe we should instead be replacing the node.

@Tara-Lakshmipathy
Copy link
Contributor

The solution in pyiron/pyiron_workflow#619 also seems fine to me. I don't mind either this or replacing a node.

@liamhuber
Copy link
Member

No, I think it's more meaningful than that. For instance in pyiron/pyiron_workflow#619 we really take a different stance on the following question: can a graph exploit its cache if any of its child nodes are not cached? This has serious implications for nodes that, e.g., exploit RNG and always set their class-level use_cache = False. This is worth thinking about in more depth.

In terms of solving the problem at hand, I realized that I'm not sure why running the data tree needs to go via the parent to begin with. Simply brute-force executing the nodes in the data tree is more elegant and gets the behaviour you're looking for: pyiron/pyiron_workflow#620. However, it made a couple tests fail and -- as I just said I'm not sure why running the data tree needs to go via the parent. But I clearly used to have an opinion on this, because that's how it's coded. So I don't really want to break that until I can conclusively say whether past-me or right-now-me has a better take on this issue.

@Tara-Lakshmipathy
Copy link
Contributor

One of the main motivations for clearing the cache is for internally stochastic nodes right? So, I think what is needed for them is to really break the philosophy of caching in pyiron_workflow. The previous example was probably a bad example. But imagine if the first node which was being reset was random integer generating node based on a random seed. Then when hit pull on the next node (let's say subtraction node) I would want the random integer node to rerun and the subtraction node also to rerun based on the new integer that it spits out.

But there could be occasions where I don't want the random integer node to generate a new number. Instead I just want to view the results of nodes.

In the first case, hitting pull should actually rerun the node and this is what I expect from pull. In the second case, I just want to view the result, and for this we can have different button called "view" or something.

Basically, the current behaviour for RNG situations is more like the view option. So, I think something somewhere has to change to account for these two different scenarios.

Sorry, I can't provide a code snippet right now, travelling somewhere today.

@Tara-Lakshmipathy
Copy link
Contributor

The issue I have with setting use_cache=False in the node definition is that all downstream nodes will always recompute if I hit pull on a downstream node. But for some post processing steps, maybe I don't want to rerun expensive stochastic stuff again. But I still need to run some downstream nodes because I added some additional post processing nodes downstream.

Basically, give the user the power to decide when the node has to be rerun and when it fetches from a cache.

@Tara-Lakshmipathy
Copy link
Contributor

Example with schematic:

expensive_stochastic_node -> post_processing_node

Here expensive_stocastic_node had use_cache=False. So far so good.

But if add a node:
expensive_stochastic_node -> post_processing_node -> another_node

Then when I hit pull on another_node, expensive_stochastic_node gets rerun. How can I avoid that?

@liamhuber
Copy link
Member

If you have a serious stochastic node, I would always recommend explicitly passing a seed variable

@liamhuber
Copy link
Member

For inexpensive analysis following expensive data, it should be possible to run from the start of the cheap part instead of pulling from the end of the entire thing. What stands in the way of this in the GUI now is that there is no output data view except on the node in which the run was invoked (and that the run button actually couple to pull, but that's a simple rename and an easy button addition)

@Tara-Lakshmipathy
Copy link
Contributor

Then you are placing restrictions on node design. Should every MD or MC node come with a seed input? And there may be some software that implicitly assume randomness and do not provide easy mechanisms for setting a seed.

In any case, I don't think we should make assumptions on what nodes will look like and what are the capabilities of the nodes.

I am strongly against the philosophy of the workflow manager forcing users and node designers into decisions. I think it's ok for a recommendation to be made, but the user should be able to make decisions that don't follow the recommendation for advanced and exceptional situations. If not, we would be back in the situation with classic pyiron where power users really struggled with flexibility.

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 12, 2025

For inexpensive analysis following expensive data, it should be possible to run from the start of the cheap part instead of pulling from the end of the entire thing. What stands in the way of this in the GUI now is that there is no output data view except on the node in which the run was invoked (and that the run button actually couple to pull, but that's a simple rename and an easy button addition)

But then we would be differentiating between run and pull which is not very intuitive for the user without doing a deep dive of the documentation. Especially for the GUI, and especially for users without coding experience, I would really like that most things are self-explanatory without having to learn the syntax of pyiron_workflow, it's underlying philosophies or know about advanced workflow management.

For the current issue, the straightforward answer for all this seems to be just replace the node itself. So I don't know why we should bother with the cache.

EDIT:

Playing around with the cache needs 5 things:

  • a pull button for general usage
  • a reset button for clearing cache for nodes that do use the cache e.g., failed nodes like in Failure and input readiness entangled #116
  • knowledge that a node could be using use_case=False
  • a run button for individual nodes after nodes that do not use cache
  • a view button for nodes that do not use cache

Replacing the node needs only 2 things:

  • a pull button for general usage
  • a reset button for replacing the node

Only drawback as far as I can tell is that the executor has to be recreated, but necessary information can be extracted before the replacement and everything can be accomplished in the background.

And to be frank, brute forcing the node as you proposed in your pull request (pyiron/pyiron_workflow#620) seems functionally similar to just replacing the node anyway. The advantage is that stuff attached to the node e.g., the executor need not be recreated.

@liamhuber
Copy link
Member

Then you are placing restrictions on node design. Should every MD or MC node come with a seed input? And there may be some software that implicitly assume randomness and do not provide easy mechanisms for setting a seed.

In any case, I don't think we should make assumptions on what nodes will look like and what are the capabilities of the nodes.

I am strongly against the philosophy of the workflow manager forcing users and node designers into decisions. I think it's ok for a recommendation to be made, but the user should be able to make decisions that don't follow the recommendation for advanced and exceptional situations. If not, we would be back in the situation with classic pyiron where power users really struggled with flexibility.

I don't know if I understand your objection here. Caching is very fundamentally the idea that "if my input is the same, don't rerun, just give my existing result". If running would generates random information, then you either need to explicitly include the seed for that random information in order to avoid a cache making you skip over it, or you need to disable caching. There's no strong-arming of node designers here, this is just a fundamental part of how the information management works.

@liamhuber
Copy link
Member

But then we would be differentiating between run and pull which is not very intuitive for the user without doing a deep dive of the documentation. Especially for the GUI, and especially for users without coding experience, I would really like that most things are self-explanatory without having to learn the syntax of pyiron_workflow, it's underlying philosophies or know about advanced workflow management.

Right now there is a local "run" button, which means "run everything upstream and then run this", and a global "run" button which means "run everything from the start".

That's fine, but it is fundamentally more restrictive than the back-end supports.

I think it would be worthwhile for the node-specific buttons to be something like

  • pull: "run everything before this then run this"
  • push: "run this then run everything after it"

We don't have to call it pull and push, we can call it whatever is most understandable for users. But IMO it is not beyond our user base to grasp those two modes, and I think it would be useful to offer both to them.

@liamhuber
Copy link
Member

For the current issue, the straightforward answer for all this seems to be just replace the node itself. So I don't know why we should bother with the cache.

Replacing the entire node seems like massive back-end overkill just to get the colour status working, but you can resolve it how you like.

@Tara-Lakshmipathy
Copy link
Contributor

It's not just the colors as this gif demonstrates:

Image

It's just that the "run" button on a node usually executes all previous nodes on a fresh workflow. But when results are cached, it does not run upstream nodes, even if upstream nodes are reset (irrespective of change in inputs). So, it runs upstream nodes only if the input changes. This makes the "reset" and the definition of the "run" buttons a bit misleading IMO when used together. A reset should mean just reset, not clear cache but don't rerun upstream nodes on clicking "run" on a downstream node because inputs didn't change on upstream nodes.

@Tara-Lakshmipathy
Copy link
Contributor

Also, doing node._cached_inputs = {} feels like mucking around in pyiron_workflow's private stuff which the GUI should have no business changing. BTW, is there any other way to clear the cache of a node (not during node definition but in the workflow execution)?

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 12, 2025

I think it would be worthwhile for the node-specific buttons to be something like

  • pull: "run everything before this then run this"
  • push: "run this then run everything after it"

We don't have to call it pull and push, we can call it whatever is most understandable for users. But IMO it is not beyond our user base to grasp those two modes, and I think it would be useful to offer both to them.

Now "push" is an idea I can totally get behind. This was the first thing I tried to implement myself in pyiron_workflow about 8 months back since it was mentioned by some continuum level people as a desirable feature. Run a node and all affected downstream nodes on change. Makes all the sense in the world. Didn't manage to get it working because I couldn't figure out how the toplogy module worked 😅(in my defense, I was still very new to workflows in general as a developer back then, and pyiron_workflow is not the easiest package to learn 😝). Is it now a feature in the package?

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 12, 2025

I don't know if I understand your objection here. Caching is very fundamentally the idea that "if my input is the same, don't rerun, just give my existing result". If running would generates random information, then you either need to explicitly include the seed for that random information in order to avoid a cache making you skip over it, or you need to disable caching. There's no strong-arming of node designers here, this is just a fundamental part of how the information management works.

No, what I mean is that some software are set up for statistical stuff that perform iterations based on internally generated seeds. E.g., in mesoscopic modelling, one could perform many iterations of a simulation, each iteration having a seed for some stochastic algorithm within. Then the series of stochastic results are analysed together to identify trends and scaling laws. There can be 100s of iterations for a single type of simulation. So specifying a seed for each of them would be difficult. Usually, the programs generate seeds internally and just log it in the output. So, you could reproduce any single iteration you wanted to. But in the normal situation, the seed would be generated by the program itself (e.g., based on a combination of the system time and the mac address or something) for convenience.

@liamhuber
Copy link
Member

No, what I mean is that some software are set up for statistical stuff that perform iterations based on internally generated seeds. E.g., in mesoscopic modelling, one could perform many iterations of a simulation, each iteration having a seed for some stochastic algorithm within. Then the series of stochastic results are analysed together to identify trends and scaling laws. There can be 100s of iterations for a single type of simulation. So specifying a seed for each of them would be difficult. Usually, the programs generate seeds internally and just log it in the output. So, you could reproduce any single iteration you wanted to. But in the normal situation, the seed would be generated by the program itself (e.g., based on a combination of the system time and the mac address or something) for convenience.

Automatically supplying random seeds is very convenient, but to be frank, if there is a tool that doesn't allow us to control the seed then I am not interested in using it. If we are serious about reproducible workflows, then we need to at least allow the seed to be specified.

I also don't thing it's a problem at all to provide 100 seeds when looping over 100 copies of a stochastic node -- just expose the seed as looped input on the node and pass it a list of seeds.

I don't see any fundamental limitation or problem with how the back-end handles randomness:

import numpy as np
import pyiron_workflow as pwf

@pwf.as_function_node(use_cache=False)
def NumpyRandom(size, seed=None):
    rng = np.random.default_rng(seed)
    random = rng.random(size)
    return random

print("New instances always get their own seed")
for i in range(3):
    n = NumpyRandom()
    print(n(4))

print("Repeated runs _can_ get their own seed (with default seed=None)")
n = NumpyRandom()
for i in range(3):
    print(n(4))

print("Or can always get the same thing for the same seed")
n = NumpyRandom()
for i in range(3):
    print(n(4, seed=42))

print("If the seed were _required_, we would be able to safely turn caching back on")

@liamhuber
Copy link
Member

liamhuber commented Mar 12, 2025

Now "push" is an idea I can totally get behind. This was the first thing I tried to implement myself in pyiron_workflow about 8 months back since it was mentioned by some continuum level people as a desirable feature. Run a node and all affected downstream nodes on change. Makes all the sense in the world. Didn't manage to get it working because I couldn't figure out how the toplogy module worked 😅(in my defense, I was still very new to workflows in general as a developer back then, and pyiron_workflow is not the easiest package to learn 😝). Is it now a feature in the package?

Ahhhh, this is 100% my fault. I thought it was a feature! But you're right, (at least!) since changes so the parent manages the signal transmission, it does not propagate -- run in the context of a parent runs exactly that node and nothing downstream. The feature should have been there because it's just the commented bit here:

        if emit_ran_signal:
            if self.parent is None:  # or not self.parent.running:
                self.emit()
            else:
                self.parent.register_child_emitting(self)

🤦‍♂

I opened a PR to fix this and there you get behaviour that I'm perfectly happy with:

import numpy as np
import pyiron_workflow as pwf

@pwf.as_function_node(use_cache=False)
def NumpyRandom(seed=None):
    rng = np.random.default_rng(seed)
    random = rng.random(1)[0]
    print("Random number:", random)
    return random

@pwf.as_function_node
def Add(a, b):
    print(f"Adding {a} and {b}")
    sum = a + b
    return sum

wf = pwf.Workflow("push_pull_example")
wf.rng = NumpyRandom()
wf.first = Add(wf.rng, 1)
wf.second = Add(wf.first, -2)

print("\twf() = ", wf())
print("----------------")
print("This will cache hit")
print("\twf.first.run() = ", wf.first.run())
print("----------------")
print("\twf.first.pull() = ", wf.first.pull())
print("----------------")
print("\twf.first.run(b=10) = ", wf.first.run(b=10))

produces

Random number: 0.8291983693373216
Adding 0.8291983693373216 and 1
Adding 1.8291983693373215 and -2
	wf() =  {'second__sum': -0.1708016306626785}
----------------
This will cache hit
	wf.first.run() =  1.8291983693373215
----------------
Random number: 0.9303779096926914
Adding 0.9303779096926914 and 1
	wf.first.pull() =  1.9303779096926914
----------------
Adding 0.9303779096926914 and 10
Adding 10.930377909692691 and -2
	wf.first.run(b=10) =  10.930377909692691

Edit: I hadn't copied the full cell output on my first paste

@liamhuber
Copy link
Member

Also, doing node._cached_inputs = {} feels like mucking around in pyiron_workflow's private stuff which the GUI should have no business changing. BTW, is there any other way to clear the cache of a node (not during node definition but in the workflow execution)?

Yeah, I mentioned to @pmrv somewhere that I should probably supply a public cache clearing method. Honestly though, this deep in our conversation, my overall sentiment is that you are mucking around in the innards, and you indeed shouldn't be. I hope with the subsequent conversation on RNG management and with patching so both pull and push are possible that my perspective is at least increasingly persuasive even if you're not yet totally sold.

@liamhuber
Copy link
Member

It's just that the "run" button on a node usually executes all previous nodes on a fresh workflow. But when results are cached, it does not run upstream nodes, even if upstream nodes are reset (irrespective of change in inputs). So, it runs upstream nodes only if the input changes. This makes the "reset" and the definition of the "run" buttons a bit misleading IMO when used together. A reset should mean just reset, not clear cache but don't rerun upstream nodes on clicking "run" on a downstream node because inputs didn't change on upstream nodes.

I've been addressing these in reverse order, so perhaps this is a little redundant now, but... yes, IMO the upstream stuff should only get rerun if there's some (any!) change in upstream data. I.e., whether or not "run" and "reset" are misleading is more of an issue for pyironflow than for pyiron_workflow.

@liamhuber
Copy link
Member

The feature should have been there because it's just the commented bit here:

Ok, unfortunately I'm very wrong about it being this straightforward: pyiron/pyiron_workflow#621 (comment)

@Tara-Lakshmipathy
Copy link
Contributor

I will respond by reversing your reverse order to restore balance to the force 😝

Ok, unfortunately I'm very wrong about it being this straightforward: pyiron/pyiron_workflow#621 (comment)

I have left a comment in the pull request based on my understanding of the developments.

whether or not "run" and "reset" are misleading is more of an issue for pyironflow than for pyiron_workflow

Agreed, that's why my suggestion was to replace the node here to make things clearer, but not necessary if we have a push feature.

I hope with the subsequent conversation on RNG management and with patching so both pull and push are possible that my perspective is at least increasingly persuasive even if you're not yet totally sold.

With push and pull, I think we have pretty much everything needed. Even if the caching behaved similar to replacing a node by brute forcing stuff, the push feature, IMO, would have still been desirable.

I thought it was a feature

I thought the feature did not exist till now because it was tricky and not a priority to implement. Then I felt really stupid for not understanding something so simple when you mentioned that just uncommenting a line would have done the job. Now i feel much better because it is not straightforward. Sorry that my relief comes at the cost of your additional effort 😂

Automatically supplying random seeds is very convenient, but to be frank, if there is a tool that doesn't allow us to control the seed then I am not interested in using it. If we are serious about reproducible workflows, then we need to at least allow the seed to be specified

Of course we need to allow the seed to be specified, never had anything against that. But if a software has a way to generate random seeds and logs it into the output (which is sufficient for reproducibility), then we should also allow for the possibility for the seed not to be specified (e.g., a field with default None) and yet allow the user to reset the cache and run things. If someone really wants to reproduce one iteration in a statistical study, then the seed input field can be filled from the output logs of a previous run. It is also important to me that running nodes after clearing a cache is not done individually since this would be very tedious in the GUI. This is addressed by the push feature, and I would be happy to use it, or an equivalent run if you get it to work when the parent is a Workflow.

@liamhuber
Copy link
Member

But if a software has a way to generate random seeds and logs it into the output (which is sufficient for reproducibility), then we should also allow for the possibility for the seed not to be specified (e.g., a field with default None) and yet allow the user to reset the cache and run things.

I think my point is rather that individual node designers who allow for internal RNG within a node ought to make non-caching the default for that node. It shouldn't be up to individual users to remember to clear caches. However, I think it's ok to leave it up to individual users to unlock performance enhancements by turning caching back on on a case-by-case basis where they're then responsible to remember that they need to provide a seed explicitly. Basically I'd like to make it so that users never need to clear the cache, and I think we have the tools for that now.

@Tara-Lakshmipathy
Copy link
Contributor

Tara-Lakshmipathy commented Mar 14, 2025

I think your proposed solution of toggling use_cache would be totally fine for the GUI from a conceptual perspective. We could replace the "reset" button with a checkbox for the boolean. This would be a special checkbox that is not part of the input arguments of the function, but is displayed in a different way to distinguish itself (not yet sure how).

But the issue is that setting use_cache=False seems to not store anything in the cache at all. I would prefer that the latest run is is always stored in the cache after the node completes, irrespective of the setting. Then, the cache is either used or ignored based on the setting. The reason for this is that a user may inadvertantly forget to switch the setting before the node is run.

E.g., I have an expensive stochastic node with use_cache=False. I add some post-processing nodes, forget to toggle the setting and hit pull. This would mean that the expensive stochastic node gets rerun.

You could say that this is the user's fault, which it is, but I think that is being too brutal. At least making sure that the latest run is cached and can be either used or ignored seems much "kinder" to me.

EDIT: Oh wait, I just remembered that pull won't trigger the upstream expensive node. Nevermind. I think having a checkbox would be fine for these situations.

But now my question: do we still need a reset button for nodes with use_cache=True when they fail like in #116. If yes, then wouldn't that make the checkbox redundant?

EDIT 2: Actually, I think the upstream node is indeed running when I hit pull.

@as_function_node("number", use_cache=False)
def Rand(lower_bound: int, upper_bound: int) -> int:
    import random
    n = random.randint(lower_bound, upper_bound)
    return n

@as_function_node("sum", use_cache=True)
def AddOne(x: int) -> int:
    y = x + 1
    return y

wf = Workflow("test")
wf.rand = Rand(0, 100)

wf.rand.pull()
# >> 4

wf.add_child(AddOne(wf.rand), "add")
wf.add.pull()
# >> 21

Then my previous issue stands: use_cache=False is currently too brutal since outputs are not stored at all in the cache. A simple misclick of pull instead of push would trigger a rerun of the previous node. It would be "kinder" to store them anyway and ignore them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants