-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset node button does not update green status icon #104
Comments
For this to happen, I think |
Yeah, some variation of |
Hang on, when I look here, I see that both Then we could have return {
'id': node.label,
'data': {
'label': label,
'source_labels': list(node.outputs.channel_dict.keys()),
'target_labels': list(node.inputs.channel_dict.keys()),
'import_path': get_import_path(node),
'target_values': get_node_values(node.inputs.channel_dict),
'target_types': get_node_types(node.inputs),
'source_values': get_node_values(node.outputs.channel_dict),
'source_types': get_node_types(node.outputs),
'failed': str(node.failed),
'running': str(node.running),
'ready': str(node.outputs.ready),
'cache_hit': str(node.cache_hit), # --------> Add this line here
'python_object_id': id(node),
} Then CustomNode.jsx can be changed to something like this: const renderLabel = (label, failed, running, ready) => {
let status = '';
if (failed === "True") {
status = '🟥 ';
} else if (running === "True") {
status = '🟨 ';
} else if ((ready === "True") && (cache_hit === "True")) {
status = '🟩 ';
} else {
status = '⬜ ';
} This could probably also be used somehow address #105, haven't yet figured out how. |
The above gif seems to be a wf = Workflow("test")
wf.add = example_nodes.Addition(1, 3)
wf.subtract = example_nodes.Subtraction(wf.add, 1)
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False
#---------- So far so good -------------
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False
#---------- Why is cache_hit still False? -------------
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False
#---------- Reset and try again -------------
wf()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True
#---------- Why is cache hit true now but was false with pull? -------------
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> False
#---------- Reset and try again -------------
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
>> True
>> True
#---------- Why is cache hit true now even with pull? ------------- @liamhuber, not sure what is going on here... |
It's happening because we, ultimately, are still running the parent tree: And you haven't actually changed any input, so you're just getting the cached result. You're not even re-running I show how we could do it here: pyiron/pyiron_workflow#618, but as stated I don't actually think we should. Also, consider your knuckles wrapped for posting a non-running example 😝 This example will make it easier to see what's happening, both with and without the above PR: import pyiron_workflow as pwf
@pwf.as_function_node("add")
def Add(obj, other):
print("Adding", obj, other)
return obj + other
@pwf.as_function_node("subtract")
def Subtract(obj, other):
print("Subtracting", obj, other)
return obj - other
wf = pwf.Workflow("test")
# wf.add = pwf.standard_nodes.Add(1, 3)
# wf.subtract = pwf.standard_nodes.Subtract(wf.add, 1)
wf.add = Add(1, 3)
wf.subtract = Subtract(wf.add, 1)
# wf.subtract.use_cache = False
# We could force individual children to run if we wanted, but that's not the point
print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True
print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- So far so good -------------
print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- Why is cache_hit still False? -------------
print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> False
#---------- Reset and try again -------------
print("wf()")
wf()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True
#---------- Why is cache hit true now but was false with pull? -------------
print("wf.add._cached_inputs = {}")
wf.add._cached_inputs = {}
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True)
# >> False
#---------- Reset and try again -------------
print("wf.subtract.pull()")
wf.subtract.pull()
print(wf.add.ready)
print(wf.add.cache_hit)
# >> True
# >> True |
There is perhaps a more graceful solution that gets your desired behaviour more robustly than the PR I linked: pyiron/pyiron_workflow#618 (comment) |
Yeah, the second solution here I am open to, and it also gets the behaviour you are wanting: |
I think it comes down to what "clearing the cache" or "resetting a node" means. For me it means recompute this node (irrespective of whether the input has changed or not) and all downstream nodes that are affected. So when I hit pull on a downstream node, I expect any upstream node which has been reset to be recomputed irrespective of inputs. |
The solution in pyiron/pyiron_workflow#619 also seems fine to me. I don't mind either this or replacing a node. |
No, I think it's more meaningful than that. For instance in pyiron/pyiron_workflow#619 we really take a different stance on the following question: can a graph exploit its cache if any of its child nodes are not cached? This has serious implications for nodes that, e.g., exploit RNG and always set their class-level In terms of solving the problem at hand, I realized that I'm not sure why running the data tree needs to go via the parent to begin with. Simply brute-force executing the nodes in the data tree is more elegant and gets the behaviour you're looking for: pyiron/pyiron_workflow#620. However, it made a couple tests fail and -- as I just said I'm not sure why running the data tree needs to go via the parent. But I clearly used to have an opinion on this, because that's how it's coded. So I don't really want to break that until I can conclusively say whether past-me or right-now-me has a better take on this issue. |
One of the main motivations for clearing the cache is for internally stochastic nodes right? So, I think what is needed for them is to really break the philosophy of caching in But there could be occasions where I don't want the random integer node to generate a new number. Instead I just want to view the results of nodes. In the first case, hitting pull should actually rerun the node and this is what I expect from pull. In the second case, I just want to view the result, and for this we can have different button called "view" or something. Basically, the current behaviour for RNG situations is more like the view option. So, I think something somewhere has to change to account for these two different scenarios. Sorry, I can't provide a code snippet right now, travelling somewhere today. |
The issue I have with setting Basically, give the user the power to decide when the node has to be rerun and when it fetches from a cache. |
Example with schematic:
Here But if add a node: Then when I hit pull on |
If you have a serious stochastic node, I would always recommend explicitly passing a seed variable |
For inexpensive analysis following expensive data, it should be possible to |
Then you are placing restrictions on node design. Should every MD or MC node come with a seed input? And there may be some software that implicitly assume randomness and do not provide easy mechanisms for setting a seed. In any case, I don't think we should make assumptions on what nodes will look like and what are the capabilities of the nodes. I am strongly against the philosophy of the workflow manager forcing users and node designers into decisions. I think it's ok for a recommendation to be made, but the user should be able to make decisions that don't follow the recommendation for advanced and exceptional situations. If not, we would be back in the situation with classic pyiron where power users really struggled with flexibility. |
But then we would be differentiating between run and pull which is not very intuitive for the user without doing a deep dive of the documentation. Especially for the GUI, and especially for users without coding experience, I would really like that most things are self-explanatory without having to learn the syntax of For the current issue, the straightforward answer for all this seems to be just replace the node itself. So I don't know why we should bother with the cache. EDIT: Playing around with the cache needs 5 things:
Replacing the node needs only 2 things:
Only drawback as far as I can tell is that the executor has to be recreated, but necessary information can be extracted before the replacement and everything can be accomplished in the background. And to be frank, brute forcing the node as you proposed in your pull request (pyiron/pyiron_workflow#620) seems functionally similar to just replacing the node anyway. The advantage is that stuff attached to the node e.g., the executor need not be recreated. |
I don't know if I understand your objection here. Caching is very fundamentally the idea that "if my input is the same, don't rerun, just give my existing result". If running would generates random information, then you either need to explicitly include the seed for that random information in order to avoid a cache making you skip over it, or you need to disable caching. There's no strong-arming of node designers here, this is just a fundamental part of how the information management works. |
Right now there is a local "run" button, which means "run everything upstream and then run this", and a global "run" button which means "run everything from the start". That's fine, but it is fundamentally more restrictive than the back-end supports. I think it would be worthwhile for the node-specific buttons to be something like
We don't have to call it pull and push, we can call it whatever is most understandable for users. But IMO it is not beyond our user base to grasp those two modes, and I think it would be useful to offer both to them. |
Replacing the entire node seems like massive back-end overkill just to get the colour status working, but you can resolve it how you like. |
It's not just the colors as this gif demonstrates: It's just that the "run" button on a node usually executes all previous nodes on a fresh workflow. But when results are cached, it does not run upstream nodes, even if upstream nodes are reset (irrespective of change in inputs). So, it runs upstream nodes only if the input changes. This makes the "reset" and the definition of the "run" buttons a bit misleading IMO when used together. A reset should mean just reset, not clear cache but don't rerun upstream nodes on clicking "run" on a downstream node because inputs didn't change on upstream nodes. |
Also, doing |
Now "push" is an idea I can totally get behind. This was the first thing I tried to implement myself in |
No, what I mean is that some software are set up for statistical stuff that perform iterations based on internally generated seeds. E.g., in mesoscopic modelling, one could perform many iterations of a simulation, each iteration having a seed for some stochastic algorithm within. Then the series of stochastic results are analysed together to identify trends and scaling laws. There can be 100s of iterations for a single type of simulation. So specifying a seed for each of them would be difficult. Usually, the programs generate seeds internally and just log it in the output. So, you could reproduce any single iteration you wanted to. But in the normal situation, the seed would be generated by the program itself (e.g., based on a combination of the system time and the mac address or something) for convenience. |
Automatically supplying random seeds is very convenient, but to be frank, if there is a tool that doesn't allow us to control the seed then I am not interested in using it. If we are serious about reproducible workflows, then we need to at least allow the seed to be specified. I also don't thing it's a problem at all to provide 100 seeds when looping over 100 copies of a stochastic node -- just expose the seed as looped input on the node and pass it a list of seeds. I don't see any fundamental limitation or problem with how the back-end handles randomness: import numpy as np
import pyiron_workflow as pwf
@pwf.as_function_node(use_cache=False)
def NumpyRandom(size, seed=None):
rng = np.random.default_rng(seed)
random = rng.random(size)
return random
print("New instances always get their own seed")
for i in range(3):
n = NumpyRandom()
print(n(4))
print("Repeated runs _can_ get their own seed (with default seed=None)")
n = NumpyRandom()
for i in range(3):
print(n(4))
print("Or can always get the same thing for the same seed")
n = NumpyRandom()
for i in range(3):
print(n(4, seed=42))
print("If the seed were _required_, we would be able to safely turn caching back on") |
Ahhhh, this is 100% my fault. I thought it was a feature! But you're right, (at least!) since changes so the parent manages the signal transmission, it does not propagate -- if emit_ran_signal:
if self.parent is None: # or not self.parent.running:
self.emit()
else:
self.parent.register_child_emitting(self) 🤦♂ I opened a PR to fix this and there you get behaviour that I'm perfectly happy with: import numpy as np
import pyiron_workflow as pwf
@pwf.as_function_node(use_cache=False)
def NumpyRandom(seed=None):
rng = np.random.default_rng(seed)
random = rng.random(1)[0]
print("Random number:", random)
return random
@pwf.as_function_node
def Add(a, b):
print(f"Adding {a} and {b}")
sum = a + b
return sum
wf = pwf.Workflow("push_pull_example")
wf.rng = NumpyRandom()
wf.first = Add(wf.rng, 1)
wf.second = Add(wf.first, -2)
print("\twf() = ", wf())
print("----------------")
print("This will cache hit")
print("\twf.first.run() = ", wf.first.run())
print("----------------")
print("\twf.first.pull() = ", wf.first.pull())
print("----------------")
print("\twf.first.run(b=10) = ", wf.first.run(b=10)) produces
Edit: I hadn't copied the full cell output on my first paste |
Yeah, I mentioned to @pmrv somewhere that I should probably supply a public cache clearing method. Honestly though, this deep in our conversation, my overall sentiment is that you are mucking around in the innards, and you indeed shouldn't be. I hope with the subsequent conversation on RNG management and with patching so both pull and push are possible that my perspective is at least increasingly persuasive even if you're not yet totally sold. |
I've been addressing these in reverse order, so perhaps this is a little redundant now, but... yes, IMO the upstream stuff should only get rerun if there's some (any!) change in upstream data. I.e., whether or not "run" and "reset" are misleading is more of an issue for |
Ok, unfortunately I'm very wrong about it being this straightforward: pyiron/pyiron_workflow#621 (comment) |
I will respond by reversing your reverse order to restore balance to the force 😝
I have left a comment in the pull request based on my understanding of the developments.
Agreed, that's why my suggestion was to replace the node here to make things clearer, but not necessary if we have a
With
I thought the feature did not exist till now because it was tricky and not a priority to implement. Then I felt really stupid for not understanding something so simple when you mentioned that just uncommenting a line would have done the job. Now i feel much better because it is not straightforward. Sorry that my relief comes at the cost of your additional effort 😂
Of course we need to allow the seed to be specified, never had anything against that. But if a software has a way to generate random seeds and logs it into the output (which is sufficient for reproducibility), then we should also allow for the possibility for the seed not to be specified (e.g., a field with default None) and yet allow the user to reset the cache and run things. If someone really wants to reproduce one iteration in a statistical study, then the seed input field can be filled from the output logs of a previous run. It is also important to me that running nodes after clearing a cache is not done individually since this would be very tedious in the GUI. This is addressed by the |
I think my point is rather that individual node designers who allow for internal RNG within a node ought to make non-caching the default for that node. It shouldn't be up to individual users to remember to clear caches. However, I think it's ok to leave it up to individual users to unlock performance enhancements by turning caching back on on a case-by-case basis where they're then responsible to remember that they need to provide a seed explicitly. Basically I'd like to make it so that users never need to clear the cache, and I think we have the tools for that now. |
I think your proposed solution of toggling But the issue is that setting E.g., I have an expensive stochastic node with You could say that this is the user's fault, which it is, but I think that is being too brutal. At least making sure that the latest run is cached and can be either used or ignored seems much "kinder" to me. EDIT: Oh wait, I just remembered that pull won't trigger the upstream expensive node. Nevermind. I think having a checkbox would be fine for these situations. But now my question: do we still need a reset button for nodes with EDIT 2: Actually, I think the upstream node is indeed running when I hit pull. @as_function_node("number", use_cache=False)
def Rand(lower_bound: int, upper_bound: int) -> int:
import random
n = random.randint(lower_bound, upper_bound)
return n
@as_function_node("sum", use_cache=True)
def AddOne(x: int) -> int:
y = x + 1
return y
wf = Workflow("test")
wf.rand = Rand(0, 100)
wf.rand.pull()
# >> 4
wf.add_child(AddOne(wf.rand), "add")
wf.add.pull()
# >> 21 Then my previous issue stands: |
I'd like that it goes back to grey after the output cache is reset.
The text was updated successfully, but these errors were encountered: