Why `withRvcbm = true` is not compatible with `lsuL1Coherency = true`? #49

petrmikheev · 2025-03-24T10:17:59Z

I noticed that a few days ago CMO clean/flush/invalidate instruction support was added (Thank you! I was waiting for this functionality).
But why it is incompatible with lsuL1Coherency = true (assert at LsuL1Plugin.scala:961)? If I am not mistaken lsuL1Coherency means only that the core can accept external requests to flush specific cache lines. Not clear why CMO conflicts with it.

The text was updated successfully, but these errors were encountered:

Dolu1990 · 2025-03-24T10:40:53Z

I noticed that a few days ago CMO clean/flush/invalidate instruction support was added (Thank you! I was waiting for this functionality)

Thanks nlnet, they founded it ^^

But why it is incompatible with lsuL1Coherency = true (assert at LsuL1Plugin.scala:961)? If I am not mistaken lsuL1Coherency means only that the core can accept external requests to flush specific cache lines. Not clear why CMO conflicts with it.

Let's say you have a L2 cache, then the CMO should instead ask the L2 to clean / flush / invalidate the address, not the L1 anymore (as far as i understand it)

Also, if you have a coherent L2, you kinda expect the DMA to be coherent, and don't need anymore any software flush. All done via hardware.

petrmikheev · 2025-03-24T11:05:46Z

I have Tilelink CacheFiber as L2 cache. But if remove lsuL1Coherency = true I get the error:

[error] java.lang.IllegalArgumentException: requirement failed: Min transfer 1 > max transfer 0
[error] 	at scala.Predef$.require(Predef.scala:281)
[error] 	at spinal.lib.bus.tilelink.SizeRange.<init>(Parameters.scala:68)
[error] 	at spinal.lib.bus.tilelink.SizeRange$.upTo(Parameters.scala:61)
[error] 	at spinal.lib.bus.tilelink.coherent.CacheFiber$$anon$2.<init>(CacheFiber.scala:57)
[error] 	at spinal.lib.bus.tilelink.coherent.CacheFiber.$anonfun$logic$1(CacheFiber.scala:50)

I guess the cache expects that at least one master uses B, C, E channels.

Let's say you have a L2 cache, then the CMO should instead ask the L2 to clean / flush / invalidate the address, not the L1 anymore (as far as i understand it)

I don't understand. I have L2 cache (CacheFiber) in between of main RAM and everything else. There is no point to flush from L2 to RAM because nothing is connected to RAM directly.

I'd expect that CMO instructions should flush data from L1 cache to whatever is below. Could you please explain in more detail how it is supposed to work?

petrmikheev · 2025-03-24T11:17:03Z

Also, if you have a coherent L2, you kinda expect the DMA to be coherent, and don't need anymore any software flush. All done via hardware.

Let's consider a simple example without DMA. There is only lsuL1Bus and iBus, both connected to Tilelink Hub.
We change something at address X, then do fence.i and jump to X.
fence.i clears instruction L1 cache, but the modified data is still in lsu L1 cache.
When the Hub gets read request to X via iBus, it requests lsu L1 cache to flush this cache line (as I understand it is what lsuL1Coherency is needed for), and iBus gets correct data.

How it can work without lsuL1Coherency?

petrmikheev · 2025-03-24T19:24:23Z

Ah, I misunderstood what it is. I was interested mostly in cbo.zero instruction which is a separate extension zicboz. For zicbom lsuL1Coherency indeed makes no sense.

But still, is it normal that tilelink.coherent.CacheFiber doesn't work if lsuL1Coherency is disabled? Did you mean some other implementation of L2 cache?

We change something at address X, then do fence.i and jump to X.
...
How it can work without lsuL1Coherency?

I guess that additionally to fence.i software should explicitly call cbo.flush on every cache line when loading executable code to memory? Or now fence.i will flush the whole lsu L1 cache automatically?

Dolu1990 · 2025-03-25T09:54:57Z

Hi ^^

But still, is it normal that tilelink.coherent.CacheFiber doesn't work if lsuL1Coherency is disabled?

Yes, the L2 cache design doesn't support not having memory coherent masters. That would require some more work to remove a bunch of unused logic.

I guess that additionally to fence.i software should explicitly call cbo.flush on every cache line when loading executable code to memory? Or now fence.i will flush the whole lsu L1 cache automatically?

"This chapter defines the "Zifencei" extension, which includes the FENCE.I instruction that provides
explicit synchronization between writes to instruction memory and instruction fetches on the same
hart. Currently, this instruction is the only standard mechanism to ensure that stores visible to a hart
will also be visible to its instruction fetches"

=> if you are single core, you don't need cbo flush after fence.i
in non-coherent vexii fence.i flush the data cache automaticaly.

Dolu1990 · 2025-03-25T09:55:42Z

Did you mean some other implementation of L2 cache?

There is currently only one implementation, which require cpus to be memory coherent

petrmikheev closed this as completed Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why `withRvcbm = true` is not compatible with `lsuL1Coherency = true`? #49

Why `withRvcbm = true` is not compatible with `lsuL1Coherency = true`? #49

petrmikheev commented Mar 24, 2025

Dolu1990 commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

Dolu1990 commented Mar 25, 2025

Dolu1990 commented Mar 25, 2025

Why withRvcbm = true is not compatible with lsuL1Coherency = true? #49

Why withRvcbm = true is not compatible with lsuL1Coherency = true? #49

Comments

petrmikheev commented Mar 24, 2025

Dolu1990 commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

petrmikheev commented Mar 24, 2025

Dolu1990 commented Mar 25, 2025

Dolu1990 commented Mar 25, 2025

Why `withRvcbm = true` is not compatible with `lsuL1Coherency = true`? #49

Why `withRvcbm = true` is not compatible with `lsuL1Coherency = true`? #49