Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility Issue with Medallion.Threading.Redis and Garnet #1098

Open
xsoheilalizadeh opened this issue Mar 13, 2025 · 11 comments · Fixed by #1075
Open

Compatibility Issue with Medallion.Threading.Redis and Garnet #1098

xsoheilalizadeh opened this issue Mar 13, 2025 · 11 comments · Fixed by #1075
Assignees

Comments

@xsoheilalizadeh
Copy link

xsoheilalizadeh commented Mar 13, 2025

Describe the bug

We attempted to use Garnet as a Redis replacement for our distributed locking mechanism using Medallion.Threading.Redis (version 1.0.3). However, we encountered an issue when trying to acquire a lock.

The same code works fine when using Redis, but fails when using Garnet.

NuGet Packages Used:

  • DistributedLock.Redis 1.0.3
  • StackExchange.Redis latest stable version

Observed Behavior (Error Logs)

14:13:03.2434: Connecting (sync) on .NET 9.0.2 (StackExchange.Redis: v2.7.27.49176)
14:13:03.2559: localhost
14:13:03.2587: localhost:6379/Interactive: Connecting...
14:13:03.2634: localhost:6379: BeginConnectAsync
14:13:03.2680: 1 unique nodes specified (with tiebreaker)
14:13:03.2688: localhost:6379: OnConnectedAsync init (State=Connecting)
14:13:03.2698: Allowing 1 endpoint(s) 00:00:05 to respond...
14:13:03.2711: Awaiting 1 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=3,Free=32764,Min=16,Max=32767), POOL: (Threads=5,QueuedItems=0,CompletedItems=5,Timers=0)
14:13:03.2720: localhost:6379/Interactive: Connected 
14:13:03.2737: localhost:6379: Server handshake
14:13:03.2798: localhost:6379: Setting client name: Mac-FXGL266(SE.Redis-v2.7.27.49176)
14:13:03.2878: localhost:6379: Setting client lib/ver
14:13:03.2894: localhost:6379: Auto-configuring...
14:13:03.2902: localhost:6379: Requesting tie-break (Key="__Booksleeve_TieBreak")...
14:13:03.2909: localhost:6379/Interactive: Writing: GET __Booksleeve_TieBreak
14:13:03.2911: localhost:6379: Sending critical tracer (handshake): ECHO
14:13:03.2911: localhost:6379/Interactive: Writing: ECHO
14:13:03.2911: localhost:6379: Flushing outbound buffer
14:13:03.2912: localhost:6379: OnEstablishingAsync complete
14:13:03.2912: localhost:6379: Starting read
14:13:03.2961: localhost:6379: Auto-configured (CLIENT) connection-id: 38
14:13:03.2996: localhost:6379: Auto-configured (CONFIG) read-only replica: false
14:13:03.2997: localhost:6379: Auto-configured (CONFIG) databases: 1
14:13:03.3001: localhost:6379: Auto-configured (INFO) role: primary
14:13:03.3006: localhost:6379: Auto-configured (INFO) version: 7.2.5
14:13:03.3012: localhost:6379: Auto-configured (INFO) server-type: Standalone
14:13:03.3013: Response from localhost:6379/Interactive / GET __Booksleeve_TieBreak: (null)
14:13:03.3015: Response from localhost:6379/Interactive / ECHO: BulkString: 16 bytes
14:13:03.3017: localhost:6379: OnConnectedAsync completed (From command: ECHO)
14:13:03.3021: All 1 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=3,Free=32764,Min=16,Max=32767), POOL: (Threads=7,QueuedItems=0,CompletedItems=53,Timers=0)
14:13:03.3022: Endpoint summary:
14:13:03.3027:   localhost:6379: Endpoint is (Interactive: ConnectedEstablished, Subscription: ConnectedEstablished)
14:13:03.3027: Task summary:
14:13:03.3028:   localhost:6379: Returned with success as Standalone primary (Source: From command: ECHO)
14:13:03.3036: Election summary:
14:13:03.3037:   Election: localhost:6379 had no tiebreaker set
14:13:03.3037:   Election: Single primary detected: localhost:6379
14:13:03.3037: localhost:6379: Clearing as RedundantPrimary
14:13:03.3039: Endpoint Summary:
14:13:03.3043:   localhost:6379: Standalone v7.2.5, primary; 1 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: ConnectedEstablished, 1 active
14:13:03.3054:   localhost:6379: int ops=13, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=6, qu=0, qs=0, qc=0, wr=0, subs=1, socks=1
14:13:03.3123:   localhost:6379: Circular op-count snapshot; int: 0+13=13 (1,30 ops/s; spans 10s); sub: 0+6=6 (0,60 ops/s; spans 10s)
14:13:03.3127: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
14:13:03.3129: Starting heartbeat...
14:13:03.3134: Total connect time: 70 ms
Lock acquired
Unhandled exception. System.AggregateException: One or more errors occurred. (Timeout performing EVAL (5000ms), next: SCRIPT, inst: 0, qu: 0, qs: 2, aw: False, bw: Inactive, rs: ReadAsync, ws: Idle, in: 0, in-pipe: 0, out-pipe: 0, last-in: 2, cur-in: 0, sync-ops: 2, async-ops: 0, serverEndpoint: localhost:6379, conn-sec: 5,09, aoc: 1, mc: 1/1/0, clientName: Mac-FXGL266(SE.Redis-v2.7.27.49176), IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=1,Free=32766,Min=16,Max=32767), POOL: (Threads=7,QueuedItems=0,CompletedItems=75,Timers=1), v: 2.7.27.49176 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts))
 ---> StackExchange.Redis.RedisTimeoutException: Timeout performing EVAL (5000ms), next: SCRIPT, inst: 0, qu: 0, qs: 2, aw: False, bw: Inactive, rs: ReadAsync, ws: Idle, in: 0, in-pipe: 0, out-pipe: 0, last-in: 2, cur-in: 0, sync-ops: 2, async-ops: 0, serverEndpoint: localhost:6379, conn-sec: 5,09, aoc: 1, mc: 1/1/0, clientName: Mac-FXGL266(SE.Redis-v2.7.27.49176), IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=1,Free=32766,Min=16,Max=32767), POOL: (Threads=7,QueuedItems=0,CompletedItems=75,Timers=1), v: 2.7.27.49176 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server, T defaultValue) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 2105
   at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server, T defaultValue) in /_/src/StackExchange.Redis/RedisBase.cs:line 62
   at StackExchange.Redis.RedisDatabase.ScriptEvaluate(String script, RedisKey[] keys, RedisValue[] values, CommandFlags flags) in /_/src/StackExchange.Redis/RedisDatabase.cs:line 1519
   at StackExchange.Redis.LuaScript.Evaluate(IDatabase db, Object ps, Nullable`1 withKeyPrefix, CommandFlags flags) in /_/src/StackExchange.Redis/LuaScript.cs:line 151
   at StackExchange.Redis.RedisDatabase.ScriptEvaluate(LuaScript script, Object parameters, CommandFlags flags) in /_/src/StackExchange.Redis/RedisDatabase.cs:line 1536
   at Medallion.Threading.Redis.Primitives.RedisScript`1.Execute(IDatabase database, TArgument argument, Boolean fireAndForget) in /_/src/DistributedLock.Redis/Primitives/RedisScript.cs:line 13
   at Medallion.Threading.Redis.Primitives.RedisMutexPrimitive.Release(IDatabase database, Boolean fireAndForget) in /_/src/DistributedLock.Redis/Primitives/RedisMutexPrimitive.cs:line 30
   at Medallion.Threading.Redis.RedLock.RedLockRelease.ReleaseAsync() in /_/src/DistributedLock.Redis/RedLock/RedLockRelease.cs:line 56
   --- End of inner exception stack trace ---
   at Medallion.Threading.Redis.RedLock.RedLockRelease.ReleaseAsync() in /_/src/DistributedLock.Redis/RedLock/RedLockRelease.cs:line 66
   at Medallion.Threading.Redis.RedLock.RedLockHandle.DisposeAsync() in /_/src/DistributedLock.Redis/RedLock/RedLockHandle.cs:line 46
   at Medallion.Threading.Internal.SyncViaAsync.<>c__3`1.<<Run>b__3_0>d.MoveNext() in /_/src/DistributedLock.Core/Internal/SyncViaAsync.cs:line 31
--- End of stack trace from previous location ---
   at Medallion.Threading.Internal.SyncViaAsync.Run[TState,TResult](Func`2 action, TState state) in /_/src/DistributedLock.Core/Internal/SyncViaAsync.cs:line 61
   at Medallion.Threading.Internal.SyncViaAsync.Run[TState](Func`2 action, TState state) in /_/src/DistributedLock.Core/Internal/SyncViaAsync.cs:line 28
   at Medallion.Threading.Internal.SyncViaAsync.DisposeSyncViaAsync[TDisposable](TDisposable disposable) in /_/src/DistributedLock.Core/Internal/SyncViaAsync.cs:line 100
   at Medallion.Threading.Redis.RedLock.RedLockHandle.Dispose() in /_/src/DistributedLock.Redis/RedLock/RedLockHandle.cs:line 35
   at Medallion.Threading.Redis.RedisDistributedLockHandle.Dispose() in /_/src/DistributedLock.Redis/RedisDistributedLockHandle.cs:line 26

Steps to reproduce the bug

Garnet Configuration

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: garnet
  namespace: garnet
spec:
  interval: 5m
  chart:
    spec:
      chart: garnet
      version: '0.2.0'
      sourceRef:
        kind: HelmRepository
        name: garnet
        namespace: flux-system
      interval: 60m
  values:
    image:
      tag: 1
    config:
      garnetConf: |
        {
          "EnableLua": true,
        }

Client

using Medallion.Threading;
using Medallion.Threading.Redis;
using StackExchange.Redis;

var config = ConfigurationOptions.Parse("localhost");
config.SocketManager = SocketManager.ThreadPool;

var connection = ConnectionMultiplexer.Connect(config, Console.Out);
var db = connection.GetDatabase();

var dLock = new RedisDistributedSynchronizationProvider(db);

var aLock = dLock.AcquireLock("123_lock");

Console.WriteLine("Lock acquired");

aLock.Dispose();

Console.WriteLine("Lock released");

Console.ReadKey();

Garnet Version

Garnet 1.0.59
@xsoheilalizadeh
Copy link
Author

xsoheilalizadeh commented Mar 18, 2025

@kevin-montrose, I tested it locally with your branch and it seems to resolve the issue.

#1075

@badrishc badrishc linked a pull request Mar 18, 2025 that will close this issue
14 tasks
@xsoheilalizadeh
Copy link
Author

xsoheilalizadeh commented Mar 18, 2025

I have tested it on main branch and it still works locally (Mac OS)! it seems that the issue appears when I run the server on a container.

@xsoheilalizadeh
Copy link
Author

xsoheilalizadeh commented Mar 18, 2025

I wouldn't close the issue because the issue persists when running garnet on a container.
@kevin-montrose

@badrishc
Copy link
Collaborator

Once the container for latest main auto-releases in at most an hour, can you retry?

@xsoheilalizadeh
Copy link
Author

I still have the issue with latest version running on a container.

$ docker run -p 6379:6379 ghcr.io/microsoft/garnet:1.0.60 --lua
    _________
   /_||___||_\      Garnet 1.0.60 64 bit; standalone mode
   '. \   / .'      Listening on: 0.0.0.0:6379
     '.\ /.'        https://aka.ms/GetGarnet
       '.'

* Ready to accept connections

@badrishc
Copy link
Collaborator

I still have the issue with latest version running on a container.

can you confirm you used latest and not 1.0.60? https://github.com/microsoft/garnet/pkgs/container/garnet/377009078?tag=latest

@kevin-montrose
Copy link
Contributor

Worth noting I cannot reproduce this on main.

docker run -d -p 8000:6379 ghcr.io/microsoft/garnet --lua

And a linqpad script of

void Main()
{
	var config = ConfigurationOptions.Parse("localhost:8000");
	config.SocketManager = SocketManager.ThreadPool;

	var connection = ConnectionMultiplexer.Connect(config, Console.Out);
	var db = connection.GetDatabase();

	var dLock = new RedisDistributedSynchronizationProvider(db);

	var aLock = dLock.AcquireLock("123_lock");

	Console.WriteLine("Lock acquired");

	aLock.Dispose();

	Console.WriteLine("Lock released");
}

Produces:

19:02:52.6634: Connecting (sync) on .NET 7.0.20 (StackExchange.Redis: v2.8.31.52602)
19:02:52.6635: localhost:8000
19:02:52.6636: localhost:8000/Interactive: Connecting...
19:02:52.6638: localhost:8000: BeginConnectAsync
19:02:52.6744: 1 unique nodes specified (with tiebreaker)
19:02:52.6745: localhost:8000: OnConnectedAsync init (State=Connecting)
19:02:52.6745: Allowing 1 endpoint(s) 00:00:05 to respond...
19:02:52.6746: Awaiting 1 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=2,Free=32765,Min=12,Max=32767), POOL: (Threads=4,QueuedItems=1,CompletedItems=413,Timers=1)
19:02:52.6901: localhost:8000/Interactive: Connected 
19:02:52.7071: localhost:8000: Server handshake
19:02:52.7147: localhost:8000: Setting client name: DESKTOP-ELRJVBQ(SE.Redis-v2.8.31.52602)
19:02:52.7273: localhost:8000: Setting client lib/ver
19:02:52.7302: localhost:8000: Auto-configuring...
19:02:52.7316: localhost:8000: Requesting tie-break (Key="__Booksleeve_TieBreak")...
19:02:52.7324: localhost:8000/Interactive: Writing: GET __Booksleeve_TieBreak
19:02:52.7328: localhost:8000: Sending critical tracer (handshake): ECHO
19:02:52.7328: localhost:8000/Interactive: Writing: ECHO
19:02:52.7328: localhost:8000: Flushing outbound buffer
19:02:52.7329: localhost:8000: OnEstablishingAsync complete
19:02:52.7329: localhost:8000: Starting read
19:02:52.7918: localhost:8000: Auto-configured (CLIENT) connection-id: 1
19:02:52.7929: localhost:8000: Auto-configured (CONFIG) read-only replica: false
19:02:52.7931: localhost:8000: Auto-configured (CONFIG) databases: 1
19:02:52.7941: localhost:8000: Auto-configured (INFO) role: primary
19:02:52.7947: localhost:8000: Auto-configured (INFO) version: 7.2.5
19:02:52.7957: localhost:8000: Auto-configured (INFO) server-type: Standalone
19:02:52.7961: Response from localhost:8000/Interactive / GET __Booksleeve_TieBreak: (null)
19:02:52.7965: Response from localhost:8000/Interactive / ECHO: BulkString: 16 bytes
19:02:52.7967: localhost:8000: OnConnectedAsync completed (From command: ECHO)
19:02:52.7967: All 1 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=2,Free=32765,Min=12,Max=32767), POOL: (Threads=7,QueuedItems=0,CompletedItems=449,Timers=0)
19:02:52.7967: Endpoint summary:
19:02:52.7968:   localhost:8000: Endpoint is (Interactive: ConnectedEstablished, Subscription: ConnectedEstablished)
19:02:52.7968: Task summary:
19:02:52.7970:   localhost:8000: Returned with success as Standalone primary (Source: From command: ECHO)
19:02:52.7971: Election summary:
19:02:52.7971:   Election: localhost:8000 had no tiebreaker set
19:02:52.7971:   Election: Single primary detected: localhost:8000
19:02:52.7971: localhost:8000: Clearing as RedundantPrimary
19:02:52.7971: Endpoint Summary:
19:02:52.7972:   localhost:8000: Standalone v7.2.5, primary; 1 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: ConnectedEstablished, 1 active
19:02:52.7972:   localhost:8000: int ops=12, qu=0, qs=1, qc=0, wr=0, socks=1; sub ops=6, qu=0, qs=0, qc=0, wr=0, subs=1, socks=1
19:02:52.7974:   localhost:8000: Circular op-count snapshot; int: 0+13=13 (1.30 ops/s; spans 10s); sub: 0+6=6 (0.60 ops/s; spans 10s)
19:02:52.7974: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
19:02:52.7975: Starting heartbeat...
19:02:52.7982: Total connect time: 134 ms
Lock acquired
Lock released

@xsoheilalizadeh
Copy link
Author

I'm running it on another device and still have the same issue

Screen.Recording.2025-03-19.at.20.51.47.mp4

@badrishc
Copy link
Collaborator

badrishc commented Mar 20, 2025

Can you retry with server running on --logger-level Trace and post the server-side logs?

@xsoheilalizadeh
Copy link
Author

$ docker run -p 6379:6379 ghcr.io/microsoft/garnet:latest --lua --logger-level Trace
    _________
   /_||___||_\      Garnet 1.0.59 64 bit; standalone mode
   '. \   / .'      Listening on: 0.0.0.0:6379
     '.\ /.'        https://aka.ms/GetGarnet
       '.'

07::07::56 info: GarnetServer[0] Garnet 1.0.59 64 bit; standalone mode; Endpoint: 0.0.0.0:6379
07::07::56 info: ArgParser[0] Configuration import from embedded resource succeeded. Path: defaults.conf.
07::07::56 info: ArgParser[0] Configuration file path not specified. Using default values with command-line switches.
07::07::56 info: Options[0] [Store] Using page size of 32m
07::07::56 info: Options[0] [Store] Using log memory size of 16g
07::07::56 info: Options[0] [Store] There are 512 log pages in memory
07::07::56 info: Options[0] [Store] Using disk segment size of 1g
07::07::56 info: Options[0] [Store] Using hash index size of 128m (2m cache lines)
07::07::56 info: Options[0] [Store] Hash index size is optimized for up to ~8m distinct keys
07::07::56 info: Options[0] [Store] Using log mutable percentage of 90%
07::07::56 info: Options[0] [Store] Not using Revivification
07::07::56 trce: TsavoriteKV [main][0] KV Initialize size:2097152, sizeBytes:134217728 sectorSize:512 alignedSizeBytes:134218240
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Using page size of 1m
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Each page can hold ~43690 key-value pairs of objects
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Using log memory size of 32m
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] This can hold ~1398101 key-value pairs of objects in memory total
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] There are 32 log pages in memory
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Using disk segment size of 32m
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Using hash index size of 16m (256k cache lines)
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Hash index size is optimized for up to ~1m distinct keys
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Using log mutable percentage of 90%
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Heap memory size is unlimited
07::07::56 info: TsavoriteKV  [obj][0] [Object Store] Not using Revivification
07::07::56 trce: GarnetServer[0] TLS is disabled
07::07::56 info: GarnetServer[0] Total configured memory limit: 17364418560
* Ready to accept connections
07::08::05 dbug: GarnetServer[0] Accepted TCP connection from 192.168.65.1:45693
07::08::05 dbug: GarnetServer[0] Accepted TCP connection from 192.168.65.1:23439
07::08::05 dbug: Session[0] [192.168.65.1:23439] [03DD3515] Starting RespServerSession Id=1
07::08::05 dbug: Session[0] [192.168.65.1:45693] [01F9ED70] Starting RespServerSession Id=2
07::08::05 dbug: Session[0] [] [01F731FC] Starting RespServerSession Id=0
07::08::05 dbug: Session[0] [] [009A2F17] Starting RespServerSession Id=0
07::08::10 dbug: Session[0] [192.168.65.1:45693] [01F9ED70] Disposing RespServerSession Id=2
07::08::10 dbug: Session[0] [192.168.65.1:23439] [03DD3515] Disposing RespServerSession Id=1
07::08::10 dbug: Session[0] [] [01F731FC] Disposing RespServerSession Id=0
07::08::10 dbug: Session[0] [] [009A2F17] Disposing RespServerSession Id=0

@badrishc
Copy link
Collaborator

Not seeing anything of note in the logs at this point to make progress on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants