Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Error occurred when upgrading CN #20747

Closed
1 task done
DanielZhangQD opened this issue Dec 13, 2024 · 7 comments
Closed
1 task done

[Bug]: Error occurred when upgrading CN #20747

DanielZhangQD opened this issue Dec 13, 2024 · 7 comments
Assignees
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@DanielZhangQD
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

2.0-dev

Commit ID

v2.0.1-64e4751dc-2024-12-12

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

The following errors are reported in application when upgrading CN:

context canceled; sql: transaction has already been committed or rolled back
Error 1105 (HY000): EOF

Detailed application logs are available in https://grafana.matrixonecloud.cn/goto/ahr8LkIHg?orgId=1

Expected Behavior

No error returns

Steps to Reproduce

NA

Additional information

Cloud prod

@DanielZhangQD DanielZhangQD added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Dec 13, 2024
@DanielZhangQD DanielZhangQD added this to the 2.0.2 milestone Dec 13, 2024
@LeftHandCold
Copy link
Contributor

LeftHandCold commented Dec 13, 2024

内核的log是在哪? @DanielZhangQD

@DanielZhangQD
Copy link
Contributor Author

@LeftHandCold
Copy link
Contributor

@daviszhen 帮忙看一下吧

@qingxinhome
Copy link
Contributor

qingxinhome commented Dec 23, 2024

这个错误信息 sql: transaction has already been committed or rolled back 来自 Go 语言标准库中的 database/sql 包,具体是 ErrTxDone 错误。这个错误的含义是:你试图在一个已经提交或回滚的事务上执行操作(比如调用 Exec、Commit 或 Rollback),但是事务对象已经处于“已完成”的状态,不能再进行操作。

错误产生的原因:
在 Go 的 database/sql 包中,当你开始一个事务 (BeginTx),你可以在事务中执行多次 SQL 操作,直到调用 Commit 或 Rollback。一旦事务完成(即调用了 Commit 或 Rollback),事务对象就变为“已完成”状态。如果你在事务已提交或回滚之后再尝试对该事务对象执行任何操作,都会抛出 ErrTxDone 错误。

错误发生的常见场景:

  1. 在提交或回滚后执行DML操作
  2. 并发操作同一个事务对象
  3. 调用 Commit 或 Rollback 后再次访问事务对象

@qingxinhome
Copy link
Contributor

qingxinhome commented Dec 23, 2024

通过分析,导致transaction has already been committed or rolled back错误是由于mo内部cn-service.lockservice报错导致,
具体的错误信息是 EOF,可能与是内部连接中断导致

  1. cloud日志:采集一个statement_id
    image

https://grafana.matrixonecloud.cn/explore?panes=%7B%22AqZ%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-cloud%5C%22%7D%20%7C%3D%20%60ERROR%60%20%21%3D%20%60cronjob%20backupTask%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221734051600000%22,%22to%22:%221734055199000%22%7D%7D%7D&schemaVersion=1&orgId=1

CN 日志:根据statement_id找到对应的CN SQL日志, 并获取对应事务ID
image

https://grafana.matrixonecloud.cn/explore?panes=%7B%22rangeQuery%22:%7B%22datasource%22:%22fe9f0f46-57bd-4343-8755-8d67d0429b65%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bmatrixorigin_io_component%3D%5C%22CNSet%5C%22%7D%20%7C%3D%20%600193bdb8-4640-78b0-b5dc-12cf998017f4%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22fe9f0f46-57bd-4343-8755-8d67d0429b65%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221734051263000%22,%22to%22:%221734054863000%22%7D%7D%7D&schemaVersion=1&orgId=1

获取CN事务操作日志:
image
https://grafana.matrixonecloud.cn/explore?panes=%7B%22rangeQuery%22:%7B%22datasource%22:%22fe9f0f46-57bd-4343-8755-8d67d0429b65%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bmatrixorigin_io_component%3D%5C%22CNSet%5C%22%7D%20%7C%3D%20%608583bdabc2279bab181098dbf02baa6b%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22fe9f0f46-57bd-4343-8755-8d67d0429b65%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221734051263000%22,%22to%22:%221734054863000%22%7D%7D%7D&schemaVersion=1&orgId=1

@qingxinhome
Copy link
Contributor

@iamlinjunhong 帮忙看一下

@aressu1985
Copy link
Collaborator

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

7 participants