Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型训练LayerNorm表现异常 #111

Open
JeffTianZy opened this issue Feb 12, 2025 · 1 comment
Open

模型训练LayerNorm表现异常 #111

JeffTianZy opened this issue Feb 12, 2025 · 1 comment

Comments

@JeffTianZy
Copy link

首先感谢您的工作,非常具有开创性。

我自己在CASIA-CHINESE数据集上进行了复现,基本上可以复现出论文的效果。但我发现在第一阶段content encoder预训练的过程中,使用您本来提供的模型,会发现在feature_ext.encoder.layers.2.norm2这个LayerNorm产生很异常的数据,相关matrix如下:
Weight:
min
-6.181015422259622e-40
max
6.090155229832801e-40
mean
2.0480070111083348e-41
std
5.035285077071834e-40
sparsity
0.0%
Bias:
min
-6.128326600001009e-40
max
6.282567521969242e-40
mean
2.722732221680734e-41
std
5.035820970065177e-40
sparsity
0.0%

不知道您有没有观察到类似现象?另外,这种表现却不影响最终的模型性能,也是挺让我费解的🤣

还有一个额外问题,不知道您有没有尝试过将风格特征做一些风格控制上的改进?谢谢

@dailenson
Copy link
Owner

dailenson commented Feb 18, 2025

感谢非常细致的实验观察! 我猜测这里会不是这个小bug造成的:#95 。至于为什么不影响,其实只要不是全0就可以继续往前计算特征?还有就是你说的风格控制上的改进指的是哪方面?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants