attention map get the same val using scale factor 0.01

When I trained the model with the same structure like attention and use 0.01 as the scale factor like the code, I find the the attention map finally get the same val in each row. It is the scale factor different in  training phase and testing pahse?