cpu: rv64: add support for f32 layer normalization using RVV #4453

krishnasai-mcw · 2025-12-12T11:20:16Z

Description

This PR introduces optimized layer normalization primitive for RV64 architectures using RVV (RISC-V Vector) intrinsics.

Features and Limitations

Supported:

Data type: f32
Input: plain and dense layouts, with the last dimension contiguous
Stats modes:
- Global stats (mean and variance provided externally)
- Calculate stats (kernel computes mean and variance internally)
Scale & shift
Forward propagation - FWD_I & FWD_D

Not supported:

Post-ops
RMS layer normalization
Backward propagation

Implementation Details

The implementation uses an LMUL=1 configuration with 4-way loop unrolling to maximize throughput while maintaining register pressure. For statistics calculation, the kernel uses double-precision (f64) accumulators via widening operations to maintain numerical accuracy across large reduction operations. The normalization pass applies the standard formula ((x - mean) / std) × scale + shift using fused multiply-accumulate instructions to efficiently compute the final output.

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

BenchDNN test log: test_lnorm_all.log

Performance improvements

Have you submitted performance data that demonstrates performance improvements?

Performance Results

Test Platform: Banana Pi F3

Test Case	Vector	Scalar	Speedup
option_set_all	86.46	1055.51	12.21
option_set_fwks_ext_gpu	62.36	765.01	12.27
shapes_ci	2.67	29.36	11.00

zhangjian29 · 2025-12-13T05:49:05Z

Hi @krishnasai-mcw ,

I noticed that you're using f64 data type. Do you think it is necessary to add a check to ensure the platform supports f64?

src/cpu/rv64/rvv_layer_normalization.cpp

krishnasai-mcw · 2025-12-15T06:02:46Z

Hi @krishnasai-mcw ,

I noticed that you're using f64 data type. Do you think it is necessary to add a check to ensure the platform supports f64?

Hi @zhangjian29 ,
No extra check for f64 is needed. The RISC-V G extension itself includes the D extension, which provides full double-precision (f64) support. In our CMake setup, we already target G by default, ensuring f64 is supported by default.

Thanks

krishnasai-mcw requested review from a team as code owners December 12, 2025 11:20

github-actions bot added platform:cpu-rv64 RISC-V component:common labels Dec 12, 2025

krishnasai-mcw force-pushed the rv64-layernorm branch 2 times, most recently from 2645bb4 to e0fcf42 Compare December 12, 2025 14:26

krishnasai-mcw marked this pull request as draft December 12, 2025 15:01

cpu: rv64: add support for f32 layer normalization using RVV

59da588

krishnasai-mcw force-pushed the rv64-layernorm branch from e0fcf42 to 59da588 Compare December 12, 2025 17:48

krishnasai-mcw marked this pull request as ready for review December 12, 2025 18:45

zhangjian29 approved these changes Dec 13, 2025

View reviewed changes

zhangfeiv0 reviewed Dec 15, 2025

View reviewed changes

src/cpu/rv64/rvv_layer_normalization.cpp Outdated Show resolved Hide resolved

cpu: rv64: lnorm: clean up and reorder includes

e7c9dfc

krishnasai-mcw force-pushed the rv64-layernorm branch from 69d161f to e7c9dfc Compare December 15, 2025 05:12

krishnasai-mcw requested a review from zhangfeiv0 December 15, 2025 17:20

zhangfeiv0 approved these changes Dec 16, 2025

View reviewed changes

zhangfeiv0 mentioned this pull request Dec 23, 2025

cpu: rv64: add support for f32 group normalization using RVV #4480

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu: rv64: add support for f32 layer normalization using RVV #4453

cpu: rv64: add support for f32 layer normalization using RVV #4453

Uh oh!

krishnasai-mcw commented Dec 12, 2025 •

edited

Loading

Uh oh!

zhangjian29 commented Dec 13, 2025

Uh oh!

Uh oh!

krishnasai-mcw commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cpu: rv64: add support for f32 layer normalization using RVV #4453

Are you sure you want to change the base?

cpu: rv64: add support for f32 layer normalization using RVV #4453

Uh oh!

Conversation

krishnasai-mcw commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Features and Limitations

Implementation Details

Checklist

General

Performance improvements

Performance Results

Uh oh!

zhangjian29 commented Dec 13, 2025

Uh oh!

Uh oh!

krishnasai-mcw commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

krishnasai-mcw commented Dec 12, 2025 •

edited

Loading