Skip to content

Conversation

@YWHyuk
Copy link

@YWHyuk YWHyuk commented Apr 28, 2022

CC-lock is based on flat-lock combining algorithm.
In this lock, only one thread, called combiner
thread the request of critical section. So, combiner
thread can exploit locality and aviod high contetetion
between lock varible.

When each cpu use only one node, let's assume lock
hold node A. In this case, node A's
(wait, completed) status should be (false, false).

Lock
|
A

When A,B cpu race occured, Let's assume that
B is win. Then, B will try to spin on A's wait
Status.

A -> B
w:F w:T

At the same time, A was enqueued. So, A's wait
status was set to True like below.

A -> B -> A
w:T w:T w:T

This lead to deadlock.

To avoid above node-reusing problem, each cpu has two
cc_node. Those node are used alternately.

A_0 -> B_0 -> A_1
w:f w:T w:T

Signed-off-by: Wonhyuk Yang vvghjk1234@gmail.com

YWHyuk added 10 commits April 28, 2022 19:24
CC-lock is based on flat-lock combining algorithm.
In this lock, only one thread, called combiner
thread the request of critical section. So, combiner
thread can exploit locality and aviod high contetetion
between lock varible.

When each cpu use only one node, let's assume lock
hold node A. In this case, node A's
(wait, completed) status should be (false, false).

Lock
 |
 A

When A,B cpu race occured, Let's assume that
B is win. Then, B will try to spin on A's wait
Status.

A   ->   B
w:F      w:T

At the same time, A was enqueued. So, A's wait
status was set to True like below.

A   ->   B   ->   A
w:T      w:T      w:T

This lead to deadlock.

To avoid above node-reusing problem, each cpu has two
cc_node. Those node are used alternately.

A_0 ->   B_0 ->   A_1
w:f      w:T      w:T

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Test reported that there is a deadlock. Situation are
below.

Node(0, 1) {
	req = 00000000d0495726,
	params = 000000002f36f5ac,
	wait = 0, completed = 1,
	refcount = 0,
	Next (2, 0)
	Prev (0, 0)
}

Node(2, 0) {
	req = 00000000d0495726,
	params = 000000002f36f5ac,
	wait = 1, completed = 0,
	refcount = 0,
	Next (2, 1)
	Prev (0, 1)
}

Node (0, 1)'s request are handled. So, it wait,
completed status are (0, 1). But, it's next node
Node(2, 0)'s wait are still 1. The combiner thread
should set Node(2, 0) wait = 0. Previous logic
set wait = 0, when DECODE_CPU(pending_cpu) != NR_CPUS.

But there can be race between combiner thread
and normal thread. In the combiner thread it
check node->req first, then it check node->next.
So there could be a situation below

Node(0, 1)			Node(2, 0)
				prev->req = req
if(pending->req)
...
DECODE_CPU(pending->next)
				prev->next = this_cpu

To fix this, combiner thread check node->next first.

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Previous, test thread used jiffes to measure the spent
time. But, it's resolution is low. So all the results
are zero or one. So use sched_clock.

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
To keep order of reading node->next and writing of
node->wait, node->completed, smp_mb should be used
instead of smp_mb(). So fix it

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Using the "echo 2 > trigger", spinlock based
benchmark can be run.

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant