DistributedAlbumStorage

This is a Springboot simple Album Sotrage Project with RabbitMQ and MySQL

Ideas and Construction

I have implemented a new Review Servlet and a corresponding GET review mechanism to acquire the like and dislike counters. The client is extended to launch 3 additional threads (dedicated to GET /review operations) after the completion of the first group of POST threads, ensuring that a sufficient number of album IDs have been generated.
Spring Boot Setup
The server is set up using Spring Boot. The RESTful API endpoints are implemented via Servlets and/or Spring MVC controllers.
- Post and Get Albums Profile, Albums Image(example image) as the base information
  - /albums
    - Profile: Artist, Title, Year
    - Image: BLOB format
  - /albums/{albumID}
- Review receives requests in the format /review/{likeOrNot}/{albumID} (with message content in JSON), converts the review selection to a JSON payload, and publishes this message to RabbitMQ.
RabbitMQ Integration
I am deploying Server and RabbitMQ on two separate AWS EC2 instances. This separation has significantly improved the RabbitMQ client connection, processing rate, and overall throughput. On the server side, a RabbitMQ channel pool is configured to efficiently manage channels across multiple threads. When a review message is processed by the consumer, it sends an acknowledgment (ACK) back to RabbitMQ, confirming that the message has been handled.
Database Configuration
The backend uses AWS RDS MySQL database (hosted on a db.t4g.micro instance with 1GB RAM) running on a free-tier T2.micro instance. The database schema has been redesigned to split the Profile, Image Blob, and Review data into three separate tables, with albumID as the primary key used to connect the tables. This normalization helps maintain data integrity and facilitates more efficient queries and updates.
RabbitMQ Consumers:
The RabbitMQ Consumers are responsible for consuming messages from the queue and performing the corresponding database updates (i.e., incrementing like or dislike counters). Multiple consumer threads are spawned via a fixed thread pool, with each thread using its own dedicated channel to interact with RabbitMQ. This setup ensures that messages are processed concurrently and efficiently. Once a message is successfully processed, the consumer sends an ACK back to the RabbitMQ server.
Client Construction：
- The client remains the original Java client, which is responsible for initiating POST requests (for album creation and posting of reviews).
- In addition, an extra runnable thread (or three threads) is activated to continuously perform GET /review requests. These GET threads operate by selecting a random albumID from a shared collection populated by successful POST operations. This design guarantees that the GET threads always work with valid album IDs.
- The client records metrics (including latency, throughput, and success/failure counts) for both POST and GET operations and outputs detailed statistics upon completion of the test.

Experiment:

Phase 1: Warm-up
- Launch 10 threads where each thread issues 100 POST requests.- Stress Test after warm up:
Phase 2: Stress Test
- 10 threads per thread group with different loads of 10 thread pool, 20 thread pool, and 30 thread pools.
- 100 post requests per thread
- Album Profile Information and each posted Album will do 2 positive reviews and 1 dislike review posted asynchronously.
- Finally, after the first thread group finishes, 3 consumer threads will be run to get a review from random existing album and run until all thread group finish running to perform get as much as possible.
Phase 3: Try different settings
- Gradually increase the Client’s thread pool to have a higher concurrent rate of messages being pushed to MySQL and being pushed to RabbitMQ for clients to consume.
- Gradually increase the Consumer’s thread count to test out the limit of the maximum throughput.

Test Analyzation:

Comparison between 30 groups with different amounts of Consumer Core threads and with different amounts of Client Thread pool.
- Highest POST throughput was achieved at:
  - corePoolSize=160, maxPoolSize=190, and consumer 30/60 — reaching 1381.82 req/s.
  - Also, core=200 and conn=60/80 reached 1368.08 req/s.
- POST Throughput Peaks at corePoolSize=230 and maxPoolSize=260 with consumer connections at 50/70 or 60/80, but it begins to decline slightly if push to core=230, possibly due to increased DB connection and loading more data, or thread contention.
- GET Throughput has a decreasing trend as POST load increases since the resource the DB connections is shared.
- Latency increases with larger pool sizes. Especially for POST P99, queuing delays or DB bottlenecks are higher under high concurrency.
- There have been no failed attempts yet, which is good but a bit concerning since I have experienced a lot of failed attempts in my last homework.
Comparison with same Threadpool and Consumer Connection but with different amount of Jobs.
- The Highest Throughput is on the 20 group test, which means when higher concurrent requests, bottleneck is encountered.
Since I have moved the RabbitMQ to another instance, RabbitMQ is no longer the largest bottleneck, but the RDS connection is still the limit.

Test Data:

Comparison between 30 groups with different amounts of Consumer Core threads and with different amounts of Client Thread pool.

Test Process One, increase Core pool Size and Increase Rabbit Consumer Size

Core/Max Pool Size	70-100	100-130	130-160	160-190	160-190*	200-230	200-230*	230-260	230-260
Consumer Connection Size	30-50	30-50	30-50	30-50	30-60	30-60	50-70	50-70	60-80
POST Throughput	1255.2	1156.6	1200.7	1182.7	1381.8	1198.0	1338.1	1165.5	1368.1
POST Total	12000	12000	12000	12000	12000	12000	12000	12000	12000
GET Throughput	79.19	53.93	48.72	47.46	55.91	43.29	49.01	37.66	51
GET Success	7571	5595	4869	4815	4855	4336	4395	3877	4473
Notes				Queue reached 6k			Queue reached 2K

Data which Brought More Confusion

I tried to increase more Core Size to see if the Queue Cumulates up again, but the accumulation didn't happen

Core/Max Pool Size	230-260	230-260	330 - 380	350-400	350-500
Client Connection Size	50-70	60-80	60-80	60-80	60-80
Wall Time (s)	102.95	87.71	89.17	84.69	86.65
POST Throughput	1165.59	1368.08	1345.8	1416.98	1384.9
POST Total Success	120000	120000	120000	120000	120000
GET Throughput	37.66	51	33.05	52.43	45.59
GET Total Success	3877	4473	2947	4440	3950

Compare Between Jobs with Differente Amount of Set Post Requests

Metric	(120K POST + 4.3K GET)	(80K POST + 4K GET)	(40K POST + 3.6K GET)
Wall Time (s)	100.16	60.67	34.9
POST Throughput (req/s)	1198.08	1318.52	1146.2
GET Throughput (req/s)	43.29	65.73	103.1
POST Total	120000	80000	40000
GET Total	4336	3988	3598
POST Max Latency (ms)	4121	688	438
POST Mean Latency (ms)	88.7	62.18	44.17
POST p99 Latency (ms)	484	277	175
GET Mean Latency (ms)	67.31	43.7	27.13
GET p99 Latency (ms)	328	189	93
GET Max Latency (ms)	801	416	198

Problems to be solved

Why aren't the increase of Core Size does not bring more accumulate messages to the RabbitMQ, even if I didn't increase the consumer connection size in the 2nd part of test
Why aren't there any failed requests with core size increase, RDS connection size not increase - max 60.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.mvn/wrapper		.mvn/wrapper
DistributedAlbumStorageSpringboot		DistributedAlbumStorageSpringboot
JavaClient		JavaClient
OldVer_GO_MongoDB		OldVer_GO_MongoDB
RabbitClient		RabbitClient
java-client-generated		java-client-generated
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DistributedAlbumStorage

Ideas and Construction

Experiment:

Test Analyzation:

Test Data:

Test Process One, increase Core pool Size and Increase Rabbit Consumer Size

Data which Brought More Confusion

Compare Between Jobs with Differente Amount of Set Post Requests

Problems to be solved

About

Uh oh!

Releases

Packages

Languages

James-Zeyu-Li/DistributedAlbumStorage

Folders and files

Latest commit

History

Repository files navigation

DistributedAlbumStorage

Ideas and Construction

Experiment:

Test Analyzation:

Test Data:

Test Process One, increase Core pool Size and Increase Rabbit Consumer Size

Data which Brought More Confusion

Compare Between Jobs with Differente Amount of Set Post Requests

Problems to be solved

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages