Skip to content

This is a Springboot simple Album Sotrage Project with RabbitMQ and MySQL

Notifications You must be signed in to change notification settings

James-Zeyu-Li/DistributedAlbumStorage

Repository files navigation

DistributedAlbumStorage

This is a Springboot simple Album Sotrage Project with RabbitMQ and MySQL

Ideas and Construction

  • I have implemented a new Review Servlet and a corresponding GET review mechanism to acquire the like and dislike counters. The client is extended to launch 3 additional threads (dedicated to GET /review operations) after the completion of the first group of POST threads, ensuring that a sufficient number of album IDs have been generated.

  • Spring Boot Setup
    The server is set up using Spring Boot. The RESTful API endpoints are implemented via Servlets and/or Spring MVC controllers.

    • Post and Get Albums Profile, Albums Image(example image) as the base information
      • /albums
        • Profile: Artist, Title, Year
        • Image: BLOB format
      • /albums/{albumID}
    • Review receives requests in the format /review/{likeOrNot}/{albumID} (with message content in JSON), converts the review selection to a JSON payload, and publishes this message to RabbitMQ.
  • RabbitMQ Integration
    I am deploying Server and RabbitMQ on two separate AWS EC2 instances. This separation has significantly improved the RabbitMQ client connection, processing rate, and overall throughput. On the server side, a RabbitMQ channel pool is configured to efficiently manage channels across multiple threads. When a review message is processed by the consumer, it sends an acknowledgment (ACK) back to RabbitMQ, confirming that the message has been handled.

  • Database Configuration
    The backend uses AWS RDS MySQL database (hosted on a db.t4g.micro instance with 1GB RAM) running on a free-tier T2.micro instance. The database schema has been redesigned to split the Profile, Image Blob, and Review data into three separate tables, with albumID as the primary key used to connect the tables. This normalization helps maintain data integrity and facilitates more efficient queries and updates.

  • RabbitMQ Consumers:
    The RabbitMQ Consumers are responsible for consuming messages from the queue and performing the corresponding database updates (i.e., incrementing like or dislike counters). Multiple consumer threads are spawned via a fixed thread pool, with each thread using its own dedicated channel to interact with RabbitMQ. This setup ensures that messages are processed concurrently and efficiently. Once a message is successfully processed, the consumer sends an ACK back to the RabbitMQ server.

  • Client Construction:

    • The client remains the original Java client, which is responsible for initiating POST requests (for album creation and posting of reviews).
    • In addition, an extra runnable thread (or three threads) is activated to continuously perform GET /review requests. These GET threads operate by selecting a random albumID from a shared collection populated by successful POST operations. This design guarantees that the GET threads always work with valid album IDs.
    • The client records metrics (including latency, throughput, and success/failure counts) for both POST and GET operations and outputs detailed statistics upon completion of the test.

Experiment:

  • Phase 1: Warm-up
    • Launch 10 threads where each thread issues 100 POST requests.- Stress Test after warm up:
  • Phase 2: Stress Test
    • 10 threads per thread group with different loads of 10 thread pool, 20 thread pool, and 30 thread pools.
    • 100 post requests per thread
    • Album Profile Information and each posted Album will do 2 positive reviews and 1 dislike review posted asynchronously.
    • Finally, after the first thread group finishes, 3 consumer threads will be run to get a review from random existing album and run until all thread group finish running to perform get as much as possible.
  • Phase 3: Try different settings
    • Gradually increase the Client’s thread pool to have a higher concurrent rate of messages being pushed to MySQL and being pushed to RabbitMQ for clients to consume.
    • Gradually increase the Consumer’s thread count to test out the limit of the maximum throughput.

Test Analyzation:

  • Comparison between 30 groups with different amounts of Consumer Core threads and with different amounts of Client Thread pool.
    • Highest POST throughput was achieved at:
      • corePoolSize=160, maxPoolSize=190, and consumer 30/60 — reaching 1381.82 req/s.
      • Also, core=200 and conn=60/80 reached 1368.08 req/s.
    • POST Throughput Peaks at corePoolSize=230 and maxPoolSize=260 with consumer connections at 50/70 or 60/80, but it begins to decline slightly if push to core=230, possibly due to increased DB connection and loading more data, or thread contention.
    • GET Throughput has a decreasing trend as POST load increases since the resource the DB connections is shared.
    • Latency increases with larger pool sizes. Especially for POST P99, queuing delays or DB bottlenecks are higher under high concurrency.
    • There have been no failed attempts yet, which is good but a bit concerning since I have experienced a lot of failed attempts in my last homework.
  • Comparison with same Threadpool and Consumer Connection but with different amount of Jobs.
    • The Highest Throughput is on the 20 group test, which means when higher concurrent requests, bottleneck is encountered.
  • Since I have moved the RabbitMQ to another instance, RabbitMQ is no longer the largest bottleneck, but the RDS connection is still the limit.

Test Data:

  • Comparison between 30 groups with different amounts of Consumer Core threads and with different amounts of Client Thread pool.

Test Process One, increase Core pool Size and Increase Rabbit Consumer Size

Core/Max Pool Size 70-100 100-130 130-160 160-190 160-190* 200-230 200-230* 230-260 230-260
Consumer

Connection Size
30-50 30-50 30-50 30-50 30-60 30-60 50-70 50-70 60-80
POST Throughput 1255.2 1156.6 1200.7 1182.7 1381.8 1198.0 1338.1 1165.5 1368.1
POST Total 12000 12000 12000 12000 12000 12000 12000 12000 12000
GET

Throughput
79.19 53.93 48.72 47.46 55.91 43.29 49.01 37.66 51
GET Success 7571 5595 4869 4815 4855 4336 4395 3877 4473
Notes Queue reached 6k Queue reached 2K

Data which Brought More Confusion

  • I tried to increase more Core Size to see if the Queue Cumulates up again, but the accumulation didn't happen
Core/Max Pool Size 230-260 230-260 330 - 380 350-400 350-500
Client Connection Size 50-70 60-80 60-80 60-80 60-80
Wall Time (s) 102.95 87.71 89.17 84.69 86.65
POST Throughput 1165.59 1368.08 1345.8 1416.98 1384.9
POST Total Success 120000 120000 120000 120000 120000
GET Throughput 37.66 51 33.05 52.43 45.59
GET Total Success 3877 4473 2947 4440 3950

Compare Between Jobs with Differente Amount of Set Post Requests

Metric (120K POST + 4.3K GET) (80K POST + 4K GET) (40K POST + 3.6K GET)
Wall Time (s) 100.16 60.67 34.9
POST Throughput (req/s) 1198.08 1318.52 1146.2
GET Throughput (req/s) 43.29 65.73 103.1
POST Total 120000 80000 40000
GET Total 4336 3988 3598
POST Max Latency (ms) 4121 688 438
POST Mean Latency (ms) 88.7 62.18 44.17
POST p99 Latency (ms) 484 277 175
GET Mean Latency (ms) 67.31 43.7 27.13
GET p99 Latency (ms) 328 189 93
GET Max Latency (ms) 801 416 198

Problems to be solved

  • Why aren't the increase of Core Size does not bring more accumulate messages to the RabbitMQ, even if I didn't increase the consumer connection size in the 2nd part of test
  • Why aren't there any failed requests with core size increase, RDS connection size not increase - max 60.

About

This is a Springboot simple Album Sotrage Project with RabbitMQ and MySQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages