Annotation Lab in Amazon Elastic Kubernetes Service (EKS)

10.08.2022

Pranab Rajbhandari

A new generation of the NLP Lab is now available: the Generative AI Lab. Check details here https://www.johnsnowlabs.com/nlp-lab/

The goal of this study is to run a series of performance tests to check the responsiveness of the Annotation Lab web application when handling big workloads, both in terms of a high number of concurrent users and in terms of database scalability. The most problematic pages are the projects and tasks pages as they handle a big amount of data. This performance study focuses on loading those 2 pages.
The tests have been conducted using the automatic load testing tool jmeter. The database was populated with synthetic user profiles and projects as described below.

Annotation Lab Configuration

• 20 NER projects with taxonomies of 30 labels each were created, and 125 tasks were imported to each project.
• Each task consisted of 600,000 characters of texts (around 200 pages).
• 7000 labeled tokens/chunks were added to each task.
• 20 users were assigned to all 20 projects as Annotator and Manager
• 6 floating licenses were imported.
• 5 preannotation servers, as well as 1 training server, were run in parallel. The deployed model was: ner_jsl with embeddings_clinical.

Cluster Specification:

• No of EKS nodes: 3
• Instance Type: t2.2xlarge
• Specification: 8 Core CPU with 32 GB RAM

The preannotation/train servers were used to predict/train at the beginning (where replica:1, web server worker:1 and thread:1) to check how many preannotation servers and training servers could be deployed at the same time in EKS system with 3 nodes. Gradually as the number of replicas and web server threads and workers increased, the performance of the system when handling more users increased at the cost of resources.
Hence for allowing teams of over 100 users to work in parallel on the same Annotation Lab instance deployed on the above cluster specification the system should not be preannotating or training at the same time.

Performance/Stress Test:

The jmeter script was used to call the API for retrieving the project list page and the tasks list of a project per thread. In jmeter the thread number refers to the number of users using the web application concurrently.

In the extreme scenario when the base Annotation Lab configuration (1 webserver worker, 8 users, and one replica) is used, it can support 10 users concurrently.
For improving the performance of Annotation Lab to better handle a higher number of users, several parameters were tweaked.
First, the number of web server threads was increased from 8 to 10 and then finally to 12. This change improved the performance of the application to support about 30 users. In the table below you can see that the performance boost from 8 to 10 threads was much higher than the performance boost from 10 to 12 web server threads.
Second, the number of workers increased from 1 to 2 and then to 3. The application was able to support 50 threads. You can notice that the performance boast was better from 2 workers to 3 workers than from 1 worker to 2 workers.
Finally, the number of replicas was increased from 1 to 2 and then to 3. The application was able to support 110 threads. Again, we can see that the performance boast was better from 1 replica to 2 replicas than from 2 replicas to 3 replicas.

SN	Replicas	Webserver Worker	Webserver Thread	No of User (Threads)	Result
1	1	1	8	10	pass
2	1	1	8	12	%
3	1	1	10	12	pass
4	1	1	10	15	pass
5	1	1	10	20	pass
6	1	1	10	25	slight error %
7	1	1	12	25	pass
8	1	1	12	30	error %
9	1	2	12	30	pass
10	1	2	12	35	pass
11	1	2	12	40	error %
12	1	3	12	45	pass
13	1	3	12	50	error %
14	2	1	8	22	pass
15	2	1	8	24	slight error %
16	2	1	10	35	pass
17	2	1	10	40	error %
18	2	1	12	43	pass
19	2	1	12	45	slight error %
20	2	2	12	50	pass
21	2	2	12	55	slight error %
22	2	3	12	80	pass
23	2	3	12	90	slight error %
24	3	3	12	100	pass
25	3	3	12	110	error %

Conclusion:

The results of this performance test show that the EKS Annotation Lab configuration with 2 replicas, 3 webserver workers, and 12 web server threads is optimal in terms of resource cost to performance ratio.

Get & install it HERE.

Full feature set HERE.

Try The Generative AI Lab - No-Code Platform For Model Tuning & Validation

See in action

Pranab Rajbhandari

Our additional expert:

Pranab Rajbhandari serves as the Project Manager for the Generative AI Lab (formerly NLP Lab) at John Snow Labs. With over eight years of experience in Software Quality Assurance—specializing in performance evaluation for web-based applications—he brings a strong foundation in delivering reliable, high-performing systems. For the past two years, he has led project management efforts in the fast-evolving landscape of Generative AI, ensuring the development and deployment of scalable, high-quality software solutions.

Visual NER Automated Preannotation and Training in the Annotation Lab

Dia Trambitas, Ph.D.

A new generation of the NLP Lab is now available: the Generative AI Lab. Check details here https://www.johnsnowlabs.com/nlp-lab/ Annotation Lab v3.4.0 brings...