Table 1 NLPaaS pilot usage metrics from 01 May 2019 through 30 September 2019—cluster statistics and resulting workstation estimates are determined based on a calculated average of 256 executor threads (16 executor nodes × 16 cores).

From: Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation

Metric Name

Value

Number of projects

61.0

Number of jobs

246.0

Number of pilot users

13.0

Number of unique concepts (across all projects)

269.0

Average number of unique concepts per project

5.0

Average number of documents per job

6,624,651.1

Average number of jobs ran per project

4.0

Average job runtime (cluster)

1.0 h

Average project runtime (cluster; avg job runtime × avg number of jobs per project)

3.9 h

Average document throughput (cluster)

6,896,784.1 documents per hour

Total job runtime (cluster)

236.3 h (9.8 days)

Estimated equivalent average job runtime (quad-core workstation)

61.5 h (2.6 days)

Estimated equivalent average project runtime (quad-core workstation)

247.9 h (10.3 days)

Estimated equivalent total job runtime (quad-core workstation)

15,122.8 h (630.1 days)