Introduction
I had a question from customer to provide the max IOPs from a StorageGRID system, and also advise how heavily utilized their current system is. Essentially a question of headroom.
From Google AI: IOPS (Input/Output Operations Per Second) measures the read and write performance of a physical storage device, like an SSD or HDD, while S3 operations relate to the performance of Amazon S3, a cloud storage service
The above essentially says to use S3 operations instead of IOPs when talking about performance with respect to an S3 object storage system. Of course, if you have physical hardware with spinning disks, you can always come up with a rough IOPs figure, but relating that to S3 operations is not directly possible (at least, I don't think so, please correct me if I'm wrong).
- 5,400 RPM HDDs: 50-60 IOPS
- 7,200 RPM SATA HDDs: 75-100 IOPS
- 10,000 RPM SATA HDDs: 125-150 IOPS
- 10,000 RPM SAS HDDs: 140 IOPS
- 15,000 RPM SAS HDDs: 175-210 IOPS
With S3 storage, we also need to consider throughput.
- Quote:
- "Normally, your grid is sized to achieve a required throughput, defined in terms of S3 operations per second or bytes per second. For example, you might have a requirement that your grid handle 1,000 S3 operations per second, or 2,000 MB per second, of object ingests and retrievals.
- "For example, if your grid was sized to achieve a throughput of 2,000 MB/second, and your average object size is 2 MB, then your grid was sized to be able to handle 1,000 S3 operations per second (2,000 MB / 2 MB)."
And I want to present some data using NetApp DII (Data Infrastructure Insights) dashboards.
Fusion Tool
In NetApp's Fusion Tool (as of 27.05.2025), for Workloads we have:
- Basic Workload Input
- Usable Capacity (TiB)
- Average Object Size (32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 2MB, 4MB, 8MB, 16MB, 32MB, 64MB, 128MB, 256MB, 512MB)
- Throughput: Req/s or MB/s
- Workload Profile (Read/Write/Delete)
- Mixed Workload 1 (50/25/25)
- Mixed Workload 2 (25/50/25)
- 100% Reads (100/0/0)
- 100% Writes (0/100/0)
- We need to consider performance both appliances:
- Storage Appliance
- Services Appliance (gateway/load balancer)
- There are a lot of factors that can influence performance:
- Number of storage nodes at a site
- Storage node deployment type
- Type of StorageGRID appliance
- Client application infrastructure
- Workload (read, write, delete, concurrency*, object sizes)
- Storage node object data and metadata capacity use
- Information lifecycle management (ILM) configuration
- Number of sites and latency between sites
- Network infrastructure and configuration (i.e. LACP, 4 x 10GbE, 4 x 25GbE ...)
- Platform services
- Cross-grid replication
- Stored object encryption
- Measurement of performance:
- S3 requests per second (requests/sec)
- "For smaller objects, the main factor that influences performance is the transactional rate at which objects are processed by the system. Depending on the context, the object transactional rate can vary based on the operation that is performed, such as ingest rate, retrieval rate, or delete rate".
- Throughput (megabytes per second or MBps)
- "For larger objects, the payload size is the main factor that affects the performance of the system. Therefore, the throughput or the bandwidth of the system is the best indicator of performance."
- Test results for various StorageGRID storage node appliances are given as -
- 100% PUT per-node performance (using 2-copy replicated ILM)
- 100% GET per-node performance (using 2-copy replicated ILM)
- 100% PUT per-node performance (using EC2+1 ILM)
- 100% GET per-node performance (using EC2+1 ILM)
- with all object sizes (KB): 32, 64 ... 524288
- threads (1024 for 32KB, 64KB, 128KB and either 8 or 4 for the rest)
- requests per second
- throughput (MBps)
- netapp_storagegrid.cluster
- agent_version
- cluster_id
- cluster_ip
- cluster_name
- cluster_oid
- usable_percent
- utilization_percent
- netapp_storagegrid.node
- agent_version
- cluster_id
- cluster_ip
- cluster_name
- cluster_oid
- code
- content_buckets_and_containers
- content_objects
- content_objects_lost
- content_objects_lost_rate
- cpu_seconds_total
- http_sessions_incoming_attempted_rate
- http_sessions_incoming_currently_established_rate
- http_sessions_incoming_failed_rate
- http_sessions_incoming_successful_rate
- identity_service_failed_authorize_by_uuid_requests_rate
- identity_service_failed_authorize_requests_rate
- identity_service_failed_change_password_requests_rate
- identity_service_failed_get_group_requests_rate
- identity_service_failed_list_groups_requests_rate
- identity_service_failed_new_connection_dials_rate
- identity_service_failed_new_connections_rate
- identity_service_failed_schedule_synchronization_requests_rate
- identity_service_failed_search_group_rate
- identity_service_failed_search_groups_rate
- identity_service_failed_search_user_rate
- identity_service_failed_search_users_rate
- identity_service_failed_synchronization_scans_rate
- identity_service_failed_tenant_synchronizations_rate
- identity_service_failed_validate_requests_rate
- identity_service_schedule_synchronization_time_ms_bucket_rate
- identity_service_schedule_synchronization_time_ms_count_rate
- identity_service_schedule_synchronization_time_ms_sum_rate
- identity_service_total_schedule_synchronization_requests_rate
- ilm_awaiting_background_objects
- ilm_awaiting_client_evaluation_objects_per_second (sec)
- ilm_awaiting_client_objects
- ilm_awaiting_total_objects
- ilm_scan_objects_per_second (sec)
- ilm_scan_period_estimated_minutes
- job
- metadata_queries_average_latency_milliseconds (ms)
- network_received_bytes_rate (B/s)
- network_transmitted_bytes_rate (B/s)
- node_name
- node_uuid
- operation
- platform_services_failed_raft_requests
- platform_services_failed_raft_requests_rate
- platform_services_failed_replications
- platform_services_failed_replications_rate
- platform_services_failed_s3_notifications
- platform_services_failed_s3_notifications_rate
- platform_services_permanently_failed_requests
- platform_services_permanently_failed_requests_rate
- platform_services_total_raft_requests
- platform_services_total_raft_requests_rate
- platform_services_total_s3_notifications
- platform_services_total_s3_notifications_rate
- s3_data_transfers_bytes_ingested_rate
- s3_data_transfers_bytes_retrieved_rate
- s3_operations_failed
- s3_operations_successful
- s3_operations_failed_rate
- s3_operations_successful_rate
- s3_operations_unauthorized
- s3_operations_unauthorized_rate
- s3_requests_cancelled_total
- s3_requests_cancelled_total_rate
- s3_requests_total
- s3_requests_total_rate
- servercertificate_management_interface_cert_expiry_days
- servercertificate_storage_api_endpoints_cert_expiry_days
- service_cpu_seconds (sec)
- service_cpu_seconds_rate
- service_load
- service_memory_usage_bytes (B)
- service_memory_usage_bytes_rate (B/s)
- service_network_received_bytes (B)
- service_network_received_bytes_rate (B/s)
- service_network_transmitted_bytes (B)
- service_network_transmitted_bytes_rate (B/s)
- service_restarts
- service_restarts_rate
- service_runtime_seconds (sec)
- service_uptime_seconds (sec)
- site_id
- site_name
- storage_state_current_maintenance
- storage_state_current_offline
- storage_state_current_online
- storage_state_current_read_only
- storage_status_no_error
- storage_status_no_free_space
- storage_status_transition
- storage_status_unknownerr
- storage_status_vols_unavail
- storage_utilization_data_bytes (B)
- storage_utilization_data_bytes_rate (B/s)
- storage_utilization_metadata_allowed_bytes (B)
- storage_utilization_metadata_bytes (B)
- storage_utilization_metadata_bytes_rate (B/s)
- storage_utilization_total_space_bytes (B)
- storage_utilization_total_space_bytes_rate (B/s)
- storage_utilization_usable_space_bytes (B)
- storage_utilization_usable_space_bytes_rate (B/s)
- storage_utilization_used_space
- swift_data_transfers_bytes_ingested
- swift_data_transfers_bytes_ingested_rate
- swift_data_transfers_bytes_retrieved
- swift_data_transfers_bytes_retrieved_rate
- swift_operations_failed
- swift_operations_failed_rate
- swift_operations_successful
- swift_operations_successful_rate
- swift_operations_unauthorized
- swift_operations_unauthorized_rate
- usable_percent (%)
- utilization_percent (%)
- netapp_storagegrid.tenant
- usage_data_bytes (B)
- agent_version
- cluster_id
- cluster_ip
- cluster_name
- cluster_oid
- storagepool_key
- tenant_id
- tenant_name
- usage_data_bytes (B)
- usage_data_bytes_rate (B/s)
- usage_object_count
- usage_object_count_rate
- usage_quota_bytes (B)
- usage_quota_bytes_rate (B/s)
- netapp_storagegrid.bucket
- agent_version
- bucket_id
- bucket_name
- cluster_id
- cluster_ip
- cluster_name
- cluster_oid
- code_type
- internal_volume_key
- method_type
- policy_id
- storagepool_key
- tenant_id
- tenant_name
- usage_data_bytes (B)
- usage_data_bytes_rate (B/s)
- usage_object_count
- usage_object_count_rate
Comments
Post a Comment