AWS CSA Notes '22: Part 7 of 8 - Analytics

** WORK IN PROGRESS **  

7 Analytics

7.1 AWS Glue

AWS Glue is a serverless, fully managed ETL service that enables cloud users to discover, prepare, and combine data for machine learning, analytics, and application development.

  • AWS Glue supports as source data stores:
    • Aurora, DynamoDB, RDS
  • AWS Glue supports as target data stores:
    • Aurora, DynamoDB, RDS, S3, Redshift, Elasticsearch
  • AWS Glue features:
    • Automatic schema discovery, job scheduler, developer endpoint, automatic code generation, and integrated data catalog

7.2 Amazon Kinesis

  • Amazon Kinesis is a durable and scalable service that enables you to collect, process, and analyze real-time streaming data from different sources.
  • AWS offers four different kinesis streams:
    • Kinesis Video Streams
    • Kinesis Data Streams
    • Kinesis Data Firehose
    • Kinesis Data Analytics

7.3 Amazon Athena

  • Athena is an interactive, serverless query service that enables cloud users to analyze structured, semi-structured, and unstructured data in S3 using Standard SQL syntax.
  • Athena also integrates with AWS Glue Data Catalog.
  • 3 possible options for accessing Amazon Athena:
    • AWS Console, AWS CLI, and Athena with your JDBC driver
  • Athena lets you compress and convert data to columnar formats such as Apache ORC or Apache Parquet to saves costs and improve query performance.

7.4 Amazon Elasticsearch

  • Elasticsearch is an open-source, highly scalable, full-text search and analytics engine that enables you to easily and quickly store, search, correlate, analyze, and visualize massive application and infrastructure data in near real time.
  • Elasticsearch provides integration with Logstash
  • Support for Kibana, Elasticsearch APIs, and SQL querying; and built-in alerts
  • Within a single Elasticsearch cluster you can store up to 3 petabytes
  • Collect metrics from your VMs, routers, servers, and switches
  • Mostly used for monitoring and alerting

7.5 Amazon QuickSight

  • ... is an embeddable, serverless, super-fast, and scalable machine learning-powered business intelligence (BI) service for creating and publishing interactive, device-agnostic business intelligence dashboards.
  • ... mainly used for embedded analytics, machine learning insights, building end-to-end BI solutions, and predictive dashboards.


Glossary:
  • AES-256 = 256-bit Advanced Encryption Standard
  • AZ = Availability Zone
  • KMS = Key Management Service {AWS Key Management Service}
  • MFS = Multi-Factor Authentication

Comments