Explication générale
Correct option:
Leverage Amazon Kinesis Data Streams to capture the data from the website and feed it into Amazon Kinesis Data Analytics which can query the data in real time. Lastly, the analyzed feed is output into Kinesis Data Firehose to persist the data on Amazon S3
You can use Kinesis Data Streams to build custom applications that process or analyze streaming data for specialized needs. Kinesis Data Streams manages the infrastructure, storage, networking, and configuration needed to stream your data at the level of your data throughput. You don't have to worry about provisioning, deployment, or ongoing maintenance of hardware, software, or other services for your data streams.
For the given use case, you can use Kinesis Data Analytics to transform and analyze incoming streaming data from Kinesis Data Streams in real time. Kinesis Data Analytics takes care of everything required to run streaming applications continuously, and scales automatically to match the volume and throughput of your incoming data. With Kinesis Data Analytics, there are no servers to manage, no minimum fee or setup cost, and you only pay for the resources your streaming applications consume.
Amazon Kinesis Data Analytics:
via - https://aws.amazon.com/kinesis/
Amazon Kinesis Data Firehose is an extract, transform, and load (ETL) service that reliably captures, transforms and delivers streaming data to data lakes, data stores, and analytics services.
For the given use case, post the real-time analysis, the output feed from Kinesis Data Analytics is output into Kinesis Data Firehose which dumps the data into Amazon S3 without any data loss.
Amazon Kinesis Data Firehose:
via - https://aws.amazon.com/kinesis/
Incorrect options:
Leverage Amazon Kinesis Data Streams to capture the data from the website and feed it into Amazon QuickSight which can query the data in real time. Lastly, the analyzed feed is output into Kinesis Data Firehose to persist the data on Amazon S3 - QuickSight cannot use Kinesis Data Streams as a source. In addition, QuickSight cannot be used for real-time streaming data analysis from its source. Therefore this option is incorrect.
Leverage Amazon Kinesis Data Streams to capture the data from the website and feed it into Kinesis Data Firehose to persist the data on Amazon S3. Lastly, use Amazon Athena to analyze the data in real time - Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Athena cannot be used to analyze data in real time. Therefore this option is incorrect.
Leverage Amazon SQS to capture the data from the website. Configure a fleet of EC2 instances under an Auto scaling group to process messages from the SQS queue and trigger the scaling policy based on the number of pending messages in the queue. Perform real-time analytics using a third party library on the EC2 instances - Even though using SQS with EC2 instances can decouple the architecture, however, performing real-time analytics using a third party library on the EC2 instances is not the best fit solution for the given use case. The Kinesis family of services is the better fit for the given scenario as these services allow streaming data ingestion, real-time analysis, and reliable data delivery to the data sink.
References:
https://aws.amazon.com/kinesis/
https://aws.amazon.com/quicksight/resources/faqs/