Estudio Examen Data 2nd Grade Quiz | Wayground (formerly Quizizz)

1.

MULTIPLE CHOICE QUESTION

1 min • 2 pts

1.You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once and must be ordered within windows of 1 hour. How should you design the solution?

A.Use Cloud Pub/Sub for message ingestion and Cloud Dataproc for streaming analysis

B.Use Apache Kafka for message ingestion and use Cloud Dataproc for streaming analysis.

C.Use Apache Kafka for message ingestion and use Cloud Dataflow for streaming analysis

D.Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

2.You have historical data covering the last three years in BigQuery and a data pipeline that delivers new data to BigQuery daily. You have noticed that when theData Science team runs a query filtered on a date column and limited to 30'"90 days of data, the query scans the entire table. You also noticed that your bill is increasing more quickly than you expected. You want to resolve the issue as cost-effectively as possible while maintaining the ability to conduct SQL queries.What should you do?

A.Recommend that the Data Science team export the table to a CSV file on Cloud Storage and use Cloud Datalab to explore the data by reading the files directly.

B.Re-create the tables using DDL. Partition the tables by a column containing a TIMESTAMP or DATE Type.

C.Modify your pipeline to maintain the last 3090"€ג days of data in one table and the longer history in a different table to minimize full table scans over the entire history

D.Write an Apache Beam pipeline that creates a BigQuery table per day. Recommend that the Data Science team use wildcards on the table name suffixes to select the data they need.

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

3. Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?

A. Create an aggregated log export at the organization level, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

B.Create a Stackdriver Monitoring dashboard based on the BigQuery metric query/scanned_bytes

C.Create a log export for each project, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

D.Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

4.You work for a shipping company that uses handheld scanners to read shipping labels. Your company has strict data privacy standards that require scanners to only transmit recipients' personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems. What should you do?

A.Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review. Most Voted

B.Install a third-party data validation tool on Compute Engine virtual machines to check the incoming data for sensitive information.

C.Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information.

D.Create an authorized view in BigQuery to restrict access to tables with sensitive data.

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

5. You are designing a cloud-native historical data processing system to meet the following conditions:

✑ The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine.

✑ A streaming data pipeline stores new data daily.

✑ Peformance is not a factor in the solution.

✑ The solution design should maximize availability.

A. Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.

B. Store the data in BigQuery. Access the data using the BigQuery Connector on Cloud Dataproc and Compute Engine.

C. Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.

D.Store the data in a regional Cloud Storage bucket. Access the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

6. You need to create a new transaction table in Cloud Spanner that stores product sales data. You are deciding what to use as a primary key. From a performance perspective, which strategy should you choose?ud Spanner, and insert and delete rows with the job information

A. A random universally unique identifier number (version 4 UUID)

B. The original order identification number from the sales system, which is a monotonically increasing integer

C. The current epoch time

D. A concatenation of the product name and the current epoch time

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

7. You have a data stored in BigQuery. The data in the BigQuery dataset must be highly available. You need to define a storage, backup, and recovery strategy of this data that minimizes cost. How should you configure the BigQuery table?

A. Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.

B. Set the BigQuery dataset to be multiregional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.

C. Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.

D. Set the BigQuery dataset to be multiregional. In the event of an emergency, use a point-in-time snapshot to recover the data.

8.

MULTIPLE SELECT QUESTION

45 sec • 1 pt

8. You are running a pipeline in Dataflow that receives messages from a Pub/Sub topic and writes the results to a BigQuery dataset in the EU. Currently, your pipeline is located in europe-west4 and has a maximum of 3 workers, instance type n1-standard-1. You notice that during peak periods, your pipeline is struggling to process records in a timely fashion, when all 3 workers are at maximum CPU utilization. Which two actions can you take to increase performance of your pipeline? (Choose two.)

A. Change the zone of your Dataflow pipeline to run in us-central1

B. Create a temporary table in Bigtable that will act as a buffer for new data. Create a new step in your pipeline to write to this table first, and then create a new pipeline to write from Bigtable to BigQuery..

C. Use a larger instance type for your Dataflow workers

D. Increase the number of max workers

E. Create a temporary table in Cloud Spanner that will act as a buffer for new data. Create a new step in your pipeline to write to this table first, and then create a new pipeline to write from Cloud Spanner to BigQuery

9.

MULTIPLE SELECT QUESTION

45 sec • 1 pt

9. You decided to use Cloud Datastore to ingest vehicle telemetry data in real time. You want to build a storage system that will account for the long-term data growth, while keeping the costs low. You also want to create snapshots of the data periodically, so that you can make a point-in-time (PIT) recovery, or clone a copy of the data for Cloud Datastore in a different environment. You want to archive these snapshots for a long time. Which two methods can accomplish this? (Choose two.)

A. Use managed export, and store the data in a Cloud Storage bucket using Nearline or Coldline class.

B. Use managed export, and then import the data into a BigQuery table created just for that export, and delete temporary export files.

C.Use managed export, and then import to Cloud Datastore in a separate project under a unique namespace reserved for that export.

D. Write an application that uses Cloud Datastore client libraries to read all the entities. Format the exported data into a JSON file. Apply compression before storing the data in Cloud Source Repositories.

E. Write an application that uses Cloud Datastore client libraries to read all the entities. Treat each entity as a BigQuery table row via BigQuery streaming insert. Assign an export timestamp for each export, and attach it as an extra column for each row. Make sure that the BigQuery table is partitioned using the export timestamp column.

10.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

10.You are implementing several batch jobs that must be executed on a schedule. These jobs have many interdependent steps that must be executed in a specific order. Portions of the jobs involve executing shell scripts, running Hadoop jobs, and running queries in BigQuery. The jobs are expected to run for many minutes up to several hours. If the steps fail, they must be retried a fixed number of times. Which service should you use to manage the execution of these jobs?

A. Cloud Functions

B. Cloud Composer

C.Cloud Dataflow

D.Cloud Scheduler

Create a free account and access millions of resources

Similar Resources on Wayground

Popular Resources on Wayground

Discover more resources for Computers