1.You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once and must be ordered within windows of 1 hour. How should you design the solution?

Estudio Examen Data

Quiz
•
Computers
•
2nd Grade
•
Easy
Yuliana Lopez
Used 4+ times
FREE Resource
49 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
1 min • 2 pts
A.Use Cloud Pub/Sub for message ingestion and Cloud Dataproc for streaming analysis
B.Use Apache Kafka for message ingestion and use Cloud Dataproc for streaming analysis.
C.Use Apache Kafka for message ingestion and use Cloud Dataflow for streaming analysis
D.Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
2.You have historical data covering the last three years in BigQuery and a data pipeline that delivers new data to BigQuery daily. You have noticed that when theData Science team runs a query filtered on a date column and limited to 30'"90 days of data, the query scans the entire table. You also noticed that your bill is increasing more quickly than you expected. You want to resolve the issue as cost-effectively as possible while maintaining the ability to conduct SQL queries.What should you do?
A.Recommend that the Data Science team export the table to a CSV file on Cloud Storage and use Cloud Datalab to explore the data by reading the files directly.
B.Re-create the tables using DDL. Partition the tables by a column containing a TIMESTAMP or DATE Type.
C.Modify your pipeline to maintain the last 3090"€ג days of data in one table and the longer history in a different table to minimize full table scans over the entire history
D.Write an Apache Beam pipeline that creates a BigQuery table per day. Recommend that the Data Science team use wildcards on the table name suffixes to select the data they need.
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
3. Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?
A. Create an aggregated log export at the organization level, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric
B.Create a Stackdriver Monitoring dashboard based on the BigQuery metric query/scanned_bytes
C.Create a log export for each project, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric
D.Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
4.You work for a shipping company that uses handheld scanners to read shipping labels. Your company has strict data privacy standards that require scanners to only transmit recipients' personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems. What should you do?
A.Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review. Most Voted
B.Install a third-party data validation tool on Compute Engine virtual machines to check the incoming data for sensitive information.
C.Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information.
D.Create an authorized view in BigQuery to restrict access to tables with sensitive data.
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
5. You are designing a cloud-native historical data processing system to meet the following conditions:
✑ The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine.
✑ A streaming data pipeline stores new data daily.
✑ Peformance is not a factor in the solution.
✑ The solution design should maximize availability.
A. Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.
B. Store the data in BigQuery. Access the data using the BigQuery Connector on Cloud Dataproc and Compute Engine.
C. Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.
D.Store the data in a regional Cloud Storage bucket. Access the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
6. You need to create a new transaction table in Cloud Spanner that stores product sales data. You are deciding what to use as a primary key. From a performance perspective, which strategy should you choose?ud Spanner, and insert and delete rows with the job information
A. A random universally unique identifier number (version 4 UUID)
B. The original order identification number from the sales system, which is a monotonically increasing integer
C. The current epoch time
D. A concatenation of the product name and the current epoch time
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
7. You have a data stored in BigQuery. The data in the BigQuery dataset must be highly available. You need to define a storage, backup, and recovery strategy of this data that minimizes cost. How should you configure the BigQuery table?
A. Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
B. Set the BigQuery dataset to be multiregional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.
C. Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.
D. Set the BigQuery dataset to be multiregional. In the event of an emergency, use a point-in-time snapshot to recover the data.
Create a free account and access millions of resources
Similar Resources on Quizizz
50 questions
Quiz Latihan Soal-Soal UTS Pengantar Organisasi Komputer

Quiz
•
KG - University
50 questions
Quiz 2 Latihan Soal-Soal UAS Semantik Web

Quiz
•
1st Grade - University
50 questions
CAT - Gr 10 - Hardware & Software

Quiz
•
KG - University
50 questions
Quiz 1 Digital Literacy, Careers in IT, Information Technology

Quiz
•
2nd Grade
50 questions
D900_3

Quiz
•
1st - 5th Grade
50 questions
GPO Powershell

Quiz
•
2nd Grade
49 questions
Unit 2, 1.3 Types of information access and storage devices

Quiz
•
1st - 10th Grade
50 questions
LATIHAN SOAL PAT TIK KELAS 5

Quiz
•
1st - 5th Grade
Popular Resources on Quizizz
15 questions
Character Analysis

Quiz
•
4th Grade
17 questions
Chapter 12 - Doing the Right Thing

Quiz
•
9th - 12th Grade
10 questions
American Flag

Quiz
•
1st - 2nd Grade
20 questions
Reading Comprehension

Quiz
•
5th Grade
30 questions
Linear Inequalities

Quiz
•
9th - 12th Grade
20 questions
Types of Credit

Quiz
•
9th - 12th Grade
18 questions
Full S.T.E.A.M. Ahead Summer Academy Pre-Test 24-25

Quiz
•
5th Grade
14 questions
Misplaced and Dangling Modifiers

Quiz
•
6th - 8th Grade
Discover more resources for Computers
10 questions
American Flag

Quiz
•
1st - 2nd Grade
10 questions
Identifying equations

Quiz
•
KG - University
10 questions
2nd Grade math review

Quiz
•
2nd Grade
26 questions
Place Value Review

Quiz
•
2nd Grade
25 questions
2.4A Add Subtract within 20 quickly: set 3

Quiz
•
1st - 3rd Grade
12 questions
Summer Trivia

Quiz
•
1st - 5th Grade
15 questions
Music 2016

Quiz
•
KG - 12th Grade
44 questions
logos

Quiz
•
KG - University