Estudio Examen Data

Estudio Examen Data

2nd Grade

49 Qs

quiz-placeholder

Similar activities

Quiz Latihan Soal-Soal UTS BIG DATA

Quiz Latihan Soal-Soal UTS BIG DATA

KG - University

50 Qs

IT Fundamentals Test Part 1

IT Fundamentals Test Part 1

KG - University

49 Qs

CAT - Gr 10 - Hardware & Software

CAT - Gr 10 - Hardware & Software

KG - University

50 Qs

Quiz 1 Digital Literacy, Careers in IT, Information Technology

Quiz 1 Digital Literacy, Careers in IT, Information Technology

2nd Grade

50 Qs

Ujian Tengah Semeter - Sistem Basis Data

Ujian Tengah Semeter - Sistem Basis Data

1st - 5th Grade

49 Qs

Quiz 2 Latihan Soal-Soal UAS Semantik Web

Quiz 2 Latihan Soal-Soal UAS Semantik Web

1st Grade - University

50 Qs

NEW 1.1.3 Input, output and storage

NEW 1.1.3 Input, output and storage

2nd Grade

53 Qs

ICT 100-150

ICT 100-150

1st - 12th Grade

50 Qs

 Estudio Examen Data

Estudio Examen Data

Assessment

Quiz

Computers

2nd Grade

Easy

Created by

Yuliana Lopez

Used 4+ times

FREE Resource

49 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

1 min • 2 pts

  1. 1.You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once and must be ordered within windows of 1 hour. How should you design the solution?

A.Use Cloud Pub/Sub for message ingestion and Cloud Dataproc for streaming analysis

B.Use Apache Kafka for message ingestion and use Cloud Dataproc for streaming analysis.

C.Use Apache Kafka for message ingestion and use Cloud Dataflow for streaming analysis

D.Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

2.You have historical data covering the last three years in BigQuery and a data pipeline that delivers new data to BigQuery daily. You have noticed that when theData Science team runs a query filtered on a date column and limited to 30'"90 days of data, the query scans the entire table. You also noticed that your bill is increasing more quickly than you expected. You want to resolve the issue as cost-effectively as possible while maintaining the ability to conduct SQL queries.What should you do?

A.Recommend that the Data Science team export the table to a CSV file on Cloud Storage and use Cloud Datalab to explore the data by reading the files directly.

B.Re-create the tables using DDL. Partition the tables by a column containing a TIMESTAMP or DATE Type.

C.Modify your pipeline to maintain the last 3090"€ג days of data in one table and the longer history in a different table to minimize full table scans over the entire history

D.Write an Apache Beam pipeline that creates a BigQuery table per day. Recommend that the Data Science team use wildcards on the table name suffixes to select the data they need.

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt


3. Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?

A. Create an aggregated log export at the organization level, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

B.Create a Stackdriver Monitoring dashboard based on the BigQuery metric query/scanned_bytes

C.Create a log export for each project, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

D.Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

4.You work for a shipping company that uses handheld scanners to read shipping labels. Your company has strict data privacy standards that require scanners to only transmit recipients' personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems. What should you do?

A.Build a Cloud Function that reads the topics and makes a call to the Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review. Most Voted

B.Install a third-party data validation tool on Compute Engine virtual machines to check the incoming data for sensitive information.

C.Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information.

D.Create an authorized view in BigQuery to restrict access to tables with sensitive data.

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

5. You are designing a cloud-native historical data processing system to meet the following conditions:

✑ The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine.

✑ A streaming data pipeline stores new data daily.

✑ Peformance is not a factor in the solution.

✑ The solution design should maximize availability.

A. Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.

B. Store the data in BigQuery. Access the data using the BigQuery Connector on Cloud Dataproc and Compute Engine.

C. Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.

D.Store the data in a regional Cloud Storage bucket. Access the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

6. You need to create a new transaction table in Cloud Spanner that stores product sales data. You are deciding what to use as a primary key. From a performance perspective, which strategy should you choose?ud Spanner, and insert and delete rows with the job information

A. A random universally unique identifier number (version 4 UUID)

B. The original order identification number from the sales system, which is a monotonically increasing integer

C. The current epoch time

D. A concatenation of the product name and the current epoch time

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

7. You have a data stored in BigQuery. The data in the BigQuery dataset must be highly available. You need to define a storage, backup, and recovery strategy of this data that minimizes cost. How should you configure the BigQuery table?

A. Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.

B. Set the BigQuery dataset to be multiregional. Create a scheduled query to make copies of the data to tables suffixed with the time of the backup. In the event of an emergency, use the backup copy of the table.

C. Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.

D. Set the BigQuery dataset to be multiregional. In the event of an emergency, use a point-in-time snapshot to recover the data.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?