PDE-2022-4

PDE-2022-4

Professional Development

36 Qs

quiz-placeholder

Similar activities

Amazon AWS Module 1-5 Test

Amazon AWS Module 1-5 Test

Professional Development

41 Qs

File Transfer Service

File Transfer Service

Professional Development

35 Qs

Terik's Practice Test 6

Terik's Practice Test 6

Professional Development

40 Qs

Microsoft 70-773: Analyzing Big Data with Microsoft R Exam

Microsoft 70-773: Analyzing Big Data with Microsoft R Exam

Professional Development

38 Qs

Associate Practice Test

Associate Practice Test

Professional Development

39 Qs

Database Questions

Database Questions

KG - Professional Development

31 Qs

AWS Practitioner 06

AWS Practitioner 06

Professional Development

39 Qs

AZ-900 Exam Q&A Series

AZ-900 Exam Q&A Series

11th Grade - Professional Development

33 Qs

PDE-2022-4

PDE-2022-4

Assessment

Quiz

Professional Development

Professional Development

Medium

Created by

Balamurugan R

Used 53+ times

FREE Resource

36 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

A shipping company has live package-tracking data that is sent to an Apache Kafka stream in real time. This is then loaded into BigQuery. Analysts in your company want to query the tracking data in BigQuery to analyze geospatial trends in the lifecycle of a package. The table was originally created with ingest-date partitioning. Over time, the query processing time has increased. You need to implement a change that would improve query performance in BigQuery. What should you do?

Implement clustering in BigQuery on the ingest date column.

Implement clustering in BigQuery on the package-tracking ID column.

Tier older data onto Cloud Storage files and create a BigQuery table using Cloud Storage as an external data source.

Re-create the table using data partitioning on the package delivery date.

2.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

Your company currently runs a large on-premises cluster using Spark, Hive, and HDFS in a colocation facility. The cluster is designed to accommodate peak usage on the system; however, many jobs are batch in nature, and usage of the cluster fluctuates quite dramatically. Your company is eager to move to the cloud to reduce the overhead associated with on-premises infrastructure and maintenance and to benefit from the cost savings. They are also hoping to modernize their existing infrastructure to use more serverless offerings in order to take advantage of the cloud. Because of the timing of their contract renewal with the colocation facility, they have only 2 months for their initial migration. How would you recommend they approach their upcoming migration strategy so they can maximize their cost savings in the cloud while still executing the migration in time?

Migrate the workloads to Dataproc plus HDFS; modernize later.

Migrate the workloads to Dataproc plus Cloud Storage; modernize later.

Migrate the Spark workload to Dataproc plus HDFS, and modernize the Hive workload for BigQuery.

Modernize the Spark workload for Dataflow and the Hive workload for BigQuery.

3.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

You work for a financial institution that lets customers register online. As new customers register, their user data is sent to Pub/Sub before being ingested into BigQuery. For security reasons, you decide to redact your customers' Government issued Identification Number while allowing customer service representatives to view the original values when necessary. What should you do?

Use BigQuery's built-in AEAD encryption to encrypt the SSN column. Save the keys to a new table that is only viewable by permissioned users.

Use BigQuery column-level security. Set the table permissions so that only members of the Customer Service user group can see the SSN column.

Before loading the data into BigQuery, use Cloud Data Loss Prevention (DLP) to replace input values with a cryptographic hash.

Before loading the data into BigQuery, use Cloud Data Loss Prevention (DLP) to replace input values with a cryptographic format-preserving encryption token.

4.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

You are migrating a table to BigQuery and are deciding on the data model. Your table stores information related to purchases made across several store locations and includes information like the time of the transaction, items purchased, the store ID, and the city and state in which the store is located. You frequently query this table to see how many of each item were sold over the past 30 days and to look at purchasing trends by state, city, and individual store. How would you model this table for the best query performance?

Partition by transaction time; cluster by state first, then city, then store ID.

Partition by transaction time; cluster by store ID first, then city, then state.

Top-level cluster by state first, then city, then store ID.

Top-level cluster by store ID first, then city, then state.

5.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

You are updating the code for a subscriber to a Pub/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. Your subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?

Set up the Pub/Sub emulator on your local machine. Validate the behavior of your new subscriber logic before deploying it to production.

Create a Pub/Sub snapshot before deploying new subscriber code. Use a Seek operation to re-deliver messages that became available after the snapshot was created.

Use Cloud Build for your deployment. If an error occurs after deployment, use a Seek operation to locate a timestamp logged by Cloud Build at the start of the deployment.

Enable dead-lettering on the Pub/Sub topic to capture messages that aren't successfully acknowledged. If an error occurs after deployment, re-deliver any messages captured by the dead-letter queue.

6.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

You work for a large real estate firm and are preparing 6 TB of home sales data to be used for machine learning. You will use SQL to transform the data and use BigQuery ML to create a machine learning model. You plan to use the model for predictions against a raw dataset that has not been transformed. How should you set up your workflow in order to prevent skew at prediction time?

When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.

When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. Before requesting predictions, use a saved query to transform your raw input data, and then use ML.EVALUATE.

Use a BigQuery view to define your preprocessing logic. When creating your model, use the view as your model training data. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.

Preprocess all data using Dataflow. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any further transformations on the input data.

7.

MULTIPLE CHOICE QUESTION

2 mins • 1 pt

You are analyzing the price of a company's stock. Every 5 seconds, you need to compute a moving average of the past 30 seconds' worth of data. You are reading data from Pub/Sub and using DataFlow to conduct the analysis. How should you set up your windowed pipeline?

Use a fixed window with a duration of 5 seconds. Emit results by setting the following trigger: AfterProcessingTime.pastFirstElementInPane().plusDelayOf (Duration.standardSeconds(30))

Use a fixed window with a duration of 30 seconds. Emit results by setting the following trigger: AfterWatermark.pastEndOfWindow().plusDelayOf (Duration.standardSeconds(5))

Use a sliding window with a duration of 5 seconds. Emit results by setting the following trigger: AfterProcessingTime.pastFirstElementInPane().plusDelayOf (Duration.standardSeconds(30))

Use a sliding window with a duration of 30 seconds and a period of 5 seconds. Emit results by setting the following trigger: AfterWatermark.pastEndOfWindow ()

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?