Data 268-277

Data 268-277

12th Grade

10 Qs

quiz-placeholder

Similar activities

Unit 12 - Big data

Unit 12 - Big data

12th Grade - University

10 Qs

SQL

SQL

11th - 12th Grade

12 Qs

Data Engineer 288-297

Data Engineer 288-297

12th Grade

10 Qs

Tools y preguntas específicas

Tools y preguntas específicas

12th Grade

10 Qs

SQL

SQL

10th - 12th Grade

11 Qs

Database

Database

8th - 12th Grade

10 Qs

RDBMS & SQL QUERIES

RDBMS & SQL QUERIES

12th Grade

15 Qs

Data 211-220

Data 211-220

12th Grade

10 Qs

Data 268-277

Data 268-277

Assessment

Quiz

Computers

12th Grade

Hard

Created by

Academia Google

FREE Resource

10 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Media Image

What should you do?

Group the data by using a tumbling window in a Dataflow pipeline, and write the aggregated data to Memorystore.

Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.

Group the data by using a session window in a Dataflow pipeline, and write the aggregated data to BigQuery.

Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to BigQuery.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You are designing a Dataflow pipeline for a batch processing job. You want to mitigate multiple zonal failures at job submission time. What should you do?

Submit duplicate pipelines in two different zones by using the --zone flag.

Set the pipeline staging location as a regional Cloud Storage bucket

Specify a worker region by using the --region flag.

Create an Eventarc trigger to resubmit the job in case of zonal failure when submitting the job.

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Media Image

What should you do?

Import the ORC files to Bigtable tables for the data scientist team.

Import the ORC files to BigQuery tables for the data scientist team.

Copy the ORC files on Cloud Storage, then deploy a Dataproc cluster for the data scientist team.

Copy the ORC files on Cloud Storage, then create external BigQuery tables for the data scientist team.

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You have a BigQuery table that ingests data directly from a Pub/Sub subscription. The ingested data is encrypted with a Google-managed encryption key. You need to meet a new organization policy that requires you to use keys from a centralized Cloud Key Management Service (Cloud KMS) project to encrypt data at rest. What should you do?

Use Cloud KMS encryption key with Dataflow to ingest the existing Pub/Sub subscription to the existing BigQuery table.

Create a new BigQuery table by using customer-managed encryption keys (CMEK), and migrate the data from the old BigQuery table.

Create a new Pub/Sub topic with CMEK and use the existing BigQuery table by using Google-managed encryption key.

Create a new BigQuery table and Pub/Sub topic by using customer-managed encryption keys (CMEK), and migrate the data from the old BigQuery table

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You are creating the CI/CD cycle for the code of the directed acyclic graphs (DAGs) running in Cloud Composer. Your team has two Cloud Composer instances: one instance for development and another instance for production. Your team is using a Git repository to maintain and develop the code of the DAGs. You want to deploy the DAGs automatically to Cloud Composer when a certain tag is pushed to the Git repository. What should you do?

1. Use Cloud Build to copy the code of the DAG to the Cloud Storage bucket of the development instance for DAG testing.

2. If the tests pass, use Cloud Build to copy the code to the bucket of the production instance.

1. Use Cloud Build to build a container with the code of the DAG and the KubernetesPodOperator to deploy the code to the Google Kubernetes Engine (GKE) cluster of the development instance for testing.

cluster of the development instance for testing. 2. If the tests pass, use the KubernetesPodOperator to deploy the container to the GKE cluster of the production instance.

1. Use Cloud Build to build a container and the KubernetesPodOperator to deploy the code of the DAG to the Google Kubernetes Engine (GKE) cluster of the development instance for testing.

2. If the tests pass, copy the code to the Cloud Storage bucket of the production instance.

1. Use Cloud Build to copy the code of the DAG to the Cloud Storage bucket of the development instance for DAG testing.

2. If the tests pass, use Cloud Build to build a container with the code of the DAG and the KubernetesPodOperator to deploy the container to the Google Kubernetes Engine (GKE) cluster of the production instance.

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You have a BigQuery dataset named “customers”. All tables will be tagged by using a Data Catalog tag template named “gdpr”. The template contains one mandatory field, “has_sensitive_data”, with a boolean value. All employees must be able to do a simple search and find tables in the dataset that have either true or false in the “has_sensitive_data’ field. However, only the Human Resources (HR) group should be able to see the data inside the tables for which “has_sensitive data” is true. You give the all employees group the bigquery.metadataViewer and bigquery.connectionUser roles on the dataset. You want to minimize configuration overhead. What should you do next?

Create the “gdpr” tag template with private visibility. Assign the bigquery.dataViewer role to the HR group on the tables that contain sensitive data.

Create the “gdpr” tag template with private visibility. Assign the datacatalog.tagTemplateViewer role on this tag to the all employees group, and assign the bigquery.dataViewer role to the HR group on the tables that contain sensitive data.

Create the “gdpr” tag template with public visibility. Assign the bigquery.dataViewer role to the HR group on the tables that contain sensitive data.

Create the “gdpr” tag template with public visibility. Assign the datacatalog.tagTemplateViewer role on this tag to the all employees group, and assign the bigquery.dataViewer role to the HR group on the tables that contain sensitive data

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You are monitoring your organization’s data lake hosted on BigQuery. The ingestion pipelines read data from Pub/Sub and write the data into tables on BigQuery. After a new version of the ingestion pipelines is deployed, the daily stored data increased by 50%. The volumes of data in Pub/Sub remained the same and only some tables had their daily partition data size doubled. You need to investigate and fix the cause of the data increase. What should you do?

1. Check for duplicate rows in the BigQuery tables that have the daily partition data size doubled.

2. Schedule daily SQL jobs to deduplicate the affected tables.

3. Share the deduplication script with the other operational teams to reuse if this occurs to other tables.

1. Check for code errors in the deployed pipelines.

2. Check for multiple writing to pipeline BigQuery sink.

3. Check for errors in Cloud Logging during the day of the release of the new pipelines.

4. If no errors, restore the BigQuery tables to their content before the last release by using time travel.

1. Check for duplicate rows in the BigQuery tables that have the daily partition data size doubled.

2. Check the BigQuery Audit logs to find job IDs.

3. Use Cloud Monitoring to determine when the identified Dataflow jobs started and the pipeline code version.

4. When more than one pipeline ingests data into a table, stop all versions except the latest one.

1. Roll back the last deployment.

2. Restore the BigQuery tables to their content before the last release by using time travel.

3. Restart the Dataflow jobs and replay the messages by seeking the subscription to the timestamp of the release.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?