You need to deploy additional dependencies to all nodes of a Cloud Dataproc cluster at startup using an existing initialization action. Company security policies require that Cloud Dataproc nodes do not have access to the Internet so public initialization actions cannot fetch resources. What should you do?

Copy all dependencies to a Cloud Storage bucket within your VPC security perimeter

Deploy the Cloud SQL Proxy on the Cloud Dataproc master

Use an SSH tunnel to give the Cloud Dataproc cluster access to the Internet

Use Resource Manager to add the service account used by the Cloud Dataproc cluster to the Network User role

You need to choose a database to store time series CPU and memory usage for millions of computers. You need to store this data in one-second interval samples. Analysts will be performing real-time, ad hoc analytics against the database. You want to avoid being charged for every query executed and ensure that the schema design will allow for future growth of the dataset. Which database and data model should you choose?

Create a narrow table in Bigtable with a row key that combines the Computer Engine computer identifier with the sample time at each second

Create a table in BigQuery, and append the new samples for CPU and memory to the table

Create a wide table in BigQuery, create a column for the sample value at each second, and update the row with the interval for each second

Create a wide table in Bigtable with a row key that combines the computer identifier with the sample time at each minute and combine the values for each second as column data.

You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the Trust No One (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data. What should you do?

Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.

Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key. Use gsutil cp to upload each encrypted file to the Cloud Storage bucket. Manually destroy the key previously used for encryption, and rotate the key once.

Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in Cloud Memorystore as permanent storage of the secret.

Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in a different project that only the security team can access.

You have data pipelines running on BigQuery, Dataflow, and Dataproc. You need to perform health checks and monitor their behavior, and then notify the team managing the pipelines if they fail. You also need to be able to work across multiple projects. Your preference is to use managed products or features of the platform. What should you do?

Export the information to Cloud Monitoring, and set up an Alerting policy

Run a Virtual Machine in Compute Engine with Airflow, and export the information to Cloud Monitoring

Export the logs to BigQuery, and set up App Engine to read that information and send emails if you find a failure in the logs

Develop an App Engine application to consume logs using GCP API calls, and send emails if you find a failure in the logs

You are working on a linear regression model on BigQuery ML to predict a customer's likelihood of purchasing your company's products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

Use SQL in BigQuery to transform the state column using a one-hot encoding method, and make each city a column with binary values.

Create a new view with BigQuery that does not include a column with city information.

Use TensorFlow to create a categorical variable with a vocabulary list. Create the vocabulary file and upload that as part of your model to BigQuery ML.

Use Cloud Data Fusion to assign each city to a region that is labeled as 1, 2, 3, 4, or 5, and then use that number to represent the city in the model.

You work for a large bank that operates in locations throughout North America. You are setting up a data storage system that will handle bank account transactions. You require ACID compliance and the ability to access data with SQL. Which solution is appropriate?

Store transaction in Cloud Spanner. Use locking read-write transactions.

Store transaction data in Cloud Spanner. Enable stale reads to reduce latency.

Store transaction data in BigQuery. Disabled the query cache to ensure consistency.

Store transaction data in Cloud SQL. Use a federated query BigQuery for analysis.

A shipping company has live package-tracking data that is sent to an Apache Kafka stream in real time. This is then loaded into BigQuery. Analysts in your company want to query the tracking data in BigQuery to analyze geospatial trends in the lifecycle of a package. The table was originally created with ingest-date partitioning. Over time, the query processing time has increased. You need to implement a change that would improve query performance in BigQuery. What should you do?

Implement clustering in BigQuery on the package-tracking ID column.

Implement clustering in BigQuery on the ingest date column.

Tier older data onto Cloud Storage files and create a BigQuery table using Cloud Storage as an external data source.

Re-create the table using data partitioning on the package delivery date.

Your company currently runs a large on-premises cluster using Spark, Hive, and HDFS in a colocation facility. The cluster is designed to accommodate peak usage on the system; however, many jobs are batch in nature, and usage of the cluster fluctuates quite dramatically. Your company is eager to move to the cloud to reduce the overhead associated with on-premises infrastructure and maintenance and to benefit from the cost savings. They are also hoping to modernize their existing infrastructure to use more serverless offerings in order to take advantage of the cloud. Because of the timing of their contract renewal with the colocation facility, they have only 2 months for their initial migration. How would you recommend they approach their upcoming migration strategy so they can maximize their cost savings in the cloud while still executing the migration in time?

Migrate the workloads to Dataproc plus Cloud Storage; modernize later.

Migrate the workloads to Dataproc plus HDFS; modernize later.

Migrate the Spark workload to Dataproc plus HDFS, and modernize the Hive workload for BigQuery.

Modernize the Spark workload for Dataflow and the Hive workload for BigQuery.

You work for a financial institution that lets customers register online. As new customers register, their user data is sent to Pub/Sub before being ingested into BigQuery. For security reasons, you decide to redact your customers' Government issued Identification Number while allowing customer service representatives to view the original values when necessary. What should you do?

Before loading the data into BigQuery, use Cloud Data Loss Prevention (DLP) to replace input values with a cryptographic format-preserving encryption token.

Use BigQuery's built-in AEAD encryption to encrypt the SSN column. Save the keys to a new table that is only viewable by permissioned users.

Use BigQuery column-level security. Set the table permissions so that only members of the Customer Service user group can see the SSN column.

Before loading the data into BigQuery, use Cloud Data Loss Prevention (DLP) to replace input values with a cryptographic hash.

You are migrating a table to BigQuery and are deciding on the data model. Your table stores information related to purchases made across several store locations and includes information like the time of the transaction, items purchased, the store ID, and the city and state in which the store is located. You frequently query this table to see how many of each item were sold over the past 30 days and to look at purchasing trends by state, city, and individual store. How would you model this table for the best query performance?

Partition by transaction time; cluster by state first, then city, then store ID.

Partition by transaction time; cluster by store ID first, then city, then state.

Top-level cluster by state first, then city, then store ID.

Top-level cluster by store ID first, then city, then state.

You are updating the code for a subscriber to a Pub/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. Your subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?

Create a Pub/Sub snapshot before deploying new subscriber code. Use a Seek operation to re-deliver messages that became available after the snapshot was created.

Set up the Pub/Sub emulator on your local machine. Validate the behavior of your new subscriber logic before deploying it to production.

Use Cloud Build for your deployment. If an error occurs after deployment, use a Seek operation to locate a timestamp logged by Cloud Build at the start of the deployment.

Enable dead-lettering on the Pub/Sub topic to capture messages that aren't successfully acknowledged. If an error occurs after deployment, re-deliver any messages captured by the dead-letter queue.

You work for a large real estate firm and are preparing 6 TB of home sales data to be used for machine learning. You will use SQL to transform the data and use BigQuery ML to create a machine learning model. You plan to use the model for predictions against a raw dataset that has not been transformed. How should you set up your workflow in order to prevent skew at prediction time?

When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.

When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. Before requesting predictions, use a saved query to transform your raw input data, and then use ML.EVALUATE.

Use a BigQuery view to define your preprocessing logic. When creating your model, use the view as your model training data. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.

Preprocess all data using Dataflow. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any further transformations on the input data.

You are analyzing the price of a company's stock. Every 5 seconds, you need to compute a moving average of the past 30 seconds' worth of data. You are reading data from Pub/Sub and using DataFlow to conduct the analysis. How should you set up your windowed pipeline?

Use a sliding window with a duration of 30 seconds and a period of 5 seconds. Emit results by setting the following trigger: AfterWatermark.pastEndOfWindow ()

Use a fixed window with a duration of 5 seconds. Emit results by setting the following trigger: AfterProcessingTime.pastFirstElementInPane().plusDelayOf (Duration.standardSeconds(30))

Use a fixed window with a duration of 30 seconds. Emit results by setting the following trigger: AfterWatermark.pastEndOfWindow().plusDelayOf (Duration.standardSeconds(5))

Use a sliding window with a duration of 5 seconds. Emit results by setting the following trigger: AfterProcessingTime.pastFirstElementInPane().plusDelayOf (Duration.standardSeconds(30))

You are designing a pipeline that publishes application events to a Pub/Sub topic. Although message ordering is not important, you need to be able to aggregate events across disjoint hourly intervals before loading the results to BigQuery for analysis. What technology should you use to process and load this data to BigQuery while ensuring that it will scale with large volumes of events?

Create a streaming Dataflow job that reads continually from the Pub/Sub topic and performs the necessary aggregations using tumbling windows.

Create a Cloud Function to perform the necessary data processing that executes using the Pub/Sub trigger every time a new message is published to the topic.

Schedule a Cloud Function to run hourly, pulling all available messages from the Pub/Sub topic and performing the necessary aggregations.

Schedule a batch Dataflow job to run hourly, pulling all available messages from the Pub/Sub topic and performing the necessary aggregations.

You work for a large financial institution that is planning to use Dialogflow to create a chatbot for the company's mobile app. You have reviewed old chat logs and tagged each conversation for intent based on each customer's stated intention for contacting customer service. About 70% of customer requests are simple requests that are solved within 10 intents. The remaining 30% of inquiries require much longer, more complicated requests. Which intents should you automate first?

Automate the 10 intents that cover 70% of the requests so that live agents can handle more complicated requests.

Automate the more complicated requests first because those require more of the agents' time.

Automate a blend of the shortest and longest intents to be representative of all intents.

Automate intents in places where common words such as 'payment' appear only once so the software isn't confused.

Your company is implementing a data warehouse using BigQuery, and you have been tasked with designing the data model. You move your on-premises sales data warehouse with a star data schema to BigQuery but notice performance issues when querying the data of the past 30 days. Based on Google's recommended practices, what should you do to speed up the query without increasing storage costs?

Partition the data by transaction date.

Materialize the dimensional data in views.

Data 151-175

Authored by Michael Caponpon

Professional Development

Used 1+ times

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

Content View

Student View

25 questions

Show all answers

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You work for an advertising company, and you've developed a Spark ML model to predict click-through rates at advertisement blocks. You've been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be closing soon, so a rapid lift-and-shift migration is necessary. However, the data you've been using will be migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?

Use Vertex AI for training existing Spark ML models

Rewrite your models on TensorFlow, and start using Vertex AI

Use Dataproc for training existing Spark ML models, but start reading data directly from BigQuery

Spin up a Spark cluster on Compute Engine, and train Spark ML models on the data exported from BigQuery

Answer explanation

Correct Answer:

✅ Use Dataproc for training existing Spark ML models, but start reading data directly from BigQuery

Why This Works?

Dataproc is Google's managed Hadoop and Spark service, which allows you to run your existing Spark ML pipelines without major code changes.
BigQuery Connector for Spark lets Spark read directly from BigQuery, avoiding the need to export data manually.
This approach maintains continuity with your existing Spark ML pipeline while ensuring a smooth migration to Google Cloud.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You work for a global shipping company. You want to train a model on 40 TB of data to predict which ships in each geographic region are likely to cause delivery delays on any given day. The model will be based on multiple attributes collected from multiple sources. Telemetry data, including location in GeoJSON format, will be pulled from each ship and loaded every hour. You want to have a dashboard that shows how many and which ships are likely to cause delays within a region. You want to use a storage solution that has native functionality for prediction and geospatial processing. Which storage solution should you use?

BigQuery

Cloud Bigtable

Cloud Datastore

Cloud SQL for PostgreSQL

Answer explanation

Correct Answer:

✅ BigQuery

Why BigQuery?

BigQuery is the best choice because:

Scalability: It can efficiently handle 40 TB of data and continuous hourly updates.
Geospatial Processing: It has native GIS functions (ST_GEOGPOINT, ST_INTERSECTS, ST_DISTANCE, etc.), which are essential for analyzing ship locations in GeoJSON format.
Machine Learning Integration: BigQuery ML lets you train ML models directly without moving data to another platform.
Dashboards & Visualization: It integrates well with Looker, Data Studio, and third-party BI tools for real-time dashboards.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You operate an IoT pipeline built around Apache Kafka that normally receives around 5000 messages per second. You want to use Google Cloud Platform to create an alert as soon as the moving average over 1 hour drops below 4000 messages per second. What should you do?

Consume the stream of data in Dataflow using Kafka IO. Set a sliding time window of 1 hour every 5 minutes. Compute the average when the window closes and send an alert if the average is less than 4000 messages.

Consume the stream of data in Dataflow using Kafka IO. Set a fixed time window of 1 hour. Compute the average when the window closes, and send an alert if the average is less than 4000 messages.

Use Kafka Connect to link your Kafka message queue to Pub/Sub. Use a Dataflow template to write your messages from Pub/Sub to Bigtable. Use Cloud Scheduler to run a script every hour that counts the number of rows created in Bigtable in the last hour. If that number falls below 4000, send an alert.

Use Kafka Connect to link your Kafka message queue to Pub/Sub. Use a Dataflow template to write your messages from Pub/Sub to BigQuery. Use Cloud Scheduler to run a script every five minutes that counts the number of rows created in BigQuery in the last hour. If that number falls below 4000, send an alert.

Answer explanation

Correct Answer:

✅ Consume the stream of data in Dataflow using Kafka IO. Set a sliding time window of 1 hour every 5 minutes. Compute the average when the window closes and send an alert if the average is less than 4000 messages.

Why This is the Best Option?

Streaming Analytics with Dataflow:
- Google Cloud Dataflow (Apache Beam) can process real-time Kafka streams efficiently.
Sliding Window for Continuous Monitoring:
- A sliding window (1-hour window, moving every 5 minutes) ensures near-real-time detection of message drop, rather than waiting for a fixed window.
Immediate Alerts:
- Dataflow can trigger an alert as soon as the moving average falls below 4000 messages per second.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You plan to deploy Cloud SQL using MySQL. You need to ensure high availability in the event of a zone failure. What should you do?

Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.

Create a Cloud SQL instance in one zone, and create a read replica in another zone within the same region.

Create a Cloud SQL instance in one zone, and configure an external read replica in a zone in a different region.

Create a Cloud SQL instance in a region, and configure automatic backup to a Cloud Storage bucket in the same region.

Answer explanation

Correct Answer:

✅ Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.

Why This is the Best Option?

Cloud SQL High Availability (HA) Setup:
- Cloud SQL HA uses a primary instance with a failover replica in a different zone in the same region.
- In the event of a zone failure, automatic failover occurs to the replica.
Automatic Failover Mechanism:
- Cloud SQL HA relies on regional persistent disks to replicate data synchronously.
- Failover is automated and minimizes downtime.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Your company is selecting a system to centralize data ingestion and delivery. You are considering messaging and data integration systems to address the requirements. The key requirements are: The ability to seek to a particular offset in a topic, possibly back to the start of all data ever captured, Support for publish/subscribe semantics on hundreds of topics, Retain per-key ordering. Which system should you choose?

Apache Kafka

Cloud Storage

Dataflow

Firebase Cloud Messaging

Answer explanation

Correct Answer:

✅ Apache Kafka

Why Apache Kafka?

Apache Kafka is the best choice because it meets all the key requirements:

Seek to a particular offset
- Kafka allows consumers to seek to a specific offset in a topic.
- You can rewind to the start of all data ever captured if retention policies allow.
Publish/Subscribe Semantics
- Kafka natively supports pub/sub with multiple topics and partitions.
- Can easily handle hundreds of topics efficiently.
Per-Key Ordering
- Kafka guarantees ordering within a partition for messages with the same key.
- Ensures that events for the same key (e.g., a specific user or device) are processed in order.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for long-running batch jobs. You want to use a managed service. What should you do?

Deploy a Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://

Deploy a Dataproc cluster. Use an SSD persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://

Install Hadoop and Spark on a 10-node Compute Engine instance group with standard instances. Install the Cloud Storage connector, and store the data in Cloud Storage. Change references in scripts from hdfs:// to gs://

Install Hadoop and Spark on a 10-node Compute Engine instance group with preemptible instances. Store data in HDFS. Change references in scripts from hdfs:// to gs://

Answer explanation

Correct Answer:

✅ Deploy a Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs://.

Why This Answer?

1. Managed Service → Google Cloud Dataproc

Dataproc is a fully managed service for running Apache Hadoop and Spark.
It automatically handles scaling, fault tolerance, and cluster lifecycle.

2. Cost Optimization → Preemptible Workers

Using 50% preemptible workers reduces costs significantly.
Preemptible VMs are much cheaper than standard instances but can be terminated anytime.
Dataproc handles failures gracefully by rescheduling failed jobs.

3. Storage Optimization → Cloud Storage (gs://) instead of HDFS

Cloud Storage is more cost-effective and durable than HDFS.
Eliminates the need for managing an HDFS cluster.
Dataproc natively integrates with Cloud Storage using the Cloud Storage Connector.
Simply update Hadoop/Spark scripts to reference gs:// instead of hdfs://.

4. Standard Persistent Disk is Sufficient

Standard persistent disks provide enough performance for batch workloads.
SSD persistent disks (as in the second option) increase costs unnecessarily unless the workload is very I/O-intensive.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters and received an area under the Curve (AUC) of 0.87 on the validation set. You want to increase the AUC of the model. What should you do?

Perform hyperparameter tuning

Train a classifier with deep neural networks, because neural networks would always beat SVMs

Deploy the model and measure the real-world AUC; it's always higher because of generalization

Scale predictions you get out of the model (tune a scaling factor as a hyperparameter) in order to get the highest AUC

Answer explanation

Correct Answer:

✅ Perform hyperparameter tuning

Why This Answer?

Hyperparameter tuning is a crucial step in improving the performance of machine learning models, including Support Vector Machines (SVMs). Since the current model has an AUC of 0.87, optimizing hyperparameters can help boost performance further.

Key hyperparameters to tune for an SVM:

Kernel type (linear, polynomial, radial basis function (RBF), sigmoid)
C (Regularization Parameter) → Controls the trade-off between maximizing the margin and minimizing classification errors.
Gamma (for RBF and polynomial kernels) → Controls the influence of a single training example.
Degree (for polynomial kernel) → Determines the complexity of the decision boundary.

Tuning these using Grid Search or Bayesian Optimization can maximize AUC by finding the best combination of hyperparameters.

10.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

You work for a mid-sized enterprise that needs to move its operational system transaction data from an on-premises database to GCP. The database is about 20 TB in size. Which database should you choose?

Cloud SQL

Cloud Bigtable

Cloud Spanner

Cloud Datastore

Answer explanation

Why Cloud SQL Could Be the Right Choice:

Cost-Effective:
- Cloud SQL is typically more cost-effective for smaller to mid-sized databases compared to Cloud Spanner, especially for a 20 TB database. Cloud Spanner is designed for very large, globally distributed databases and can be more expensive due to its scaling and availability features.
Fully Managed:
- Like Cloud Spanner, Cloud SQL is also a fully managed database that handles backups, patching, and scaling without manual intervention, making it easier to operate.
Supports SQL:
- Cloud SQL supports SQL queries, which is perfect if you're migrating from an on-premises relational database. It's compatible with MySQL, PostgreSQL, and SQL Server, offering flexibility depending on your existing database engine.
Horizontal Scaling:
- While it doesn't scale horizontally as seamlessly as Cloud Spanner, Cloud SQL does offer vertical scaling (increasing resources like CPU and RAM) and can be replicated for high availability. For 20 TB, this could be sufficient for many mid-sized companies.
Managed & Secure:
- Cloud SQL is fully managed and offers built-in security features like encryption at rest and in transit, which meets many security requirements for cloud databases.

When Cloud SQL is a Better Fit:

Mid-Sized Databases: For a 20 TB database, Cloud SQL can comfortably handle the workload without the need for complex scaling mechanisms like those required by Cloud Spanner.
Cost Considerations: If cost is a primary factor, Cloud SQL offers a more affordable option, especially if your requirements don't need the massive scaling benefits of Cloud Spanner.
Existing SQL Workloads: If you're already using MySQL, PostgreSQL, or SQL Server, Cloud SQL is a direct fit, making the migration easier and more seamless.

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever

or continue with

Microsoft

Apple

Others

Already have an account?

Popular Resources on Wayground

10 questions

5.P.1.3 Distance/Time Graphs

Quiz

•

5th Grade

10 questions

Fire Drill

Quiz

•

2nd - 5th Grade

20 questions

Equivalent Fractions

Quiz

•

3rd Grade

22 questions

School Wide Vocab Group 1 Master

Quiz

•

6th - 8th Grade

20 questions

Main Idea and Details

Quiz

•

5th Grade

20 questions

Context Clues

Quiz

•

6th Grade

20 questions

Inferences

Quiz

•

4th Grade

12 questions

What makes Nebraska's government unique?

Quiz

•

4th - 5th Grade

Discover more resources for Professional Development

16 questions

Parallel, Perpendicular, and Intersecting Lines

Quiz

•

KG - Professional Dev...

20 questions

NCAA Logo Quiz

Quiz

•

Professional Development

20 questions

Easter trivia

Quiz

•

12th Grade - Professi...

Data 151-175

Correct Answer:

Why This Works?

Correct Answer:

Why BigQuery?

You operate an IoT pipeline built around Apache Kafka that normally receives around 5000 messages per second. You want to use Google Cloud Platform to create an alert as soon as the moving average over 1 hour drops below 4000 messages per second. What should you do?

Correct Answer:

Why This is the Best Option?

You plan to deploy Cloud SQL using MySQL. You need to ensure high availability in the event of a zone failure. What should you do?

Correct Answer:

Why This is the Best Option?

Correct Answer:

Why Apache Kafka?

You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for long-running batch jobs. You want to use a managed service. What should you do?

Correct Answer:

Why This Answer?

1. Managed Service → Google Cloud Dataproc

2. Cost Optimization → Preemptible Workers

3. Storage Optimization → Cloud Storage (gs://) instead of HDFS

4. Standard Persistent Disk is Sufficient

Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters and received an area under the Curve (AUC) of 0.87 on the validation set. You want to increase the AUC of the model. What should you do?

Correct Answer:

Why This Answer?

Correct Answer:

Why This Answer?

You need to choose a database for a new project that has the following requirements: Fully managed, Able to automatically scale up, Transactionally consistent, Able to scale up to 6 TB, Able to be queried using SQL. Which database do you choose?

Why Cloud SQL is the Right Choice:

You work for a mid-sized enterprise that needs to move its operational system transaction data from an on-premises database to GCP. The database is about 20 TB in size. Which database should you choose?

Why Cloud SQL Could Be the Right Choice:

When Cloud SQL is a Better Fit:

Access all questions and much more by creating a free account

Popular Resources on Wayground

Discover more resources for Professional Development