Training Data Exam

Training Data Exam

Professional Development

99 Qs

quiz-placeholder

Similar activities

Berpikir Komputasi Kelas X YSKI

Berpikir Komputasi Kelas X YSKI

University - Professional Development

100 Qs

PRESENT PERFECT

PRESENT PERFECT

Professional Development

99 Qs

501-600 IFRC

501-600 IFRC

Professional Development

100 Qs

601-700 IFRC

601-700 IFRC

Professional Development

100 Qs

Prueba Cris

Prueba Cris

Professional Development

94 Qs

ENGLISH TEST - MINING TRUCK

ENGLISH TEST - MINING TRUCK

Professional Development

100 Qs

Training Data Exam

Training Data Exam

Assessment

Quiz

Instructional Technology

Professional Development

Hard

Created by

Stefy MZ

Used 1+ times

FREE Resource

99 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

1. Your company built a TensorFlow neutral-network model with a large number of neurons and layers. The model fits well for the training data. However, when tested against new data, it performs poorly. What method can you employ to address this?
Threading
Serialization
Dropout Methods
Dimensionality Reduction

2.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

2. An external customer provides you with a daily dump of data from their database. The data flows into Google Cloud Storage GCS as comma-separated values (CSV) files. You want to analyze this data in Google BigQuery, but the data could have rows that are formatted incorrectly or corrupted. How should you build this pipeline?
Use federated data sources, and check data in the SQL query.
Enable BigQuery monitoring in Google Stackdriver and create an alert.
Import the data into BigQuery using the gcloud CLI and set max_bad_records to 0.
Run a Google Cloud Dataflow batch pipeline to import the data into BigQuery, and push errors to another dead-letter table for analysis.

3.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

3. Your weather app queries a database every 15 minutes to get the current temperature. The frontend is powered by Google App Engine and server millions of users. How should you design the frontend to respond to a database failure?
Issue a command to restart the database servers.
Retry the query with exponential backoff, up to a cap of 15 minutes.
Retry the query every second until it comes back online to minimize staleness of data.
Reduce the query frequency to once every hour until the database comes back online.

4.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

4. You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts. There is no guarantee that data will only be sent in once but you do have a unique ID for each row of data and an event timestamp. You want to ensure that duplicates are not included while interactively querying data. Which query type should you use?
Include ORDER BY DESK on timestamp column and LIMIT to 1.
Use GROUP BY on the unique ID column and timestamp column and SUM on the values.
Use the LAG window function with PARTITION by unique ID along with WHERE LAG IS NOT NULL.
Use the ROW_NUMBER window function with PARTITION by unique ID along with WHERE row equals 1.

5.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

5. You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: • No interaction by the user on the site for 1 hour • Has added more than $30 worth of products to the basket • Has not completed a transaction You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline?
Use a fixed-time window with a duration of 60 minutes.
Use a sliding time window with a duration of 60 minutes.
Use a session window with a gap time duration of 60 minutes.
Use a global window with a time based trigger with a delay of 60 minutes.

6.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

6. Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster.
Create a Google Cloud Dataflow job to process the data.
Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS.
Create a Hadoop cluster on Google Compute Engine that uses persistent disks.
Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector.
Create a Hadoop cluster on Google Compute Engine that uses Local SSD disks.

7.

MULTIPLE CHOICE QUESTION

30 sec • 5 pts

7. Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration.
Put the data into Google Cloud Storage.
Use preemptible virtual machines (VMs) for the Cloud Dataproc cluster.
Tune the Cloud Dataproc cluster so that there is just enough disk for all data.
Migrate some of the cold data into Google Cloud Storage, and keep only the hot data in Persistent Disk.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?