
EXAMEN PREPARATION PART 3

Quiz
•
Other
•
Professional Development
•
Hard
licibeth delacruz
Used 1+ times
FREE Resource
26 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following describes how Databricks Repos can help facilitate CI/CD workflows
on the Databricks Lakehouse Platform?
A. Databricks Repos can facilitate the pull request, review, and approval process before
merging branches
B. Databricks Repos can merge changes from a secondary Git branch into a main Git
branch
C. Databricks Repos can be used to design, develop, and trigger Git automation
pipelines
D. Databricks Repos can store the single-source-of-truth Git repository
E. Databricks Repos can commit or push code changes to trigger a CI/CD process
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A data engineering team needs to query a Delta table to extract rows that all meet the same
condition. However, the team has noticed that the query is running slowly. The team has
already tuned the size of the data files. Upon investigating, the team has concluded that the
rows meeting the condition are sparsely located throughout each of the data files.
Based on the scenario, which of the following optimization techniques could speed up the
query?
A. Data skipping
B. Z-Ordering
C. Bin-packing
D. Write as a Parquet file
E. Tuning the file size
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A junior data engineer needs to create a Spark SQL table my_table for which Spark
manages both the data and the metadata. The metadata and data should also be stored in
the Databricks Filesystem (DBFS).
Which of the following commands should a senior data engineer share with the junior data
engineer to complete this task?
CREATE TABLE my_table (id STRING, value STRING) USING
org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
CREATE MANAGED TABLE my_table (id STRING, value STRING) USING
org.apache.spark.sql.parquet OPTIONS (PATH "storage-path");
CREATE MANAGED TABLE my_table (id STRING, value STRING);
CREATE TABLE my_table (id STRING, value STRING) USING DBFS;
CREATE TABLE my_table (id STRING, value STRING);
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A data engineer wants to create a relational object by pulling data from two tables. The
relational object must be used by other data engineers in other sessions. In order to save on
storage costs, the data engineer wants to avoid copying and storing physical data.
Which of the following relational objects should the data engineer create?
A. View
B. Temporary view
C. Delta Table
D. Database
E. Spark SQL Table
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A junior data engineer has ingested a JSON file into a table raw_table with the following
schema:
cart_id STRING,
items ARRAY<item_id:STRING>
The junior data engineer would like to unnest the items column in raw_table to result in a
new table with the following schema:
cart_id STRING,
item_id STRING
Which of the following commands should the junior data engineer run to complete this
task?
A. SELECT cart_id, filter(items) AS item_id FROM raw_table;
B. SELECT cart_id, flatten(items) AS item_id FROM raw_table;
C. SELECT cart_id, reduce(items) AS item_id FROM raw_table;
D. SELECT cart_id, explode(items) AS item_id FROM raw_table;
E. SELECT cart_id, slice(items) AS item_id FROM raw_table;
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A data engineer has ingested a JSON file into a table raw_table with the following schema:
transaction_id STRING,
payload ARRAY<customer_id:STRING, date:TIMESTAMP, store_id:STRING>
The data engineer wants to efficiently extract the date of each transaction into a table with
the following schema:
transaction_id STRING,
date TIMESTAMP
Which of the following commands should the data engineer run to complete this task?
A. SELECT transaction_id, explode(payload) FROM raw_table;
B. SELECT transaction_id, payload.date FROM raw_table;
C. SELECT transaction_id, date FROM raw_table;
D. SELECT transaction_id, payload[date] FROM raw_table;
E. SELECT transaction_id, date from payload FROM raw_table;
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
A data analyst has provided a data engineering team with the following Spark SQL query:
SELECT district,
avg(sales)
FROM store_sales_20220101
GROUP BY district;
The data analyst would like the data engineering team to run this query every day. The date
at the end of the table name (20220101) should automatically be replaced with the current
date each time the query is run.
Which of the following approaches could be used by the data engineering team to
efficiently automate this process?
A. They could wrap the query using PySpark and use Python’s string variable system to
automatically update the table name.
B. They could manually replace the date within the table name with the current day’s
date.
C. They could request that the data analyst rewrites the query to be run less frequently.
D. They could replace the string-formatted date in the table with a
timestamp-formatted date.
E. They could pass the table into PySpark and develop a robustly tested module on the
existing query.
Create a free account and access millions of resources
Similar Resources on Wayground
25 questions
Ulangan Informatika kelas 7

Quiz
•
Professional Development
28 questions
Razor, TempData e View1..*

Quiz
•
Professional Development
21 questions
Quiz 1

Quiz
•
Professional Development
22 questions
Video games

Quiz
•
KG - Professional Dev...
22 questions
COPA - Module 9 : Database Management Systems

Quiz
•
Professional Development
25 questions
DGCA MODULE 10 july 2017

Quiz
•
Professional Development
22 questions
Supervisi PA Surabaya

Quiz
•
Professional Development
25 questions
TES PANTARLIH DESA PANCASURA

Quiz
•
Professional Development
Popular Resources on Wayground
10 questions
Video Games

Quiz
•
6th - 12th Grade
10 questions
Lab Safety Procedures and Guidelines

Interactive video
•
6th - 10th Grade
25 questions
Multiplication Facts

Quiz
•
5th Grade
10 questions
UPDATED FOREST Kindness 9-22

Lesson
•
9th - 12th Grade
22 questions
Adding Integers

Quiz
•
6th Grade
15 questions
Subtracting Integers

Quiz
•
7th Grade
20 questions
US Constitution Quiz

Quiz
•
11th Grade
10 questions
Exploring Digital Citizenship Essentials

Interactive video
•
6th - 10th Grade