Search Header Logo

PySpark Quiz Round

Authored by Ankita Chatterjee

Other

Professional Development

Used 1+ times

PySpark Quiz Round
AI

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

    Content View

    Student View

11 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Which of the following is a transformation operation in PySpark?

count()

filter()

reduce()

collect()

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Which of the following is true for RDD?

RDD is programming paradigm

RDD in Apache Spark is an immutable collection of objects

It is a database

None of the above

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

words_list = sc.parallelize ( ["pyspark", "quiz", "questions", "at", "quiz.com"] )

filtered_words = words_list.filter(lambda x: 'quiz' in x)

matched_words= filtered_words.collect()

print(matched_words)

[ "quiz", "quiz.com" ]

[ "quiz" ]

["quiz.com" ]

Error

4.

MULTIPLE CHOICE QUESTION

30 sec • 2 pts

Let us consider, we have a data frame "df". Then what does the expression '[.]{2,}' signify for the following transformation?

df = df.withColumn('var_addrss', sf.regexp_replace('var_addrss', '[.]{2,}', ''))

A single dot (".") followed by 2 integers

A single dot (".") followed by the integer '2'

Single dot (".") appearing twice consecutively

None of these

5.

MULTIPLE CHOICE QUESTION

30 sec • 2 pts

Let us consider, we have a data frame "df". Then what does the expression '^[0]*' signify for the following transformation?

df = df.withColumn('var_addrss', sf.regexp_replace('var_addrss', '^[0]*', ''))

The value starts with 0 OR followed by a sequence of 0s

The value starts with 0 and ends with 0

The value starts with 0 and followed by a sequence of 0s

The value starts with anything other than 0

6.

MULTIPLE SELECT QUESTION

45 sec • 1 pt

Media Image

Let's assume we have the following data frame "df".

How to display the 'age' column in descending order?

display(df.orderBy(df.age.desc()))

display(df.sort(df.age.desc()))

display(df.orderBy(df.age, sort = desc()))

None of these

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What will the data type of the columns for the following PySpark data frame "df"?

df = spark.read.format("csv").option("header", "true").option("inferSchema", "false").option("delimeter", ",").load("/mnt/temp/test.csv")

Data types of columns will be int

Data types of columns will be read as per the data types defined in the file

Data types of all columns will be string

None of the above

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?