Deep Learning: Generative Models

Deep Learning: Generative Models

University

10 Qs

quiz-placeholder

Similar activities

ANS   Sympathetic drugs

ANS Sympathetic drugs

University

15 Qs

Radiações

Radiações

University

15 Qs

Points Angles Lines Planes

Points Angles Lines Planes

10th Grade - University

14 Qs

Geometry Content

Geometry Content

11th Grade - University

15 Qs

Segments Angles

Segments Angles

11th Grade - University

15 Qs

TestyVAEs3

TestyVAEs3

University

10 Qs

[IMV] Future Edge, Generative AI

[IMV] Future Edge, Generative AI

University

10 Qs

Rumus- rumus Trigonometri 1

Rumus- rumus Trigonometri 1

12th Grade - University

10 Qs

Deep Learning: Generative Models

Deep Learning: Generative Models

Assessment

Quiz

Mathematics, Science, Computers

University

Hard

Created by

Josiah Wang

Used 36+ times

FREE Resource

10 questions

Show all answers

1.

MULTIPLE SELECT QUESTION

15 mins • 1 pt

Which of the following statements justify the Maximum Likelihood approach ?

It returns a model that assigns high probability to observed data

It minimises the KL divergence KL[p_data || p_model]

It minimises the KL divergence KL[p_model || p_data]

It minimises the reconstruction error of the data

Answer explanation

Definition of the likelihood function as “the likelihood of the model parameters that explains the generation of the data”, so MLE corresponds to finding the “best explanation”.

With regards to both the KL options - refer to the definition of KL.

Maximum Likelihood minimises the reconstruction error only if the model likelihood itself describes a reconstruction process - think cross-entropy loss.

2.

MULTIPLE SELECT QUESTION

15 mins • 1 pt

Which of the following statements, when combined together, explain why we cannot train VAEs using Maximum likelihood Estimation?

The decoder is parameterised by a neural network so it is highly non-linear

The latent variable is continuous

MLE requires evaluating the marginal distribution on data

There are too many datapoints in the dataset

Answer explanation

MLE requires the evaluation of

p(x) = p(xz)p(z)dzp\left(x\right)\ =\ \int_{ }^{ }p\left(x|z\right)p\left(z\right)dz to marginalise out the latent variable. This is intractable due to the nature of p(x|z)

The option 'The latent variable is continuous' is not true if picked alone -- consider probabilistic PCA where the latent variable is Gaussian and the “decoder” is linear.

Regarding the option about there being too many datapoints, this is about intractability due to large-scale data, not due to the intractability of marginal likelihood on each datapoint.

3.

MULTIPLE SELECT QUESTION

15 mins • 1 pt

Which of the following statements are true for the VAE objective?

It is a lower-bound to the maximum likelihood objective

The gap between the VAE objective and the maximum likelihood objective is KL[p(z)||q(z|x)]

The KL term can always be viewed as a regulariser for the VAE encoder

The optimum of the VAE decoder is also the MLE optimum

Answer explanation

The gap between the VAE and Maximum Likelihood objective is not KL[p(z)||q(z|x)] -- check definitions.


The KL term acts as a reguraliser when the prior is fixed with no learnable parameters. If prior is learnable, the prior can be learned towards the q distribution so the regularisation effect is unclear.


The optimum of the VAE == the MLE optimum only if q is the true posterior, so the correctness of this statement depends on the form of q.

4.

MULTIPLE SELECT QUESTION

15 mins • 1 pt

In the famous “Chinese room” turing test example, a man will be sitting inside a room doing English-to-Chinese translation, and the other volunteers outside the room will be asked to guess, based on the English-to-Chinese translation results, whether the man in the room understands Chinese or not. You are one of the volunteers. You know the man in the room is English so you assume a priori he does not understand Chinese with probability 0.8. Now given the translation result is correct, how would you guess whether he understands Chinese or not?

I’m sure he definitely understand Chinese

He probably doesn’t understand Chinese (with probability 0.8)

Give me more info about the correct translation rates for those who only speak English

Give me more info about the correct translation rates for those who speak both English and Chinese

Answer explanation

The goal of this question is to guide students to think about Bayes’ optimal classifier. This requires information about p(translation is correct | the man only speaks English) and p(translation is correct | the man speaks both English and Chinese).

5.

MULTIPLE CHOICE QUESTION

15 mins • 1 pt

Which best represents the reparameterisation trick?

y = μ + σϵ y\ =\ \mu\ +\ \sigma\epsilon\ where ϵN(0, I)\epsilon\sim N\left(0,\ I\right)

y N(μ, σ)y\ \sim N\left(\mu,\ \sigma\right)

y N(E(x), ϵ)y\ \sim N\left(E\left(x\right),\ \epsilon\right)

None of the above

Answer explanation

You cannot backprop through a stochastic node. The reparamaterisation trick allows you to emulate sampling from a distribution however keeping the main computational graph ( μ\mu and σ\sigma ) deterministic and so differentiable.

6.

MULTIPLE SELECT QUESTION

15 mins • 1 pt

Which of the following statements are true for the encoder in a Variational Autoencoder.

It is an approximation function which outputs likely latent representations for a given input.

It is equivalent to the true posterior

It is an approximation of the true posterior

It is still required during the generation process

Answer explanation

VAEs are latent variable models, in that they use a latent variable, z to describe the generation process. Now in order to calculate p_model(x), rather than having to sample all values of z (which results in an intractable problem) the encoder is introduced as an approximate posterior to narrow down the latent space and suggest likely latent codes given x.

7.

MULTIPLE CHOICE QUESTION

15 mins • 1 pt

Media Image

Heuristically which of the two plots is the best loss for the Generator in a Generative Adversarial Network?

-log(D(G(z)))

log(1 - D(G(z)))

Answer explanation

This is heuristically motivated. Maximising the probability that the discriminator makes a mistake rather than minimising the probability that the discriminator is correct results in the derivatives of the generator’s loss function with respect to the discriminators logits to remain large even when the discriminator easily rejects the generators samples.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?

Discover more resources for Mathematics