Search Header Logo

NVIDIA-Certified Associate AI Infrastructure and Operations

Authored by Edgar Cruz

Computers

Vocational training

Used 7+ times

NVIDIA-Certified Associate AI Infrastructure and Operations
AI

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

    Content View

    Student View

251 questions

Show all answers

1.

MULTIPLE SELECT QUESTION

30 mins • 1 pt

In your AI data center, you need to ensure continuous performance and reliability across all

operations. Which two strategies are most critical for effective monitoring? (Select two)

A. Conducting weekly performance reviews without real-time monitoring

B. Using manual logs to track system performance daily

C. Disabling non-essential monitoring to reduce system overhead

D. Deploying a comprehensive monitoring system that includes real-time metrics on CPU,

GPU, and memory usage

E. Implementing predictive maintenance based on historical hardware performance data

2.

MULTIPLE CHOICE QUESTION

30 mins • 1 pt

A financial institution is deploying two different machine learning models to predict credit

defaults. The models are evaluated using Mean Squared Error (MSE) as the primary metric.

Model A has an MSE of 0.015, while Model B has an MSE of 0.027. Additionally, the

institution is considering the complexity and interpretability of the models. Given this

information, which model should be preferred and why?

A. Model A should be preferred because it has a more complex architecture, leading to better

long-term performance.

B. Model B should be preferred because it has a higher MSE, indicating it is less likely to

overfit.

C. Model A should be preferred because it is more interpretable than Model B.

D. Model A should be preferred because it has a lower MSE, indicating better performance.

3.

MULTIPLE CHOICE QUESTION

30 mins • 1 pt

You are designing a data center platform for a large-scale AI deployment that must handle

unpredictable spikes in demand for both training and inference workloads. The goal is to

ensure that the platform can scale efficiently without significant downtime or performance

degradation. Which strategy would best achieve this goal?

A. Deploy a fixed number of high-performance GPU servers with auto-scaling based on CPU

usage.

B. Implement a round-robin scheduling policy across all servers to distribute workloads

evenly.

C. Migrate all workloads to a single, large cloud instance with multiple GPUs to handle peak

loads.

D. Use a hybrid cloud model with on-premises GPUs for steady workloads and cloud GPUs

for scaling during demand spikes.

4.

MULTIPLE CHOICE QUESTION

30 mins • 1 pt

QUESTION NO: 4

Your organization runs multiple AI workloads on a shared NVIDIA GPU cluster. Some

workloads are more critical than others. Recently, you've noticed that less critical workloads

are consuming more GPU resources, affecting the performance of critical workloads. What is

the best approach to ensure that critical workloads have priority access to GPU resources?

A. Implement GPU Quotas with Kubernetes Resource Management

B. Use CPU-based Inference for Less Critical Workloads

C. Upgrade the GPUs in the Cluster to More Powerful Models

D. Implement Model Optimization Techniques

5.

MULTIPLE CHOICE QUESTION

30 mins • 1 pt

QUESTION NO: 5

Your AI team notices that the training jobs on your NVIDIA GPU cluster are taking longer

than expected.

Upon investigation, you suspect underutilization of the GPUs. Which monitoring metric is the

most critical to determine if the GPUs are being underutilized?

A. GPU Utilization Percentage

B. Memory Bandwidth Utilization

C. Network Latency

D. CPU Utilization

6.

MULTIPLE SELECT QUESTION

30 mins • 1 pt

QUESTION NO: 6

A large enterprise is deploying a high-performance AI infrastructure to accelerate its machine

learning workflows. They are using multiple NVIDIA GPUs in a distributed environment. To

optimize the workload distribution and maximize GPU utilization, which of the following tools

or frameworks should be integrated into their system? (Select two)

A. NVIDIA CUDA

B. NVIDIA NGC (NVIDIA GPU Cloud)

C. TensorFlow Serving

D. NVIDIA NCCL (NVIDIA Collective Communications Library)

E. Keras

7.

MULTIPLE CHOICE QUESTION

30 mins • 1 pt

QUESTION NO: 7

Your AI training jobs are consistently taking longer than expected to complete on your GPU

cluster, despite having optimized your model and code. Upon investigation, you notice that

some GPUs are significantly underutilized. What could be the most likely cause of this issue?

A. Insufficient power supply to the GPUs

B. Inefficient data pipeline causing bottlenecks

C. Inadequate cooling leading to thermal throttling

D. Outdated GPU drivers

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?