
NVIDIA-Certified Associate AI Infrastructure and Operations
Authored by Edgar Cruz
Computers
Vocational training
Used 7+ times

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
251 questions
Show all answers
1.
MULTIPLE SELECT QUESTION
30 mins • 1 pt
In your AI data center, you need to ensure continuous performance and reliability across all
operations. Which two strategies are most critical for effective monitoring? (Select two)
A. Conducting weekly performance reviews without real-time monitoring
B. Using manual logs to track system performance daily
C. Disabling non-essential monitoring to reduce system overhead
D. Deploying a comprehensive monitoring system that includes real-time metrics on CPU,
GPU, and memory usage
E. Implementing predictive maintenance based on historical hardware performance data
2.
MULTIPLE CHOICE QUESTION
30 mins • 1 pt
A financial institution is deploying two different machine learning models to predict credit
defaults. The models are evaluated using Mean Squared Error (MSE) as the primary metric.
Model A has an MSE of 0.015, while Model B has an MSE of 0.027. Additionally, the
institution is considering the complexity and interpretability of the models. Given this
information, which model should be preferred and why?
A. Model A should be preferred because it has a more complex architecture, leading to better
long-term performance.
B. Model B should be preferred because it has a higher MSE, indicating it is less likely to
overfit.
C. Model A should be preferred because it is more interpretable than Model B.
D. Model A should be preferred because it has a lower MSE, indicating better performance.
3.
MULTIPLE CHOICE QUESTION
30 mins • 1 pt
You are designing a data center platform for a large-scale AI deployment that must handle
unpredictable spikes in demand for both training and inference workloads. The goal is to
ensure that the platform can scale efficiently without significant downtime or performance
degradation. Which strategy would best achieve this goal?
A. Deploy a fixed number of high-performance GPU servers with auto-scaling based on CPU
usage.
B. Implement a round-robin scheduling policy across all servers to distribute workloads
evenly.
C. Migrate all workloads to a single, large cloud instance with multiple GPUs to handle peak
loads.
D. Use a hybrid cloud model with on-premises GPUs for steady workloads and cloud GPUs
for scaling during demand spikes.
4.
MULTIPLE CHOICE QUESTION
30 mins • 1 pt
QUESTION NO: 4
Your organization runs multiple AI workloads on a shared NVIDIA GPU cluster. Some
workloads are more critical than others. Recently, you've noticed that less critical workloads
are consuming more GPU resources, affecting the performance of critical workloads. What is
the best approach to ensure that critical workloads have priority access to GPU resources?
A. Implement GPU Quotas with Kubernetes Resource Management
B. Use CPU-based Inference for Less Critical Workloads
C. Upgrade the GPUs in the Cluster to More Powerful Models
D. Implement Model Optimization Techniques
5.
MULTIPLE CHOICE QUESTION
30 mins • 1 pt
QUESTION NO: 5
Your AI team notices that the training jobs on your NVIDIA GPU cluster are taking longer
than expected.
Upon investigation, you suspect underutilization of the GPUs. Which monitoring metric is the
most critical to determine if the GPUs are being underutilized?
A. GPU Utilization Percentage
B. Memory Bandwidth Utilization
C. Network Latency
D. CPU Utilization
6.
MULTIPLE SELECT QUESTION
30 mins • 1 pt
QUESTION NO: 6
A large enterprise is deploying a high-performance AI infrastructure to accelerate its machine
learning workflows. They are using multiple NVIDIA GPUs in a distributed environment. To
optimize the workload distribution and maximize GPU utilization, which of the following tools
or frameworks should be integrated into their system? (Select two)
A. NVIDIA CUDA
B. NVIDIA NGC (NVIDIA GPU Cloud)
C. TensorFlow Serving
D. NVIDIA NCCL (NVIDIA Collective Communications Library)
E. Keras
7.
MULTIPLE CHOICE QUESTION
30 mins • 1 pt
QUESTION NO: 7
Your AI training jobs are consistently taking longer than expected to complete on your GPU
cluster, despite having optimized your model and code. Upon investigation, you notice that
some GPUs are significantly underutilized. What could be the most likely cause of this issue?
A. Insufficient power supply to the GPUs
B. Inefficient data pipeline causing bottlenecks
C. Inadequate cooling leading to thermal throttling
D. Outdated GPU drivers
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?