A. All datasets will be updated once and the pipeline will shut down. The compute
resources will be terminated.
B. All datasets will be updated at set intervals until the pipeline is shut down. The
compute resources will be deployed for the update and terminated when the
pipeline is stopped.
C. All datasets will be updated at set intervals until the pipeline is shut down. The
compute resources will persist after the pipeline is stopped to allow for additional
testing.
D. All datasets will be updated once and the pipeline will shut down. The compute
resources will persist to allow for additional testing.
E. All datasets will be updated continuously and the pipeline will not shut down. The
compute resources will persist with the pipeline.
Question 37
A data engineer has a Job with multiple tasks that runs nightly. One of the tasks
unexpectedly fails during 10 percent of the runs.
Which of the following actions can the data engineer perform to ensure the Job completes
each night while minimizing compute costs?
A. They can institute a retry policy for the entire Job
B. They can observe the task as it runs to try and determine why it is failing
C. They can set up the Job to run multiple times ensuring that at least one will
complete
D. They can institute a retry policy for the task that periodically fails
E. They can utilize a Jobs cluster for each of the tasks in the Job
Question 38
A data engineer has set up two Jobs that each run nightly. The first Job starts at 12:00 AM,
and it usually completes in about 20 minutes. The second Job depends on the first Job, and
it starts at 12:30 AM. Sometimes, the second Job fails when the first Job does not complete
by 12:30 AM.
Which of the following approaches can the data engineer use to avoid this problem?
A. They can utilize multiple tasks in a single job with a linear dependency
B. They can use cluster pools to help the Jobs run more efficiently
C. They can set up a retry policy on the first Job to help it run more quickly
D. They can limit the size of the output in the second Job so that it will not fail as easily
E. They can set up the data to stream from the first Job to the second Job