Domain 4 Answer Key

Network Condition Alert Rules

4.1 Question 1

Which of the following metrics can be used to configure an alert rule for Endpoint Agent HTTP Server tests? (Choose two)

A) Response Time
B) BGP Reachability
C) Error Type
D) Interface Throughput

Explanation

For Endpoint Agent HTTP Server tests, valid alert rule metrics include Response Time and Error Type. BGP Reachability and Interface Throughput metrics are not applicable to this test type. Response Time measures time-to-first-byte while Error Type allows alerting on specific HTTP errors.

4.1 Question 2

The alert shown in the exhibit is designed to detect which of the following network security issues?

A) Route poisoning
B) DNS poisoning
C) BGP hijacking
D) DNS hijacking

4.1 Question 3

Refer to the exhibit. The alert rule is set up as shown, but didn't trigger. Why?

A) Alert conditions weren't met and won't trigger with current setup
B) Alert needs two consecutive agent failures to trigger
C) Response code is set up incorrectly
D) All of the above

Explanation

The correct answer is option A) Alert conditions weren't met and won't trigger with current setup because the condition requires 2 agents to generate an error.

Incorrect Options:

B) Alert will trigger when the condition is met for a single test round
C) Response code is correctly configured and should trigger for the HTTP code 401 shown in the exhibits
D) Does not apply

4.1 Question 4

Refer to the exhibit. A network engineer is tasked with configuring an alert that will trigger if the HTTP server responds with a server error. What alert conditions should be configured to meet the specified requirements?

A) Error type is any
B) Wait Time is Dynamic (New) with Medium sensitivity
C) Response Time ≥ Static 500ms
D) Response Code is server error(5XX)

Explanation

The correct answer is D) Response Code is server error(5XX). This is the most specific and relevant condition for the given scenario.

Incorrect Options:

A) HTTP Error: Too broad, capturing any error in the HTTP process (HTTP, Receive, Wait, SSL, etc.). While it includes HTTP server errors, it's not specific enough.
B) Time to First Byte: Measures the duration between completing a browser request and receiving the first byte of the server's response. Not related to server error codes.
C) Response Time: Measures overall request-response duration. Not directly related to HTTP response codes.

End-User Experience Alert Rules

4.2 Question 1

Refer to the exhibit. A network engineer is tasked with configuring an alert that will trigger if the Endpoint Agent path ASN changes on a specific hop. What is the alert type and condition needed to meet the requirement?

A) Scheduled tests, Hop#
B) Real User Test, Hop#
C) Scheduled tests, Any Hop
D) Real User Test, Path Length

Explanation

Option A) Scheduled tests, Hop# is correct because:

Scheduled tests are suitable for monitoring consistent changes in the network path, like an ASN change on a specific hop. Real User Tests are better suited for monitoring real-time user experience fluctuations.
Hop# is the appropriate condition as the requirement explicitly mentions monitoring a specific hop for the ASN change.

Incorrect Answers:

B) Real User Test, Hop#: Real User Tests are not the best choice for this scenario as they are designed for monitoring real-time user experience, not network path changes.
C) Scheduled tests, Any Hop: This option would trigger the alert if the ASN changes on any hop, not just a specific one.
D) Real User Test, Path Length: This option would trigger the alert if the total number of hops in the path changes, not if the ASN changes on a specific hop.

4.2 Question 2

A company is noticing sporadic slowdowns in their web application performance, impacting user experience. They suspect it might be related to high CPU utilization on employee laptops, potentially caused by background processes. Which ThousandEyes alert type and condition combination would be most effective in identifying if endpoint CPU performance is contributing to this issue?

A) Real User Tests > Network Tests and Path Trace, End-to-End Packet Loss
B) Scheduled Tests > Endpoint Path Trace, Path length > #
C) Real User Tests > Endpoint, CPU utilization ≥ %
D) Scheduled Tests > Endpoint End-to-End (server), Memory load ≥ %

Explanation

Option C) Real User Tests > Endpoint, CPU utilization ≥ % is correct because:

This combination directly targets the user's device (Endpoint) where the suspected CPU issue resides.
It utilizes Real User Tests, which gather real-time performance data during user activity, providing the most accurate representation of the issue's impact.
It specifically monitors for CPU utilization exceeding a defined threshold (≥ %), allowing for alerts to be triggered when CPU usage reaches problematic levels.

Dashboard Configuration

The following sample questions require you to analyze data presented in two ThousandEyes dashboards used to monitor its application service at https://thousandeyes.com:

Executive Dashboard: This dashboard (link) provides a high-level overview of application performance.
IT Operations Dashboard: This dashboard (link) offers granular insights for troubleshooting and performance optimization.

Refer to the data in these dashboards to answer the questions below.

4.3 Question 1

Which type of test are we using for these dashboards?

A) HTTP server
B) Page Load
C) Agent to server
D) FTP

Explanation

Observe the widgets on both dashboards to determine the test type.

4.3 Question 2

Which type of widgets were used in the executive dashboard? (Select all that apply)

Explanation

The executive dashboard uses the map and number widgets.

4.3 Question 3

Analyzing the IT operations dashboard, which agent has a better HTTP Connect Time?

A) San Jose CA (AT&T)
B) Mexico City Mexico (TelMex)

Explanation

The Mexico City Mexico (TelMex) Agent displays a Connect time of 0.51 ms.

4.3 Question 4

In the IT operations dashboard, what is the alert trigger reason?

A) Page Load Packet Loss
B) Network jitter
C) Network packet loss
D) Page Load Latency

Explanation

The alert rule is displayed at the beginning of the dashboard, indicating the trigger reason.

4.3 Question 5

In the executive dashboard, what is the page completion time for the Mexico City agent?

A) 100%
B) 83.4%
C) 15.2%
D) 99.67%

Explanation

Move your cursor over the map widget to the Mexico City agent to view the page completion time.

4.3 Question 6

In the executive dashboard, what is the total error count for ThousandEyes web page in the last 15 days?

A) 520
B) 1.58
C) 4610
D) 4805

Explanation

The number widget displays the total TE error count for the last 15 days.

4.3 Question 7

In the IT operations dashboard, while comparing the latest metrics, what is the time difference between Page Load time and DOM time?

A) 120.6 ms
B) 125.3 ms
C) 100 ms
D) 150.4 ms

Explanation

Place your mouse over the latest metrics for page load time and DOM load time, then subtract the DOM load time from the page load time (1479.6 - 1354.3 = 125.3 ms).

4.3 Question 8

A network monitoring engineer is tasked with creating a widget that displays the average packet loss from an agent installed as a Linux package. What is the data source and measure that should be selected?

A) Endpoint Agents and Median
B) Cloud & Enterprise Agents and Mean
C) Routing and Standard Deviation
D) Devices and nth Percentile

Alert Configuration and Functionality

4.4 Question 1

An alert rule for a Web - HTTP Server test is not triggering when the HTTP response code is 500 Internal Server Error. The alert conditions are configured with "Response Code" set to "any error (>= 400 or no response)". What could be causing the alert to not fire?

A) The alert rule is disabled
B) The test is not enabled on any Enterprise Agents
C) The alert rule's "Settings" section does not have the correct test selected
D) The HTTP server is returning a 200 OK response code

Explanation

The correct answer is:

C) The alert rule's "Settings" section does not have the correct Web - HTTP Server test selected. This means the alert conditions are not being evaluated against the test's data, so the 500 errors are not triggering the alert.

Incorrect options:

A) A disabled alert rule would prevent alerts, but this would be obvious in the Alert Rules page.
B) If the test was not assigned to any agents, it would never generate data to trigger alerts. However, this would likely be noticed when viewing the test.
D) A 200 OK response would not trigger the ">= 400" alert condition, but the question states 500 errors are occurring, so this is not the issue.

4.4 Question 2

A CPU utilization alert for Endpoint Agents is triggering too frequently, creating alert noise. Which of the following steps would help reduce the sensitivity of the alert rule? (Select two)

A) Increase the number of agents that must exceed the CPU threshold to trigger the alert
B) Lower the CPU utilization percentage in the alert condition
C) Adjust the alert rule to require more rounds of data to exceed the threshold
D) Enable the alert rule on more Endpoint Agents

Explanation

To make a CPU utilization Endpoint Agent alert less sensitive and reduce noise, the correct options are:

A) Increasing the number/percentage of agents that must exceed the CPU threshold will prevent a single agent from triggering the alert.
C) Requiring more rounds of data to be above the threshold (e.g. 2 of 3 rounds instead of 1 of 1) will filter out brief CPU spikes.

The incorrect options that would not reduce alert sensitivity are:

B) Lowering the CPU utilization percentage would make the alert more sensitive and trigger more frequently.
D) Enabling the alert on more agents would potentially trigger it more often, not less.

Network Capacity Planning

4.5 Question 1

You're analyzing NetFlow data for a network supporting voice and video traffic. The data shows consistent spikes in delay and jitter during peak hours. Which optimization would you recommend?

A) Implement a complete QoS redesign
B) Increase bandwidth on all network links
C) Tune the existing QoS configuration to prioritize voice and video traffic
D) Replace all network hardware with newer models

Explanation

The correct answer is C) Tune the existing QoS configuration to prioritize voice and video traffic.

This option directly addresses the observed issues (delay and jitter spikes) during peak hours.
It aligns with the scope of the exam, which includes QoS tuning but not complete redesigns.
This solution is targeted and likely more cost-effective than other options.

Incorrect options:

A) A complete QoS redesign is out of scope for this exam and may be unnecessary.
B) Increasing bandwidth on all links is a costly solution that may not specifically address the voice and video traffic issues.
D) Replacing all network hardware is an extreme and costly solution that may not directly solve the problem.

4.5 Question 2

SNMP data indicates that a wireless access point is experiencing high channel utilization and increased retransmissions. What optimization would you recommend to improve voice call quality for users on this access point?

A) Increase the transmit power of the access point
B) Change the access point to a different, less congested channel
C) Disable all non-voice traffic on the wireless network
D) Implement strict admission control for all wireless clients

Explanation

The correct answer is B) Change the access point to a different, less congested channel.

This directly addresses the high channel utilization issue.
Reducing channel congestion can decrease retransmissions and improve overall voice call quality.
This solution is a targeted optimization based on the SNMP data provided.

Incorrect options:

A) Increasing transmit power may exacerbate interference issues and doesn't address channel congestion.
C) Disabling all non-voice traffic is an extreme measure that could negatively impact other necessary network functions.
D) Strict admission control for all clients doesn't specifically target the channel utilization issue and may be too restrictive.

4.5 Question 3

CLI outputs show that a router's egress queue for voice traffic is consistently full, leading to increased latency. Based on this data, which optimization would you recommend?

A) Increase the queue size for voice traffic
B) Implement traffic shaping on non-voice traffic
C) Disable QoS on the router to allow all traffic equal priority
D) Replace the router with a higher-capacity model

Explanation

The correct answer is B) Implement traffic shaping on non-voice traffic.

This solution addresses the root cause by managing non-voice traffic to prevent it from overwhelming the voice queue.
It's a targeted optimization that can reduce latency for voice traffic without major hardware changes.
This approach aligns with QoS tuning, which is within the scope of the exam.

Incorrect options:

A) Increasing queue size may delay packets further and doesn't address the underlying issue of queue saturation.
C) Disabling QoS would likely worsen the situation for voice traffic, which requires prioritization.
D) Replacing the router is an expensive solution that may not be necessary if the issue can be resolved through configuration changes.

4.5 Question 4

The following exhibit shows the Capacity Planning results for a router interface connected to an ISP, which provides a 1Gbps connection: Based on the evidence, which action is most likely to fix the observed behavior?

A) Request a link increase from the ISP
B) Reconfigure maximum capacity for the interface
C) Restrict the Web Sites that can be visited from the site
D) Reconfigure business hours settings

Explanation

This is a tricky question. Is there really an issue with the data being presented or perhaps there is something misconfigured on the platform?

Our exhibit shows that our highest consumption, although marked at 97%, is merely 48Mbps, certainly not enough to be making use of the entire 1Gbps connection from the ISP, so option A would be incorrect. Even though our top traffic is indeed HTTP, nothing in the exhibit indicates that pruning some specific HTTP traffic could fix how data is being presented, so option C is incorrect. The exhibit also fails to provide any reason as to how changing business hours could provide some benefit in this case, so option D is incorrect.

Finally, if we gather all the data we have available: ISP connection is 1Gbps and capacity planning marks 48Mbps as 97% of max capacity, we can reach the conclusion that the max capacity for this interface is misconfigured; it should be set to 1Gbps instead of the value it currently has, thus, option B is correct.