Introducing Intelligent Waits for Spidering

Introducing Intelligent Waits for Spidering
Spider illustration

We are excited to announce the launch of a technical upgrade to our crawling engine – Intelligent Waits for Spidering. This feature replaces our previous fixed-time delay with an adaptive, modern waiting mechanism, delivering superior performance and reliability when dealing with dynamic web content, a challenge with Modern Web Applications that perform client side rendering.

In the past, our crawler relied on a fixed wait time—specifically, a 1000 ms delay implemented via Thread.sleep in Java. This was a necessary workaround because Selenium's standard page load event is often unreliable in today's complex web environments.

While the page load event works well for traditional web apps, it falls short for modern sites, especially Single Page Applications (SPAs) and those with heavy dynamic content. The browser may register the page as "loaded" long before all the necessary JavaScript has executed, the API calls have completed, and the final content has been properly rendered. As a consequence, spider would miss discovery of important application functionality.

The Solution: Leveraging WebDriver Bi-Di (Bidirectional) protocol for True Readiness.

Our new Intelligent Waits feature solves this problem by taking a sophisticated, data-driven approach. Instead of a guessing game with a fixed delay, we now leverage the advanced capabilities of BiDi protocol to monitor the browser in real-time.

Specifically, the crawler now dynamically determines the appropriate wait time by monitoring two critical signals:

  1. Network Activity: We track all outgoing requests and incoming responses to ensure no critical assets are still being fetched.
  2. DOM Mutation: We monitor changes to the Document Object Model (DOM) to confirm that the page structure is stable and that no significant new elements are being added or modified.

By monitoring Network and DOM mutation events, the crawler only proceeds when it is certain the page is fully rendered and interactive, ensuring you capture complete and accurate data from even the most modern, dynamic web applications.

Intelligent Waits diagram

The prevalence of automated test suites within web applications, typically utilizing frameworks such as Selenium or Playwright, raises a critical question: why does waiting for page elements continue to pose a significant challenge for a web spider?

In automated test suites, Selenium/Playwright can wait for the visibility or interactability of a specific web element that is part of the test flow. However, a web spider operates without such foresight - it crawls without prior knowledge of the elements that will render on a page. This inherent blind operation is precisely what makes the implementation of intelligent waiting mechanisms a considerable technical hurdle.

For instance, consider a product page on an e-commerce site where the "Add to Cart" button is only rendered after a slow API call fetches the real-time stock status.

Key Benefits for web application scans

Intelligent Waits is now active and available in longer scan durations (2 hours or more).

Experience confidence in your AppSec program.

Run a free NightVision scan — validated findings on your own app in under 10 minutes.