Introducing Enhanced Crawl

Introducing Enhanced Crawl 1

Enhanced Crawl will drastically change how you work with Simply Static – it’s smarter, it’s faster, it’s more reliable than our old crawling solution, and it’s easy to switch.

We made a little promotional video, so if you’re more into visual content, there you go:

What is Enhanced Crawl?

Enhanced Crawl uses native WordPress functions to locate all pages and files relevant to your WordPress website:

  • Pages, Posts, and CPTs
  • Uploads, Plugins, Themes
  • Archives, Pagination, Taxonomies
  • Rest API, RSS Feeds, XML Sitemaps

By using dedicated crawlers for each area of WordPress, we eliminate the guesswork when using Simply Static – “Is my page included?” is no longer an issue you have to deal with.

How to use it?

Easy – if you install Simply Static for the first time, it’s activated by default. If you want to use it on an existing project, head over to Simply Static -> Settings -> General -> Enhanced Crawl and enable it:

Introducing Enhanced Crawl 2

How to fine-tune it?

We made the solution adaptable to your specific website and use case. All crawlers are enabled by default, but you can easily mix and match the combination that fits your project the best:

Introducing Enhanced Crawl 3

How to extend it?

We made the solution fully extendable, so if you feel the need to build your own crawler, go for it:

To add a custom crawler, you need to:

  1. Create a class that extends the Simply_Static\Crawler\Crawler abstract class
  2. Implement the required methods and properties
  3. Add your crawler to the list of crawlers using the simply_static_crawlers filter

Step 1: Create a Custom Crawler Class

Here’s an example of a custom crawler that detects URLs for a specific plugin:

<?php

namespace My_Plugin\Crawlers;

// Exit if accessed directly.
if ( ! defined( 'ABSPATH' ) ) {
    exit;
}

/**
 * Custom crawler for My Plugin
 */
class My_Custom_Crawler extends \Simply_Static\Crawler\Crawler {

    /**
     * Crawler ID.
     * @var string
     */
    protected $id = 'my-custom-crawler';

    /**
     * Constructor
     */
    public function __construct() {
        $this->name = __( 'My Custom Crawler', 'my-plugin' );
        $this->description = __( 'Detects URLs for My Plugin.', 'my-plugin' );
        
        // Optional: Set to false if you want this crawler to be disabled by default
        // $this->active_by_default = false;
    }

    /**
     * Detect URLs for this crawler type.
     *
     * @return array List of URLs
     */
    public function detect() : array {
        $urls = [];
        
        // Your custom logic to detect URLs
        // For example:
        $urls[] = 'https://example.com/my-plugin/page1';
        $urls[] = 'https://example.com/my-plugin/page2';
        
        return $urls;
    }
}

Step 2: Add Your Crawler to the List

Add your crawler to the list of crawlers using the simply_static_crawlers filter:

/**
 * Add custom crawler to Simply Static
 */
function add_my_custom_crawler( $crawlers ) {
    // Make sure the Simply Static plugin is active
    if ( class_exists( '\Simply_Static\Crawler\Crawler' ) ) {
        // Add your custom crawler to the list
        $crawlers[] = new \My_Plugin\Crawlers\My_Custom_Crawler();
    }
    
    return $crawlers;
}
add_filter( 'simply_static_crawlers', 'add_my_custom_crawler' );

Step 3: Load Your Crawler Class

Make sure your crawler class is loaded before the filter is applied. You can do this in your plugin’s main file:

/**
 * Load custom crawler class
 */
function load_my_custom_crawler() {
    // Only load if Simply Static is active
    if ( class_exists( '\Simply_Static\Crawler\Crawler' ) ) {
        require_once plugin_dir_path( __FILE__ ) . 'includes/crawlers/class-my-custom-crawler.php';
    }
}
add_action( 'plugins_loaded', 'load_my_custom_crawler' );

How It Works

When Simply Static runs the URL discovery task, it will:

  1. Load all built-in crawlers from the src/crawler directory
  2. Apply the simply_static_crawlers filter, allowing your plugin to add custom crawlers
  3. Get the active crawlers (based on user settings)
  4. Run each active crawler to discover URLs

Your custom crawler will be included in the list of available crawlers in the Simply Static settings page, allowing users to enable or disable it as needed.

Best Practices

  1. Give your crawler a unique ID to avoid conflicts with other crawlers
  2. Provide a clear name and description so users understand what your crawler does
  3. Make your crawler efficient by only detecting URLs that are relevant to your plugin
  4. Consider setting active_by_default to false if your crawler is for a specific use case that not all users will need
  5. Use proper namespacing to avoid conflicts with other plugins

Conclusion

That’s a wrap, folks.

I hope you enjoyed our in-depth look at the new Enhanced Crawl feature.

It will be available in the 3.4 release of Simply Static, and I highly recommend updating to 1.7 of Simply Static Pro as well, as we are building some accompanying features specifically for this new feature.