arrow_backAll Articles
Architecture10 min read

How I Indexed 1M+ Pages for Sub-100ms Search Without Elasticsearch

Everyone reaches for Elasticsearch when search gets hard. I didn't — and the custom FULLTEXT index I built handles 1M+ listings at under 100ms per query.

Everyone reaches for Elasticsearch the moment search gets complicated. Add a managed Solr cluster. Spin up Algolia. Throw money at the problem. I’ve watched teams add six-figure infrastructure costs to solve something MySQL was already capable of handling — if you’re willing to stop relying on WP_Query and think like a database engineer for an afternoon.

The Problem

The site was a national directory — tens of thousands of listings, each tied to a complex location taxonomy (country → state → city). The search feature was a basic WP_Query with an s parameter: WordPress’s built-in title search, which runs a LIKE '%keyword%' scan against wp_posts.

At 5,000 listings it was slow. At 50,000 it was painful. At scale — with over a million indexed pages across taxonomy archives and listing detail pages — it was completely unusable. Search requests were timing out, the database was locking up during peak traffic, and the server was sweating through every query.

The client’s first instinct: “Can we add Elasticsearch?” My first instinct: let me check if we actually need to.

The Approach

Before adding infrastructure, I wanted to understand exactly what the search needed to do:

  • arrow_forwardMatch listing titles against a keyword query
  • arrow_forwardOptionally filter by location taxonomy
  • arrow_forwardReturn results fast enough to feel instant — sub-150ms at the API layer
  • arrow_forwardScale without requiring a separate search service to maintain

That’s a narrow problem. Elasticsearch solves it, but it also solves fifty other problems we didn’t have. What we actually needed was a fast, keyword-indexed table with the right query pattern.

MySQL’s FULLTEXT search in InnoDB has been production-grade since 5.6. Boolean mode gives you prefix matching, relevance ranking, and exclusion operators. The catch is that it doesn’t work well against wp_posts out of the box — not because of MySQL limitations, but because wp_posts is structured for general content storage, not optimised for this specific read pattern.

The decision: build a dedicated denormalized table, keep it in sync, and point search at it instead of wp_posts.

The Implementation

Step 1 — Design the search table

The table is purpose-built. It stores only what search needs: the listing ID, title, and taxonomy associations. One row per listing per taxonomy, so a single listing can be found across multiple location filters.

You don’t always need Elasticsearch, Algolia, or a managed search cluster to make WordPress search scale.

By treating MySQL like a real search engine and designing a purpose-built read model, you can get sub-100ms search on tens of thousands of listings using infrastructure you already have.

---

Core Idea

Stop asking wp_posts to do everything.

Instead of throwing WP_Query with s at a million-row table and hoping MySQL figures it out, build a dedicated, denormalized search table that matches your read pattern:

  • arrow_forwardOne row per listing per taxonomy
  • arrow_forwardOnly the fields search actually needs (ID, title, taxonomy, term)
  • arrow_forwardFULLTEXT index on the title
  • arrow_forwardSimple B-tree index for taxonomy filters

Then:

  • arrow_forwardPopulate it via WP-CLI in batches
  • arrow_forwardKeep it fresh on save_post and with a periodic cron rebuild
  • arrow_forwardQuery it directly with FULLTEXT in BOOLEAN mode

No joins. No WP_Query. No extra services.

---

Why This Works

The original problem

  • arrow_forwardNational directory with tens of thousands of listings
  • arrow_forwardComplex location taxonomy (country → state → city)
  • arrow_forwardDefault WordPress search (WP_Query + s) doing LIKE '%keyword%' on wp_posts
  • arrow_forwardAt 50k listings: 2–4 second queries, timeouts under load, DB lockups

The instinct was to add Elasticsearch. The reality: the search requirements were narrow:

  • arrow_forwardMatch listing titles by keyword
  • arrow_forwardOptional filter by location taxonomy
  • arrow_forwardSub-150ms responses
  • arrow_forwardNo new infrastructure to maintain

MySQL’s InnoDB FULLTEXT search already solves this, as long as you:

  1. Give it a table shaped for search, not for general content
  2. Query it directly instead of going through WordPress’s generic abstractions

---

Implementation Overview

1. Purpose-built search table

  • arrow_forwardwp_custom_search holds:
    • arrow_forwardpost_id
    • arrow_forwardpost_title
    • arrow_forwardtaxonomy
    • arrow_forwardterm_id
  • arrow_forwardIndexes:
    • arrow_forwardFULLTEXT on post_title for keyword search
    • arrow_forward(taxonomy, term_id) for fast filtering

This avoids scanning wp_posts and avoids runtime joins.

2. Index build via WP-CLI

  • arrow_forwardUse a custom WP-CLI command to:
    • arrow_forwardFetch listing posts in batches (e.g. 500 at a time)
    • arrow_forwardCall index_single_listing($post_id) per listing
  • arrow_forwardRuns outside HTTP, so no timeouts
  • arrow_forwardScales to tens of thousands of posts in a few minutes

3. Fast search query

  • arrow_forwardDirect $wpdb query against wp_custom_search
  • arrow_forwardFULLTEXT in BOOLEAN mode with prefix matching:
    • arrow_forward+keyword* → required term, prefix match
  • arrow_forwardOptional taxonomy filter via WHERE taxonomy = %s
  • arrow_forwardLIMIT 50 to cap result set

No WP_Query, no LIKE '%...%', no table scans.

4. Keeping the index fresh

Two mechanisms:

  1. On write: save_post_listing hook
    • arrow_forwardUpsert the listing into wp_custom_search whenever it’s saved
  2. On schedule: biweekly cron job
    • arrow_forwardFull rebuild to catch bulk imports or anything that bypassed hooks

Together, they keep search results accurate without manual babysitting.

5. AJAX endpoint

  • arrow_forwardSimple admin-ajax.php handler:
    • arrow_forwardValidate keyword length
    • arrow_forwardCall search_listings()
    • arrow_forwardMap IDs to formatted results

End-to-end latency stays under 100ms on a normal managed MySQL instance.

---

Gotchas & How to Handle Them

  1. innodb_ft_min_token_size (default 3)
    • arrow_forwardQueries under 3 characters silently return nothing
    • arrow_forwardFix: for 1–2 character queries, fall back to LIKE on the small search table
  2. Index drift from bulk imports
    • arrow_forwardDirect SQL imports bypass save_post
    • arrow_forwardFix: explicitly call index_single_listing() in import scripts and keep the cron rebuild
  3. FULLTEXT on tiny tables
    • arrow_forwardInnoDB FULLTEXT can behave oddly on very small datasets
    • arrow_forwardIn staging with few rows, search may look broken
    • arrow_forwardFix: document this and seed enough data in non-prod
  4. Boolean mode ordering
    • arrow_forwardBOOLEAN mode doesn’t auto-rank results
    • arrow_forwardFix: select MATCH(...) AGAINST(...) AS relevance_score and ORDER BY relevance_score DESC

---

Results

After switching from WP_Query + s to a dedicated FULLTEXT table:

  • arrow_forwardQuery time: 15–40ms consistently
  • arrow_forwardAJAX response: <100ms including network
  • arrow_forwardInfra cost: $0 additional
  • arrow_forwardIndex build: ~4 minutes for 50,000 listings via WP-CLI
  • arrow_forwardMaintenance: one cron job + one save_post hook

All with the database that was already there.

---

Takeaways

  • arrow_forwardDon’t reach for Elasticsearch just because search feels slow. First, ask if MySQL can do it with the right schema and query.
  • arrow_forwardDenormalize for reads. wp_posts is a general-purpose write model; your search table should be a purpose-built read model.
  • arrow_forwardUse WP-CLI for heavy lifting. Never build large indexes inside HTTP requests.
  • arrow_forwardHandle short queries explicitly; FULLTEXT will silently ignore them.
  • arrow_forwardCombine real-time hooks with periodic rebuilds to keep your index trustworthy.

Most “we need Elasticsearch” conversations are actually “we need to respect how MySQL wants to be queried.”

sql
CREATE TABLE wp_custom_search (
    id          BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    post_id     BIGINT UNSIGNED NOT NULL,
    post_title  VARCHAR(255)    NOT NULL,
    taxonomy    VARCHAR(100)    NOT NULL,
    term_id     BIGINT UNSIGNED NOT NULL,
    PRIMARY KEY (id),
    FULLTEXT KEY ft_post_title (post_title),
    KEY idx_taxonomy_term (taxonomy, term_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
php
<?php
WP_CLI::add_command('listing_search index_listings', function($args, $assoc_args) {
    $batch_size = $assoc_args['batch-size'] ?? 500;
    $offset = 0;

    do {
        $posts = get_posts([
            'post_type'      => 'listing',
            'post_status'    => 'publish',
            'posts_per_page' => $batch_size,
            'offset'         => $offset,
            'fields'         => 'ids',
        ]);

        foreach ($posts as $post_id) {
            index_single_listing($post_id);
        }

        $offset += $batch_size;
        WP_CLI::log("Indexed $offset posts...");

    } while (count($posts) === $batch_size);

    WP_CLI::success('Index complete.');
});
php
<?php
function search_listings(string $keyword, string $taxonomy = ''): array {
    global $wpdb;
    $table       = $wpdb->prefix . 'custom_search';
    $search_term = '+' . esc_sql($keyword) . '*';

    if (strlen($keyword) < 3) {
        $sql = $wpdb->prepare(
            "SELECT DISTINCT post_id FROM $table WHERE post_title LIKE %s LIMIT 50",
            '%' . $wpdb->esc_like($keyword) . '%'
        );
        return wp_list_pluck($wpdb->get_results($sql), 'post_id');
    }

    $sql = $taxonomy
        ? $wpdb->prepare(
            "SELECT DISTINCT post_id FROM $table
             WHERE MATCH(post_title) AGAINST(%s IN BOOLEAN MODE)
             AND taxonomy = %s LIMIT 50",
            $search_term, $taxonomy
          )
        : $wpdb->prepare(
            "SELECT DISTINCT post_id FROM $table
             WHERE MATCH(post_title) AGAINST(%s IN BOOLEAN MODE)
             LIMIT 50",
            $search_term
          );

    return wp_list_pluck($wpdb->get_results($sql), 'post_id');
}

Want to work with me?