The importance of features flags and reducing the impact on user experience.

Hands designing user website on laptop screen, moving UI elements around.

For modern web applications in the field of e-commerce, the competition is fierce. Users are saturated with content and options, and marketers are always looking for that next big thing that will give them the edge. This is where experimentation testing and feature flags come in.

How to optimize mature applications.

In the life of an e-commerce company, there are many stages. The priority of development change:

  1. MVP - is focused on fast development and fast results.
  2. Growth - with a more mature product the shift focuses to app performance, stability and scalability.
  3. Maturity - at this point, you likely have a well performing application that is stable and scalable. Now, the focus shifts to optimizing revenue.

This tutorial will focus on the Maturity stage. As companies reach a market share where they struggle to grow, they need to make careful decisions on how to best deploy their engineering efforts and avoid spending time on features that would have low impact, or worse even drive customers away due to complexity or a decrease in performance.

We'll explore how a typical mature Next.js driven e-commerce application can leverage feature flags and A/B testing to optimize their revenue. We'll establish a few facts about our example application to frame the discussion:

  • The codebase is 4-5 years old with a mix of Pages and App Router.
  • Some marketing landing pages are converting well, but there is low consistency across the site.
  • The application scores well on Lighthouse metrics, but keeping it performant is a constant challenge due to increasing complexity and team size.

This is a common scenario for many companies. Margins are squeezed and the focus shifts from growth to optimizing revenue. This is where server-side feature driven development comes in.

Server-side feature flag playbook.

This playbook will work in both Pages and App Router. Here is a high level overview of the steps we will be taking:

  1. Setting up a feature flag provider.
  2. Using the Edge middleware to get the flag value and segment users before they reach the page.
  3. Directing the traffic to the appropriate version of the page.
  4. Analyzing the results and making data-driven decisions.

Setting up a feature flag provider

There are many providers to choose from, but for this example, we will be using a mock provider. This will allow us to focus on the feature flagging process itself and not get bogged down by the details of a specific provider.

// lib/mockFeatureFlags.ts
type User = {
  key: string;
  // Add other user attributes as needed.
};

class MockFeatureFlagClient {
  private flags: Record<string, boolean> = {
    "page-b-experiment": false,
  };

  async initialize(): Promise<void> {
    // Simulate initialization delay.
    await new Promise((resolve) => setTimeout(resolve, 100));
  }

  getVariation(flagKey: string, user: User, defaultValue: boolean): boolean {
    // Simple logic: return true for even user keys, false for odd.
    if (flagKey === "page-b-experiment") {
      return parseInt(user.key) % 2 === 0;
    }
    return this.flags[flagKey] ?? defaultValue;
  }

  track(eventName: string, data: Record<string, any>): void {
    console.log("Event tracked:", eventName, data);
  }
}

let client: MockFeatureFlagClient | null = null;

export async function initFeatureFlagClient(): Promise<MockFeatureFlagClient> {
  if (!client) {
    client = new MockFeatureFlagClient();
    await client.initialize();
  }
  return client;
}

This mock feature flag client mimics the common usage of feature flag providers like LaunchDarkly. These providers serve two main purposes:

  1. Serve the flag value to the client.
  2. Track the results of the flag.

The goal of running this experiment is to gradually test the performance and conversion rate of page B vs. page A. To ensure that we can accurately measure the impact of the experiment, we need to make sure that the users are randomly assigned to one of the pages. This is where the feature flag provider comes in. It will serve the flag value to the client and track the results of the flag.

Let's see how we will segment users from the Edge middleware.

// middleware.ts
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";
import { initFeatureFlagClient } from "./lib/mockFeatureFlags";

export async function middleware(request: NextRequest) {
  const client = await initFeatureFlagClient();
  const user = {
    key: request.cookies.get("user_id")?.value || "anonymous",
  };

  const shouldShowPageB = client.getVariation("page-b-experiment", user, false);

  if (shouldShowPageB) {
    return NextResponse.rewrite(new URL("/page-b", request.url));
  }

  return NextResponse.rewrite(new URL("/page-a", request.url));
}

export const config = {
  matcher: "/experiment-page",
};

The beauty of this approach is that it can work with both the Pages and App Router. If we were starting from a brand new project using the App Router, we could use React Server Components to even do this at the component level. However, since we are dealing with existing static pages, we are aiming to keep the changes to a minimum. Experiments are always meant to be short-lived and act as a data point on the performance of a feature or page. By duplicating the page and modifying just part of it like the copy or the layout, we can effectively run these experiments without impacting the performance of the application, since both of our variants will stay static pages!

Analyzing the results

Now that we have our two variants of the page, we can analyze the results to see which one performs better. We will be using the data from the feature flag provider to see which variant is being shown to which user.

A typical way to analyze results is by emitting events that track the performance. In an e-commerce application, this could be something like a purchase or a view of the basket. By tracking these events, we can see which variant is performing better.

const AddToCartButton = ({ product, userId }) => {
  const handleAddToCart = () => {
    const experimentId = "add_to_cart_button_experiment";
    const variation = getExperimentVariation(experimentId, userId);

    // Emit the add-to-cart event with experiment information.
    trackEvent("add-to-cart", {
      productId: product.id,
      productName: product.name,
      price: product.price,
      experimentId: experimentId,
      experimentVariation: variation,
    });

    // You could also add the actual cart functionality here.
    alert(`Added ${product.name} to cart!`);
  };

  return (
    <button onClick={handleAddToCart}>
      Add to Cart - ${product.price.toFixed(2)}
    </button>
  );
};

This example tracks which experiment drives the most add to cart actions. We can then use this data to see which variant performs better.

Conclusion

With the results in, we can now make data-driven decisions on which variant to use. We can use the data to see which variant performs better, and then make the decision to either roll out the new variant to all users, or to continue running the experiment to get more data.

While we've explored the process, the most important aspect of this experiment is that we tested without impacting performance. In most applications, even a 100ms delay in loading a page can decrease our SEO score. Vercel uses the tagline, framework defined infrastructure and Edge middleware is a great example of this. App Router gives us an even broader toolset to work with this, but even for mature applications, the Vercel platform gives you the tooling to be data-driven without impacting performance.