Sergio Xalambrí

Load only the data you need in Remix

If you are used to build SPA and your APIs are REST you probably have found the issue with over fetching, this is a common problem usually solve by either creating custom endpoints and not using REST anymore or using GraphQL with all the work needed to setup a GraphQL endpoint with proper batching of data and then in the client load one GraphQL client library like Relay or Apollo.

So you normally have one of the following snippets.

If you are using React Query you may have something like this:

function UserList() {
  let { data, error, status } = useQuery("users", () => api.get("/api/users"));
  // use status, data and error here to render your component
}

If you are using Apollo or another GraphQL library you may have something like this:

function UserList() {
  let { data, error, loading } = useQuery(gql`
    query GetUserList {
      users {
        id
        name
      }
    }
  `);
  // use status, data and error here to render your component
}

In the React Query case you fetched the whole list of users with all the data the generic /api/users endpoint returned, so you overfetching but React Query is super small you your JS is small.

In the GraphQL case you avoided the overfetch but includede a probably huge JS library.

The best way to solve problems is by eliminating them instead. Remix do this for us.

The Remix approach

In Remix we do our data fetching server-side instead of client-side, this give us a lot of benefits like you can query our DB directly, you can use Redis for cache instead of cache in-memory so even if the user goes to another computer it may hit a cache, you can also read from the file system or do basically anything you could do in a server and couldn't in a browser.

To do this a route in Remix exports a function called loader which receives the request and returns a response with the data that route will need.

export let loader: LoaderFunction = async ({ request }) => {
  // this should return the list of users
  let users = await getAllUsers();
  return json({ users });
};

export default function View() {
  let { users } = useRouteData();
  // use the data here to render the list of users
  // no need to handle loading states, the browser does that for use
}

This makes the request really simple, but we are still sending everything to the user because the loader is directly returning the whole list of users while we only need their ID and name. Let's fix that.

export let loader: LoaderFunction = async ({ request }) => {
  // this should return the list of users
  let users = await getAllUsers();
  // reduce the data to what we actually need in this route
  users = users.map((user) => {
    return { id: user.id, name: user.name };
  });
  // return the data as always
  return json({ users });
};

Doing this we reduced the data of each user to only their ID and name, if we were using an ORM we may be able to query only the columns we need, if we are fetching our data from an API we may need to do this Array#map since our API Rest most likely will not allow us fetch less data.

What if the API is already GraphQL? In that case you could be tempted to load Apollo and don't use loaders, but we would have to add a whole JS lib for something Remix already does (load data) and the page will not work without JS.

What we could do instead is to run our GraphQL directly in the loader and return the data.

export let loader: LoaderFunction = async ({ request }) => {
  // we run our query server side
  let users = await runQuery(gql`
    query GetUserList {
      users {
        id
        name
      }
    }
  `);
  // and return the data as always
  return json({ users });
};

And because we run our GraphQL query server-side we don't need to ship any JS code related to handling GraphQL requests, parsing queries or storing a cache of the resources.

Caching in Remix

The tools we mentioned, React Query and Apollo or basically any GraphQL client, come with a client-side cache for our data. This allows them to avoid querying things they already have, so you can run the same query two or more times and only run one request.

In Remix we don't have that issue, because Remix know everything about your routes, and that includes the layout routes, you only need to return what every route needs and on client-side navigation Remix will only fetch the data of the new route, this means if we have a /users with a list of users and a nested /users/:id which keeps the list of users from /users on the left we can fetch the list with reduced data on our routes/users route and fetch the detail of the user on routes/users/$id route.

Every time the user click another user on the list Remix will only fetch the detail fo that user and not the list, if we submit a form doing a non GET request it will refetch any loader for the currently actives routes, this is the list and details, which is not a problem because our list only have a reduced amount of data (id and name) and not the whole user as you may have had to load if you fetched the data client-side.

So the overfetching problem is eliminated thanks to Remix loaders, and the extra JS for data fetching libs is also eliminated because you fetch server-side in those loaders.

Okey, but I want to cache the data

If you know you have a performance issue because of those fetches, or if you have identified how your users actually use your app and found some places where you can improve the performance by adding caches you can do that in server-side too.

export let loader: LoaderFunction = async ({ request }) => {
  // if we have the list of users cached return it
  if (await cache.has("users"))
    return json({ users: await cache.get("users") });
  // if we don't fetch it
  let users = await getAllUsers();
  // reduce the data to what we actually need in this route
  users = users.map((user) => {
    return { id: user.id, name: user.name };
  });
  // store the list of users in the cache
  await cache.set("users", users);
  // return the data as always
  return json({ users });
};

This cache object could be an in-memory cache using a LRU cache or it could be Redis or anything we could use to cache data.

By caching this way the browser will still send a request to get the data but the response will be almost instantaneous thanks to our cache.

If you want to avoid the browser to do the request you could send a Cache-Control header in the response.

export let loader: LoaderFunction = async ({ request }) => {
  let users = await getAllUsers();
  users = users.map((user) => {
    return { id: user.id, name: user.name };
  });
  // return the data as always plus the cache control headers
  return json(
    { users },
    {
      headers: {
        "Cache-Control": "some cache control value to cache for a few minutes",
      },
    }
  );
};

Note that, if you use Cache-Control to cache in the user browsers you will not have a way to invalidate that cache after an action is run, so the user will still receive cached data.

If you want to be able to purge the cache you need to use Cache-Control to only cache in a CDN and use a CDN which allow you to purge the cache manually by sending some request to an API, like Cloudflare or Fastly.

A final option would be to setup a Service Workers, but be aware that a incorrectly configured SW could be really hard to update if it caches itself and you may made some of your users have an old version of your webapp almost forever.