Student of Life

Paging Large Data Sets With A LazyList

By , 30 April 2007

Paging Large Data Sets With A LazyList
Paging Large Data Sets With A LazyList

As you might have noticed from reading my previous blogs, I'm a big fan of simple solutions to problems, especially where they reduce the amount of plumbing which needs to be implemented. For some time, my problem has been finding an efficient way to page through large data sets using JPA and JSF UIData components such as <h:dataTable/>.

I had read a lot of difficult ways to solve the problem, and tried using OpenJPA's Large Result Set (LRS) extensions to JPA, but none of these solutions satisfied me in terms of efficiency and simplicity. What I came up with instead, is a basic List implementation which uses the standard JPA API to load the results of a query on-demand.

Paging Large Data Sets With A LazyList

Here is an example of how the class might be used:

/**
 * Get all the items of posted by the given user. The query is put into a 
 * LazyList for efficient paging.
 */
public Collection getPagedItems(String user, int pageSize) {
    EntityManager em = FacesFunctions.evaluate("${em}", EntityManager.class);
    return new LazyList(
        em.createQuery("SELECT i FROM Items WHERE i.author = :author ")
                     .setParameter("author", user), 
        pageSize,
        (Long) em.createQuery("SELECT COUNT(i) FROM Items WHERE e.author = :author")
                     .setParameter("author", user)
                     .getSingleResult());
}

As shown above, when the list is instantied, the total number of expected results must be provided. It is required since the Query interface doesn't have a function to calculate this value (e.g. by executing an equivalent COUNT(1) query).

Using the LazyList makes working with JSF components such as data tables more efficient, without having to build custom pager components or DataModels. Results from the query are cached in the list, which makes the list efficient when reused in the request scope, or when placed in the session or application scopes.

Here is the LazyList class, in its entirety:

/**
 * This is a list backed by a JPA Query, but only loading the results of the
 * query one page at a time. It loads one page *ahead* for each fetch miss,
 * so if you are iterating through query results backwards you will get poor
 * performance.
 */
public class LazyList extends AbstractList {
  
    /** backing query */
    Query query;
    
    /** cache of loaded items */
    Map<Integer, Object> loaded;

    /** total number of results expected */
    long numResults;
    
    /** number of results to fetch on cache miss */
    int pageSize;
    
    /** default constructor */
    public LazyList() {
        loaded = new HashMap<Integer, Object>();
    }
    
    /**
     * Create a LazyList backed by the given query, using pageSize results
     * per page, and expecting numResults from the query.
     */
    public LazyList(Query query, int pageSize, long numResults) {
        this();
        this.query = query;
        this.pageSize = pageSize;
        this.numResults = numResults;
    }
    
    /**
     * Fetch an item, loading it from the query results if it hasn't already 
     * been.
     */
     public Object get(int i) {
         if (!loaded.containsKey(i)) {
             List results = query.setFirstResult(i).setMaxResults(pageSize)
                    .getResultList();
             for (int j = 0; j < results.size(); j++) {
                  loaded.put(i + j, results.get(j));
             }
         }
         return loaded.get(i);
     }

     /**
      * Return the total number of items in the list. This is done by
      * using an equivalent COUNT query for the backed query.
      */
     public int size() {
         return (int) numResults;
     }
     
     /** update the number of results expected in this list */
     public void setNumResults(long numResults) {
         this.numResults = numResults;
     }
}

The LazyList class works nicely for me although in the future, I might look at addressing the following issues:

  • It can only be used with Query objects, and not ORM-mapped lists. OpenJPA's LRS can be used in this problem-space, but this isn't a part of the JPA standard.
  • Adding or deleting items from the list requires updating the number of results in the set. This isn't usually problem when the list is in the request scope, since its instantiation will often occur after the update, but for session and application scopes it is counter-intuitive to have to do this manually.

The LazyList is a part of the Furnace Webapp Framework. Please feel free to use it in your own applications!

 

About Roger Keays

Paging Large Data Sets With A LazyList

I am an artist, an engineer, and a student of life. Some projects I'm working on at the moment are The Game Of Your Life, Memory Genius and Money Talks. I write every Sunday about travel, psychology and technology. Click here to subscribe, or follow me via RSSTwitter, Facebook, and even Google+.

Add a comment

Please visit https://RogerKeays.com/blog/paging-large-data-sets-with-a-lazylist to add your comments.

Follow Roger Keays

Join the Mailing List

Subscribe to our mailing list for the latest news and announcements.


It'll Change Your Life

I write every Sunday about travel, psychology, and technology. Drop me your email and you'll be the first to get my articles when they're published. It'll change your life. And it's free :)


Website Updates

Latest Poll

Chocolate or avocados?
Chocolate
Avocados