Paging Large Data Sets With A LazyList

By , 30 April 2007

Paging Large Data Sets With A LazyList
Paging Large Data Sets With A LazyList

As you might have noticed from reading my previous blogs, I'm a big fan of simple solutions to problems, especially where they reduce the amount of plumbing which needs to be implemented. For some time, my problem has been finding an efficient way to page through large data sets using JPA and JSF UIData components such as <h:dataTable/>.

I had read a lot of difficult ways to solve the problem, and tried using OpenJPA's Large Result Set (LRS) extensions to JPA, but none of these solutions satisfied me in terms of efficiency and simplicity. What I came up with instead, is a basic List implementation which uses the standard JPA API to load the results of a query on-demand.

Paging Large Data Sets With A LazyList

Here is an example of how the class might be used:

/**
 * Get all the items of posted by the given user. The query is put into a 
 * LazyList for efficient paging.
 */
public Collection getPagedItems(String user, int pageSize) {
    EntityManager em = FacesFunctions.evaluate("${em}", EntityManager.class);
    return new LazyList(
        em.createQuery("SELECT i FROM Items WHERE i.author = :author ")
                     .setParameter("author", user), 
        pageSize,
        (Long) em.createQuery("SELECT COUNT(i) FROM Items WHERE e.author = :author")
                     .setParameter("author", user)
                     .getSingleResult());
}

As shown above, when the list is instantied, the total number of expected results must be provided. It is required since the Query interface doesn't have a function to calculate this value (e.g. by executing an equivalent COUNT(1) query).

Using the LazyList makes working with JSF components such as data tables more efficient, without having to build custom pager components or DataModels. Results from the query are cached in the list, which makes the list efficient when reused in the request scope, or when placed in the session or application scopes.

Here is the LazyList class, in its entirety:

/**
 * This is a list backed by a JPA Query, but only loading the results of the
 * query one page at a time. It loads one page *ahead* for each fetch miss,
 * so if you are iterating through query results backwards you will get poor
 * performance.
 */
public class LazyList extends AbstractList {
  
    /** backing query */
    Query query;
    
    /** cache of loaded items */
    Map<Integer, Object> loaded;

    /** total number of results expected */
    long numResults;
    
    /** number of results to fetch on cache miss */
    int pageSize;
    
    /** default constructor */
    public LazyList() {
        loaded = new HashMap<Integer, Object>();
    }
    
    /**
     * Create a LazyList backed by the given query, using pageSize results
     * per page, and expecting numResults from the query.
     */
    public LazyList(Query query, int pageSize, long numResults) {
        this();
        this.query = query;
        this.pageSize = pageSize;
        this.numResults = numResults;
    }
    
    /**
     * Fetch an item, loading it from the query results if it hasn't already 
     * been.
     */
     public Object get(int i) {
         if (!loaded.containsKey(i)) {
             List results = query.setFirstResult(i).setMaxResults(pageSize)
                    .getResultList();
             for (int j = 0; j < results.size(); j++) {
                  loaded.put(i + j, results.get(j));
             }
         }
         return loaded.get(i);
     }

     /**
      * Return the total number of items in the list. This is done by
      * using an equivalent COUNT query for the backed query.
      */
     public int size() {
         return (int) numResults;
     }
     
     /** update the number of results expected in this list */
     public void setNumResults(long numResults) {
         this.numResults = numResults;
     }
}

The LazyList class works nicely for me although in the future, I might look at addressing the following issues:

  • It can only be used with Query objects, and not ORM-mapped lists. OpenJPA's LRS can be used in this problem-space, but this isn't a part of the JPA standard.
  • Adding or deleting items from the list requires updating the number of results in the set. This isn't usually problem when the list is in the request scope, since its instantiation will often occur after the update, but for session and application scopes it is counter-intuitive to have to do this manually.

The LazyList is a part of the Furnace Webapp Framework. Please feel free to use it in your own applications!

About Roger Keays

Paging Large Data Sets With A LazyList

Roger Keays is an artist, an engineer, and a student of life. He has no fixed address and has left footprints on 40-something different countries around the world. Roger is addicted to surfing. His other interests are music, psychology, languages, the proper use of semicolons, and finding good food.

Leave a Comment

Please visit https://rogerkeays.com/blog/paging-large-data-sets-with-a-lazylist to add your comments.

Comment posted by: Edson, 8 years ago

Hello Roger!
After instantiating the class LazyList in my DAO and return the list to manage bean, I get the error:

javax.persistence.PersistenceException: org.hibernate.SessionException: Session is closed!

How to keep an open session in LazyList?

Thanks!

Comment posted by: , 8 years ago

You use it like an ordinary List.

Comment posted by: Edson, 8 years ago

Hi Roger Keays!

Congratulations for the excellent tutorial!

Can you provide the code sample how to use LazyList in JSF?

My application uses JSF 2.0 with RichFAces 4.

Comment posted by: Justin, 13 years ago

I am trying to use tomahawk t:dataScroller for pagination on my jsf page. Data is rendered properly for the first page but whenever I try to click any button to go to next page, java script error is thrown as below

form is undefined
var oldTarget = form.target;

Any solution to this will be quite helpful.

I am using tomahawk12-1.1.9 lib with JSF2.0

Comment posted by: ravi, 15 years ago

Thank you for your response. In my application I put LazyList in a request-scoped bean only,... even I am getting this same error.

Can you provide any working sample...it would very helpful for me...

your response is appreciated.

Comment posted by: , 15 years ago

Sounds like you have your LazyList in a session-scoped bean. The lifetime of the list should match the lifetime of the EntityManager, so for most webapps that would mean using a request-scoped bean.

Comment posted by: ravi kumar, 15 years ago

Hey!!!This example is very useful for me...

I am getting following error when I use LazyList in <h:dataTable....

javax.persistence.PersistenceException: org.hibernate.SessionException: Session is closed!

Can you send me the code sample how to use LazyList in JSF.

or can you provide any sample JSF appication code which uses the LazyList

Comment posted by: ravi, 15 years ago

I am getting following error in get mathod.

: org.hibernate.SessionException: Session is closed!

Can you send me the code snippet to use LazyList in JSF

javax.persistence.PersistenceException

Comment posted by: , 15 years ago

http://www.ninthavenue.com.au/products/furnace/example

Comment posted by: Peter, 15 years ago

Hi.

is possible to attach a simple war project showing this functionality. ??

 

Thanks

Comment posted by: frank, 15 years ago

I have an application with ejbs for business and jsf for the presentation logic. The lazyList shoud be shown in a dataTable. The getPagedItems method is implemented in a session bean. The method is called from a managedbean in the web-container. My problem is: when the query from the LazyList executs i get an exception: 

 

java.lang.IllegalStateException: Attempting to execute an operation on a closed EntityManager

I know that the entitymanager is only open in the ejb-container. How can i Implement your LazyList pattern with session beans

Comment posted by: Remon van Vliet, 15 years ago

Nice solution to a problem I've been struggling with. That said, wouldn't it be better to cache the page the element is on rather than the page of which the requests element would be the first element? In other words, shouldnt

 public Object get(int i) {
         if (!loaded.containsKey(i)) {
             List results = query.setFirstResult(i).setMaxResults(pageSize)
                    .getResultList();
             for (int j = 0; j < results.size(); j++) {
                  loaded.put(i + j, results.get(j));
             }
         }
         return loaded.get(i);
     }

be

 public Object get(int i) {
         if (!loaded.containsKey(i)) {
             int pageIndex = i / pageSize;
             List results = query.setFirstResult(pageIndex * pageSize).setMaxResults(pageSize)
                    .getResultList();
             for (int j = 0; j < results.size(); j++) {
                  loaded.put(pageIndex * pageSize + j, results.get(j));
             }
         }
         return loaded.get(i);
     }

That way you'll always cache the entire page the element is on. In your current version a single page display could trigger two queries (granted, it is unlikely since it's highly probably the implementation will request the first element of the page first, but still). Anyway, thanks a lot ;)

Comment posted by: , 15 years ago

Depends where you keep the LazyList. In the request scope would be fine, but if you needed a memory-sensitive implementation you could replace the HashMap with a ReferenceMap.

Comment posted by: Billy Bob Bain, 15 years ago

Does this still have the potential to OOM since you are keeping the loaded data in a hashmap?

Comment posted by: , 15 years ago

Hey Mathias, the query you pass into the lazy list should already be sorted. At work, we have encapsulated sorting so that all our data access objects extend a base class which provides the properties and methods we need (currently for Tomahawk's dataTable). Works okay for us :)

Comment posted by: Mathias Walter, 15 years ago

I like the idea of the LazyList and there simplicity. But is there a simple way to add sorting?

Comment posted by: Patrick Linskey, 15 years ago

 Bear in mind that there are query isolation issues with this approach. There's nothing wrong with that, but it's important to consider the impact of isolation in your queries when evaluating what approach to use.

 

Also, it's generally a good idea to make sure you use an ORDER BY clause in your queries when doing paged queries. Queries against a relational database typically do not have any inherent ordering.

Comment posted by: , 15 years ago

For those of you who are interested, here is our current version of the LazyList which uses OpenJPA-specific features to automatically build the COUNT query for you.

/**
 * This is a list backed by a JPA Query, but only loading the results of the
 * query one page at a time. The same Query object is used for each page, but
 * starting at a different index. Results are cached and pages are loaded one 
 * page *ahead* for each fetch miss. If you are iterating through query results
 * backwards you will get poor performance and might prefer to change the
 * ORDER BY of your query instead.
 */
public class LazyList extends AbstractList {
  
    /** backing query */
    Query query;
    
    /** cache of loaded items */
    Map loaded;

    /** total number of results expected */
    long numResults;
    
    /** number of results to fetch on cache miss */
    int pageSize;
    
    /** default constructor */
    public LazyList() {
        loaded = new HashMap();
    }
    
    /**
     * Create a LazyList backed by the given query, using pageSize results
     * per page. The number of results expected is calculated by
     * reconstructing and executing an equivalent COUNT query when the list 
     * is created.
     */
    public LazyList(Query query, int pageSize) {
        this();
        this.query = query;
        this.pageSize = pageSize;
        
        /* create a query to see how many results there are */
        OpenJPAQuery jpaQuery = (OpenJPAQuery) query;
        EntityManager em = jpaQuery.getEntityManager();
        String queryStr = jpaQuery.getQueryString();
        Query count = em.createQuery(queryStr
                .replaceFirst("(?i)SELECT (.*?) FROM", "SELECT COUNT($1) FROM")
                .replaceFirst("(?i)ORDER BY .*", ""));
        
        /* reset all the parameters */
        if (jpaQuery.hasPositionalParameters()) {
            Object[] posParams = jpaQuery.getPositionalParameters();
            for (int i = 0; i < posParams.length; i++) {
                if (posParams[i] != null) {
                    count.setParameter(i + 1, posParams[i]);
                }
            }
        } else {
            ((OpenJPAQuery) count).setParameters(jpaQuery.getNamedParameters());
        }
        
        this.numResults = (Long) count.getSingleResult();
    }
    
    /**
     * Fetch an item, loading it from the query results if it hasn't already 
     * been.
     */
     public Object get(int i) {
         if (!loaded.containsKey(i)) {
             
             /* fetch the page required starting from this index */
             List results = query.setFirstResult(i).setMaxResults(pageSize)
                        .getResultList();
             for (int j = 0; j < results.size(); j++) {
                  loaded.put(i + j, results.get(j));
             }    
         }
         return loaded.get(i);
     }

     /** 
      * Return the total number of items in the list. This is calculated
      * using an equivalent COUNT query when the list is created.
      */
     public int size() {
         return (int) numResults;
     }
     
     /** update the number of results expected in this list */
     public void setNumResults(long numResults) {
         this.numResults = numResults;
     }
}
Comment posted by: , 15 years ago

Hmmm... I'd probably have a look at what you are using for the pageSize parameter. Also make sure your JPA vendor implements setMaxResults() properly. OpenJPA had a bug with this once upon a time.

Comment posted by: Wilson Aquino, 15 years ago

Hi, That´s very good man,

But I have a issue im my way...

First load of a table, we had all records??

In debug I saw the table get all records, no limits...

what´s hapenning?

Thanks,  use netbeans 6.1 whith table woodstockand JPA

Comment posted by: , 15 years ago

Hey Jónsi, glad you found this idea useful. Rog.

Comment posted by: Jónsi, 15 years ago

Here's the link to the post: www.icefaces.org/JForum/posts/list/8549.page

Comment posted by: Jónsi, 15 years ago

M8, can't thank you enough. Im gonna put a link to this blog at IceFaces forums where it's somewhat a hot issue. Greetings from Iceland!

Comment posted by: , 16 years ago

Thanks Yosh, I fixed that up.

Comment posted by: Yosh, 16 years ago

Hi!

Just found your blog entry... as though your link to the Seamless Library is not working (anymore)... can it be found at

http://www.ninthavenue.com.au/products/furnace

probably? ...

Comment posted by: Hicks, 16 years ago

Looks interesting. Will give it a try..