Dec 16, 2015

Paging Data with DataGridView in VirtualMode


This article illustrates an example of paged data retrieval and displaying a large amount of data on theDataGridView control in VirtualMode.


Displaying millions of rows of data on a Form consumes lots of memory and can easily make the UI unresponsive. In order for the UI to display large amounts of data and remain responsive, the UI should only load and display data that it needs to show. Paging data makes use of memory more efficiently since additional data is only loaded when needed.
There are other things that must be considered to improve the performance of the DataGridView control. This article does not cover optimization. For more information on optimization, I recommend reading the MSDN article - “Performance Tuning in the Windows Forms DataGridView Control.”
For this article, I will present an example of a Form with only one DataGridView loaded with over a million rows of names. The database script for creating the table and stored procedure is included in the zipped solution. You need to load data to the tables.

Using the Code

There are two parts in the example:
  1. The use of DataGridView VirtualMode, and
  2. Paged data retrieval

How to Use DataGridView VirtualMode

There are three main things that must be done to use DataGridView in VirtualMode:
  1. Set the VirtualMode property to true
  2. Attach an event handler for DataGridView's CellValueNeeded event, and
  3. Set the DataGridView's RowCount property
public partial class Form1 : Form
    const int PAGE_SIZE = 5000;

    NameListCache _cache = null;

    public Form1()

        _cache = new NameListCache( PAGE_SIZE );

        dataGridView1.CellValueNeeded +=
            new DataGridViewCellValueEventHandler( dataGridView1_CellValueNeeded );

        dataGridView1.VirtualMode = true;

        dataGridView1.RowCount = (int)_cache.TotalCount;

    private void dataGridView1_CellValueNeeded
        ( object sender, DataGridViewCellValueEventArgs e )
        _cache.LoadPage( e.RowIndex );

        int rowIndex = e.RowIndex % _cache.PageSize;

        e.Value = _cache.CachedData[rowIndex][e.ColumnIndex];
On the code above, the VirtualMode property of dataGridView1 is set to true and an event handler forCellValueNeeded event is added. To get the row count, the NameListCache instance TotalCount is assigned to dataGridView1.RowCount. The code for NameListCache is explained in the Paged Data Retrieval section.
In Form1's constructor, RowCount is set to the total number of rows. The DataGridView needs to know the total number of rows ahead of time even before all rows are retrieved. If the data retrieval is set to get 5000 rows per page and the total number of rows is 100,000 rows, the DataGridView must already know that it is supposed to display 100,000 rows even when only the first 5000 rows are loaded. The DataGridView control uses the total number of rows in order to know how to set the vertical scrollbar grip.
The DataGridView control requests only data that it needs to display when VirtualMode is set to true. TheDataGridView control raises the CellValueNeeded event every time cells are loaded for display. Keep in mind that this happens for each cell. If the table displayed has 5 columns, the CellValueNeeded event is raised 5 times for every row added to the DataGridView. In the CellValueNeeded event handler, the cell requesting for its content is identified using DataGridViewCellValueEventArgsRowIndex and ColumnIndexproperties. The cell's value is assigned to the Value property of DataGridViewCellValueEventArg. The example above shows how a cell's value is taken from _cache.CachedData (the CachedData property refers to an instance of a DataTable). RowIndex and ColumnIndex are used as indexers on CachedData to get the value that the requesting cell must display.

Paged Data Retrieval

Paged data retrieval requires only part of the result set returned. The UI must be able to make multiple calls and specify which part of the total result set is currently requested. Typically, UIs also need to know the total number of rows at the beginning of the sequence of requests.
For this example, I wrote a stored procedure that returns part of the list of names retrieved and an output parameter for the total row count.
create proc [dbo].[GetNames]
    @lastRow bigint,
    @pageSize bigint,
    @totalRowCount bigint output

select @totalRowCount = count(*) from _firstNames, _lastNames

        fn.[FirstName] as FirstName,
        ln.[Name] as LastName,
        row_number() over( order by FirstName ) as RowNumber
        _firstNames fn, _lastNames ln
) as data
    RowNumber between ( @lastRow + 1 ) and ( @lastRow + @pageSize )

The use of a cross-join of first names and last names shown above is for illustration purposes only. This is so I can get a large amount of data for the DataGridView. The stored procedure takes the row number of the last row retrieved and the page size. The main point that must be taken from the stored procedure code shown is that we can call the procedure and retrieve only part of the query result and that the stored procedure's output parameter tells us how many total rows the query returns.
I created a typed-dataset called NameList with a DataAdapter for data access. The NameListCache class encapsulates the paging and caching code.
public class NameListCache
    public int PageSize = 5000;
    public long TotalCount;
    public NameList.GetNamesDataTable CachedData = null;

    NameListTableAdapters.GetNamesTableAdapter _adapter =
        new DataGridViewVirtualModePaging.NameListTableAdapters.GetNamesTableAdapter();
    int _lastRowIndex = -1;

    public NameListCache( int pageSize )
        PageSize = pageSize;
        LoadPage( 0 );

    public void LoadPage( int rowIndex )
        int lastRowIndex = rowIndex - ( rowIndex % PageSize );
        if( lastRowIndex == _lastRowIndex ) return;
        _lastRowIndex = lastRowIndex;
        if( CachedData == null ) CachedData = new NameList.GetNamesDataTable();
        long? totalCount = 0;
        _adapter.Fill( CachedData, _lastRowIndex, PageSize, ref totalCount );
        TotalCount = totalCount.Value;
The NameListCache exposes a LoadPage method that takes a row index. The UI can call this method and pass it an integer value of the requested data row's index. The LoadPage method determines if the requested data is already in memory by checking if the lastRowIndex is the same as the _lastRowIndex. The _lastRowIndexmember stores the value of the lastRowIndex of the previous request. If it is determined that the previouslastRowIndex is the same as the current lastRowIndex, this means the requested data is currently inCachedData and the function returns without retrieving data from the data source. CachedData stores a reference to the DataTable retrieved from the DataAdapter. Since CachedData is a DataTable, the UI can access CachedData to get cell values for display by using a row and column index.

Points of Interest

I observed (using Windows Task Manager) how much memory the DataGridView consumes with different number of page sizes. When I set the page size between a number as little as 50 and up to 5,000 rows, the application uses about 40 MB. In my case the total number of rows is over 1 million. With 5,000 rows page size, start-up is very fast and there is no noticeable delay when scrolling. I can immediately scroll to the bottom of the list and not see any indication of delay. At 50,000 rows page size, the application uses about 72 MB and performance is not much different compared to having only a 5,000 row page size. At 500,000, application start-up takes over 20 seconds and when moving pages the UI becomes unresponsive for over 20 seconds too. Memory used for 500,000 page size is about 172 MB. When I just load 1 million rows in the grid and not do any paging, it takes more than a minute for the UI to load and the application consumes over 300 MB of memory.


Post a Comment

Nam Le © 2014 - Designed by, Distributed By Templatelib