Jump to content
Larry Hengen

Best Practices for FireDAC FetchMode/RecordCount Settings

Recommended Posts

I have recently encountered a situation where code was explicitly setting queries to use FetchOptions.RecordCountMode := cmTotal.  The query involved was rather complex and therefore expensive to execute and due to the RecordCountMode set, was being executed twice as per the FireDAC Documentation.  As a result, together with all the other workload it was bringing SQL Server to it's knees.  That got me thinking about what are appropriate (read least resource utilization with best performance) settings to use in the following scenarios:

 

1) You open a query to gather data and iterate over each record not updating the record itself but perhaps issuing other SQL Statements., or generating a report

-such a query could be ReadOnly, ForwardOnly

 

2) You open a query with RequestLive = True and iterate over it updating values which FireDAC posts back to the database

 

In both of the scenarios above, to display progress information you want the total record count.

 

What do developers usually use for FireDAcC FetchMode/RecordCountMode and other settings in such scenarios and what if any combination of settings are done globally with a TFDManager vs. locally at the query level?

 

Share this post


Link to post
1 minute ago, Hans J. Ellingsgaard said:

You can use an extra query with a count(*) to get the number of records.

That is exactly what FireDAC does for you so why would I write additional code to do the same thing?  I want to avoid a second query because it can be detrimental to your SQL back end's performance. Imagine doubling it's workload for every user...

 

A less intensive approach is to fetch all records, but that can cause a delay in processing while they are streamed to the client, and if the query is in the main thread, the application will "freeze".

 

Share this post


Link to post

You may either:

  • Refine your SQL query, e.g. by adding some proper indexes to the DB, to make it faster on the server;
  • Actually retrieve all data at once (maybe into an in-memory list/array), but use Application.ProcessMessages within the loop to avoid UI freezing;
  • Don't care about exact record count: just fetch the first 1000 for instance, and write '>1000' on the UI if there are more rows.
Edited by Arnaud Bouchez
  • Like 1

Share this post


Link to post

To fetch all the records from a table to get the record count is in most cases a bad praxis. If it's a table with a lot of records, you will just put a lot unnessisary strain on the database server. A count(*) query, on the other hand,  has a very limited impact on the server. 

 

Share this post


Link to post
22 minutes ago, Hans J. Ellingsgaard said:

A count(*) query, on the other hand,  has a very limited impact on the server. 

I don't agree. This is highly depending on the DB itself.
A count(*) could take a lot of time for a huge database, or with a complex query with no proper indexes.

 

You could have a count(*) of a few rows (e.g. one or two rows) which takes a lot of time on big tables with a SQL request bypassing the indexes.
In this case, it is better to retrieve the few result rows instead of making a select count(*) following by a select *

Edited by Arnaud Bouchez
  • Like 1

Share this post


Link to post
4 hours ago, Hans J. Ellingsgaard said:

To fetch all the records from a table to get the record count is in most cases a bad praxis. If it's a table with a lot of records, you will just put a lot unnessisary strain on the database server. A count(*) query, on the other hand,  has a very limited impact on the server. 

 

I beg to differ.  The situation I encountered was that a very expensive SQL Server query wrapped in a select count(*) from() was causing major performance issues.  Result set was about 1500 rows.  In this case I think it's far easier to bring back the entire result set.  What I was looking for was guidelines other devs are using to make such decisions since the actual row counts are not well known.

Share this post


Link to post

If you make a select count query on an indexed field, it should only take a ms to execute. The resultset will only have one record. 

If your table only have 1500 records, the dataset.RecordCount should also respond quckly (not as quick as the record count thoug). 

There must be something else that goes wrong.

 

What database are you using?

Have you tried to run the queries directly in a db manager?

Share this post


Link to post

@Hans J. Ellingsgaard  I am using SQL Server.  The query uses an inner join on about 5 tables, a left outer on one and a cross apply with a group by and order by on the results  Pretty much a worst case scenario.  Some of the criteria is not indexed, and due to the data layout requires a large # of reads.  Some work has been done on the query to optimize it, and it's now better, but the question remains; what are the best settings to use for FireDAC when the cost and row count of a particular query is not generally known at design-time, as is the case with many dynamically built SQL queries.

Share this post


Link to post
8 hours ago, Larry Hengen said:

@Hans J. Ellingsgaard  I am using SQL Server.  The query uses an inner join on about 5 tables, a left outer on one and a cross apply with a group by and order by on the results  Pretty much a worst case scenario.  Some of the criteria is not indexed, and due to the data layout requires a large # of reads.  Some work has been done on the query to optimize it, and it's now better, but the question remains; what are the best settings to use for FireDAC when the cost and row count of a particular query is not generally known at design-time, as is the case with many dynamically built SQL queries.

To give you one more idea to try: Use a sql window function to get the total count of all records as an additional column. Syntax for the column should something like this "count(*) over () as total".

 

SELECT - OVER Clause (Transact-SQL)

 

But I don't tried it by my self 🙂

Share this post


Link to post
On 9/13/2020 at 12:07 AM, Larry Hengen said:

Some of the criteria is not indexed,

Is there anything that prevent you from indexing all the fields in the join and where clause? There is a good chance that it would give you a significiant boost on your query performance.

Edited by Hans J. Ellingsgaard

Share this post


Link to post

Indexing and doing index maintenance is crucial to maintain performance. 

If the queries can be done completely free form, that will be a challenge. 

IMO, there are few settings to FireDAC that really affects the performance once the query has been passed to the server.

Share this post


Link to post
On 9/12/2020 at 6:23 AM, Hans J. Ellingsgaard said:

A count(*) query, on the other hand,  has a very limited impact on the server. 

That depends on the database and the version. For instance, in PostgreSQL < 9.2, a count(*) query would have to perform a sequential scan through the database:

 

Quote

PostgreSQL uses multiversion concurrency control (MVCC) to ensure consistency between simultaneous transactions. This means each transaction may see different rows – and different numbers of rows – in a table. There is no single universal row count that the database could cache, so it must scan through all rows counting how many are visible. Performance for an exact count grows linearly with table size.

MS SQL Server, and now PostgreSQL, can  often do index-only scans in situations like this, for much quicker results.

Share this post


Link to post

I don't think indexes will help to avoid joins in this situation. They will only speed up counting but not preparing data array to select from.

In general, I'd advice to not use total record count at all. Just output count of currently fetched records so user will see that his request is in progress.

Share this post


Link to post
Guest
On 9/11/2020 at 11:21 PM, Larry Hengen said:

What do developers usually use for FireDAcC FetchMode/RecordCountMode and other settings in such scenarios and what if any combination of settings are done globally with a TFDManager vs. locally at the query level?

Interesting, nobody answered the actual question.

 

I could give you all the parameters for the DAC that i use and have the sources to.

This should (In My Humble Experience) be clear from the code - answers like:

"This DAC will issue a count(*) query derived from the user query if ... and if ... but not if ..."

Taking that information and then moving on to what new "magic" the current version of your RDBMS has inside it's query optimizer, looking at the SQL-monitor, and you will be getting closer to figuring out a good solution.

 

Deriving that count query can fail sometimes, especially performance-wise, if the query is advanced enough. The DAC should have an array of "SQL" properties where you can kick in your own hand-made profiled version for specific cases. It is difficult for code to know what joins, subselects, CTE's and whatnot are actually useful in the count since the optimizer can work much better when a field-list is supplied.

 

I wish FireDAC would come with sources in Professional.

 

Are there any people delivering to sensitive businesses, medical, human-rights et. al. using FireDAC in professional? I would not. I'm asking because i never seen anyone commenting a FireDAC question here and referring to the sources.

Share this post


Link to post

Ahm, what kind of bull*** is calling a count(*) prior to the actual selection? Who guarantees that the two results will be the same?

Share this post


Link to post
Guest

@Attila Kovacs, another valid point, in deed. The DACs sometimes does. Might be a "hack" to prepare a full grid/chart or some such.

Granted it is a rather base idiom. However it's "built" into a lot of TDataSet compatible DACs.

I would recommend to avoid it. For beginners, it may happen if you visually connect a grid to a dataset without bothering more.

Depends on what the other components ask the DAC to do. Look-see at DX "Smart refresh" and similar "thingies", idioms that has become legacy and for starters (In My Humble Guessing Universe) was workarounds.

 

And i am not sure it's "prior to the selection". As you know rowsets are fetched from the server and short of fetching all of them you cannot have a count locally.

The DACs try to avoid fetching all (expensive) but at the same time serve the controls with info that will make them work better (pre-allocation and such). And as we all now, that might take longer.

 

In a more modern setting (i have implemented a GUI based on knockout.js recently, full-blown MVVM) you will face similar problems but attack the in a completely different way.

The end result, IMHO, is up to the final implementer (me, you the OP) and we will ALWAYS have to think specifically. That is why i am a bit sceptical about LINQ-solutions, how do you post-performance tune those beasts? Me, i start writing a query by hand and expand from there. So if i see that the query probably never will be "economically responsibly" optimized in a good way, i can, at a very early stage, go for an alternate solution.

 

Also never forget the SQL-monitor and the PLANs. Look-se at them!

Edited by Guest

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×