Larry Hengen 39 Posted September 11, 2020 I have recently encountered a situation where code was explicitly setting queries to use FetchOptions.RecordCountMode := cmTotal. The query involved was rather complex and therefore expensive to execute and due to the RecordCountMode set, was being executed twice as per the FireDAC Documentation. As a result, together with all the other workload it was bringing SQL Server to it's knees. That got me thinking about what are appropriate (read least resource utilization with best performance) settings to use in the following scenarios: 1) You open a query to gather data and iterate over each record not updating the record itself but perhaps issuing other SQL Statements., or generating a report -such a query could be ReadOnly, ForwardOnly 2) You open a query with RequestLive = True and iterate over it updating values which FireDAC posts back to the database In both of the scenarios above, to display progress information you want the total record count. What do developers usually use for FireDAcC FetchMode/RecordCountMode and other settings in such scenarios and what if any combination of settings are done globally with a TFDManager vs. locally at the query level? Share this post Link to post
Hans J. Ellingsgaard 21 Posted September 11, 2020 You can use an extra query with a count(*) to get the number of records. Share this post Link to post
Larry Hengen 39 Posted September 11, 2020 1 minute ago, Hans J. Ellingsgaard said: You can use an extra query with a count(*) to get the number of records. That is exactly what FireDAC does for you so why would I write additional code to do the same thing? I want to avoid a second query because it can be detrimental to your SQL back end's performance. Imagine doubling it's workload for every user... A less intensive approach is to fetch all records, but that can cause a delay in processing while they are streamed to the client, and if the query is in the main thread, the application will "freeze". Share this post Link to post
Arnaud Bouchez 407 Posted September 12, 2020 (edited) You may either: Refine your SQL query, e.g. by adding some proper indexes to the DB, to make it faster on the server; Actually retrieve all data at once (maybe into an in-memory list/array), but use Application.ProcessMessages within the loop to avoid UI freezing; Don't care about exact record count: just fetch the first 1000 for instance, and write '>1000' on the UI if there are more rows. Edited September 12, 2020 by Arnaud Bouchez 1 Share this post Link to post
Hans J. Ellingsgaard 21 Posted September 12, 2020 To fetch all the records from a table to get the record count is in most cases a bad praxis. If it's a table with a lot of records, you will just put a lot unnessisary strain on the database server. A count(*) query, on the other hand, has a very limited impact on the server. Share this post Link to post
Arnaud Bouchez 407 Posted September 12, 2020 (edited) 22 minutes ago, Hans J. Ellingsgaard said: A count(*) query, on the other hand, has a very limited impact on the server. I don't agree. This is highly depending on the DB itself. A count(*) could take a lot of time for a huge database, or with a complex query with no proper indexes. You could have a count(*) of a few rows (e.g. one or two rows) which takes a lot of time on big tables with a SQL request bypassing the indexes. In this case, it is better to retrieve the few result rows instead of making a select count(*) following by a select * Edited September 12, 2020 by Arnaud Bouchez 1 Share this post Link to post
Hans J. Ellingsgaard 21 Posted September 12, 2020 A count(*) should always be done an an index field, or the point of using it is gone in the first place. Share this post Link to post
Larry Hengen 39 Posted September 12, 2020 4 hours ago, Hans J. Ellingsgaard said: To fetch all the records from a table to get the record count is in most cases a bad praxis. If it's a table with a lot of records, you will just put a lot unnessisary strain on the database server. A count(*) query, on the other hand, has a very limited impact on the server. I beg to differ. The situation I encountered was that a very expensive SQL Server query wrapped in a select count(*) from() was causing major performance issues. Result set was about 1500 rows. In this case I think it's far easier to bring back the entire result set. What I was looking for was guidelines other devs are using to make such decisions since the actual row counts are not well known. Share this post Link to post
Hans J. Ellingsgaard 21 Posted September 12, 2020 If you make a select count query on an indexed field, it should only take a ms to execute. The resultset will only have one record. If your table only have 1500 records, the dataset.RecordCount should also respond quckly (not as quick as the record count thoug). There must be something else that goes wrong. What database are you using? Have you tried to run the queries directly in a db manager? Share this post Link to post
Attila Kovacs 629 Posted September 12, 2020 (edited) . Edited September 12, 2020 by Attila Kovacs Share this post Link to post
Attila Kovacs 629 Posted September 12, 2020 (edited) . Edited September 12, 2020 by Attila Kovacs Share this post Link to post
Larry Hengen 39 Posted September 12, 2020 @Hans J. Ellingsgaard I am using SQL Server. The query uses an inner join on about 5 tables, a left outer on one and a cross apply with a group by and order by on the results Pretty much a worst case scenario. Some of the criteria is not indexed, and due to the data layout requires a large # of reads. Some work has been done on the query to optimize it, and it's now better, but the question remains; what are the best settings to use for FireDAC when the cost and row count of a particular query is not generally known at design-time, as is the case with many dynamically built SQL queries. Share this post Link to post
HeZa 3 Posted September 13, 2020 8 hours ago, Larry Hengen said: @Hans J. Ellingsgaard I am using SQL Server. The query uses an inner join on about 5 tables, a left outer on one and a cross apply with a group by and order by on the results Pretty much a worst case scenario. Some of the criteria is not indexed, and due to the data layout requires a large # of reads. Some work has been done on the query to optimize it, and it's now better, but the question remains; what are the best settings to use for FireDAC when the cost and row count of a particular query is not generally known at design-time, as is the case with many dynamically built SQL queries. To give you one more idea to try: Use a sql window function to get the total count of all records as an additional column. Syntax for the column should something like this "count(*) over () as total". SELECT - OVER Clause (Transact-SQL) But I don't tried it by my self 🙂 Share this post Link to post
Hans J. Ellingsgaard 21 Posted September 14, 2020 (edited) On 9/13/2020 at 12:07 AM, Larry Hengen said: Some of the criteria is not indexed, Is there anything that prevent you from indexing all the fields in the join and where clause? There is a good chance that it would give you a significiant boost on your query performance. Edited September 14, 2020 by Hans J. Ellingsgaard Share this post Link to post
Lars Fosdal 1792 Posted September 15, 2020 Indexing and doing index maintenance is crucial to maintain performance. If the queries can be done completely free form, that will be a challenge. IMO, there are few settings to FireDAC that really affects the performance once the query has been passed to the server. Share this post Link to post
Joseph MItzen 251 Posted September 21, 2020 On 9/12/2020 at 6:23 AM, Hans J. Ellingsgaard said: A count(*) query, on the other hand, has a very limited impact on the server. That depends on the database and the version. For instance, in PostgreSQL < 9.2, a count(*) query would have to perform a sequential scan through the database: Quote PostgreSQL uses multiversion concurrency control (MVCC) to ensure consistency between simultaneous transactions. This means each transaction may see different rows – and different numbers of rows – in a table. There is no single universal row count that the database could cache, so it must scan through all rows counting how many are visible. Performance for an exact count grows linearly with table size. MS SQL Server, and now PostgreSQL, can often do index-only scans in situations like this, for much quicker results. Share this post Link to post
Fr0sT.Brutal 900 Posted September 21, 2020 I don't think indexes will help to avoid joins in this situation. They will only speed up counting but not preparing data array to select from. In general, I'd advice to not use total record count at all. Just output count of currently fetched records so user will see that his request is in progress. Share this post Link to post
Guest Posted September 21, 2020 On 9/11/2020 at 11:21 PM, Larry Hengen said: What do developers usually use for FireDAcC FetchMode/RecordCountMode and other settings in such scenarios and what if any combination of settings are done globally with a TFDManager vs. locally at the query level? Interesting, nobody answered the actual question. I could give you all the parameters for the DAC that i use and have the sources to. This should (In My Humble Experience) be clear from the code - answers like: "This DAC will issue a count(*) query derived from the user query if ... and if ... but not if ..." Taking that information and then moving on to what new "magic" the current version of your RDBMS has inside it's query optimizer, looking at the SQL-monitor, and you will be getting closer to figuring out a good solution. Deriving that count query can fail sometimes, especially performance-wise, if the query is advanced enough. The DAC should have an array of "SQL" properties where you can kick in your own hand-made profiled version for specific cases. It is difficult for code to know what joins, subselects, CTE's and whatnot are actually useful in the count since the optimizer can work much better when a field-list is supplied. I wish FireDAC would come with sources in Professional. Are there any people delivering to sensitive businesses, medical, human-rights et. al. using FireDAC in professional? I would not. I'm asking because i never seen anyone commenting a FireDAC question here and referring to the sources. Share this post Link to post
Attila Kovacs 629 Posted September 21, 2020 Ahm, what kind of bull*** is calling a count(*) prior to the actual selection? Who guarantees that the two results will be the same? Share this post Link to post
Guest Posted September 21, 2020 (edited) @Attila Kovacs, another valid point, in deed. The DACs sometimes does. Might be a "hack" to prepare a full grid/chart or some such. Granted it is a rather base idiom. However it's "built" into a lot of TDataSet compatible DACs. I would recommend to avoid it. For beginners, it may happen if you visually connect a grid to a dataset without bothering more. Depends on what the other components ask the DAC to do. Look-see at DX "Smart refresh" and similar "thingies", idioms that has become legacy and for starters (In My Humble Guessing Universe) was workarounds. And i am not sure it's "prior to the selection". As you know rowsets are fetched from the server and short of fetching all of them you cannot have a count locally. The DACs try to avoid fetching all (expensive) but at the same time serve the controls with info that will make them work better (pre-allocation and such). And as we all now, that might take longer. In a more modern setting (i have implemented a GUI based on knockout.js recently, full-blown MVVM) you will face similar problems but attack the in a completely different way. The end result, IMHO, is up to the final implementer (me, you the OP) and we will ALWAYS have to think specifically. That is why i am a bit sceptical about LINQ-solutions, how do you post-performance tune those beasts? Me, i start writing a query by hand and expand from there. So if i see that the query probably never will be "economically responsibly" optimized in a good way, i can, at a very early stage, go for an alternate solution. Also never forget the SQL-monitor and the PLANs. Look-se at them! Edited September 21, 2020 by Guest Share this post Link to post