Jump to content
David Schwartz

General DB access question -- paging query results

Recommended Posts

SELECT <your query>
OFFSET 19900 ROWS FETCH FIRST 100 ROWS ONLY

That for SQL Server 2012 and higher. Different SQL will be generated for older versions.

  • Like 1

Share this post


Link to post
ORDER BY x,y,z OFFSET 19900 ROWS FETCH NEXT 100 ROWS ONLY
It's also new for me so I don't know the caveats, but it looks interesting/promising. Sql Server 2012+
 
 
 

Share this post


Link to post

Thank you, @Dmitry Arefiev!  It seems I have to withdraw that critical statement and explore the FireDAC options again.

How did you prime the FDQuery for that specific result?

 

We have a mix of 2008, 2012, 2016 and 2017 servers, so we are still held back by the 2008.

I will have to push for an upgrade of the handful of 2008 servers still remaining.

 

Share this post


Link to post
19 minutes ago, Attila Kovacs said:

@Dmitry Arefiev Since you are already here, do you have any Sql Server Schema/Data sync tool in your drawer?

We use the ApexSQL tools (https://www.apexsql.com/). Diff and DataDiff are very flexible and powerful, although ApexSQL from time to time redo their UIs with varying success - probably when upgrading their MSVS 😛

  • Thanks 1

Share this post


Link to post

Guys? Topic?

 

@David Schwartz I am a classic Delphi developer and I can't follow your step from Client/Server to ... well what, really? All I'm reading is that the Client no longer communicates with the database server directly, but via something like REST. So there is an extra layer added. But Multi Tier architectures have been around for quite some time, and Delphi copes just fine. Am I missing something? It still is Client/Server, only with 1 to n middle men. Of course you can toss all DB connectivity components, but who cares? Its just work, and that is what we are paid for. With every new layer between the Client and the rest (no pun intended), there comes change to the Client. As well as any change to that last layer will result in yet more changes to the Client. Again, just work. So what, in essence are you getting at? Does Visual Studio support total immersion of every tier into the IDE? Was that what this was about?

7 hours ago, David Schwartz said:

If you want to build a REST-based server in Delphi, they offer a solution by providing a pre-built server framework that can be extended by creating plug-in modules (DLLs basically), but the IDE and components still have no knowledge of the dynamics inherent in such designs. But there's no particular benefit of one server over another -- the IDE sees them all as foreign objects that require the programmer to manipulate directly.

If so, wow. Just wow. Gotta look into that sometime.

But I would not build DLLs, and I would not use Delphi, nor C#. I would build Python modules that would be run in an Apache or whatever. Also work I know. But do you really want to put us out of work? 😉

Share this post


Link to post
On 11/25/2018 at 11:30 AM, David Schwartz said:

My question refers to a use-case that's similar to thousands of others like it: suppose you have an app built in, say, D7 that was originally designed as a client/server app, using standard Delphi components available at the time, and it's now exhibiting poor performance after replacing the local file server with a DB server (MS SQL Server, Oracle, MySQL, whatever) because the 243 SQL queries on the 178 forms are all set to do "select * from xxx" in order to populate the controls on the form. So loading up some forms can literally take 10-15 minutes. 

 

It's a very general question: how does one usually deal with this?

 

(In my experience, Management usually says, "We're going to rebuild the app in C#/.NET because Delphi clearly can't handle this workload." But I'm asking for a more TECHNICAL explanation that doesn't just throw the baby out with the bathwater and start over from scratch.)

Your problem has nothing to with with Delphi, and C#/.net can not handle it any better. If you are using BDE, you will need to switch to FireDac. FireDac datasets has parameters for how many records to load at a time. A good advice is also not to use 'select * ', but instead use 'select fieldname from'. FireDac is really fast and can easily handle db's with million og records.

Share this post


Link to post
On 11/25/2018 at 10:45 AM, David Schwartz said:

Also, how do you NOT load up things like BLOBs unless you actually want to see them? (Since I pre-defined the tables with the fields so I can refer to field vars in the code rather than using FieldByName for everything, if I don't do a 'select * from...' then an error arises saying a data field is missing.)

If you get an error that says data field is missing, it's because some of the fields that you excluded from your query are connected to a dataaware component. The easiest way to find that component is to search for the field directly in the dfm file, and there you can see wich component they are assigned to. 

Share this post


Link to post
On 11/25/2018 at 10:45 AM, David Schwartz said:

Do LiveBindings help solve this problem vs. regular data-aware controls? (I haven't really played with them yet.)

Livebingings will not make any difference here. It's the design of your queries, and your database components, that will make all the difference.

Share this post


Link to post
Guest

A more general answer:

 

You should not fetch all columns, run a full table scan and transfer all rows to your application. Nothing wrong with full table scan. Access was pretty good at SELECT everything from everywhere  but SQL database simply don't work that way.

 

a) select dedicated columns - list the columns you want to retrieve

b) limit the number of rows

c) a) + b) retrieve a well defined result set

 

Introducing an index on a table emulates record oriented database. When it comes to online transaction retrieve columns using indices. Single lines should be treated as single lines and well defined result sets as well defined result sets. In general it's not wise to transfer data that is not required.

 

You simply start to lock more data than should be treated or just retrieve the key values and requery.

 

NexusDB ist one example for the best of both worlds.

 

In a '4GL' like environment data is retrieved from the server, put into a local buffer (maybe on the app-server side) and the application logic refers to the data stored in those buffers. We would call those buffers in-memory data-set. Kinda LUW. Also in such cases a) makes lots of sense. In such environments you will have to live with a limited cache per user which is pretty helpful.

 

If you want to reduce the traffic use a middleware or allow db access components to handle the pain for you.

 

FireDAC and other components allow you to fine tune what you want to retrieve (Fetch Options). Those settings allow to lesser the pain and as you pointed out that works.

 

The problem behind the scenes is Entity Relationship. Entity Relationship is about the insane idea to impose semantics on master-data and treat those as anything else but lookup tables. There is no such thing as 'master-data'. It's nothing but time independent transactional data. I fairly remember a data model that does still hold anything but time dependent what was called master-data once. From the very logic maintaining transactional data where there is no transaction makes no sense.

 

If you have a server-side process on the server-side unless you don't get the data from another source too.

 

Paginating results: Fetch what you need an fetch on demand. That works no matter what implementation technology is used, because it's the way things are intended to be.

 

 

Share this post


Link to post
On 11/27/2018 at 12:56 AM, Lars Fosdal said:

How does the generated SQL look for retrieving the top 100 results at row 19.900 for SQL Server?

Hi, Lars.  I'm not that fluent in databases, so could you please help me understand your question above.  What does the generated SQL look like?

You imply that the query has to read through the first 19,900 rows to get to the data. Are you saying that SQL Server can't use and index so that  WHERE (Row > 19,900) AND (Row < 20,000) can be fast?  If rows aren't added or deleted (or require persistence in their row number), couldn't a column named Row be introduced?  Or perhaps the db has internal row numbers? (I believe SQLite does for example)

Share this post


Link to post

Add a paging mechanism by using restrictions in sql where clause certainly doesn't use extra indexing. The base query itself my use indexes, if well designed and fitting.

I think Lars just was surprised, that components today realy do that ~"offset trick" aka make use of more modern SQL Features.

 

 

Share this post


Link to post

See previous my comments in this thread. SQL Server 2008 does not have built in support for pagination, while 2012 and newer does.

Share this post


Link to post
Guest

So both FireDAC, IBObjects and other DACs have the capability to server a component in "gridMode". This means that the key index will be fetched first and then the rows will be filled as requested. The key here being that you can present what looks like a huge dataset to the user and it will be fetched as needed. Users can see rows 1-10 and then jump to rows 4951-5000. These datasets can be writable.

 

Since people are interested in paging here (i'm not) i have a question. Imagine you have a set of 5000 sorted rows that users can mark, for example checking records and confirming them. Let's say there are summaries too. So if user one sees rows 1-100 and user two sees rows 101-200 working along. And user one inserts or deletes a row... when user two is supposed to look at rows 201-300 s/he will either see one row again (delete) or /miss/ one row (insert). No?

 

If yes, how would you tackle this in a multiuser read-write environment?

 

Or perhaps you are working with fuzzy data like google results where veracity is not required?

 

Regards,

 

/Dany

Share this post


Link to post
On 11/26/2018 at 10:57 AM, David Schwartz said:

These queries ran very quickly when the file server was in the same building and they only had a year or two of historical data present. But today they literally take several minutes (some up to 15!) to run on the remote server, processing over a decade of data (they're really hairy queries that apparently aren't time-boxed) and many return over 100,000 rows of data to the client before the user is allowed to view the first 10 and do anything.

I don't think you can fix this at the DAC/client end. You may need to redesign the app to allow selecting a time period first, and only then running the heavy SQL on a (hopefully) relatively narrow timespan/low amount of data to speed up creating the result set. You may also consider calculating and permanently storing intermediate data (presumably, the 10 year old data doesn't change all that much anymore).

Share this post


Link to post
On 1/28/2019 at 6:09 PM, eivindbakkestuen said:

I don't think you can fix this at the DAC/client end. You may need to redesign the app to allow selecting a time period first, and only then running the heavy SQL on a (hopefully) relatively narrow timespan/low amount of data to speed up creating the result set. You may also consider calculating and permanently storing intermediate data (presumably, the 10 year old data doesn't change all that much anymore).

This has to do entirely with how CLOBs are stored vs. VARCHAR2's. The data is new -- it happens with five records created yesterday. And changing the architecture is not going to help because if you need to store >4k then you need to use LOBs. Period. 

 

Apparently Oracle saves CLOBs and BLOBs directly to disk in their own files, not in the DB. So it's way slower loading them. I don't know why other DBs don't seem to have this problem with huge text entities (like SQLServer's Memo fields).

Share this post


Link to post
On 1/31/2019 at 7:47 PM, David Schwartz said:

This has to do entirely with how CLOBs are stored vs. VARCHAR2's. The data is new -- it happens with five records created yesterday. And changing the architecture is not going to help because if you need to store >4k then you need to use LOBs. Period. 

Oracle 12 extended maximum length of VARCHAR2 to 32Kb. And there always was LONG.

On 1/31/2019 at 7:47 PM, David Schwartz said:

Apparently Oracle saves CLOBs and BLOBs directly to disk in their own files, not in the DB.

CLOB / BLOB are stored in tablespaces. BFILE are stored in external files.

Share this post


Link to post
On 2/1/2019 at 12:18 PM, Dmitry Arefiev said:

Oracle 12 extended maximum length of VARCHAR2 to 32Kb. And there always was LONG.

We're using Oracle 10g.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×