Jump to content
Lars Fosdal

Best site/source for SQL Server questions?

Recommended Posts

The joys of inherited problems.


I am looking for a pattern or advice on how to best allocate a set of records to a specific requestor when there are multiple parallel requests, and there are multiple parallel request viewers.

In practice: Allocate a set of deliveries to a user, when there can be multiple users requesting deliveries in parallel, and multiple users viewing the deliveries in parallel.

 

The current SQL can give deadlocks both on allocations and on viewing, so something is definitively not right, and I need to improve on it.

 

 

 

There is SO, where it all is an exercise in asking the question right...

Are there other good discussion sites?

Share this post


Link to post
2 minutes ago, Attila Kovacs said:

WITH (NOLOCK) ?

Well, I've added that where it seems safe for the viewers, but it is a short way between being a viewer and being a requestor, so for certain lookups, the data needs to be accurate.

The entire allocation part is a very heavy SQL statement with joins, functions, conditions and sorting - and the actual grabbing is totally void of transactions.

 

Share this post


Link to post
Posted (edited)

I see, I'd check then "Row Versioning". I never used it yet but I'll try it myself when I find some time.

Edited by Attila Kovacs
  • Thanks 1

Share this post


Link to post

Yes, I can see that from your link. It came as a surprice to me, as I am mostly used to work with IB and Firebird databases.

 

There is at NOLOCK isolationlevel, that will minimize risk of locking, but it will not be suitable, if there is a risk of data being changed during the execution of the query. Another thing to do is to make sure that all the fields in the where clause and the joins are indexed to avoid tablescans.

Share this post


Link to post
On 3/12/2019 at 12:44 PM, Hans J. Ellingsgaard said:

There is at NOLOCK isolationlevel, that will minimize risk of locking, but it will not be suitable, if there is a risk of data being changed during the execution of the query. Another thing to do is to make sure that all the fields in the where clause and the joins are indexed to avoid tablescans.

4

I usually limit NOLOCK to queries for
- data that is used as viewing info (i.e. not acted upon)
- data that are currently in the custody of the client (i.e. not likely to be changed by other processes)

Share this post


Link to post

some thoughts and questions to this

- the only requests like this one I know from ms sql "users"

- so some systems seem to be picky about row level locking others not

- the sql handling problem itself here is bound very close to the real world problem of demanding a physically limited thing

- row level locking in ms sql server under load could cause «lock escalation> aka a much more general locking like page locking

- maybe one has to think about some denormalisation, which of course have to be accompanied by accompanied by adequate constraints, keys, triggers, etc.

- handling the «grabbing» might be easier, quicker, more robust, if handled by a single, atomic Grab Stored Procedure (considering the denormalisation)»

 

- which version is in use?

- could you elaborate «..the actual grabbing is totally void of transactions..» please?

- or describe the workflow a little?

 

Share this post


Link to post
On 3/16/2019 at 2:40 PM, jobo said:

- handling the «grabbing» might be easier, quicker, more robust, if handled by a single, atomic Grab Stored Procedure (considering the denormalisation)»


That's why I'd like to find an SQL group to get some hints to best practices

 

- which version is in use?

2008r2, 2012, 2016
 

- could you elaborate «..the actual grabbing is totally void of transactions..» please?

Some of the current code does not do explicit transaction handling at all.
 

 

- or describe the workflow a little?


That would become very detailed, very quickly

 

2

 

Share this post


Link to post
On 3/18/2019 at 1:59 PM, Lars Fosdal said:

Some of the current code does not do explicit transaction handling at all.

In my point of view, this is best way to make use of database transaction control, when you are on client side.

Servers transaction handling is the server side implementation to guarantee a complete, error free transaction, considering each and every rule known to the server.

So called client side "transaction handling" tries to achieve the same without being the server. Of course you can do this, but it f.. ups the server principle and comes with a price. It uses more resources and by doing so it leads more quickly to  the situation you'd like to avoid. Timeouts, lock escalation (mssql), locked tables, dead locks.


 

So first thing to do, is making the process as lean as possible. Prepare everything possible before the final transaction.

Use highly specific SP if possible (RDBMS offers strong SP features), which could mean for example, avoid parameter driven "smartness" of the SP and choose the best possible SP (and processing path) from client.

If necessary model the "processing path" more explicit in your database, which could be a table for "reservations" or distinguish between view mode (of to be processed data) and "I'm really buying now" mode.

Leave the developer comfort zone and stop using client transactions, but make tiny, meaningful steps in backend (transactions) and frontend and try to handle raising up resource problems gracefully.

Share this post


Link to post

We do all data change via SPs of which some do not do explicit transaction handling.

The current SP call handling will detect if the call was a deadlock victim and rerun it if necessary - but that basically escalates the risk of yet another deadlock.

 

The bigger sites have nearly a hundred pickers - so the odds for concurrent access = inevitable.
We need to identify where we do not need exclusive access and explicitly use NoLock where appropriate.
We need to identify where transactions will be feasible / required.

We need to identify a best practice for resource allocation. 

 

The challenge is that the rules are complex and the dataset is dynamic (pallets are emptied and replaced many, many times during a day).

f.x. get the best pick position for n items of article x with date requirement y which has the best fit for our pick route and least incoming traffic.
To further complicate it, different clients have different rules for date requirements due to longer transports or specific rules per article for shorter requirements due to high turnover.  The list of these varying parameters go on and on.

If two clients make the same request at the same time - one has to be the winner of what may be a scarce resource, while the other needs a viable plan b.
The pick request may consume the remaining items on one pallet, and then grab the rest from the next pallet - if there is one.

To further complicate it, the allocation may need to be split onto multiple deliveries to multiple clients within the pick order.

 

What is blatantly clear is that we currently are not doing it the optimal way.

 

Share this post


Link to post

I'm not a ms sql specialist. In fact I don't like it. So I have no specific points here.
 

From Your description it sounds like everything is mixed up a little. Don't know if this is just an impression from the description or if this describes present situation.

The problem I mentioned above occurs on selling limited real world articles (not like software license keys). And selling things which are already sold is really a problem.

But this is different from optimizing picking paths, keeping pallets available etc.
 

So without specific problems in question I can just give some final hints:

clean up and separate issues (maybe even separate systems)

do application and processing optimization on every single step (having a few lazy spots is sufficient to break the system*)

in fact sql tuning (single statements) could speed up things by factors you can't afford in hardware

tuning could be achieved by

- denormalization for speed (space/precalculation is the price for speed, speed avoids growing locks)

- normalization for speed (find old flaws, which call for double work and endanger data accuracy within the time critical process)

- index optimization (mostly indexing the proper columns)

- using most advanced features of the system (which could be i.e. sql feature like CTE or also a really cool new index technology- I'm not up to date with sql server here)

- avoid local (client side) processing in favour of SP (the famous "if loop" and so on)

- avoid costly deduction of states, use modelled, explicit, simple states

- consider additional technologies to notify clients on state changes etc. (decreasing stock count, ..)

- consider specialized use cases to easy situation, include data model changes if necessary (see also: de-normalization)


 

*Finally healing a few smaller flaws could add up to more than the sum of the single effects.

Share this post


Link to post

Is it a two-tier (multiple clients/one server) or three-tier system (with some middleware or website between the clients and the server)?   

 

In the first case, consider using a CTE (it runs atomically).  In the latter case, consider using a mutex. 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×