We regularly have problems with the performance of our Agresso system. I’ll rephrase that – we are regularly asked to look into the performance of our Agresso system.
(Disclaimer: We may not be running the most up-to-date version of the software; however, there are always lessons that can be learned from examining the behaviour of applications.)
The focus of my investigation in this instance was not the same as that of my predecessor’s predecessor, but more on understanding the way the application accessed the database. So I fired up SQL Server Profiler and aimed it at the Agresso database.
But first, a couple of observations:
- As is documented elsewhere, there are no stored procedures in this version of Agresso’s database, which merits a WTF all to itself.
- Secondly, the good folks of Agresso have decided that the normal SQL-Server-provided TempDB isn’t good enough, and so it they use their own TempDB. It’s busy, I’ll grant you, but even so… Apparently, it is possible to reconfigure the Agresso system to use the real TempDB, but this hasn’t been done for whatever reason. (I suppose there is an argument to be made for using a dedicated temporary database in an environment where the database server is serving many databases, but it still feels a bit wrong. This server is dedicated to Agresso.)
And now to the performance monitoring. I took an afternoon’s activity from the middleware / web application server as my baseline, as most of our users go through this and this was feeling particularly slow. I made the following notes:
Two components of the system are running this against a particular table every 15 seconds; and six other processes are running it every five minutes. And it’s not necessarily the most efficient way of getting the rowcount (but at least in this instance it does use an index scan rather than a table scan).
I did some performance comparisons against this 18-row table. The screenshot below shows the results:
As you can see, the sysindexes query was responsible for just 8% of the query load. I never did get round to encapsulating both behaviours in stored procedures and going for the foursome! However, given the number of times that this particular query is run, it seems that an increase in performance is possible through using this technique.
I ran the monitor for just under 3 hours. During that time, over 27000 queries (of over 57000 captured queries) were calls by CS_SERVICE to update cwTK_Task table. And the vast majority of those were calls to update the tk_status field by 16384 (ah, the joys of Magic Numbers).
One. Record. At. A. Time. (Or “RBAR” – “Row By Agonizing Row” as I’ve seen elsewhere.)
And the list of rows to be updated? From a relatively straightforward query.
Things get a bit odder when we look at the use of temporary tables. Roughly 2% of those queries recorded in my monitoring period were DDL creating or dropping temporary tables. The process was a little unusual:
- Drop table
- Create table
- Do stuff
- Drop table
Now, a few thoughts come to mind here, and I noticed a couple of other interesting “features”:
- As expected, these are all created in the agrtempdb rather than in the standard MSSQL TempDB. See above.
- The table names are hardcoded rather than normal temporary table names.
- The first drop statement is issued without checking to see if the table exists in the first place.
- There are still tables in the agrtempdb that were created nine months ago, so the tidy-up process isn’t fool-proof. If this had been the “real” tempdb, those tables would have been long gone – the server was, until fairly recently, being rebooted every five weeks (don’t ask).
- The tables themselves were not consistent. I noticed a few names coming and going regularly – HAGRTPS121, for example – and I filtered out the DDL for this table. It seems that there are (at least) six different versions of the table being created at various times; and there are other tables that are similarly “blessed”.
Round-trip SELECTs on cwTK_Task
There’s a query run against the cwMQ_message_queue table, that selects a load of data. And then, for each record that’s returned from that query, there’s another select statement issued to retrieve further data. Why?
The original select isn’t too complicated:
And for each record / cwTK_guid returned in the above query, run the following:
Once again, I have to ask – why not do it at the server and save all those round trips? Why not wrap it up in a stored procedure? Why not use a single set-based query to get the whole lot?
AgrDWS.exe – More SELECT COUNT(*) goodness
Roughly half the queries observed from AgrDWS.exe were of the form
Why not replace that little lot with a function call that takes the table name as the parameter? Save a bit more of that network bandwidth.
The SQL Server was not stressed at any time during the making of this post. The application / web server similarly wasn’t stressed. The main problem is the sheer number of queries each doing their own tiny little thing, and with all the overhead that goes along with each individual query, it adds up…
Application performance enhancement is not just about index tuning. It’s about designing and writing the application properly.
And a note to the team at Agresso – you’re not the only ones who do this, and I know there’s a new version of your software coming out soon / out now, and I hope that you’ve already addressed these issues…