Tuesday, October 6, 2009

The Automatic Workload Repository

1.0 Introduction
Here’s a game to play on a wet Wednesday afternoon. Take a 9i database. Write a script which will schedule the taking of a Statspack snapshot every hour -indeed, let’s just simply run Oracle’s own script (spauto.sql) to do precisely that. Whilst we’re at it, maybe we should rename the schema into which the snapshots are written -after all, PERFSTAT is a bit of weird name, isn’t it? In fact, why don’t we just let SYS own the relevant tables: why create a whole new user anyway? Of course, we’ll have to think up a brand new name for this ‘automated Statspack’, otherwise nobody will think it’s anything very much new. Hmmm... well, it would be good if we could get the word ‘repository’ in the new name somewhere, because it’s a word that is used everywhere else in the database for no good reason; that always sounds imposing and impressive; and whose meaning is sufficiently obscure that paying customers won’t quite know what they’re getting. Also: the thing is scheduled by way of our script, so it runs without manual intervention... sounds like we could maybe squeeze the word ‘automatic’ in there in that case. People pay good money for automation, after all. I know: let’s call it the ‘Automatic Workload Repository’!
OK, OK! I’m being cynical, and I’m over-stating the case a little. But one of 10g’s new features is indeed called the Automatic Workload Repository; and, as a name, that does indeed sound pretty sexy and impressive. But when you boil it down to its basics, it is in fact very much what you would have achieved in 9i merely by scheduling Statpack snapshots every hour. Sure: the quality of the snapshot statistics collected in 10g has improved enormously: for a start, they are more detailed than their 9i cousins ever were. It is also true that the collection of the snapshots in 10g is done by a new background process called MMON, rather than by a server process as with Statspack in 9i. And I will concede that you control and manage the collection of statistics by executing ‘dbms_workload_repository.create_snapshot’ instead of logging on as PERFSTAT and executing ‘statspack.snap’ (though I think you might agree that the old command had a distinct advantage in brevity and ease of typing!) But despite all those acknowledges differences, Automatic Workload Repository (AWR) remains, when all is said and done, scheduled/automated Statspack on Steroids.
I ought to just mention at this point, however, that AWR doesn’t physically enhance or improve on Statspack: it is truly a replacement for it, meaning that the unadorned Statspack is still there to be run (exactly as you would have done in 9i) if you want to. The strong recommendation from Oracle, however, is that you don’t bother: AWR does what Statspack used to do, for sure; but it does it better and with less fuss.
So, in this article, I’ll try and explain how AWR works, what it does when it works, and how you can use its work to help improve the performance of your database. But I’d like you to start out thinking of AWR not as some new, revolutionary feature or capability, but merely as a better and different way of achieving something you were probably doing perfectly well in 9i. You’ll find, I think, that whilst the name might have gotten a lot sexier, the principles of taking snapshots of the dynamic performance views to assess database performance hasn’t.
2.0 A short history of performance tuning...
Oracle uses in-memory structures to record the current state and condition of the database. You can view the contents of these memory structures by querying various dynamic performance views -for example, V$SYSSTAT and V$WAITSTAT, to name but two. The contents of such views are constantly updated, because the system keeps the underlying memory structures up-to-date, in more or less real time.
Unfortunately, a lot of the memory structures (and hence the views) merely accumulate statistics. That’s a problem, because -for example- knowing merely that your database has experienced 3 million buffer busy waits is meaningless. Three million waits ...since when? If your instance has been running uninterruptedly for three years, then that many waits is probably trivial. But if your instance has been running for three days, then you’ve got serious problems! A cumulative total of events is no good to anyone, in other words: you need to know a count of events over a defined period of time.
The other problem with merely looking at the V$ views to assess database performance is that they are based only on memory structures -which are inevitably wiped when the instance is shut down, and re-created from scratch next time you start the instance up. So if you want to know how the database performed three months ago when you last performed that quarterly archiving and loading bulk process... well, tough luck: the information’s not there.
To resolve the problem of cumulative and timeless (and hence hopeless) statistics, Oracle gave us -way back at least as far as version 7- the utlbstat.sql and utlestat.sql scripts. These let you capture the V$ statistics at one time with utlbstat, capture them again at another time with utlestat, and then compare the two sets of numbers (something utlestat did intrinsically just after it captured the second set of statistics). The number of buffer busy waits at one time might have been captured as 3,000,000 and at the second time as 3,000,105 -and suddenly you know that the database has suffered 105 buffer busy waits in the time period spanning the two captures: a much more meaningful piece of information.
These ancient scripts did not, however, resolve the second problem I mentioned: of a lack of history of statistics, and hence an inability to do trend analysis. That’s because each run of utlestat.sql wiped the entire set of statistics, so each run of utlbstat started with a clean slate. Thus, in 8i, Oracle felt compelled to introduce Statspack. Statspack also did the capturing of V$ statistics at different times, thus allowing you to do comparisons over a defined period of time. But the results of the captures (the Statspack ‘snapshots’) were not simply discarded after each run, but were physically stored in ordinary, normal database tables -owned by a specially-created user called PERFSTAT. To take the actual statistics snapshots, though, you had to log onto the database as PERFSTAT and execute a special package and procedure: statspack.snap. If you wanted automatic collection of statistics every hour, for example, then you either had to schedule your own job to do the deed -or you could simply run the spauto.sql script supplied by Oracle, because that did indeed schedule snapshots to be taken every hour. If you logged on to take a manual snapshot, then it was your server process which did the actual deed. If you scheduled a job to do it, then a Job Queue process did it. When you wanted to pick which set of stored statistics snapshots you wanted to compare to assess database performance, you ran another Oracle-supplied script called spreport.sql. This output a text file which you could then pore over for hours, trying to puzzle out what it all meant... or (if you were sensible!) you could upload it to www.oraperf.com, where some scripted cleverness would analyze it into something meaningful for you in moments, and all for free.
And in this way did the DBAs of the early years of the third millenium tune their databases. Here endeth the lesson...
3.0 What’s new in AWR
All of the above is claimed to be ancient history in 10g, but as I was hinting in my introduction to this article, it is still very much reality. You still need to grab V$ statistics at different times to convert the cumulative numbers into something that is actually meaningful. And you still need to store those ‘grabs’ in a set of tables somewhere if you want to do historical trend-analysis of performance over time.
It is true, however, that we no longer have to create a new user to own the tables into which these sorts of statistics snapshots are written: SYS is now the owner of the relevant tables. And they get created automatically for us inside the new SYSAUX tablespace, with names that all begin WRx$, where x can be ‘H’ or ‘M’:
SQL> select table_name from dba_tables
2 where owner=''SYS'' and tablespace_name=''SYSAUX''
3 and table_name like ''WR%''
4 and table_name not like ''WRI%''
5 order by table_name;

TABLE_NAME
------------------------------
WRH$_ACTIVE_SESSION_HISTORY_BL
WRH$_BG_EVENT_SUMMARY
WRH$_BUFFER_POOL_STATISTICS
WRH$_CLASS_CACHE_TRANSFER_BL
WRH$_CR_BLOCK_SERVER
WRH$_CURRENT_BLOCK_SERVER

[and so on for 67 rows!]
We also don’t have to perform our own snapshots any more (though, as I’ll go on to show, you can still do so if you wish), and we don’t have to create or run any scripts to schedule the automatic taking of snapshots. Instead, we get a whole new background process called Manageability Monitor (MMON) to take automatic snapshots for us, provided only that an initialisation parameter is set to an appropriate value (and I’ll explain which parameter and what value in just a moment).
By default, MMON will take statistics snapshots for us every 60 minutes -though I note that the Oracle Press book Oracle 10g New Features by Robert Freeman incorrectly asserts the collection period to be every 30 minutes by default -see, for example, page 54. I guess this is one of those times when using a beta product as the basis of assessing a new product becomes a bit awkward -and to be fair to Mr. Freeman, the 30 minutes figure is liberally sprinkled around the Internet, too, as a quick look through Google will confirm and as this site, picked entirely at random, shows as at February 2005. Trouble is, you can prove the "60 minutes’ default theory" just by issuing this query (this comes from a freshly-created template 10g database):
select snap_interval, retention
from dba_hist_wr_control;

SNAP_INTERVAL RETENTION
------------------- -------------------
+00000 01:00:00.0 +00007 00:00:00.0
Here, the ‘snap interval’ is shown to be, indeed, 1 hour. But this report doesn’t just prove the ‘30 minute default’ statements to be wrong. It gives a clue to the existence of a whole bunch of new DBA_HIST_xxx data dictionary views which you can use to monitor the entire statistics collection mechanism. And it also shows that the statistics MMON captures are automatically purged -by default, as my report’s RETENTION period shows, after 7 days. Statspack snapshots, by contrast, just piled up forever, unless you remembered to clear them out -so that’s another definitely new feature.
Some statistics that are important for performance tuning are session-specific ones, rather than instance-wide ones. The trouble is that to really understand what sessions are up to, and the performance issues they might be encountering, you need to sample these sorts of statistics an awful lot more frequently than once an hour. In fact, you should probably be thinking of grabbing these sorts of statistics every second or so -which is precisely what now happens in 10g. Every second, the system looks at V$SESSION to see which sessions are actually busy doing something (idle sessions are ignored). For those sessions, the system then sees if they are currently waiting on anything in the database (effectively looking at V$WAITSTAT). If sessions are indeed waiting, then that information is stored in a new memory buffer called the Active Session History (ASH) -whose contents you can view at any time by querying the new V$ACTIVE_SESSION_HISTORY view. The ASH buffer has its contents sampled (1 row in 10) and flushed to disk at each MMON statistics snapshot (that is, every 60 minutes by default). If there are too many sessions doing too many waits, however, the buffer will fill before the next MMON snapshot, so another new background process called Manageability Monitor Light (MMNL) flushes a sample of the ASH records direct to the AWR tables as needed. MMON will still flush its sample of the ASH statistics to the AWR when its next snapshot interval falls due.
The statistics MMON captures at each snapshot interval are greatly improved over the kind of thing Statspack managed to collect. For a start, there are more base statistics -operating system ones, for example, along with better SQL and optimiser ones. Brand new to 10g, there are new time-based statistics that record how long activities have run for.
But base statistics are simply the raw count of things happening. Metrics are statistics derived from the base statistics, and usually indicate the rate at which things are happening. For example, a base statistic might tell me the database has performed 1,000,000 physical reads since instance startup. A metric based on that statistic might be that the database has performed 5000 physical reads in the last hour, meaning an average of 1.4 physical reads per second. The base statistics are only of use, as I mentioned earlier, if you’ve got something from an earlier period to compare it with -say, 995,000. Then you can see that the difference yields you the ‘5000 since last measured’ metric.
One of the main jobs that MMON performs, aside from the periodic capturing of statistics snapshots, is to calculate and re-calculate numerous system metrics almost continuously (for most of them, about once every minute). The computed metrics are visible in another bunch of new views -such as V$SYSMETRIC- as well as being stored in the AWR proper when the usual snapshot interval comes round. Having the metrics recomputed continuously in this way means you can assess the effect of running a particular job practically instantaneously: you just run the job, then query the new V$ views. Compare that to 9i, where you would have had to arrange for the calculation of the appropriate metrics by manually taking a Statspack snapshot just before and just after you ran the job, and then by running a new Statspack report to compare the two snapshots.
Finally, I should mention that AWR is fully RAC-aware, in a way that Statspack never was. Certainly, you could get Statspack to capture statistics for every instance in the cluster. But you had to do that on each instance in turn: there was no way to ask for all instances in the cluster to simultaneously snapshot their individual statistics. What’s more, within the PERFSTAT tables, there was no way to distinguish which instance a particular statistics snapshot had come from, so picking the right snapshots to compare was more grief than it should have been. With AWR, by contrast, MMON in one instance co-ordinates with MMON in all the others, so that they all (virtually) simultaneously take a snapshot of their own instance -and then each of those separate snapshots is written into the AWR tables with the same Snapshot ID, though with different instance numbers. That makes picking the right snapshots to compare extremely easy; it also makes understanding what one instance was doing at precisely the same time as another a lot simpler, too.
3.1 Summary
Putting all of that together into one brief summary, therefore, we can say of AWR:
·    AWR is simply the physical and process infrastructure to collect instance and database statistics needed for proper performance tuning.
·    Statistics collection is automated (once per hour, by default)
·    Statistics purging is automated (statistics expire after a week, by default)
·    MMON is a new background process that does the collecting and purging
·    MMNL may assist in the collection of session statistics
·    Session statistics are collected every second for active sessions, but only in memory
·    One tenth of session statistics are written to the Workload Repository at the usual MMON collection
·    MMON collects more and better-quality statistics
·    Every minute, MMON re-computes metrics based on the statistics
·    AWR is fully RAC-enabled
So it really is fair to say that AWR brings new features and new qualities to the job of statistics collection that are a distinct improvement on the sort of mechanisms we had available to us in earlier Oracle versions. Whilst it is therefore possible to run traditional Statspack in a 10g database, it is rather pointless to do so, because MMON and the AWR achieve the same goal far more effectively.{mospagebreak title=Impact of AWR}
4.0 Impact of AWR
You might reasonably pause at this point and ask: AWR sounds very convenient and clever, but it also sounds as though there’s an awful lot of automatic shenanigans going on in the background to make it all work... are you sure it won’t clobber my database’s performance to death? To which the semi-official answer is ‘Stop Worrying! It’s all referencing in-memory structures, so it’s fast and has been optimised’. The slightly more realistic answer is that yes, of course this puts a load on your database, and that load might be unacceptable.
The Active Session History memory buffer, for example, consumes a fixed 2MB of RAM per CPU, (though with an absolute ceiling of no more than 5% of your total shared pool size). If you simply migrated from vanilla 9i to equally vanilla 10g, therefore, without also remembering to increase your shared pool size by that sort of figure to compensate, you’d be operating your database with smaller library and dictionary caches than you probably expected to have at your disposal -and that can definitely cause performance problems. Similarly, the AWR tables will probably end up consuming about 300MB of disk space, if you stick with the default retention period of 7 days, and whilst disk space is always said to be cheap, can you afford that much space even so? And never mind the megabytes: disk space has a habit of turning into physical I/O, especially when MMON starts writing to it every hour... can you afford that much additional I/O?
The point is, these new structures and snapshot-taking behaviours definitely do impose a load on the database, and that load might be unacceptable. On the other hand, what you can do with the information being collected by this process by way of tuning and enhancing your database’s performance might well be thought to outweigh such considerations. And there’s no doubt that Oracle has designed the whole process so that the impact on the database’s normal operations is minimised. But ‘minimised’ is not the same thing as ‘minimal’! It really boils down to careful testing and evaluation of how it behaves in your own environment, of course, and whether the benefits do outweigh the costs. My gut feel is that for most people, most of the time, they will... but it’s early days to know that for sure.
5.0 Managing AWR
Given that AWR is, as its very name tells us, ‘Automatic’, it’s perhaps not surprising that there’s not actually a lot of management to do to get it working! You’ll certainly need to know how to switch it on and off, and you might want to look at moderating the load it places on on the database when it’s on. But after that, it’s all down to MMON just getting on with it. It is, however, possible that you’ll want to take manual statistics snapshots without waiting for MMON’s next wakeup call. And finally, you may need to mark particular snapshots as being typical of some ‘baseline’ level of database performance. So I’ll now examine how you go about doing each of those four things in turn.
5.1 Switching MMON on and off
MMON is switched on by setting the initialisation parameter STATISTICS_LEVEL. That parameter takes three possible values: ALL, TYPICAL and BASIC. If BASIC is set, then MMON is off. If either of the other values are set, then MMON is on and the Workload Repository will be populated. The default value for STATISTICS_LEVEL is TYPICAL, therefore MMON is actually on by default.
Be careful about setting the parameter to ALL. Very detailed statistics related to the operating system and SQL statements are collected at this setting, and the impact of doing so on the database is quite severe. Oracle’s official line seems to be that you should never set the parameter to ALL unless Oracle support tells you to, or unless you are doing some extremely rare tuning of specific SQL statements. And even then, you wouldn’t leave it set like that for long!
Bear in mind that STATISTICS_LEVEL is a global switch that affects an awful lot more in the database than just MMON. For example, switching it off (by setting it to BASIC) will stop 10g doing automatic, dynamic SGA memory re-allocations, and will similarly switch off the various advisories that exist to inform you of the best possible manual settings for things like the shared pool size, or the right size for the PGA aggregate target. If you want all of those bits of functionality, but don’t want MMON automatically taking statistics snapshots, then you’ll have to go about disabling MMON in another way entirely (see section 5.2 below). Oracle Corporation, however, strongly suggests that you don’t even think about switching off MMON: the essence of a self-tuning, self-managing, self-automated database is that you let it do its automated stuff, and stop trying to get choosey about it. That’s a new paradigm for most Oracle DBAs to swallow, and some are going to find it harder to feel comfortable about than others. But that’s 10g all over!
5.2 Configuring MMON
There’s not a lot to configure regarding MMON, of course: it will just do its thing without your guidance or intervention, taking a new snapshot, as we’ve already seen, every 60 minutes and purging ones older than 7 days. The only thing you might want to do is alter those timings, and you do that by using the new DBMS_WORKLOAD_REPOSITORY package as follows:
exec dbms_workload_repository.modify_snapshot_settings (43200, 20)
The two parameters in the brackets there are, respectively, the AWR record retention period and the MMON snapshot interval time, both measured in minutes. In this example, therefore, I’m asking for repository records to be retained for 43,200 minutes -which people good at maths will immediately recognise as being equivalent to 30 days- and for MMON to take fresh snapshots every 20 minutes.
As a rough rule of thumb, allow 300MB for the repository tables if they are asked to retain snapshot data for the default period of 7 days -so a change in that retention peiod to 30 days implies my repository tables will end up occupying about 1.2GB of physical disk space. The retention period can be a minimum of 1440 (that is, one day) and a maximum of a somewhat ambitious 52,560,000 -a mere 100 years!! Incidentally, at 300MB per week, the AWR will consume about 1.5TB of data after 100 years. Just in case you were needing to budget a hardware purchase, you understand...
The snapshot interval in its turn can be as short as 10 minutes, and as ridiculously silly as 100 years. The shorter the interval, the finer-grained your view of database performance will be: mini-performance spikes will be detectable in the collected snapshots. But equally, the background work the database has to do will shoot up, and that will probably start imposing a measurable and unacceptable performance burden on your database. Conversely, the longer the interval, the less work MMON is doing, and the less the impact on your database... but the more blurred your view of database performance will be. If you smudge all the little ups and downs of performance that take place during a normal day, at intervals of a few minutes at a time, into (say) a single 10 hour-broad average, then you’ll really just have wasted your time, and the statistical reports will likely be of no use to anyone.
Although the technical minimum for the snapshot interval is, as I mentioned, 10 minutes, you are allowed to set it to 0. This setting has a special meaning: it disables MMON completely, and no automatic snapshots at all are taken. If you therefore wish to switch off MMON, but don’t want to set STATISTICS_LEVEL to BASIC because of all the other things that would disable at the same time, this is how you would go about doing it.
5.3 Taking Manual Snapshots
If you want to take your own manual statistics snapshot, instead of waiting for MMON to get around to it when the next snapshot interval elapses, then you can do so by using the CREATE_SNAPSHOT procedure of the DBMS_WORKLOAD_REPOSITORY package like this:
SQL> exec dbms_workload_repository.create_snapshot()

PL/SQL procedure successfully completed.
I’ve specified no arguments here at all, which is why there’s a pair of empty brackets at the end of that command, but actually you can specify there something called the flush_level as being either TYPICAL or ALL. You can therefore probably guess that this is a way of specifying the statistics level for each snapshot, regardless of what statistics level may have been configured for the instance as a whole by setting the STATISTICS_LEVEL initialisation paramater. If you don’t specify a flush-level, then you get the TYPICAL one, which you might have guessed would be the default, since that is equally the default value for the instance-wide STATISTICS_LEVEL parameter.
You can see the existence of the new snapshot by looking in the new data dictionary view DBA_HIST_SNAPSHOT, though it won’t yield an awful lot of particularly useful information:
SQL> select snap_id, to_char(begin_interval_time,''HH:MI:SS'') as "START",
2 to_char(end_interval_time,''HH:MI:SS'') as "FINISH",
3 snap_level from dba_hist_snapshot
4 order by snap_id;

SNAP_ID START FINISH SNAP_LEVEL
---------- -------- -------- ----------
1 04:40:10 06:00:46 1
2 06:00:46 07:00:12 1
3 12:00:06 12:11:44 1
4 04:27:28 04:39:15 1
5 05:54:28 06:05:59 1
6 12:09:09 12:20:31 1
7 12:20:31 01:00:49 1
8 01:00:49 02:00:26 1
9 02:00:26 03:00:54 1
10 03:00:54 04:00:29 1
11 09:01:12 09:12:27 1
12 09:12:27 10:00:45 1
13 10:45:48 10:45:53 1
14 10:47:22 10:47:29 2
15 11:04:10 11:35:09 1
16 11:35:09 11:35:10 1
17 11:35:10 11:35:17 2
The main thing of note here, I think, is the SNAP_LEVEL column: you get a ‘1’ if the collection was a typical one, and a ‘2’ if it was done at the ALL level. My data is not particularly good here: it appears as though my ALL snapshots took just 2 seconds or so longer to complete than my TYPICAL ones. But that’s true only for the sad and puny test database I’m using right now. In a real production database, you would expect the differential to be much larger than that.
Incidentally, if you set the snapshot interval to zero as I described above, so that MMON is not performing automatic snapshotting,then you’ll actually not be able to take manual snapshots either. You’ll get this sort of error message if you try:
SQL> exec dbms_workload_repository.create_snapshot()
BEGIN dbms_workload_repository.create_snapshot(); END;

*
ERROR at line 1:
ORA-13516: SWRF Operation failed: INTERVAL Setting is ZERO
ORA-06512: at "SYS.DBMS_WORKLOAD_REPOSITORY", line 8
ORA-06512: at "SYS.DBMS_WORKLOAD_REPOSITORY", line 31
ORA-06512: at line 1
This tells you that although it is you and your session issuing the command, it cannot be your server process that actually performs the statistics collection. In fact, it tells you that even manual snapshots are perfomed by MMON: all your server process does in these manual situations is tell MMON to get on with it. And, of course, if MMON has been switched off, then there’s nothing and no-one around to respond to the instruction... and hence the error you see here.
5.4 Taking a Baseline
Assuming you let MMON do its thing and capture snapshots automatically, there are going to be pairs of snapshots which can be considered representative of the start and end of, say, ‘normal running’ or ‘monthly bulk load’ or ‘end-of-year reports running’. You can mark such pairs of snapshots and declare them to be either end of a named baseline. You do so like this:
SQL> exec dbms_workload_repository.create_baseline(13,14,''Nightly Batch Load'')

PL/SQL procedure successfully completed.
Here, I’m marking snapshots 13 and 14 as having been taken at the beginning and end of a nightly batch load (and yes, if you look back to section 5.3, you’ll realise that my nightly batch load took all of about one minute to process, so it can’t have been much of a load! True enough, but then I’m only interested in getting the theory right here. Realism can wait!)
The point of creating a baseline in this way is that the baseline’s contributing snapshots are now exempt from the normal purging activity of MMON. They do not, in other words, ‘age out’ of the repository and in fact will be kept permanently -or, at least, until you decide to get rid of them yourself. Until then, you have a permanent record of how a nightly batch load behaved one night in January 2005 (in my case). If batch loads by July or August start running incredibly slowly, I can compare them to the way things were behaving at the start of the year, and thus (hopefully!) identify what has gone wrong in between the two periods.
Remembering that if you create a baseline, it and its constituent snapshots will be forever immune to the automatic purging activity of MMON, the only way to get rid of an unwanted baseline is to delete it yourself, and you can do that by using the DROP_BASELINE procedure of the DBMS_WORKLOAD_REPOSITORY package, like so:
SQL> exec dbms_workload_repository.drop_baseline(''Nightly Batch Load'',TRUE)
PL/SQL procedure successfully completed.
The DROP_BASELINE procedure takes as its arguments the ‘friendly’ name of the snapshot to be cleared out, and an optional second parameter which says whether the drop should ‘cascade’ down to clearing out the two snapshots which together define the baseline. The default for that argument is FALSE, so by default only the baseline ‘wrapper’ around the snapshots is itself deleted. In my example, though, I’ve wiped the participating snapshots as well.{mospagebreak title=Using AWR}
6.0 Using AWR
With Statspack in 9i, you ran spreport.sql to output the results of comparing one snapshot with another. With AWR, you run awrrpt.sql -either that, or you use the web-based Enterprise Manager to achieve the same sort of thing. I’ll stick with running the manual report for now, and deal with Enteprise Manager in a later paper. The awrrpt.sql script is found in the place traditionally used for this sort of thing: $ORACLE_HOME/rdbms/admin.
Take a moment to have a look at the contents of this script, and you’ll find at the beginning of it this little give-away:
Rem NOTES
Rem Run as select_catalog privileges.
Rem This report is based on the Statspack report.
Rem
Rem If you want to use this script in an non-interactive fashion,
Rem see the ''customer-customizable report settings'' section in
Rem awrrpti.sql
First, note that the script must be run by a user to whom the SELECT_CATALOG role has been granted. Second, marvel at the grammatical incompetence that allows anyone to tell you to ‘run as privilges’! Third, wonder at the inability of whoever wrote this to understand the difference between a privilege and a role. And fourth, and the bit I find most interesting, pay attention to the line that confesses ‘This report is based on the Statspack report’!! I know there are some people on the planet who tend to roll their eyes whenever I say ‘Oh, new feature X is really just old feature A on steroids’, but here we have the horse’s mouth, as it were, confirming my diagnosis!
Anyway, I digress! Here’s how you run that report:
SQL> show user
USER is "SYS"
SQL> @?\\rdbms\\admin\\awrrpt

Current Instance
~~~~~~~~~~~~~~~~

DB Id DB Name Inst Num Instance
----------- ------------ -------- ------------
973646425 DBNEW 1 dbnew

Specify the Report Type
~~~~~~~~~~~~~~~~~~~~~~~
Would you like an HTML report, or a plain text report?
Enter ''html'' for an HTML report, or ''text'' for plain text
Defaults to ''html''
Enter value for report_type: html

Type Specified: html
I’ll break in at this point to draw your attention to the rather nice fact that the ‘new Statspack’ report can be output in html instead of just plain text.
Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DB Id Inst Num DB Name Instance Host
------------ -------- ------------ ------------ ------------
* 973646425 1 DBNEW dbnew KOALA
Using 973646425 for databas e Id
Using 1 for instance number

Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing without
specifying a number lists all completed snapshots.
Listing the last 3 days of Completed Snapshots

Snap
Instance DB Name Snap Id Snap Started Level
------------ ------------ --------- ------------------ -----
11 11 Jan 2005 09:12 1
12 11 Jan 2005 10:00 1
15 11 Jan 2005 11:35 1
16 11 Jan 2005 11:35 1
17 11 Jan 2005 11:35 2
18 11 Jan 2005 12:0 0 1

Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Enter value for begin_snap: 11
Begin Snapshot Id specified: 11

Enter value for end_snap: 18
End Snapshot Id specified: 18
So, just as in the old Statspack days, you get shown which snapshots actually exist, and then asked to choose which one should be compared to which other one. You can pick any two you like, provided the first one is from a time before the second (hopefully for obvious reasons!)
Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is awrrpt_1_11_18.html. To use this name,
press to continue, otherwise enter an alternative.
awr_rpt1.html

[snip a huge amount of screen-scrolling activity]

End of Report

Report written to awr_rpt1.html
And at the end, as it says, your report is complete -written, incidentally, into the same directory you were in when you first launched SQL*Plus. The start of my report looks like this:
...and if you’re at all familiar with the start of a Statspack report, you’ll recognise the general instance statistics this is showing you. And, just like an old Statspack report, you can scroll on to discover what your top 5 wait events were, what your most expensive SQL statements were, and so on. At which point, I hope you are an experienced Statspack analyst and have vast experience in immediately drawing the right tuning conclusions from the various statistical oddities the report will have buried within its details... because you’re going to need to be! For whilst the report just produced might look a bit like an old Statspack report, it isn’t one... and hence useful websites like Oraperf, which is able to analyse your Statspack reports for free in just a few minutes, don’t (as yet, anyway) allow you to upload their 10g equivalents. So if you’re not a Statspack Report analysis expert, it’s highly questionable why you bothered to produce an ‘awrrpt’ at all!
Which would be a bit of a depressing note to conclude on, if it was the whole story. But, of course, it isn’t at all -because in 10g there are a lot of automated, in-built clients of the Workload Repository, which draw upon the statistical information it contains to suggest all manner of performance tuning tweaks you might want to apply to your database by hand, and sometimes to carry out performance tuning tweaks totally automatically. For example, the Automatic Database Diagnostic Monitor (ADDM) uses the AWR statistics to detect and resolve database configuration and performance problems. Or, again: there’s a new set of server-generated alerts which similarly draw upon the AWR data to predict when, for example, space problems are going to arise on the database. And, as one final example, the new SQL Tuning Advisor can also use the AWR to work out how SQL statements could be better written.
In short, there are many AWR clients, and you as a DBA don’t actually have to be one of them in order to get benefit from having a Workload Repository. Gurus and techie types will certainly want the control that perusing your own AWR reports will give you: they will be able to use it to make informed, correct diagnoses and to implement appropriate fixes. But the vast majority of DBAs, who lack naval-gazing skills of that sort and intensity, will simply be able to let other database components use the AWR to fix things up automatically and can concentrate on being more productive in other areas.
I’ll be discussing some of these AWR clients in other articles in this series.
7.0 Conclusion
I have, I’m afraid, been deliberately provocative in this article in the way I have cast AWR as nothing much more than a modestly-improved Statspack capability. My main reason for being so, however, is that it is all-too easy to get thoroughly confused -indeed, intimidated- when told that 10g comes with AWR, ADDM, ASM, and who knows what other acronyms and abbreviations besides. So many new things! So much to learn! So much to re-learn! At which point, it can become rather tempting just to give up and avoid all the new-ness altogether!
Yet, if you could put a lot of these new features into some sort of context; if you could see them as being a development of, and an enhancement to, features with which you are already familiar, then the reason for that sense of bewilderment or intimidation disappears. You can see through to those aspects which are genuinely new very quickly, because you understand the foundation they’ve been built on. You won’t, for example, be thinking there’s a huge intellectual mountain to climb before you can even grasp what the new features are actually for.
We had much the same thing in 9i with Real Application Clusters: “Thou shalt not think of this brand-new, never-before seen product as an enhanced version of Oracle Parallel Server”, said Oracle when it was first releaed. But in fact, that’s what it was! Truthfully, the technical difference between OPS and RAC was really rather slight (block transfer via the network rather than the hard disk being the main one -and even that had been introduced in basic form in 8i) -though it is fair to say that slight difference makes a huge difference in the way DBAs, Application Designers and Developers approach the topics of clustered database application design and administration. And whilst it was definitely important to get users to make that huge leap in approach, I’m not convinced that marketing a product as something it isn’t is actually the most helpful way to go about it.
So, in a similar way, with AWR. Yes, of course the statistics are better, and the use to which they can be put is dramatically new in 10g compared with ye olde Statspack. But does it help people grasp the essential point of AWR better and more quickly if it is explained to them in Statspack-like terms? I think so. I certainly hope so, anyway, because otherwise this article has been a waste of time!
So to conclude: yes, you will almost certainly let MMON automatically populate your Workload Repository. But it is unlikely you will be much interested in what’s inside the Repository yourself. Instead, you’ll probably be happy to let the internal, automated AWR clients like ADDM and the SQL Tuning Advisor make use of it as they see fit. AWR, in short, will probably be a vital but practically invisible 10g new feature as far as most real, human DBAs are concerned.

No comments:

Post a Comment