No question at this time
DBA Top 10
1 A. Kavsek 10900
2 M. Cadot 6200
3 P. Wisse 5000
4 B. Vroman 4500
5 T. Boles 2200
6 . Lauri 2000
7 J. Schnackenberg 1300
8 J. PĂ©ran 1100
9 R. Pattyn 700
10 T. P 500
10 G. Graham 500
About
DBA-Village
The DBA-Village forum
Forum as RSS
as RSS feed
Site Statistics
Ever registered users48425
Total active users1512
Act. users last 24h8
Act. users last hour2
Registered user hits last week172
Registered user hits last month818
Go up

Options for gathering statistics for datawarehouse tables
Next thread: need to speed up backup of large table by using Datapump
Prev thread: query is picking a bad execution plan

Message Score Author Date
Hi, I am trying to set up a proper table statis...... Lauri Nov 28, 2018, 13:31
Hi, 1) If you don't have any concrete evidence,...... Jan Schnackenberg Nov 29, 2018, 09:23
Hi Lauri, You're aware about automatic statisti...... Philip Wisse Nov 30, 2018, 09:31

Follow up by mail Click here


Subject: Options for gathering statistics for datawarehouse tables
Author: Lauri, Netherlands
Date: Nov 28, 2018, 13:31, 295 days ago
Os info: All
Oracle info: 12.1 and higher
Error info: Not applicable
Message: Hi,

I am trying to set up a proper table statistics gathering scheme for reasonably big tables in a datawarehouse environment.

I use this template:

dbms_stats.gather_table_stats(
ownname => [my user name],
tabname => [my table name],
partname => [my table subpartition if exists if not, my table partition],
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
block_sample => true,
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
granularity => [ 'SUBPARTITION' or 'PARTITION' or 'GLOBAL' ]
cascade => true,
force => true
)

But I have some hesitations:
1) Should I use estimate_percent = DBMS_STATS.AUTO_SAMPLE_SIZE or a different value?
2) Should I use method_opt = FOR ALL COLUMNS SIZE AUTO or method_opt = FOR ALL COLUMNS SIZE SKEWONLY?
3) Should I prefer block_sample = true insyead of block_sample = false? I have a preference for block sampling as the number of rows does not necessarily give an idea of how "big" is a table.
4) Should I use force = true? I tend to favor this option.
5) Should I set degree or not, and why?

Thanks by advance for sharing your experience with datawarehouses.

Kind Regards
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Jan Schnackenberg, Germany
Date: Nov 29, 2018, 09:23, 295 days ago
Message: Hi,

1) If you don't have any concrete evidence, that a distinct value works better, go with 'AUTO_SAMPLE_SIZE'. This allows some optimizations for the gathering process.

2) First: I have extremely little knowhow in this special area. But in general: No one will be able to tell you what you should use without in-depth knowledge about the data in the affected table.

3) Same answer as for 1). Stay with the default unless you have proven (by testing on your own data) that the default does not yield the best results

4) For what reason do you want to FORCE gathering statistics on a locked table? If you locked the stats for this table, you'll have had a reason for that. Why do you now want to do this?

5) Same as 1) and 3)
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Philip Wisse, Netherlands
Date: Nov 30, 2018, 09:31, 294 days ago
Message: Hi Lauri,

You're aware about automatic statistic gathering (auto_stats_job)?

My 'favourite' statistics gathering is like:
BEGIN

dbms_stats.gather_schema_stats(
ownname => 'DWH',
estimate_percent => 100,
CASCADE => TRUE,
OPTIONS => 'GATHER AUTO');
END;


So this is per schema.

Even big tables are worth scanning in full IMO.

HTH Philip
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here