No question at this time
DBA Top 10
1 B. Vroman 13400
2 M. Cadot 10400
3 J. Schnackenberg 8300
4 T. Boles 7850
5 A. Kavsek 5700
6 M. Hidayathullah ... 2200
7 G. Lambregts 1100
8 T. P 1000
8 P. Wisse 1000
10 . Lauri 800
About
DBA-Village
The DBA-Village forum
Forum as RSS
as RSS feed
Site Statistics
Ever registered users48302
Total active users1583
Act. users last 24h3
Act. users last hour0
Registered user hits last week198
Registered user hits last month812
Go up

Options for gathering statistics for datawarehouse tables
Next thread: need to speed up backup of large table by using Datapump
Prev thread: query is picking a bad execution plan

Message Score Author Date
Hi, I am trying to set up a proper table statis...... Lauri Nov 28, 2018, 13:31
Hi, 1) If you don't have any concrete evidence,...... Jan Schnackenberg Nov 29, 2018, 09:23
Hi Lauri, You're aware about automatic statisti...... Philip Wisse Nov 30, 2018, 09:31

Follow up by mail Click here


Subject: Options for gathering statistics for datawarehouse tables
Author: Lauri, Netherlands
Date: Nov 28, 2018, 13:31, 17 days ago
Os info: All
Oracle info: 12.1 and higher
Error info: Not applicable
Message: Hi,

I am trying to set up a proper table statistics gathering scheme for reasonably big tables in a datawarehouse environment.

I use this template:

dbms_stats.gather_table_stats(
ownname => [my user name],
tabname => [my table name],
partname => [my table subpartition if exists if not, my table partition],
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
block_sample => true,
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
granularity => [ 'SUBPARTITION' or 'PARTITION' or 'GLOBAL' ]
cascade => true,
force => true
)

But I have some hesitations:
1) Should I use estimate_percent = DBMS_STATS.AUTO_SAMPLE_SIZE or a different value?
2) Should I use method_opt = FOR ALL COLUMNS SIZE AUTO or method_opt = FOR ALL COLUMNS SIZE SKEWONLY?
3) Should I prefer block_sample = true insyead of block_sample = false? I have a preference for block sampling as the number of rows does not necessarily give an idea of how "big" is a table.
4) Should I use force = true? I tend to favor this option.
5) Should I set degree or not, and why?

Thanks by advance for sharing your experience with datawarehouses.

Kind Regards
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Jan Schnackenberg, Germany
Date: Nov 29, 2018, 09:23, 17 days ago
Message: Hi,

1) If you don't have any concrete evidence, that a distinct value works better, go with 'AUTO_SAMPLE_SIZE'. This allows some optimizations for the gathering process.

2) First: I have extremely little knowhow in this special area. But in general: No one will be able to tell you what you should use without in-depth knowledge about the data in the affected table.

3) Same answer as for 1). Stay with the default unless you have proven (by testing on your own data) that the default does not yield the best results

4) For what reason do you want to FORCE gathering statistics on a locked table? If you locked the stats for this table, you'll have had a reason for that. Why do you now want to do this?

5) Same as 1) and 3)
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Philip Wisse, Netherlands
Date: Nov 30, 2018, 09:31, 16 days ago
Message: Hi Lauri,

You're aware about automatic statistic gathering (auto_stats_job)?

My 'favourite' statistics gathering is like:
BEGIN

dbms_stats.gather_schema_stats(
ownname => 'DWH',
estimate_percent => 100,
CASCADE => TRUE,
OPTIONS => 'GATHER AUTO');
END;


So this is per schema.

Even big tables are worth scanning in full IMO.

HTH Philip
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here