No question at this time
DBA Top 10
1 A. Kavsek 8700
2 B. Vroman 6600
3 M. Cadot 5600
4 T. Boles 4550
5 P. Wisse 4100
6 J. Schnackenberg 3700
7 G. Lambregts 1100
7 . Lauri 1100
7 T. P 1100
10 R. Pattyn 800
About
DBA-Village
The DBA-Village forum
Forum as RSS
as RSS feed
Site Statistics
Ever registered users48379
Total active users1533
Act. users last 24h5
Act. users last hour0
Registered user hits last week231
Registered user hits last month580
Go up

Options for gathering statistics for datawarehouse tables
Next thread: need to speed up backup of large table by using Datapump
Prev thread: query is picking a bad execution plan

Message Score Author Date
Hi, I am trying to set up a proper table statis...... Lauri Nov 28, 2018, 13:31
Hi, 1) If you don't have any concrete evidence,...... Jan Schnackenberg Nov 29, 2018, 09:23
Hi Lauri, You're aware about automatic statisti...... Philip Wisse Nov 30, 2018, 09:31

Follow up by mail Click here


Subject: Options for gathering statistics for datawarehouse tables
Author: Lauri, Netherlands
Date: Nov 28, 2018, 13:31, 201 days ago
Os info: All
Oracle info: 12.1 and higher
Error info: Not applicable
Message: Hi,

I am trying to set up a proper table statistics gathering scheme for reasonably big tables in a datawarehouse environment.

I use this template:

dbms_stats.gather_table_stats(
ownname => [my user name],
tabname => [my table name],
partname => [my table subpartition if exists if not, my table partition],
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
block_sample => true,
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
granularity => [ 'SUBPARTITION' or 'PARTITION' or 'GLOBAL' ]
cascade => true,
force => true
)

But I have some hesitations:
1) Should I use estimate_percent = DBMS_STATS.AUTO_SAMPLE_SIZE or a different value?
2) Should I use method_opt = FOR ALL COLUMNS SIZE AUTO or method_opt = FOR ALL COLUMNS SIZE SKEWONLY?
3) Should I prefer block_sample = true insyead of block_sample = false? I have a preference for block sampling as the number of rows does not necessarily give an idea of how "big" is a table.
4) Should I use force = true? I tend to favor this option.
5) Should I set degree or not, and why?

Thanks by advance for sharing your experience with datawarehouses.

Kind Regards
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Jan Schnackenberg, Germany
Date: Nov 29, 2018, 09:23, 200 days ago
Message: Hi,

1) If you don't have any concrete evidence, that a distinct value works better, go with 'AUTO_SAMPLE_SIZE'. This allows some optimizations for the gathering process.

2) First: I have extremely little knowhow in this special area. But in general: No one will be able to tell you what you should use without in-depth knowledge about the data in the affected table.

3) Same answer as for 1). Stay with the default unless you have proven (by testing on your own data) that the default does not yield the best results

4) For what reason do you want to FORCE gathering statistics on a locked table? If you locked the stats for this table, you'll have had a reason for that. Why do you now want to do this?

5) Same as 1) and 3)
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Options for gathering statistics for datawarehouse tables
Author: Philip Wisse, Netherlands
Date: Nov 30, 2018, 09:31, 199 days ago
Message: Hi Lauri,

You're aware about automatic statistic gathering (auto_stats_job)?

My 'favourite' statistics gathering is like:
BEGIN

dbms_stats.gather_schema_stats(
ownname => 'DWH',
estimate_percent => 100,
CASCADE => TRUE,
OPTIONS => 'GATHER AUTO');
END;


So this is per schema.

Even big tables are worth scanning in full IMO.

HTH Philip
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here