No question at this time
DBA Top 10
1 M. Cadot 6300
2 B. Vroman 2700
3 P. Tsongayinwe 1500
4 P. Wisse 1300
4 J. Péran 1300
6 A. Kavsek 900
7 D. Walgude 400
7 J. Schnackenberg 400
9 J. Alcroft 200
10 A. Hudspith 100
10 L. Ywema 100
10 D. Johnson 100
10 B. B 100
About
DBA-Village
The DBA-Village forum
Forum as RSS
as RSS feed
Site Statistics
Ever registered users48709
Total active users1322
Act. users last 24h2
Act. users last hour0
Registered user hits last week33
Registered user hits last month217
Go up

Help needed to delete huge volume of data
Next thread: ORA-02019 : connection description for remote database not found
Prev thread: SQL Query help

Message Score Author Date
Hi Experts Please help me. I have a table ...... Venu Madhav Apr 17, 2022, 10:02
How do you determine those 8 million rows? Wha...... Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts Michel Cadot Apr 17, 2022, 14:26
Hi Michel, Many thanks for your reply. I don't...... Venu Madhav Apr 17, 2022, 17:56
Still not clear how you retrieve 8 million rows ...... Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts Michel Cadot Apr 17, 2022, 18:35
Hi Michel, Thanks for your assistance. <pre>...... Venu Madhav Apr 18, 2022, 12:35
The number of rows is not so huge, I have tables...... Score: 300 PtsScore: 300 PtsScore: 300 PtsScore: 300 PtsScore: 300 Pts Michel Cadot Apr 18, 2022, 17:34
Thanks Michel, I will update you. Thanks, V...... Venu Madhav Apr 18, 2022, 19:31
Hello Venu, my understanding (please correct th...... Score: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 Pts Bruno Vroman Apr 19, 2022, 11:20
Hi Bruno Thanks for the help and anlysis by dif...... Venu Madhav Apr 20, 2022, 12:26
Correction- So ideally table DOC_Control conta...... Venu Madhav Apr 20, 2022, 12:27
The performances you can get depend on the index...... Score: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 Pts Michel Cadot Apr 20, 2022, 12:33
Hi Venu, thank you for the feedback -although I...... Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts Bruno Vroman Apr 20, 2022, 17:54

Follow up by mail Click here


Subject: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 17, 2022, 10:02, 225 days ago
Os info: Hpux
Oracle info: 12
Message: Hi Experts

Please help me.

I have a table with 5 coloums. Out of these 5 coloums ,4 columns have fixed value in entire table.and remaining 1 field contains 80 Million rows.

Now out of these 80 million rows , I want to keep only 8 million rows and rest I want to delete.

I was trying to di this by taking export backup of the table based on the Query Clause.Now I am not able to understand How to select those 8 million records as 4 other columns have content value in entire table.

Table ABC- 80 Million records
COL A- CONSTANT VALUE FOR 80 MILLION RECORDS
COL B- -CONSTANT VALUE FOR 80 MILLION RECORDS
COL C- CONSTANT VALUE FOR 80 MILLION RECORDS
COL D - CONSTANT VALUE FOR 80 MILLION RECORDS
COL E- DIFFERENT 80 MILLION VALUES FOR 80 MILLION RECORDS.

PLEASE HELP TO SELECT 8 MILLION RECORDS FROM THIS TABLE.

KINDLY HELP.

THANKYOU VERY MUCH IN ADVANCE.

THANKS
VENU
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Michel Cadot, France
Date: Apr 17, 2022, 14:26, 225 days ago
Score:   Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts
Message:
How do you determine those 8 million rows?
What is the criterion?

Regards
Michel

Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 17, 2022, 17:56, 225 days ago
Message: Hi Michel,

Many thanks for your reply.
I don't have the selection criteria on the same table.

But I did someore coordination with requesters and they provided me one more table which is having the same column as our problematic Table's Column E.

I have tried to fetch the matching records on both the tables based on Column E and date range which in another table.

Since the first problematic table is having 80 Million records and second table is having 50 Million records. Due to huge volume of records, comparision query is still running for the last 6-7 hours and no output yet.

Any thought on this?

Is expdp is possible with 2 tables in query clause.

Thanks
Venu
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Michel Cadot, France
Date: Apr 17, 2022, 18:35, 224 days ago
Score:   Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts
Message:
Still not clear how you retrieve 8 million rows from the first table using the second table of 50 million rows.
Post the query you execute.

Regards
Michel
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 18, 2022, 12:35, 224 days ago
Message: Hi Michel,

Thanks for your assistance.


Table -1 /- DOC_Control
Name Null? Type
----------------------------------------- -------- ----------------------------
MANDT NOT NULL VARCHAR2(9)
RIGHT_ID NOT NULL VARCHAR2(60)
OBJECT_GUID NOT NULL RAW(16)
PRC_TYPE_ID NOT NULL NUMBER(3)
TASK_ID NOT NULL VARCHAR2(24)


Table -2 /- DOC_Capture
Name Null? Type
----------------------------------------- -------- ----------------------------
CLIENT NOT NULL VARCHAR2(9)
GUID NOT NULL RAW(16)
OBJECT_ID NOT NULL VARCHAR2(30)
PROCESS_TYPE NOT NULL VARCHAR2(12)
POSTING_DATE NOT NULL VARCHAR2(24)
DESCRIPTION NOT NULL VARCHAR2(120)







Common field in Both the table is Doc_Control(OBJECT_GUID) and Doc_capture(GUID)


No of records- Doc_Control - 80M
Doc_Capture- 50M

I want to keep only 8M records in Table Doc_Control.

Since I am not able to find any selection criteria on Doc_Control. so I have involved another table Doc_Capture.

Below is the query :


select /*+ parallel */B.GUID,B.POSTING_DATE
from
Doc_Control A,Doc_Capture B
where
B.POSTING_DATE>='20200101' and
A.OBJECT_GUID=B.GUID
and
A.RIGHT_ID='ABC8923BL00U';



since volume of data is huge so above query is running for long time.

Goal is to keep only 8M records in Table Doc_Control based on OBJECT_GUID.

How this can be achive ?

is this possible with expdp and impdp which Query clause on 2 tables ?

Please help.

Thanks
Venu
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Michel Cadot, France
Date: Apr 18, 2022, 17:34, 224 days ago
Score:   Score: 300 PtsScore: 300 PtsScore: 300 PtsScore: 300 PtsScore: 300 Pts
Message:
The number of rows is not so huge, I have tables on my laptop with more rows. :)

What is the execution plan of the query?

After the execution use
select * from table(dbms_xplan.display_cursor(format=>'allstats last +cost'));

If you don't want to execute the query and it is still in the SGA, use:
select * from table(dbms_xplan.display_cursor(sql_id=>'<the query sql id>',format=>'allstats last +cost'));

Regards
Michel
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 18, 2022, 19:31, 223 days ago
Message: Thanks Michel,

I will update you.

Thanks,
Venu
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Bruno Vroman, Belgium
Date: Apr 19, 2022, 11:20, 223 days ago
Score:   Score: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 Pts
Message: Hello Venu,

my understanding (please correct the mistakes)
- you have a table doc_control with 80 million rows
- you want to keep 8 million rows of this table
- rows to keep: any of them, with the condition that doc_control.object_guid must exist in table doc_capture.guid
(from the last point I guess that there are doc_control.object_id that have NO corresponding doc_capture.guid)
I tend to guess that the "foreign key" relationship between the 2 tables is not declared, this might be the source of issues (orphan rows...), maybe you should review the data model.

Important: I assume that this is NOT a "production" database and that you want to decrease the data volume for some reason, and that deleting 90% of the table is not an issue.

The fact that 4 columns out of 5 have a constant value is not a problem (well, it shows that something must be wrong somewhere from business point of view, but technically it is not a problem)

Would we run:
SELECT co.*

FROM doc_control co
WHERE EXISTS( SELECT NULL
FROM doc_capture ca
WHERE ca.guid = co.object_id
)
we would have all the rows of "co" having a correspondance in "ca"
If we want only 8 million rows, we can do
SELECT co.*

FROM doc_control co
WHERE EXISTS( SELECT NULL
FROM doc_capture ca
WHERE ca.guid = co.object_id
)
AND rownum <= 8E6
But if the query implies to scan the 50 millions rows of "ca" for each candidate row of "co" this would be prohibitive (especially if many rows of "co" have no corresponding row in "ca")

It is not clear if the condition on "posting_date" is important or just a "nice to have", and if the following query
SELECT co.*

FROM doc_control co
WHERE EXISTS( SELECT NULL
FROM doc_capture ca
WHERE ca.guid = co.object_id
AND ca.posting_date >= '20200101'
)
AND rownum <= 8E6
would retrieve 8 million rows or possibly less... (remark: dates stored as VARCHAR2(24)? Awful!)

To derive a better way I would like to know if some assumptions are correct...
 a) Please tell us what indexes exist on both tables and if they are UNIQUE
 b) For each table: is there a primary key? Is it declared in the database? On which column(s)?
 c) Is there a foreign key joining the 2 tables? If yes, which one exactly? Is it declared in the database?
 d) for "b)" and "c)", maybe constraints are not declared in the database, but tell us if they make sense from a business point of view...

And, waiting your replies, just an idea: maybe we don't really care about the values of "co" to keep and we might truncate the table and then insert in "co" 8 million ca.guid picked randomly for "co.object_id" and the constant values for the 4 other columns...

Best regards,

Bruno Vroman.
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 20, 2022, 12:26, 222 days ago
Message: Hi Bruno

Thanks for the help and anlysis by different approaches.

Issue is due to some business logic triggered and caused the mishaps in the environment. To correct the issue we are doing this exercise.

So
ideally table DOC_Control contains 0 records.But due to issue this table is now have 8M records.

and we don't have any selection criteria to identify those 8M correct records.

So another table DOC_Capture came in to the picture with one field in common.which helps to identify the 8 M records.

Now we have tested successfully below method in Pre Prod.

1. Created on temp table and in "Temp Table" inserted the common 8 M records.
2.taken expdp Backup for DOC_Control table with 80M records.
3.Truncated the DOC_Control table.
4.Inserted the 8M records from "Temp Table" to DOC_Control.
5.Activated the Business logic jobs.

Now validation is in progress.


Many thanks for your help.

Thnaks,
Venu
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Venu Madhav, India
Date: Apr 20, 2022, 12:27, 222 days ago
Message: Correction-

So
ideally table DOC_Control contains 0 records.But due to issue this table is now have 80M records.
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Michel Cadot, France
Date: Apr 20, 2022, 12:33, 222 days ago
Score:   Score: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 PtsScore: 400 Pts
Message:
The performances you can get depend on the indexes and constraints... you have thus the execution plan I asked you in order to determine what you have to do to improve these performances.

Data Pump or SELECT use the same way so fist improve SELECT.

Regards
Michel
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here

Subject: Re: Help needed to delete huge volume of data
Author: Bruno Vroman, Belgium
Date: Apr 20, 2022, 17:54, 222 days ago
Score:   Score: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 PtsScore: 200 Pts
Message: Hi Venu,

thank you for the feedback -although I don't understand much of it-.

It seems that you have fixed your issue and are now "happy" so all is fine.

But still:

 - you might on long term find some benefits in declaring constraints like 'PK' and 'FK' at database level.

 - you might maybe review the design of doc_control; the fact that all rows have the same value for four fields seem to show that they are useless...
Maybe you might even drop the table and simply use a view:
DROP TABLE doc_control;

CREATE VIEW doc_control AS
SELECT 'mandt' mandt, 'ABC8923BL00U' right_id, ca.guid object_id, 123, 'task_id'
FROM doc_capture ca
;
(with the "correct" values for the constants, I've just picked examples)

Best regards,

Bruno.
Your rating?: This reply is Good Excellent
Goto: Reply - Top of page 
If you think this item violates copyrights, please click here