Showing posts with label SQL. Show all posts
Showing posts with label SQL. Show all posts

Friday, November 25, 2011

Delete Duplicate Records – Rows

There are different way to do this.

Solution - 1

Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3. (The table must have identity column, which will be used to identify the duplicate records.)


DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)


Solution - 2

set nocount on
drop table tbl1
create table tbl1
(
col1 int
)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(1)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(2)
insert into tbl1 values(3)
insert into tbl1 values(3)

set rowcount 1
select ‘start’
while @@rowcount > 0 delete a from tbl1 a where (select count(*) from tbl1 b where a.col1 = b.col1)>1
set rowcount 0
select * from tbl1
set nocount off

Note: This will be a time-consuming method. Alternates are move distinct data to temp table or use row_number() function.

Solution - 3

If you are looking for deleting multiple records (duplicate with more than 2 records like 3,4,5 duplicate records, etc) and without key column then the following method can be adopted.

Lets say your table is tab1.

SELECT DISTINCT * INTO tab2 FROM tab1 — distinct of all columns

DROP TABLE tab1

EXEC sp_rename ‘tab2′,’tab1′

Note:
Take a back up of your original table before using dropping for your reference. You may need to grant permission of tab2 as given for tab1.

Triggers and indexes defined on that table tab1 will have to recreate. To resolve this issue, you may follow Solution - 1 or 4.

Solution - 4

Here is a single query which will answer all your questions but this still needs Primary key or identity column to be added to the table though.
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] > MIN(b.[ID])
)
If you want most recent records to be present in your DB and delete multiple Old duplicate records, You can make use of
DELETE FROM MyTable
WHERE EXISTS (
SELECT * FROM MyTable AS b
WHERE
b.[col1] = MyTable.[col1]
AND b.[col2] = MyTable.[col2]
AND b.[col3] = MyTable.[col3]
GROUP BY
b.[col1], b.[col2], b.[col3]
HAVING
MyTable.[ID] MIN(b.[ID])
)
Hope this answers all your questions.
If you dont have Primary key or Identity Column then you can always create it and run this query and delete the Primary Key or identity column.

Solution - 5

This can be easily achieved without crating unique identifier by using CTE and ROW_NUMBER (), the modified query for sql server 2005 goes here.

WITH T1 AS (SELECT ROW_NUMBER ( ) OVER ( PARTITION BY ID, FNAME, LNAME ORDER BY ID ) AS RNUM FROM DUPLICATE )
DELETE FROM T1 WHERE RNUM > 1

Solution - 6

Delete record in a single query its OK. But if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column, here is the solution.

CREATE TABLE #Table1 (col1 int, col2 int)
INSERT INTO #Table1 VALUES (1, 1000)
INSERT INTO #Table1 VALUES (2, 2000)
INSERT INTO #Table1 VALUES (2, 2000)
INSERT INTO #Table1 VALUES (3, 3000)
INSERT INTO #Table1 VALUES (3, 3000)
INSERT INTO #Table1 VALUES (4, 4000)

SELECT * FROM #Table1;

WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1)) AS RNum FROM #Table1)
DELETE FROM T1 WHERE RNum IN (SELECT a.RNum FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)

SELECT * FROM #Table1

DROP TABLE #Table1

ROW_NUMBER() will solve your purpose.

Monday, November 22, 2010

@@IDENTITY vs SCOPE_IDENTITY() vs IDENT_CURRENT – Retrieve Last Inserted Identity of Record

SELECT @@IDENTITY
It returns the last IDENTITY value produced on a connection, regardless of the table that produced the value, and regardless of the scope of the statement that produced the value.
@@IDENTITY will return the last identity value entered into a table in your current session. While @@IDENTITY is limited to the current session, it is not limited to the current scope. If you have a trigger on a table that causes an identity to be created in another table, you will get the identity that was created last, even if it was the trigger that created it.

SELECT SCOPE_IDENTITY()
It returns the last IDENTITY value produced on a connection and by a statement in the same scope, regardless of the table that produced the value.
SCOPE_IDENTITY(), like @@IDENTITY, will return the last identity value created in the current session, but it will also limit it to your current scope as well. In other words, it will return the last identity value that you explicitly created, rather than any identity that was created by a trigger or a user defined function.

SELECT IDENT_CURRENT(‘tablename’)
It returns the last IDENTITY value produced in a table, regardless of the connection that created the value, and regardless of the scope of the statement that produced the value.
IDENT_CURRENT is not limited by scope and session; it is limited to a specified table. IDENT_CURRENT returns the identity value generated for a specific table in any session and any scope.

To avoid the potential problems associated with adding a trigger later on, always use SCOPE_IDENTITY() to return the identity of the recently added row in your T SQL Statement or Stored Procedure.

Tuesday, August 24, 2010

SQL Date Function

Following query will run respective to today’s date. It will return Last Day of Previous Month, First Day of Current Month, Today, Last Day of Previous Month and First Day of Next Month respective to current month.


DECLARE @mydate DATETIME
SELECT @mydate = GETDATE()
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(@mydate)),@mydate),101) ,'Last Day of Previous Month'
UNION
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(@mydate)-1),@mydate),101) AS Date_Value,'First Day of Current Month' AS Date_Type
UNION
SELECT CONVERT(VARCHAR(25),@mydate,101) AS Date_Value, 'Today' AS Date_Type
UNION
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(DATEADD(mm,1,@mydate))),DATEADD(mm,1,@mydate)),101) ,'Last Day of Current Month'
UNION
SELECT CONVERT(VARCHAR(25),DATEADD(dd,-(DAY(DATEADD(mm,1,@mydate))-1),DATEADD(mm,1,@mydate)),101) ,'First Day of Next Month'


First Day of Next Month


DECLARE @datum DATETIME
SET @datum=GETDATE()

DECLARE @datFuture DATETIME
SET @datFuture = dateadd(mm,1,@datum)

DECLARE @dateFirstDayNewMonth datetime
SET @dateFirstDayNewMonth = cast(year(@datFuture) as varchar(50)) + '-' + cast(month(@datFuture) as varchar(50)) + '-1'

print @dateFirstDayNewMonth


OR


DECLARE @datum DATETIME
SET @datum=getdate()
select dateadd(month,datediff(month,0,@datum)+1,0)


print all the months between 2 dates, eg. 01 Aug 2006 and 31 March 2007


declare @date1 datetime,@date2 datetime
select @date1='01 Aug 2006', @date2 ='31 mar 2007'

select dateadd(month,number,@date1) from master..spt_values
where type='p' and number between 0 and datediff(month,@date1,@date2)


OR


DECLARE @date1 DATETIME,
@date2 DATETIME

SELECT @date1 = '01 Aug 2006', @date2 = '31 mar 2007';

WITH months(monthnames)
AS (SELECT @date1 AS monthnames
UNION ALL
SELECT DATEADD(mm,1,monthnames)
FROM months
WHERE monthnames < @date2) SELECT STUFF(CONVERT(VARCHAR(12),monthnames,106),1,3,'') AS monthname FROM months OPTION (MAXRECURSION 0) first date of last month


select cast(cast(datepart(month, dateadd(mm, -1, getdate())) as varchar) +
'/' +
cast(datepart(day, dateadd(day, -datepart(day, getdate())+1, getdate()))
as varchar) + '/' +
cast(datepart(year, dateadd(mm, -1, getdate())) as varchar) as datetime)


This week start date Monday


select DATEADD(wk, DATEDIFF(wk,0,getdate()),0)


This week enddate


select DATEADD(wk, DATEDIFF(wk,-7,getdate()),-1)


Last week start date


select DATEADD(wk, DATEDIFF(wk,0,getdate()),-7)


Last week end date


select DATEADD(wk, DATEDIFF(wk,0,getdate()),-1)


First day of this year


select DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)


Last day of the prior year


select cast(floor(cast(dateadd(mi,-1,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)) as float)) as datetime)


First day of next year


select DATEADD(yy, DATEDIFF(yy,-1,getdate()), 0)


First day of prior year


SELECT DATEADD(yy, DATEDIFF(yy,0,dateadd(mi,-1,DATEADD(yy, DATEDIFF(yy,0,getdate()), 0))), 0)


This week last Year start date Monday


select cast(floor(cast(dateadd(wk,datediff(wk,0,dateadd(yy,-1,getdate()) ),0)as float)) as datetime)


This day last year


select dateadd(yy,-1,DATEADD(wk, DATEDIFF(wk,0,getdate()),0))


First day of the current month


select DATEADD(mm, DATEDIFF(mm,0,getdate()), 0)


Last day of the current month


select cast(floor(cast(dateadd(ms,-3,DATEADD(mm, DATEDIFF(m,0,getdate() )+1, 0))as float)) as datetime)


First day of the current month Last year


select dateadd(yy,-1,DATEADD(mm, DATEDIFF(mm,0,getdate()), 0))


Last day of the current month Last Year


select cast(floor(cast(dateadd(yy,-1,dateadd(ms,-3,DATEADD(mm, DATEDIFF(m,0,getdate() )+1, 0)))as float)) as datetime)


Thursday, August 19, 2010

Delete Dulicate Records

Let's create some duplicate records. Then we'll delete duplicate records.

/* Create Table with 7 entries - 3 are duplicate entries */
CREATE TABLE DuplicateRcordTable (Col1 INT, Col2 INT)
INSERT INTO DuplicateRcordTable
SELECT 1, 1
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 2 --duplicate
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 1, 4
GO

/* It should give you 7 rows */
SELECT *
FROM DuplicateRcordTable
GO

/* Delete Duplicate records */
WITH CTE (COl1,Col2, DuplicateCount)
AS
(
SELECT COl1,Col2,
ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
GO

/* It should give you Distinct 4 records */
SELECT *
FROM DuplicateRcordTable
GO