Typical. You answer a question only to find that the answer you provided isn’t what the original poster was looking for by reading the dozen or so comments that arrived on the question while you were figuring out your answer… So I’ve taken my answer out and will put it here instead. :-)
Question
How can I deduplicate data?
My Answer
Use a CTE (Common Table Expression), and the Row_Number() function, and delete from the CTE where that row number is greater than 1.
In more detail…
Let’s work this through with a sample. A ludicrously simple sample.
DECLARE @DeDupe TABLE (f1 VARCHAR(100), i1 INT) INSERT INTO @DeDupe SELECT 'Fred', 1 UNION ALL SELECT 'Fred', 1 UNION ALL SELECT 'John', 1 UNION ALL SELECT 'John', 2 SELECT * FROM @DeDupe

At this point, we have a (very) small table with a duplicate row. So now we’ll use a CTE and that ROW_NUMBER function mentioned above and add the following to our query above:
;WITH DeDupeTemp AS (
SELECT f1, i1, ROW_NUMBER() OVER (
PARTITION BY f1,i1 ORDER BY f1,i1) AS RowNum
FROM @DeDupe)
SELECT * FROM DeDupeTemp
As you can see, that final SELECT gives us an extra column, the row number of the line within that particular combination of fields.

We now have an easy way of deleting the data we don’t need. Replace the final line of the above with:
DELETE FROM DeDupeTemp WHERE RowNum > 1 SELECT * FROM @DeDupe

Hurrah! No duplicate data. Manipulating the CTE like this has directly affected the base table.
Homework
The next step is to wrap that code up into a stored procedure that takes the name of a table as a parameter, prepares dynamic T-SQL statements and executes them to deduplicate the named table.


Surely the homework is to add a unique constraint to prevent duplicates from being added in the first place?! *{;-)
That’d be my favourite, yes.