数据库sql去重_sql

1 去重

1.1 查询

1.1.1 存在部分字段相同的纪录，即有唯一键主键ID

最常见情况如果是这种情况的话用distinct是过滤不了的，这就要用到主键id的唯一性特点及group by分组

select * from table where id in (select max(id) from table group by [去除重复的字段名列表,....])

1.1.2 存在两条完全相同的记录用关键字distinct就可以去掉

select distinct id(某一列) from table(表名) where (条件)

1.1.3 查找表中不含重复的数据，根据单个字段（id）来判断

select * from table where id in (select id from table group by id having count (id) >1)

1.1.4 查找表中重复的数据，根据单个字段（id）来判断

select * from table where id not in (select id from table group by id having count (id) >1)

1.1.5 查询全部的重复信息

select * from people where id not in (select min(id) from people group by name,sex HAVING COUNT(*) <2)

1.1.6 查询全部的重复信息

select * from table where id not in (select MIN(id) from table group by name,sex)

1.1.7 删除多余重复的信息，只保留最小ID

delete from table where id not in(select MIN(id) from table group by name,sex)

1. distinct

select distinct 列名 from 表名

2. row_number

select *, row_number() over (partition by 想去重的列名 order by 列名) as row_num

from 表名

where row_num = 1

3.group by

select 列名 from 表名 group by 列名

重复量多时，GROUP BY总的处理效率比DISTINCT高，重复量低时，DISTINCT就比GROUP BY快一点了，而如果随着整体数据量的增加，效果会越来越明显。

欢迎分享，转载请注明来源：内存溢出

数据库sql去重