postgresql - SQL duplicate elimination for tables with columns using UNION or INTERSECT in Postgres -
in postgres simplest possible query is
postgres=# select; -- (1 row)
the result of query relation single tuple (row), i.e., empty tuple. relation {()}
.
this semantically correct.
now assume have relation foo 1 attribute (value) , @ least 2 tuples (rows). example,
foo ---- value ------ 1 2
now issue following query
postgres=# select foo; -- (2 rows)
this bag {(),()}
, makes sense since projection on empty set of attributes , there 2 tuples in foo.
what want coerce bag set , obtain {()}
. can done group follows:
postgres=# select foo group by(); -- (1 row)
and correct well.
another possibility, in general, remove duplicates use union
or intersect
operators of sql because implicitly coerce bags sets.
so example, if r
relation duplicates, query
(select * r) union (select * r)
or query
(select * r) intersect (select * r)
will trick.
now here observation (and question).
consider query
(select foo) union (select foo)
one expect answer query relation {()}
, i.e., relation in duplicate empty tuples removed. however, when query in postgresql not answer.
postgres=# (select foo) union (select foo); -- (4 rows)
so, appears in case union
operator behaves union all
operator.
postgres=# (select foo) union (select foo); -- (4 rows)
but stranger happens intersect
operator. consider query
postgres=# (select foo) intersect (select foo); -- (4 rows)
so again 4 empty tuples expect 1 (or @ 2).
the same happens intersect all
operator
postgres=# (select foo) intersect (select foo); -- (4 rows)
conclusion: appear there bug in postgresql when comes down union
, intersect
, intersect all
operators concerns table no attributes.
remark:
1 may wonder why 1 consider such relations. useful because relation 0 columns can used mechanism in query processing reason true , false
the relation {()}
stands true
, relation {}
stands false
.
the fact both union
, intersect
return 4 rows seems self-contradictory – either empty tuples equal (in case union
should have fewer elements) or empty tuples not equal (in case intersect
should have fewer elements).
and intersect
having more result rows either input relation insane.
you should take pgsql-hackers list or report bug.
but have doubt statement these questions of practical value:
empty
select
lists not standard compliant. introduced implementation artifact in postgresql. keeps using (sql standard compliant) variantselect 1 ...
?how can interpret result of 1 empty tuple
true
, empty resultfalse
? meaning of result 2 empty tuple?union
,intersect
mean in context?
Comments
Post a Comment