|
Home > Archive > PostgreSQL Discussion > August 2005 > GROUP BY requirement
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
GROUP BY requirement
|
|
| Bill Moseley 2005-08-26, 8:23 pm |
| I'm wondering if adding a GROUP BY (as required by Postgres) will
change the results of a select on a view.
I have the following view which joins a "class" with a teacher. A
teacher is a "person" and I have an "instructors" link table.
CREATE VIEW class_list (id, class_time, instructor )
AS
SELECT DISTINCT ON(class.id)
class.id, class.class_time, person.first_name
FROM class, instructors, person
WHERE instructors.person = person.id
AND class.id = instructors.class;
I also have a table "registration" that links students with a class.
The registration table has a "reg_status" column to say if they are
confirmed or on the wait_list. So when showing the above I'd also
like to see how many students are confirmed and on the wait_list.
DROP VIEW cl;
CREATE VIEW cl (id, class_time, instructor,
confirmed_cnt, wait_list_cnt)
AS
SELECT DISTINCT ON(class.id)
class.id, class.class_time, person.first_name,
sum (CASE WHEN registration.reg_status = 1 THEN 1 ELSE 0 END) as confirmed_cnt,
sum (CASE WHEN registration.reg_status = 2 THEN 1 ELSE 0 END) as wait_list_cnt,
FROM class, instructors, person, registration
WHERE instructors.person = person.id
AND class.id = instructors.class
AND class.id = registration.class
GROUP BY class.id, class.class_time, person.first_name;
PostgreSQL requires the GROUP BY. But, I'm not clear how the GROUP BY
might change the results between the two views above.
http://www.postgresql.org/docs/8.0/...tml#SQL-GROUPBY
says:
When GROUP BY is present, it is not valid for the SELECT list
expressions to refer to ungrouped columns except within aggregate
functions, since there would be more than one possible value to
return for an ungrouped column.
Frankly, I cannot see how it might change results of a select between
the two views. Am I missing something?
--
Bill Moseley
moseley@hank.org
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
| |
| Scott Marlowe 2005-08-30, 8:24 pm |
| On Fri, 2005-08-26 at 14:39, Bill Moseley wrote:
> I'm wondering if adding a GROUP BY (as required by Postgres) will
> change the results of a select on a view.
>
> I have the following view which joins a "class" with a teacher. A
> teacher is a "person" and I have an "instructors" link table.
>
> CREATE VIEW class_list (id, class_time, instructor )
> AS
> SELECT DISTINCT ON(class.id)
> class.id, class.class_time, person.first_name
>
> FROM class, instructors, person
> WHERE instructors.person = person.id
> AND class.id = instructors.class;
>
> I also have a table "registration" that links students with a class.
> The registration table has a "reg_status" column to say if they are
> confirmed or on the wait_list. So when showing the above I'd also
> like to see how many students are confirmed and on the wait_list.
>
> DROP VIEW cl;
> CREATE VIEW cl (id, class_time, instructor,
> confirmed_cnt, wait_list_cnt)
> AS
> SELECT DISTINCT ON(class.id)
> class.id, class.class_time, person.first_name,
> sum (CASE WHEN registration.reg_status = 1 THEN 1 ELSE 0 END) as confirmed_cnt,
> sum (CASE WHEN registration.reg_status = 2 THEN 1 ELSE 0 END) as wait_list_cnt,
>
> FROM class, instructors, person, registration
> WHERE instructors.person = person.id
> AND class.id = instructors.class
> AND class.id = registration.class
>
> GROUP BY class.id, class.class_time, person.first_name;
>
> PostgreSQL requires the GROUP BY. But, I'm not clear how the GROUP BY
> might change the results between the two views above.
>
> http://www.postgresql.org/docs/8.0/...tml#SQL-GROUPBY
>
> says:
>
> When GROUP BY is present, it is not valid for the SELECT list
> expressions to refer to ungrouped columns except within aggregate
> functions, since there would be more than one possible value to
> return for an ungrouped column.
>
> Frankly, I cannot see how it might change results of a select between
> the two views. Am I missing something?
OK, distinct on suffers from this problem. Given the following simple
dataset:
mytable:
a | b
------
1 | 0
1 | 1
select distinct on (a) a,b from mytable;
One can see how the possible results are:
1,0 and 1,1, right? All depending on the order in which they are
fetched.
The same would be true if you could do a group by on a and select b:
select a,b from mytable group by a;
Right?
Now, if it's impossible for your dataset to return such sets, due to the
way it's built, it is likely not fully normalized. I.e. you have data
like this:
classid | instructorname | moreinfo...
--------------------------------------
1 | 'John Smith' | 'information'
1 | 'John Smith' | 'even more information'
and so on. Or your join is creating such a data set.
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
|
|
|
|
|