l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
November 4: Social gathering
Next Installfest:
TBD
Latest News:
Oct. 24: LUGOD election season has begun!
Page last updated:
2004 Jul 21 00:47

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] SQL help - unions
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] SQL help - unions



On Tue, 20 Jul 2004, Peter Jay Salzman wrote:

> Total newbie here.  I know less about SQL than I do about xkb.   ;-)
> 
> I have two tables, with the same structure.  Call them [Fall 2003] and
> [Spring 2004].  They each hold data about enrolled students.  One of the
> fields of these tables is SSN (social security number).

The existence of separate tables with identical structures and disjoint
data in your schema is a red flag to me.

I highly recommend putting all the data into one table with a "Season"
column to distinguish the rows... it simplifies these kinds of queries and
makes the data much more useful for mining.

> I can get an unduplicated headcount for the academic year with:
> 
>    SELECT SSN from [Fall 2003]
>    UNION
>    SELECT SSN from [Spring 2004];
> 
> There's also a field called "race".  If the race field is 0, the student
> is African American.
> 
> How would I modify the above SQL statement to list the unduplicated
> headcount of all African Americans?  I would like to do something like:
> 
>    SELECT SSN from [Fall 2003]
>    UNION
>    SELECT SSN from [Spring 2004]
>    and race == 1;
> 
> but of course that doesn't work.  :)

As the other respondents suggested, different SQL engines have different
levels of support for the language.

    SELECT SSN from [Fall 2003] WHERE race == 1
    UNION
    SELECT SSN from [Spring 2004] WHERE race == 1

gets you a little bit closer, but there are a couple of problems with 
this... it isn't really obtaining a headcount, and unions don't
eliminate duplicates.

Imagine taking this step by step in a sequential batch, something like:

    CREATE TEMPORARY TABLE ssntbl1 (
        ssn char(9)
    );

    INSERT INTO ssntbl1 (ssn)
    SELECT SSN FROM [Fall 2003] WHERE race == 1;

    INSERT INTO ssntbl1 (ssn)
    SELECT SSN FROM [Spring 2004] WHERE race == 1;

    SELECT COUNT( DISTINCT ssn ) FROM ssntbl1;

or, in more efficient fashion:

    CREATE TEMPORARY TABLE ssntbl2 (
        season varchar(20)
      , headcount number
    );

    INSERT INTO ssntbl2 ( season, headcount )
    SELECT 'Fall 2003', COUNT(SSN) FROM [Fall 2003]
    WHERE race == 1;

    INSERT INTO ssntbl2 ( season, headcount )
    SELECT 'Spring 2004', COUNT(SSN) FROM [Spring 2004]
    WHERE race == 1;

    SELECT SUM(headcount) FROM ssntbl2;

The UNION clause in the first possible solution above is generally
implemented under the hood much like the ssntbl1 solution... that is, if a
student participates in both terms, their SSN will show up twice in the
output of the UNION.

If you had the data all in one table, the select statement for the
headcount per season for multiple seasons...

    SELECT season, COUNT( DISTINCT SSN )
    FROM UnifiedEnrollment
    WHERE season IN ( 'Fall 2003', 'Spring 2004' ) AND race=1
    GROUP BY season;

... would be about as easy as obtaining the headcount for multiple seasons
combined:

    SELECT COUNT( DISTINCT SSN )
    FROM UnifiedEnrollment
    WHERE season IN ( 'Fall 2003', 'Spring 2004' ) AND race=1;
    
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...2k
---------------------------------------------------------------------------


_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
EDGE Tech Corp.
For donating some give-aways for our meetings.