Higher Specimen Coursework

  • Mark Hay
    Participant

    For anyone doing the H specimen coursework – noticed a possible little niggle in the DB task where the marking scheme doesn’t seem to match.

    I think the SQL was meant to say: SELECT forename, surname, patientID FROM Patient WHERE address LIKE ‘%EH12 %’; ( the space was missing) this means that it does miss out some of the results.

    As when we ran the original it did find all of the 5 results?

    Steve Close
    Participant

    Yeah I found this too. Perhaps we’re just missing something? Or is it indeed a mistake?

    James Brennan
    Participant

    Yeah, I’m sure I found the query results didn’t match the scheme. Looks like a mistake to me.

    Lucinda
    Participant

    It is definitely a mistake.

    Richard Scott
    Participant

    Maybe someone can help me simplify my answer to this before I go through it with the class..

    I am using SQLite in Python for this and have produced the following output:

    Task 1, 1C(i)
    Number of times each drug has been prescribed
    drug ADSIJD8244244 prescribed 12 times
    drug HG2383474SKD prescribed 10 times
    drug 20BHDS235723 prescribed 9 times
    drug DSJKDSIOEISBG prescribed 9 times
    drug 19BHDS239475 prescribed 8 times
    drug 95LSJCVAH864 prescribed 8 times
    drug 103BSJGS836S prescribed 7 times
    drug MASJBCBEUI900 prescribed 7 times
    drug CPSVS700347A prescribed 6 times
    drug KASJBCBEUIHE2 prescribed 6 times
    drug 16343SHSAR3D prescribed 5 times
    drug 72BSJCJAK468 prescribed 5 times
    drug 93763SHSAG3D prescribed 5 times
    drug SDAIJD9364534 prescribed 5 times
    drug 104BSHGS350R prescribed 4 times
    drug 94HSJ378BSDJS prescribed 4 times
    drug SAJGSKD048SS prescribed 4 times
    drug PNXVS895347W prescribed 3 times
    drug PHASHVEHSB2 prescribed 2 times
    drug PAK946120BSV prescribed 1 times

    Task 1, 1C(ii)
    Patients prescribed drug with highest dosage
    Patient ID 17 Date prescribed: 12/05/2018
    Patient ID 38 Date prescribed: 13/05/2018
    Patient ID 53 Date prescribed: 15/05/2018
    Patient ID 58 Date prescribed: 15/05/2018
    Patient ID 75 Date prescribed: 14/05/2018
    Patient ID 88 Date prescribed: 14/05/2018
    Patient ID 92 Date prescribed: 13/05/2018
    Patient ID 99 Date prescribed: 16/05/2018

    Task 1, 1D
    People in EH12 postcode area
    8 Gifty Adisa
    9 Lewis Robinson
    11 Morgane Dubios

    (Yes, I added in the missing space to the %EH12% query)

    Richard Scott
    Participant

    Maybe someone can help me simplify my answer to this before I go through it with the class..

    I am using SQLite in Python for this and have produced the following output:

    Task 1, 1C(i)
    Number of times each drug has been prescribed
    drug ADSIJD8244244 prescribed 12 times
    drug HG2383474SKD prescribed 10 times
    drug 20BHDS235723 prescribed 9 times
    drug DSJKDSIOEISBG prescribed 9 times
    drug 19BHDS239475 prescribed 8 times
    drug 95LSJCVAH864 prescribed 8 times
    drug 103BSJGS836S prescribed 7 times
    drug MASJBCBEUI900 prescribed 7 times
    drug CPSVS700347A prescribed 6 times
    drug KASJBCBEUIHE2 prescribed 6 times
    drug 16343SHSAR3D prescribed 5 times
    drug 72BSJCJAK468 prescribed 5 times
    drug 93763SHSAG3D prescribed 5 times
    drug SDAIJD9364534 prescribed 5 times
    drug 104BSHGS350R prescribed 4 times
    drug 94HSJ378BSDJS prescribed 4 times
    drug SAJGSKD048SS prescribed 4 times
    drug PNXVS895347W prescribed 3 times
    drug PHASHVEHSB2 prescribed 2 times
    drug PAK946120BSV prescribed 1 times

    Task 1, 1C(ii)
    Patients prescribed drug with highest dosage
    Patient ID 17 Date prescribed: 12/05/2018
    Patient ID 38 Date prescribed: 13/05/2018
    Patient ID 53 Date prescribed: 15/05/2018
    Patient ID 58 Date prescribed: 15/05/2018
    Patient ID 75 Date prescribed: 14/05/2018
    Patient ID 88 Date prescribed: 14/05/2018
    Patient ID 92 Date prescribed: 13/05/2018
    Patient ID 99 Date prescribed: 16/05/2018

    Task 1, 1D
    People in EH12 postcode area
    8 Gifty Adisa
    9 Lewis Robinson
    11 Morgane Dubios

    (Yes, I added in the missing space to the %EH12% query)

    My question is about c(ii), here is my solution:

    SELECT patientID, datePrescribed
    FROM prescribedDrug
    WHERE prescribedDrug.drugID = (
    SELECT drugID FROM (SELECT drugID, MAX(dosage) FROM drug)
    )

    I am having to use THREE select statements, can anyone simplify this for me? I suspect this is because I am using Subqueries, does anyone know how to (if at all possible) chain the results of queries together without having to create temporary tables?

    Paul Gardner
    Participant

    Hi Richard,

    There are a few ways to answer Q1 C(ii)

    If you were tackling this in Access you’d probably create and save a named query that finds the drug with the highest dose and then use this query in a second query. The SQL equivalent to this is to create a view and then use this view in another query, e.g.

    — Q1 c(ii) patients prescribed drug with the highest dosage
    — a) use a view for highest dosage and simple table joins

    create view highestDosage as
    select max(dosage) maxDosage from drug;

    select patientID, datePrescribed
    from prescribedDrug, drug, highestDosage
    where prescribedDrug.drugID = drug.drugID
    and drug.dosage = highestDosage.maxDosage;

    Or you can use an ‘inline view’ which does the same but without saving the view definition, e.g.

    — b) use an inline view
    select patientID, datePrescribed
    from prescribedDrug, drug,
    (select max(dosage) maxDosage from drug) as highestDosage
    where prescribedDrug.drugID = drug.drugID
    and drug.dosage = highestDosage.maxDosage;

    Or you can use a subquery in the where clause, e.g.

    — c) use subqueries
    select patientID, datePrescribed
    from prescribedDrug, drug
    where prescribedDrug.drugID = drug.drugID
    and drug.dosage = (select max(dosage) from drug);

    =================

    Your approach works in this scenario but demonstrates what is arguably a bug in SQLite, as the expression

    SELECT drugID, MAX(dosage) FROM drug

    shouldn’t work. MAX(dosage) is an aggregate function that returns only one value.
    If more than one drug has the same (maximum) dosage then SQLite appears to only return the first row that matches the MAX(dosage) value.

    You can see this with a simple test, e.g.

    create table test (name text, age integer);
    insert into test values (‘a’,21);
    insert into test values (‘b’,22);
    insert into test values (‘c’,22);
    insert into test values (‘d’,21);

    The way to get the name with the highest age is to

    select name, age from test
    where age = (select max(age) from test);

    (two rows returned)

    but SQLite will allow you to

    select name, max(age) from test;

    but will only return one row. max(age) will be correct but the name value will only be one of the two possible.

    Richard Scott
    Participant

    Many thanks for that Paul.

    I like your solution with the subquery:

    SELECT patientID, datePrescribed
    FROM prescribedDrug, drug
    WHERE prescribedDrug.drugID = drug.drugID
    AND drug.dosage = (SELECT MAX(dosage) FROM drug);

    It does indeed simplify my answer. I was aware of the limitations of my solution at the time and realise it only gave the correct result dues to the data in the system.

    Your solution here is great and much easier to understand and explain than my slightly convoluted and incorrect one (given a different set of data) – but I think it is what I was trying to get at!

    SELECT patientID, datePrescribed
    FROM prescribedDrug
    WHERE prescribedDrug.drugID = (
    SELECT drugID FROM (SELECT drugID, MAX(dosage) FROM drug));

    Thanks again.

    Richard Scott
    Participant

    Equi-join between tables.

    I cannot help thinking that I could improve on the solution to Qu1 C(ii) by implementing a join between the tables as follows:

    ‘SELECT patientID, datePrescribed
    FROM prescribedDrug
    INNER JOIN drug
    ON prescribedDrug.drugID = drug.drugID
    WHERE drug.dosage = (SELECT MAX(dosage) FROM drug);’

    However the Higher course does not mention using JOINS but the N5 course does; the N5 course lists ‘equi-joins’ in the list of SQL commands required. It could be argued that one could implement a join using WHERE as in Paul’s answer below but would the above method not be the ‘more correct’ way of doing this?

    ‘SELECT patientID, datePrescribed
    FROM prescribedDrug, drug
    WHERE prescribedDrug.drugID = drug.drugID
    AND drug.dosage = (SELECT MAX(dosage) FROM drug);’

    SQLite has natural joins which are an incredibly easy way to achieve this without the join condition and simplifies the answer to:

    ‘SELECT patientID, datePrescribed
    FROM prescribedDrug
    NATURAL JOIN drug
    WHERE drug.dosage = (SELECT MAX(dosage) FROM drug);’

    Should I be teaching this at Higher? It is mentioned (or hinted at) in the N5 arrangements but not in the Higher arrangements.

    Lee Murray
    Participant

    I’m only just getting a chance to work through the specimen assignment now, but wanted to mention my views on the same question (1d).

    Not only is the first part incorrect, but the second mark suggests that the problem is fixed by splitting the address field and having a required postcode field. This doesn’t solve the problem at all because all 5 patients did provide a postcode; it’s just that some of them included only the first part of it. Ensuring a match would require either fixing the query in the first place (not including the space) or setting a bit more validation on the new postcode field.

    Lee Murray
    Participant

    I’ve tried posting multiple times since completing the specimen, but I think this forum has problems with code being typed in. It blocked my posts as it probably suspected an injection attack or something, ha ha.

    Anyway, I have an issue with question 2bii. I wonder if the design is intentionally insufficient or if it was a mistake. Following the design as best as I could and only using code that was taught, I came up with the following:

    following design using only higher code

    Since pupils have to follow the design, I came up with this solution:

    following the design but with extra code

    Obviously this uses code that isn’t taught at higher level (and may not exist in other languages), so not only is it wrong to expect pupils to come up with it, but it also renders the entire design pointless:

    not following the design, using extra code

    I know pupils may be expected to follow inefficient designs, but what about designs that just can’t work? The marking instructions show that no marks are awarded for following the design, so does that mean that the third solution would garner full marks?

    Hopefully the real assignment isn’t this bad.

    Lee Murray
    Participant

    Or an even simpler way to solve the entire problem:

    print(beach.name.split(" ")[0])

    Paul Gardner
    Participant

    Hi Richard,

    I think the SQA specifies equi-joins (only) so that it’s clear that pupils aren’t expected to know about outer joins (left join, right join etc.).

    I think you’re right about syntax, the newer style for select, e.g.

    select patientID, datePrescribed
    from prescribedDrug inner join drug
    on prescribedDrug.drugID = drug.drugID
    where …

    is better as it separates the table join conditions from any row selection conditions. It’s also a consistent style to use if/when you need to introduce outer joins. I still use the old style out of habit but try and remember to teach the new.

    I avoid using NATURAL JOINs because the written SQL doesn’t show the explicit join condition; you need to know the database schema to know which fields match between the tables. It’s also easy to miss that NATURAL JOINs use *all* matching field names (regardless of primary and foreign key definitions), e.g.

    create table doctor (
    doctorID integer primary key,
    name text);

    create table patient (
    patientID integer primary key,
    name text,
    doctorID integer,
    foreign key (doctorID) references doctor(doctorID));

    A typical inner join works as expected:

    select * from
    doctor inner join patient
    on doctor.doctorID = patient.doctorID;

    but a NATURAL JOIN:

    select * from
    doctor natural join patient;

    matches doctor.doctorID = patient.doctorID *and* doctor.name = patient.name

    Happy coding.

    Paul

    Richard Scott
    Participant

    Hi Paul,

    I can see your point about the flexibility of INNER JOIN and avoiding NATURAL JOIN – but – it is so much simpler provided, as you say, the schema supports it. If I knew that H tasks could always be solved using NATURAL JOIN then it would be tempting to teach only that – I suspect they may be!

    😁

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic.