Arrays of Records in Python

  • David Jack
    Participant

    As many of you know I am putting the final touches onto a Higher Python resource and before I release it I wanted to gather opinions on the best way to implement “Arrays of Records”.

    After attending the CodeClan course and spending some of my own personal time looking into it I have settled on creating a List of Lists to represent an “Array of Records”

    There are many other way of doing it, tuples and classes are two.

    Of course, I am looking for the best way that suits the course, understanding for the pupils and is in line with the SQA expectations in the exams and practical assessments.

    Does anyone expect there to be any issues by teaching it as a “List of Lists”? I still plan to use the SQA jargon “Array of records”. If anyone teaches it another way what is your reasoning for it? Should a standardised way be delivered throughout Scotland, or are the SQA leaving it up to the judgement of the teacher/faculty? If so this may cause a few issues at exam marking if the marker is not used to tuples.

    David Jack
    Participant

    Also, thank you to everyone who has sent me feedback and suggestions to the resource so far.

    Gordon Milne
    Participant

    I think there is a slight issue with using a “list of lists” for an array of records. Records in languages which implement them are distinct from arrays in that they can contain a mixture of different data types, integer, real, string etc.

    Arrays contain a single data type either primitive data types e.g. array[10] of integer or declared data types e.g. array [100] of studentrecordtype.

    Using the flexible python list structure for both of these can cause confusion. I would use either a class or a named tuple as the best substitute for a record.

    Having taught Python for 10 years now to Higher students this has always been one of the few areas where I think Python’s looser structure can be a problem rather than an advantage.

    Lee Murray
    Participant

    I will personally be using a Class to create a record, then have an array of (instances of) objects.

    The reason for this is that with a class you can use dot notation which was accepted as an answer in the 2016 paper (question 15b), but also the creation of an instance is almost identical.

    For example, if I want to create a ‘record’ structure called athleteData (as per the 2016 paper Q15), I’d do this in Python like so:

    class athleteData:
        def __init__(self, forename, surname, runnerNumber, professional, seasonBest, weight):
            self.forename = forename #string
            self.surname = surname #string
            self.runnerNumber = runnerNumber #integer
            self.professional = professional #boolean
            self.seasonBest = seasonBest #real
            self.weight = weight #real

    That’s obviously nothing like the SQA solutions or examples, but the creation of an instance is pretty spot on:

    myRecord = athleteData("Salma", "Hussain", 324, True, 45.12, 67.5)

    Adding this to a list (‘array’) is also really intuitive:

    athletes = []
    athletes.append(myRecord)
    
    # adding another one
    athletes.append(athleteData("Lee", "Murray", 100, False, 80.37, 95.7))

    I found that there are issues with dictionaries and even more so with lists of lists, so I’ll be sticking with lists of objects.

    Hope this helps and sorry for teaching you to suck eggs with code, but it helped me explain my thoughts 🙂

    Marc McWhirter
    Participant

    I still think the namedtuple from the typing library in Python 3.6 onwards is the best way to go at the moment and implement a list of those. Still not entirely comfortable with this area as SQA always ask how to create an array of however many when this doesn’t really work in Python.

    Here’s an example of Python 3 namedtuples from the typing library rather than collections, as you can see you have to define types for each field.

    https://repl.it/@mcwhim/Arrays-of-Records-Example

    Mrs Janet McDonald
    Participant

    Hi

    We use python classes for this because the coding can look a lot like what the SQA seem to be looking for in answers (as Lee has said in his reply).

    We tried using namedTuples, but found that the problem with that was that pupils couldn’t replace the value in a single “field” of the “record”, they had to overwrite the entire “record”. This was counter intuitive! It probably didn’t actually cause problems at Higher because the pupils rarely, if ever, had to replace a value in a single field in practice at Higher. But when pupils went on to use “arrays of records” in Advanced Higher projects it did become an issue (eg updating scores in a game where they were storing the player’s name and score in an “array of records”).

    Janet McDonald

    Marc McWhirter
    Participant

    @Janet McDonald you should check this out for the problems your students had with arrays of records for their projects: https://realpython.com/python-data-classes/

    These will function very similar to namedtuples, BUT they will be mutable so you can change the values of fields. It will be available in Python 3.7 onwards. Hope that helps.

    Cheers,
    Marc

    Scott Leiper
    Participant

    @MarcMcWhirter that’s the first thing I have seen that makes me want to move from 2.7 to 3.7 in my classroom. I had previously discounted named tuples as too much phaf.

    I had Raymond Simpson visiting me a few years back at school and I mentioned Arrays of Records being unsupported in Python and that generally I used 2D lists, and he said fine. As a result any practical assessment that looks for a record structure we use lists and I teach records as theory only for answering in the exam.

    Lee Murray
    Participant

    Data classes look VERY nice!

    Marc McWhirter
    Participant

    @sleiper @leemurray 3.7 is out at the end of June according to the python website https://www.python.org/dev/peps/pep-0537/

    Darren Brown
    Participant

    Found these exact same issues – we are still using Python 2.7 so have been using an Array of Named Tuples and realised could not edit individual fields which may well be required in new H. Managed to use _replace operator to get this working (see 2nd procedure below) but seriously thinking about changing N5 and H to Python 3. I believe lists can also be setout like Arrays in SQA format as well.

    def GetHandlesandScores():
    player_info = open(“Pinball Testers.txt”,”r”)

    PinballPlayers = []

    tester = namedtuple(‘tester’,’Handle, Basic_Score, Level, Bonus, Final_Score’)

    for line in player_info:

    row_data = line.split(‘,’)

    current_record = tester(Handle=row_data[0], Basic_Score=int(row_data[1]), Level=int(0), Bonus=int(0), Final_Score=int(0))

    PinballPlayers.append(current_record)

    player_info.close()

    return PinballPlayers

    def LevelandBonusSet(PinballPlayers):

    for x in range(8):
    print “Gamer: “,PinballPlayers[x].Handle,” “,
    level_on = int(raw_input(“Please enter the level reached in the Pinball App – “))
    while not (level_on >= 1 and level_on <= 25):
    level_on = int(raw_input(“Levels between 1 and 25! Please re-enter the level reached in the Pinball App – “))

    PinballPlayers[x] = PinballPlayers[x]._replace(Level=level_on)

    if PinballPlayers[x].Level > 20:
    PinballPlayers[x] = PinballPlayers[x]._replace(Bonus = 50)
    elif PinballPlayers[x].Level >= 15:
    PinballPlayers[x] = PinballPlayers[x]._replace(Bonus = 25)
    elif PinballPlayers[x].Level >= 10:
    PinballPlayers[x] = PinballPlayers[x]._replace(Bonus = 10)
    else:
    PinballPlayers[x] = PinballPlayers[x]._replace(Bonus = 5)

    return PinballPlayers

    David Jack
    Participant
    Lee Murray
    Participant

    @DJack have you used that code? How do you populate the 2-D array? Wouldn’t that just create a list of the same list repeated 5 times.

    I tried adding the following code to your example:

    city_array[0][0] = "Aberdeen"
    city_array[0][1] = "Scotland"
    city_array[0][2] = 0.212125
    city_array[0][3] = "Doric"
    
    city_array[1][0] = "Edinburgh"
    city_array[1][1] = "Scotland"
    city_array[1][2] = 0.495360
    city_array[1][3] = "English"
    
    city_array[2][0] = "Tokyo"
    city_array[2][1] = "Japan"
    city_array[2][2] = 13.617444
    city_array[2][3] = "Japanese"

    What that does is change the city_records array to the Aberdeen version 5 times, then to the Edinburgh version 5 times, then finally the Tokyo version 5 times. That’s because it’s always the same list that it’s referring to, regardless of where in the city_array that list appears.

    I don’t know if I’m using it in a completely different way to you and maybe I’m missing something, but 2-D arrays really don’t look fit for purpose in any way for an array of records.

    David Jack
    Participant
    Lee Murray
    Participant

    @DJack I see. Is the initial creation of the array just to get pupils into the mind-set of setting out the size of the array and data types in the ‘record’?

    I still wouldn’t use this method, personally, as retrieving the data is quite different from what the SQA would use or expect in an exam (from what I’ve seen in past papers – the only thing we have to go by).

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic.