Suprtool

Getting the Top Ten Records

If you have a dataset/file that stores "scores" (e.g., movie ratings), and you want to pick the top ten scores, the obvious solution doesn't work:

    >get       d-movies
    >sort      rating,desc
    >sort      movie-title
    >extract   rating, movie-title
    >numrecs   10
    >list      standard
    >xeq

    RATING MOVIE-TITLE

    8  The Rock
    8  The Truth About Cats & Dogs
    7  Courage Under Fire
    7  G.I. Jane
    7  Leaving Las Vegas
    7  Return of the Jedi
    6  Dragonheart
    6  The American President
    2  Batman & Robin
    2  From Dusk Till Dawn
    IN=11,OUT=10.CPU-Sec=1.Wall-Sec=1.
This isn't a list of highly-rated movies because of a misunderstanding of what the Numrecs command does.

The IN=11 count at the end shows that the entire 500+ record dataset was not read. Numrecs limits the input operation to the number of records specified. Therefore, Suprtool reads the first ten records it finds and sorts those ten records according to whatever rating they might have.

The proper approach to this problem is to use two passes:

    >get       d-movies
    >sort      rating,desc
    >sort      movie-title
    >extract   rating, movie-title
    >output    foo,link,temp
    >xeq
    IN=519,OUT=519.CPU-Sec=1.Wall-Sec=1.

    >input     foo
    >numrecs   10
    >list      standard
    >xeq

    RATING MOVIE-TITLE

    10 Casablanca
    10 The Usual Suspects
    9  Citizen Kane
    9  One Flew Over the Cuckoo's Nest
    9  Saving Private Ryan
    9  Schindler's List
    9  Star Wars
    9  The Godfather
    9  The Shawshank Redemption
    8  Raiders of the Lost Ark
    Warning:  NUMRECS exceeded; some records not processed.
    IN=11,OUT=10.CPU-Sec=1.Wall-Sec=1.
This method allows Suprtool to see all the records, sort them by score, and then re-read the sorted list selecting the first ten from the sorted list.

[Mike Shumko]

Note: Please don't flame us about the actual movie ratings shown above. It's just test data!

....Back to the Suprtool Q&A Page