Download the updated files here. Note that the 2010 data has not been modified.
Tonight I released the Study Score Archive for 2010. Download it here.
Unfortunately, as I noted in the previous post , the Herald Sun did not publish school names along side student names online this year. For this reason, the 2010 release is much narrower in scope than those of previous years. Update: school data now available.
This year brings the introduction of estimated scaled scores to the archive. There are a few other changes, too – I hope you will find them useful. As always, please let me know if you spot any errors.
The source code for the program that generates the lists is currently not available, but I will aim to publish it soon.
Early this morning the Herald Sun published its annual Top Scorers section online. Unlike in previous years where the website simply served a set of static pages corresponding to VCE subjects, this year’s website features an interactive search tool exclusively. It is possible to search by surname and by study name. In general, the new format is far superior to the old static pages and reduces at least somewhat the need for the Study Score Archive. The CSV files used to store the student names and scores are also readily accessible: there is one arranged by surname and one arranged by study name. Access to the data already sorted neatly into a CSV file is good news for me, since it means there is no longer any need to manually copy and paste huge chunks of text from 100 separate web pages.
Updated: school data now available.
As I previously mentioned, this year’s release of the Study Score Archive will feature estimated scaled scores alongside raw scores. Helpfully, VTAC provides scaling data in the annual Scaling Report, but correspondences between raw scores and scaled scores for each subject are provided only for scores that are greater than 20 and are multiples of 5 (that is, for the raw scores of 20, 25, 30, 35, 40, 45 and 50). As such, it is necessary to somehow estimate the correspondences for the remaining raw scores. This process in general is called interpolation.
A simple method of interpolation is linear interpolation. This is the approach used by Daniel15’s VCE ATAR Calculator, and it is the approach that I will be using in the 2010 release of the Study Score Archive. Other methods might produce more accurate results, but they are harder to implement and without more data it’s not possible to verify which method of interpolation produces the best results.
To illustrate the process, let’s take a look at the scaling of Further Mathematics in 2010 (for scores of 40 and above).
There will be one major additional feature this year: scaling of scores. Newspapers only publish raw scores, which can be misleading: a raw score of 40 in Latin contributes much more to one’s ATAR than a raw score of 40 in Business Management does. This year I will be incorporating data from the Scaling Report published by VTAC to generate a separate set of lists made with scaled scores.
As always, this sort of data should be interpreted cautiously. Consider:
- Calculated scaled scores are estimates only.
- A raw score of 39 in Latin scales to somewhere between 50 and 52, but it will be missing from the data. Meanwhile, a score of 40 in Business Management (which scales to around 37) will be present.
Lists using scaled scores will be included in addition to the standard raw score lists, not in place of.
I don’t have access to the scaling reports for all the years included in the archive. Lists using scaled scores will only be present for 2006 onwards (unless someone can send me the reports for earlier years).