Welcome to the last of our series on extracting data from old crossplots. This is the most challenging of the four. We ask you to pull out the axes locations of all the markers on the plot and identify their label feature from the legend. Good luck, and we look forward to seeing what you all build.
Note: Due to the number of points that need to be evaluated on each test image, the scoring algorithm will take >2 minutes to run.
Test dataset contains labels file which can be uploaded to get RMSE score of 0. Any thoughts team to discard the test labels file from the competition page and reset the leaderboard?
These are a few observations related to scoring:
When I submit a file with constant coordinates for all the samples, I get a score of around 657.3851.
When I submit a file
sample-submission.csv, which was provided at the start of the competition, I get a score of 289.0781 (I believe the top 2,3,4 participants did the same). And this file doesn’t have any of test sample names.
And when I submit a file with reasonably good predictions. I get a score of >1000.
Kindly check into how the website is scoring. If the website is scoring correctly, please share the scoring function so that we know what we are trying to improve. If not, please reset the leaderboard.
Thank you for your note. We’ve corrected the Leaderboard and have rigorously tested the scoring algorithm.
Your score of 657 is correct according to the RMSE metric. Have you seen any other abnormal behavior?
Thanks for the reply. This is the information provided on the website regarding scoring:
Submissions will be scored against the test answer key using RMSE. For each graph, the reported coordinates are matched to key coordinates by minimizing euclidean distance. Values are normalized with min/max transformation. The coordinates are scored based on RMSE. Lower scores are considered more successful.
Can you provide more information on how the scoring algorithm handles edge cases like
- How does it handle no data points for a given label?
- How does it handle data points for an extra label (if the legend has only three label integers but the submission has a point with the label as 4)?
For each graph, the reported coordinates are matched to key coordinates by minimizing euclidean distance, is this done label-wise?
It would be great if you could share some parts of the scoring function!
For each graph, the values are min/max transformed (0-1) based on the answer key and then applied to the submission file. Missing/extra values are penalized (+1; max distance from min/max transformation). We are not explicitly scoring values based on the legend (z). However, a solution for the legend category will get extra points in the final evaluation.
Thank you for an interesting challenge, I noticed there are previous competitions available, would it be possible to open source(make public) the winning solutions of previous challenges available?