Welcome to the Find a Legend Challenge! This is the third in our series of challenges around getting the most out of legacy crossplots. Please ask, share, and conjecture on this discussion board about how to tackle this problem. Best of luck to you!
Can I annotate the challenge’s training dataset, or a part of it atleast
Yes, you can add additional annotations. However, please make sure you include these data if you are asked to supply a final submission, as we need to be able to repeat your training steps.
How much of open source models can I use, there are a lot of different tasks starting from clearing out the data to the final text prediction, and there are many pretrained model available, do I have to make my own models and train them on my own annotated data, what are the rules on external data and pretrained/not-pretrained model usage.
… I will still need an answer for that, there are lot of public OCR datasets and models, similarly there are pretrained weights for imagenet and coco datasets used heavily in image classification and object detection tasks, I want to confirm that everything is able to be used, as I normally do it on kaggle, most of the time every external dataset and model is okay to use, but other times kaggle do state that only datasets and models with specific licence permission can be used. Please provide this information as transfer learning will be extremely valuable for this dataset because this dataset has a very low number of samples.
You are welcome to use whatever model you wish as long as it can be called in a Python environment and you have the license to share the model (i.e., open-source). As for annotating your data, you can use these data but must provide them to Xeek in the case that you are nominated as a finalist for the competition.
Should the predictions be case-sensitive?