Dear all @here,
The lithology prediction competition is nearing its end and soon it will be time to declare a winner. As stated in the rules, the final score will be determined using a hidden test dataset, not the test data used in the leaderboard. Due to the close competition at the top of the leaderboard we have decided to invite the top 20 (not 10) teams to submit their model for final scoring. An email will be sent out to the teams that are in the top 20 at the end of the competition, but since there has been some questions about the requirements of a submission here are some more details:
- Submission should be Python code that takes as input a dataframe in the exact same format as the open test data (test.csv) and returns a list of predictions (like the leaderboard submission) for every row in the input dataframe
- Any pre-processing and feature engineering must be included in the code
- Any persisted data, like model weights, scalers, etc can be included as files and must be loaded by the Python code
- The model must make a prediction for every row of the test data
- Python dependencies should be included in the form of a requirements.txt, pipenv-file, conda environment file, or equivalent.
The starter notebook contains an example on how a model can be packaged for prediction. You do not have to follow that template as long as the points above are satisfied. Let me know if anything was unclear and good luck!