Baselines
We offer different baselines and more importantly scripts to submit result in the iterative way required by the competition. However, now that it is over, an off-line version of the data and the baselines are available.Python API
A python API is available here to help you submit your results. You need to modify it by giving your user number, the path to the competition samples, the problem number, and the name of your submission (different for all your submissions on a problem). Then the only parts missing are the one learning the model and the one ranking the next possible symbols given a prefix and a learned model.
Spectral Learning Baseline
A spectral learning baseline in python was developped for the SPiCe competition! Here is the steps to use it:
- You first needs to get the Sp2Learning (for SPiCe Spectral Learning) package. The easiest way it by using pip: you just need to wrote
pip install Sp2Learning
in a terminal. - You can now directly import the modules in your python code: how cool is that?!? We are even giving you a python program to compete using it: submission_spectral.py. The only thing you need is to add your user number and the path on your system to the train and test files. Of course, you can modified the parameters of the learning function (we do not pretend that the ones we used to obtain the scores of this baseline on the problems are the best possible).
Of course, there is a Documentation for this toolbox, and you can directly download the sources (it is licensed under Free BSD, feel free to use it for whatever you want).
If you do not know about spectral learning of weighted automata, this paper gives the main ideas to understand what the baseline is doing. You can also look at this tutorial.
3-gram
A simple baseline, computing 3-gram on the train file and ranking next possible symbols using this 3-gram is available here.
Other Baselines
You will find a lot of toolboxes on the web that can be used for the competition. For instance, if you plan to compete using a deep learning or a Bayesian approach, we are confident you know which toolbox to use, and how to modify it to obtain needed rankings.
We just want to point out here that page that contains links to some toolboxes developped in the context of grammar learning. In particular, Mans Hulden coded the algorithms used by the best participants to the PAutomaC competition and made them available here. As the train file format was the same than for SPiCe, it should be easy to use this work.