Python implementation of Shift-Reduce semantic parser: http://ceur-ws.org/Vol-1180/CLEF2014wn-QA-XuEt2014.pdf
Run each script with -h parameter to see a list of required parameters. These usually consist of path to data files, training/testing mode(trn/tst), size of dataset and operating mode. For scripts where training a model is involved, number of iterations is usually used. Description of what each operating mode of each script does follows. Words in brackets in file names are variable parameters. Files not mentioned have main methods only for testing their methods which other scripts import. All scripts require Free917 dataset, training and testing split.
- l – annotate all words in dataset with phrase detection labels; creates
labels_(trn/tst)_(size).pickleandquestions_(trn/tst)_(size).picklefiles. - b – labels all words in dataset using bootstraping; creates
labels_(trn/tst)_(size).pickleandquestions_(trn/tst)_(size).picklefiles
- p – tags all questions in dataset with part-of-speech tags; creates
pos_tagged_(trn/tst).picklefile - n – tags all questions in dataset with NER tags; creates
ner_tagged_(trn/tst).picklefile - i – creates features for phrase detection for all questions in the dataset. Requires
pos_tagged_*.pickle,ner_tagged_*.pickleandquestions_(trn/tst)_(size).picklefiles; createsphrase_detect_features_(trn/tst)_(size)_arr.picklefile
- l – creates training examples for phrase detection model training. Requires
labels_(trn/tst)_(size).pickleandphrase_detect_features_(trn/tst)_(size).picklefiles; createsphr_detect_examples_(trn/tst)_(size).pickleandempty_weights_(trn/tst)_(size).picklefiles. - t – trains phrase detection model. Requires
phr_detect_examples_(trn/tst)_(size).pickleandempty_weights_(trn/tst)_(size).picklefiles; createsw_(size)_(iterations).picklefile - e – computes error of a model on a testing set. Requires
w_641_(iterations).pickle,labels_tst_276.pickle,questions_tst_276.pickleandphrase_detect_features_tst_276_arr.picklefiles.
Requires labels_(trn/tst)_(size).pickle, questions_(trn/tst)_(size).pickle and pos_tagged_(trn/tst).pickle files
- c – creates training examples for shift-reduce model training. Requires
gold_dags_(trn/tst)_(size).picklefile;creates dag_examples_(trn/tst)_(size).pickle,gold_sequences_(trn/tst)_(size).pickleandempty_weights_dag_(trn/tst)_(size).picklefiles. - t – trains shift-reduce model. Requires
dag_examples_(trn/tst)_(size).pickleandempty_weights_dag_(trn/tst)_(size).picklefiles; createsw_dag(size)_(iterations).picklefile - b – computes error of a model on a testing set. Requires
w_dag641_(iterations).pickle,gold_dags_tst_276.pickleandgold_sequences_tst_276.picklefiles.
Requires questions_(trn/tst)_(size).pickle
- e – obtain candidates for entity linking through Google Freebase API; creates
candidates_(trn/tst)_(size).picklefile - g – obtain correct entities for linking. Requires
candidates_(trn/tst)_(size).pickleandquery_gold_ent_(trn/tst).picklefiles; createsgold_entities_(trn/tst)_(size).picklefile - f – construct features for entity linking. Requires
gold_entities_(trn/tst)_(size).pickleandcandidates_(trn/tst)_(size).picklefiles; createscandidates_features_(trn/tst)_(size).pickleandent_labels_(trn/tst)_(size).picklefiles - t – train model for entity linking. Requires
candidates_features_(trn+tst)_(size).pickleandent_labels_(trn+tst)_(size).picklefiles (4 total); createsent_lr_trn_641.picklefile - r – construct features for relation linking and train model. Requires
query_gold_rel_trn.picklefile; createsrelation_lr_trn_641.picklefile - u – evaluate model for relation linking. Requires
relation_lr_trn_641.pickle,query_gold_rel_tst.pickleandrel_dict.picklefiles - l – construct features for edge linking. Requires
query_gold_edges_(trn/tst).pickleandgold_dags_(trn+tst)_(size).picklefiles. Createsedge_features_(trn/tst).pickleandedge_labels_(trn/tst).picklefiles - d – train model for edge linking. Requires
edge_features_(trn+tst).pickleandedge_labels_(trn+tst).picklefiles (4 total);creates edge_lr_trn.picklefile - a – link all questions to KB. Requires all linking models and
candidates_(trn/tst)_(size).picklefile. - q – parse logic formulas to linked DAGs; creates
query_gold_rel_(trn/tst).pickle,query_gold_ent_(trn/tst).pickle,query_gold_dags_(trn/tst).pickleandquery_gold_edges_(trn/tst).picklefiles - c – create vocabularies for edges and relations
Requires all models, dictionaries, free917_(trn/tst)_answers.txt and pos_tagged_*.pickle files
- i – answers questions input by user
- f – answers questions from file
- a – answers questions from dataset and evaluates them on gold standard answers
- https://github.com/pks/rebol - for answers for Free917 questions