1. single word verbs(lexical units) data files: swv_T.pkl: contains 79584 records, there are 8 columns in it. column 'masked_sent' contains the sentence where targetword is highlighted using underscore like: __targetword__ swv_gold_dataset.pkl: file with similar format as swv_T, but contains gold clusters. Column 'gold_clusters_processed' should be used for evaluation. Following two files conatin indices to split swv datafile into dev and test set (as it was used in original ELMo experiments) swv_gold_dataset_dev_split.json swv_gold_dataset_test_split.json 2. single word frame roles data files: swr_T.pkl: file for experiment, contains 191252 records, there are 8 columns in it. column 'masked_sent' contains the sentence where target role is highlighted as: __targetrole__ swr_gold_dataset.pkl: contains gold clusters, column 'gold_cluster_patternlemmatized' should be used for evaluation Following two files conatin indices to split swr datafile into dev and test set (as it was used in original ELMo experiments) swr_gold_dataset_dev_split.json swr_gold_dataset_test_split.json 3. vocabulary file for verbs: verbs_list.txt: This file contains list of verbs used to filter predictions of lexical_unit during postprocess step.should be used for vocabulary_path variable in run_postprocessing_predictions.py