Probabilistic Context Free Grammar Parser
Objective
Build a PCFG parser and increase its baseline accuracy.
What I did?
- Built a PCFG parser by implementing CKY algorithm and training it on ATIS portion of Penn Treebank.
- Improved the baseline F1 score by 5 percentage points by employing following methods
- Horizontal Markovization
- Vertical Markovization
- NP Tag-split.
Results
- Best precision was observerd when I used Horizontal and Vertical Markovization together(98.6%), but as the grammar was very strict the F1 score came down to 87.6% owing to low recall.
- The best F1 score(89.6) was observer when I used Horizontal Markovization with binarization. It gave me a precision value of 94.6%
Technologies Used: