Description

Implement the generate() method for NLTK’s probabilistic context-free grammar to probabilistically generate valid sentences. (NLTK stands for Natural Language Toolkit.)

Installation

The source code for pcfg hosted on Github.

pip install pcfg

Example usage

A PCFG can be initialized in the same way that an NLTK probabilistic context-free grammar is initialized:

>>> from pcfg import PCFG
>>> grammar = PCFG.fromstring("""
... S -> Subject Action [1.0]
... Subject -> "a cow" [0.7] | "some guy" [0.1] | "the woman" [0.2]
... Action -> "eats lunch" [0.5] | "was here" [0.5]
... """)

To generate sentences, simply use the generate() method:

>>> for sentence in grammar.generate(3):
...     print(sentence)

The output could be the following:

>>> for sentence in grammar.generate(3):
...     print(sentence)

The output could be the following:

>>> for sentence in grammar.generate_sentences(3):
...     print(sentence)

The output could be the following:

the woman eats lunch
the woman was here
a cow was here

Of course, your output may be different because the sentences are generated probabilistically.

License

MIT