Wikipoet is a program I wrote in Python that generates simple, iterative poems using the raw text from Wikipedia articles (retrieved via the MediaWiki API) and NLTK.
Wikipoet begins with a single word and uses the Wikipedia article for that word to find likely noun and adjective combinations. It then prints a stanza with the following structure:
[word]
[adjective] [word]
[adjective], [adjective] [word]
[adjective], [adjective], [adjective] [word]
[adjective], [adjective], [adjective], [adjective] [word]
[word] [noun], [word] [noun], [word] [noun]
[word] [noun], [word] [noun], [word] [noun]
[word] [noun / new word]
[new word]
Wikipoet then repeats this operation for the new word. The stanzas can continue indefinitely.
Here’s an example:
computer
former computer
flash, military computer
many, full, best computer
all, later, more, earlier computer
computer design, computer help, computer say
computer reference, computer voice, computer central processing unit
computer job
job
principal job
risky, creative job
critical, national, many job
lowly, steady, poor, primary job
job satisfaction, job reference, job preparation
job system, job look, job retention
job want
want
noble want
four, like want
more, strong, most want
human, some, american, many want
want can, want production, want protection
want level, want story, want item
want character
character
classical character
novel, new character
other, written, first character
greek, various, practical, set character
character construction, character actor, character words
character see, character page, character volume
character pick
pick
game pick
original, american pick
used, all, first pick
bay, star, early, specific pick
pick brand, pick use, pick set
pick title, pick people, pick peter
pick page
page
side page
modern, all page
other, past, early page
south, worldwide, beginning, electronic page
page format, page declaration, page band
page technology, page business, page address
page stop
stop
three stop
full, former stop
total, black, used stop
top, safe, international, white stop
stop code, stop nation, stop destruction
stop period, stop frank, stop part
stop closure
closure
prompt closure
epistemic, tight closure
early, short, social closure
transitive, deductive, other, cognitive closure
closure operator, closure process, closure rule
closure operation, closure law, closure map
closure series
series
kind series
systematic, sequential series
geologic, former, odd series
world, fixed, ordered, funny series
series flora, series movie, series sequence
series tone, series world, series step
series year
year
actual year
received, minor year
mass, cultural, done year
scheduled, united, martian, keen year
year consultation, year master, year trend
year personal, year level, year lord
year high
Depending on the length of the poem desired and the speed of one’s internet connection, Wikipoet can take a relatively long time to produce its output. The poem above took approximately 30 minutes to produce with a standard broadband connection.
While creating Wikipoet, I realized that I could improve the quality of its adjective-noun pairings by producing one set of adjective-noun combinations, then searching those combinations and removing ones that appear fewer than 10 times in Wikipedia search results.
Here is the code that accomplishes that using the MediaWiki API and the Python Requests library, where y is a list of adjectives:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
for i in y[:]: search_string = "\"" + i + ' ' + word + "\"" payload = {'action': 'query', 'list': 'search', 'format': 'json', 'srsearch': search_string, 'srlimit': 1, 'srprop': 'snippet', 'srwhat': 'text'} r = s.get(url, params=payload, headers=headers) json_obj = r.json() hits = int(json_obj['query']['searchinfo']['totalhits']) if hits < 10: y.remove(i) else: pass |
The primary utility of Wikipoet is its ability to find meaningful adjectives to pair with nouns and nouns to pair with adjectives. I plan to integrate this process into future projects.