I must admit, I'm really struggling to wrap my head around this Topic Modeling business. The biggest question I haven't yet answered (and I do hope that some of this week's readings I haven't yet tackled address this question) is whether what topic modeling might be able to offer on an individual text level, rather than on the scale of a whole archive/corpus/canon.
Looking at the various tools on the table (and trying my own experiment on The Sun Also Rises, as I happen to have it sitting around in .txt form after last week's word cloud experiment) I couldn't make heads or tails of this question. The Java TopicModelingTool Gabe dug up returned these results on my text:
topicId |
words.. |
|
|
|
|
1 |
room wanted man read thing stood coming fight san friends |
2 |
time back big ll long morning money bar thought talking |
3 |
street crowd back bulls square turned door stairs white dark |
4 |
bull brett romero hand gave front place told close knew |
5 |
jake put ve table drink hell count drunk didn english |
|
6 |
mike cohn brett robert road made fine damned eat paris |
7 |
looked head face side water girl standing glass brought sleep |
8 |
bill don asked brett make nice montoya called matter boy |
9 |
people wine hotel town started trees bottle car stopped country |
10 |
good night bed sat sitting felt talk things great isn |
|
Contractions seem to foil the work the program does (you can see "isn," "didn," and "ve" appear here as...words?...topics? Even the terminology is still confusing me, and Matt Burton's piece, try as he might, only confused me more).
Much more comprehensible was the work of Andrew Goldstone's DFR-Browser on on PMLA--if I could input my texts and churn out something that looked like this:
I think I might be able to get my head around what's going on here!
Comments (0)
You don't have permission to comment on this page.