A Picture Worth 140,000 Words

17 October 2008
2:01 PM

4 Comments

Some classmates introduced me to Wordle this week, a site that creates amazing displays of words based on their frequency. In addition to producing beautiful results, the program is fun to use. You can adjust all the settings: choose different typefaces, word arrangements, and color schemes.

I decided to give it a whirl, but with my entire website. I used my blogging software to create a text file with everything I have written here—140,000 words—and I uploaded it for Wordle to process. Here are the major themes in all that text:

Word chart of my site
Word chart of my site produced by Wordle.

Once you start making wordles it can be addicting. I kept looking for things that I have written to put into the system. I uploaded my senior philosophy thesis, emails, source code for computer programs. Get some text and try it out for yourself.

Comments

You are what you write about? Taking behaviorism into account, this is an interesting glimpse at your psyche :)

I see Roy, Emily, Caitlin.

What I don’t see is: Michael, Late Night, Diablo 2, Ping-Pong.

I guess you have decided the world does not want to hear about our long hours plugging through Act Five.

Michael Greenberg

on October 20, 2008 11:47 AM

Hey, how did you extract this info from your database…you’ve piqued my interest…

My blogging software is MovableType, which uses template to lets you publish your entries to any kind of file you want. I just created a template that listed all of my entries, stripped out the HTML, and saved it as a text file. Then I pasted (yes, pasted 600k) into the dialog box at Wordle.

I know you use Drupal, but it still shouldn’t be difficult to export the text from your database. You could throw together a short PHP script that reads from your database directly and save the text.

Start with this query (works with Drupal 6 databases; I don’t know if they changed the schema significantly from Drupal 5):

SELECT node.title, node_revisions.body FROM node LEFT JOIN node_revisions ON node.vid = node_revisions.vid WHERE type = "story"

Your script just has to connect to the database (mysql_connect()), run the query (mysql_query()), then you just need to read each one (mysql_fetch_object()), and add them to a file (file_put_contents()). You’ll probably also want to use PHP’s functions to strip HTML tags.

Leave a comment