INFORMATION TECHNOLOGY
http://chronicle.com/infotech
Computers Track the Elusive Metaphor
BY LISA GUERNSEY cs.bham.ac.uk/~jab/AT T-Meta/Data-bank/.)
INDEX CARDS AND PENCILS
ARISTOTLE WAS
love of metaphors and applauded writers who could
famous for his
harness their power. Having command over metaphors could not be
taught or “imparted by another,” he
wrote. “It is the mark of genius.”
More than 400 years later, computers may not be able to master poetics like Aristotle, but they have become smart enough to know a metaphor when they see one.
An online database called The
Mind Is a Metaphor, created by Brad
Pasanek, an assistant professor of
English at the University of Virginia, is a searchable bank of phrases,
verses, and lines from literature that
encapsulate metaphors of the mind.
Mr. Pasanek hopes it will help literary and intellectual historians gain
insights into how people’s language
reflects their understanding of the
world around them.
But the database also has another role: to test how much a piece of
software can gather about language
with a little training from humans.
The answer, Mr. Pasanek said, is:
much more than he thought.
If Aristotle thought that metaphor could not be learned, “it was
nice to find that not only can it be
learned, it can be learned by a machine,” says D. Sculley, a computer
scientist in Pittsburgh who works
for Google and who, on his own,
has run projects on the database for
the past few years.
The database, which can be found
at http://mind.textdriven.com, has
more than 8,700 metaphors found in
18th-century literature. Type in the
keyword “rose,” for example, and
among the results you’ll get is a line
from Keats: One soul may be snared
and labyrinthed in another “like the
hid scent in an unbudded rose.”
Mr. Pasanek, a scholar of 18th-
century literature, started building
the database when he was a graduate
student at Stanford University. His
original goal was to document every
moment when a metaphor was used
to describe how the mind works, so
that he could uncover connections
between intellectual movements
and the way people used words to
describe their thinking.
Today, for example, people often liken brains to computers. We
get the download, need rebooting,
commit fatal errors. By contrast,
Mr. Pasanek says, 300 years ago, as
the modern novel was taking shape,
metaphors of the mind commonly
evoked paper and books: “It’s a very
literate moment. People were thinking of themselves as paper or blank
paper.”
Other scholars are similarly interested in metaphors as signals of
thought. George P. Lakoff, a linguist at the University of California at Berkeley and author of Moral
Politics, has written widely about
metaphors and theories of cognitive
science.
Another database, with about
1,100 examples from modern-day
usage, is run by John Barnden, a professor of artificial intelligence at the
University of Birmingham, in England. (It is on the Web at http://www.
Mr. Pasanek started with Shakespeare, Milton, and the King James
Version of the Bible, using index
cards and colored pencils to note
each metaphor he came across. Soon
an information-technology adviser
at Stanford suggested that he create
a database, with retrieval via keyword searching. “I’d start searching
for ‘mind’ and ‘blank slate’ within
a 100 words of each other,” the professor says.
Eventually he found he could be
even more efficient by training a
computer program to recognize patterns in the way words came together. Instead of just plugging in keywords to get word-for-word matches,
Mr. Pasanek can train his software
to recognize what a particular type
of metaphor might look like (
using, say, 100 examples) and then ask
it to search large text databases for
more.
This process, known as machine
learning, was brought to his attention by Mr. Sculley, a schoolmate
from the sixth grade who happened
to catch up with him at a wedding.
“From my perspective,” says Mr.
Sculley, “the thought of doing this
by hand struck me as insane—in the
most admirable way.”
The two are now collaborators,
running machine-learning programs
to test software programs and theories about metaphors. They recently
wrote two articles about their work,
to be published by the online jour-
TOM COGILL FOr THE CHrOnICLE
Metaphor-tracking software devised by Brad Pasanek, of the U. of Virginia,
has shown changes in how people think about the mind.
nal Literary and Linguistic Computing.
But for Mr. Pasanek, the database
is primarily a literary endeavor. He
likes to combine close readings of
a work of literature with data-min-ing explorations of bits and pieces
of hundreds of texts. “I find myself
intensely scrutinizing one metaphor
against a background of 400 metaphors,” he says.
He is also writing a dictionary of
metaphor, using the database to help
him store and retrieve examples. In other words, the database is a mind crutch.
Metaphorically speaking, that is.
Online-College Rankings
Use Distinct Formula
The Online Education Database,
or OEDb, released its third annual
ranking of online educational institutions in January, prompting announcements and news releases from
many of those that appeared near the
top of the list.
The Houston-based OEDb, a subsidiary of DomainDev, is a for-profit
company that makes money by referring visitors to the many online
colleges and universities that advertise with it.
But the annual rankings are
a completely separate service,
OEDb’s founder, Andy Hagans,
told The Chronicle. Mr. Hagans
conceived the idea when he noticed that there were no mainstream
rankings for online institutions.
Any perceived conflict of interest,
he said, should be dispelled by the
site’s thoroughly transparent methodology.
OEDb ranks the colleges according to eight separate metrics. Two
of the metrics—retention rate and
graduation rate—are among those
U.S. News & World Report uses to
formulate its yearly ranking of traditional colleges and universities.
Several others bear some similarity
to U.S. News metrics without being
identical: The OEDb rankings do
not use peer assessments of colleges,
but they do score how many times a
college’s Web site is linked to from
other colleges’ sites. The metric, referred to as “peer Web citations,”
appears to use those links as a proxy
for a college’s reputation among other institutions of higher learning.
The remaining OEDb metrics are
acceptance rate, student-faculty ratio, percentage of students receiving financial aid, scholarly citations
(how often outside scholars have
cited the institution’s research), and
how many years the institution has
been accredited by an agency recognized by the U.S. Education Department’s Office of Postsecondary
Education.
The most distinct aspect of the
OEDb’s formula is that it weights
all the metrics equally. For instance,
the quality of each college’s financial-aid program is given the same
consideration as how many times
that college is linked to from other
colleges’ Web sites.
Mr. Hagans said he decided to assign equal weight to all criteria because “any weighting we could give
them would be arbitrary.” He pointed out that visitors are able to view
the rankings by individual metric,
in effect allowing viewers to ascribe
their own weight to each one. “We
also publish all the raw data,” he
said, “so people could produce their
own rankings if they want.”
—Steve Kolowich
Number of Data Breaches
at Colleges Is Still Rising
The number of data-security in-
cidents across the country rose 47
percent between 2007 and 2008, and
more educational institutions reported incidents last year than ever before, according to a new report from
the Identity Theft resource Center,
a nonprofit organization.
The center identified 656 breaches in 2008, with 131 of them taking place at colleges or secondary
schools. In 2007 there were 446
breaches reported, with 111 of them
at schools or colleges.
“Some universities and colleges over the years have had multiple
breaches—that’s dismaying,” said
Linda Foley, the center’s founder.
“You would think after the first or
second breach that there would be a
change in policy and a change in the
IT procedures where this would not
continue to happen.”
Still, colleges are not suffering
any greater increase in breaches
than other sectors, according to the
report. —Jeffrey R. Young
Professors May Have to Pay
Taxes on College Laptops
Professors lucky enough to get
laptops from their institutions may
want to watch out—the taxman
could come knocking.
Manchester College has announced that employees with uni-versity-owned laptops will now
have to list those laptops on their tax
forms as taxable items. This means
a $1,600 laptop with a four-year life
expectancy would add $400 per year
to an employee’s taxable income,
wrote Michael Case, Manchester’s
information-technology director, in
a message sent to an e-mail list for
campus technology officials hosted
by Educause.
At Manchester, the news has
caused “significant pushback from
employees” who want to avoid
paying additional taxes, Mr. Case
wrote.
“Many want to exchange their
laptop for a desktop to avoid the tax
liability in the future,” he wrote. “I
can’t blame them because I’m thinking the same thing.”
Because university laptops are
often used for personal purposes, the Internal revenue Service
counts them as taxable “fringe
benefits,” said Bertrand Harding,
a tax lawyer specializing in non-profit institutions, in an interview.
If a college is able to provide documentation showing such laptops
were used for business purposes
every time they were turned on, it
would not have to pay tax on the
machines. But few institutions
could offer any such proof, Mr.
Harding said.
In recent years, the IrS has started to crack down on fringe benefits
like laptops and cellphones, and it
has asked institutions to pay taxes
that were not withheld from employees, Mr. Harding said.
That has resulted in an increasing
number of colleges, like Manchester,
listing laptops as taxable items, Mr.
Harding said.
“Some colleges just put their
heads in the sand and say we’re just
going to wait for the IrS to come,”
Mr. Harding said. “Others are
changing their policies so they don’t
get hammered when the IrS comes
in.” —David Shieh