Monday, May 23, 2011

Exciting Esoterica

It’s been a while since Brian Cowen stepped aside to Ms. Enda Kenny, and boy have I been productive. My old supervisor, Mark Keane, of University College infamy, and I have three papers to appear in two conferences this summer: IJCAI-11 and CogSci-11. IJCAI is an amazing conference, one of a couple top-tier computer-science conferences (as rated by the ERA and these guys.) IJCAI stands for “International Joint Conference on Artificial Intelligence. It dates back to the cold war, the “Joint” implying a Soviet - West collaboration. Cool eh? CogSci is also a very good conference, though the acceptance rate was pretty high (~72%) and Mark thinks “it’s gone down-hill a bit [now that I’m less involved...].” It’s still an A-rate conference according to the ERA guys, and to be honest, I’d be overjoyed with with the two papers in CogSci were it not overshadowed by the IJCAI paper. Plus, once I find some money willing to be spent on me, I’ll get to present in Barcelona for IJCAI and Boston for CogSci!

The IJCAI paper is about tracking stock-market bubbles using changes in the number and kinds of verbs in financial reporting. For some of you, the abstract + intro + discussion will be interesting. The method will bore everyone, and the model and analysis might be of technical interest to the more...patient...reader.

The CogSci papers are a bit more technical, and less widely-exciting. They are a splitting and bolstering of some bits of my master’s thesis on metaphor. One focuses on clustering of metaphoric phrases (found in financial text) based on the arguments they take. It’s a neat methodology, and I think we’re onto a good, real results. The second paper focuses on isolating metaphoric antonyms (soar-plummet, gain-lose) in financial text. Again, it investigates using argument distributions to generate results that correlate to those of a human-study. The abstracts are worth everyone’s time, the rest is a bit dense.

Below are the citations and abstracts, with non-publisher links to the papers.

Gerow, Aaron and Keane, Mark T. (2011) Mining the Web for the "Voice of the Herd" to Track Stock Market Bubbles. To appear in Proc. of the 22nd Intl. Joint Conf. on A.I. (IJCAI '11), Barcelona, Spain, 16-22 July, 2011.

Abstract We show that power-law analyses of financial commentaries from newspaper web-sites can be used to identify stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions reflect market movements of the Dow Jones Industrial Average (DJI), the FTSE-100, and the NIKKEI-225. Notably, the statistical regularities in language track the 2007 stock market bubble, showing emerging structure in the language of commentators, as progressively greater agreement arose in their positive perceptions of the market. Furthermore, during the bubble period, a marked divergence in positive language occurs as revealed by a Kullback-Leibler analysis.

Available here.


Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors referred to by these verbs are organised into hierarchical structures of superordinate and subordinate groups.

Available here.


Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-the-opposite task and the most commonly identified antonyms were compiled. In Study 2, we determined whether the argument-distributions for the verbs in these antonym-pairs were sufficiently similar to predict the most frequently-identified antonym. It was found that cosine similarity correlates moderately with the proportions of antonym-pairs identified by people (r = 0.31). More impressively, 87% of the time the most frequently-identified antonym is either the first- or second-most similar pair in the set of alternatives. The implications of these results for distributional approaches to determining metaphoric knowledge are discussed.

Available here.