Monday, May 23, 2011

Exciting Esoterica

It’s been a while since Brian Cowen stepped aside to Ms. Enda Kenny, and boy have I been productive. My old supervisor, Mark Keane, of University College infamy, and I have three papers to appear in two conferences this summer: IJCAI-11 and CogSci-11. IJCAI is an amazing conference, one of a couple top-tier computer-science conferences (as rated by the ERA and these guys.) IJCAI stands for “International Joint Conference on Artificial Intelligence. It dates back to the cold war, the “Joint” implying a Soviet - West collaboration. Cool eh? CogSci is also a very good conference, though the acceptance rate was pretty high (~72%) and Mark thinks “it’s gone down-hill a bit [now that I’m less involved...].” It’s still an A-rate conference according to the ERA guys, and to be honest, I’d be overjoyed with with the two papers in CogSci were it not overshadowed by the IJCAI paper. Plus, once I find some money willing to be spent on me, I’ll get to present in Barcelona for IJCAI and Boston for CogSci!

The IJCAI paper is about tracking stock-market bubbles using changes in the number and kinds of verbs in financial reporting. For some of you, the abstract + intro + discussion will be interesting. The method will bore everyone, and the model and analysis might be of technical interest to the more...patient...reader.

The CogSci papers are a bit more technical, and less widely-exciting. They are a splitting and bolstering of some bits of my master’s thesis on metaphor. One focuses on clustering of metaphoric phrases (found in financial text) based on the arguments they take. It’s a neat methodology, and I think we’re onto a good, real results. The second paper focuses on isolating metaphoric antonyms (soar-plummet, gain-lose) in financial text. Again, it investigates using argument distributions to generate results that correlate to those of a human-study. The abstracts are worth everyone’s time, the rest is a bit dense.

Below are the citations and abstracts, with non-publisher links to the papers.

Gerow, Aaron and Keane, Mark T. (2011) Mining the Web for the "Voice of the Herd" to Track Stock Market Bubbles. To appear in Proc. of the 22nd Intl. Joint Conf. on A.I. (IJCAI '11), Barcelona, Spain, 16-22 July, 2011.

Abstract We show that power-law analyses of financial commentaries from newspaper web-sites can be used to identify stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions reflect market movements of the Dow Jones Industrial Average (DJI), the FTSE-100, and the NIKKEI-225. Notably, the statistical regularities in language track the 2007 stock market bubble, showing emerging structure in the language of commentators, as progressively greater agreement arose in their positive perceptions of the market. Furthermore, during the bubble period, a marked divergence in positive language occurs as revealed by a Kullback-Leibler analysis.

Available here.

Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors referred to by these verbs are organised into hierarchical structures of superordinate and subordinate groups.

Available here.

Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-the-opposite task and the most commonly identified antonyms were compiled. In Study 2, we determined whether the argument-distributions for the verbs in these antonym-pairs were sufficiently similar to predict the most frequently-identified antonym. It was found that cosine similarity correlates moderately with the proportions of antonym-pairs identified by people (r = 0.31). More impressively, 87% of the time the most frequently-identified antonym is either the first- or second-most similar pair in the set of alternatives. The implications of these results for distributional approaches to determining metaphoric knowledge are discussed.

Available here.

Thursday, March 10, 2011

The follow-up

Here's how it worked out:
Fine Gael: 76 [Centre-right]
Labour: 37 [Centre-left]
Fianna Fáil: 20 [Centrist]
Sinn Féin: 14 [Quasi-Marxist, Nationalist]
Socialist / People Before Profit / ULA: 5 (Includes Seamus Healy) [Leftist]
Green: 0 [Environmental]
Independents: 14 (4 stated leftists, 1 stated conservative)

Right. So. Fine Gael / Labour coalition recently finalised. Endy Kenny, head of Fine Gael, is Taoiseach. Fine Gael takes 10 cabinet positions (similar to U.S. cabinet) and Labour takes 5. The full cabinet is described here. Independents are working to form an ad-hoc alliance. The five ULA seats, along with at least three other like-mindeds, appear headed for cahoots-dom. Sinn Féin will probably vote with, but not be aligned with an independent lefty alliance. However, there are a number of independents with centre-right sympathies (a la Fine Gael). So it seems the majority government won't have too much trouble passing legislation. They will, though, be tempered by Labour's large involvement--so we're likely to see a productive, centrist government. And! Renegotiating the EU/IMF bailout is on the table (thank goodness...) Sad to see the Greens go but happy to see a number of leftists come in.

Ah. And as for collecting election posters: our house has two Fianna Fáils, a Fine Gael, Sinn Féin, Labour, and an independent -- all from the Dublin Southeast constituency.

Wednesday, January 12, 2011

Saving Space in OS X

Saving Space in OS X

Do you ever miss that old Microsofty feeling you used to get when deleting Temporary Internet files or some dumb folder in Application Data and find you have about five gigs more space? Well, touching the almost-out-of-swap-space 95% disk-usage on my laptop has rekindled my fondness for those cute temporary files and that warm fuzzy feeling you get when you zap them.

Now then, aside from some standard Unixy business in /var, some long-gone junk in /usr/local, and running the fink cleanup business, I couldn’t find anything much delete! So what did I do you ask...actually you didn't ask...sorry... But! I figured out a way to get my respectably large, unequivocally hip 72 gigs music down to a lovable 48G. Yep. No joke. Here’s the deal.

I rip a lot of music from oulde fashioned CDs, and upon looking at my import settings, in iTunes->Preferences->Import Settings, I found I was using some nonsense called an AIFF encoder. This, as they say, sounds a bit dodgy, so I switched to the MP3 encoder. Yes, even though most of my music is OGG (and yours should be too!) I grudgingly switched from one oppressive format to another. And here’s why.

Now! I can order songs in iTunes by bit-rate (in list view, right-click in the headings bar to add a bit-rate column). At the top of the list was a whole bunch of seemingly random songs encoded at staggeringly useless 1141kbps! Whoa. Rumor has it that some people can tell 256kbps from 196kbps but I think that’s rubbish. So I changed my encoder (back in Import Settings) to 196, which is as awesome as it will go. Then! I select all the too-well-encoded-to-be-reasonable songs and went to Advanced->Create MP3 Version and iTunes converted the daylights outa them songs, and then I cunningly deleted the old ones. If you’re tight on space, not unlike myself, you’ll have to pull this maneuver in batches. Oh, and iTunes croaked when I selected over 1000 songs for conversion...bummer. But in the end, I’ve reclaimed a good bit of disk-space.