Sunday, January 15, 2012

Download and run Apple Hardware Test (AHT) from a USB drive.

Background (why I’m in this state)

I have a MacBook Pro 2,2 which came with Tiger. Then I upgraded to Leopard, then Snow Leopard and most recently, to Lion. For each of these upgrades I did a clean install, because I’d heard bad things about Ruby, CPAN, fink, MySQL and stuff getting munged. Effectively, I would wipe the old system and copy (not “Restore”) my old files.

The Situation (what didn’t work)

It seems like my GPU may be on its way out, so I wanted to run Apple’s Hardware Test (AHT). I don’t have my original system disks nearby, from which AHT could be run rather easily. Booting holding ‘D’ didn’t work, neither did F2 or Option+’D’, as some forums claimed. So I poked around and found that my /System/Library/CoreSerives/ didn’t have a .diagnostics directory (where AHT reportedly should be). This is probably because of all my clean OS installs. I found a place to download it (see below), but copying AHT into that .diagnostics directory still didn’t allow me to boot into AHT using the normal steps. I think this is because having cleanly installed Lion, it expects to use the new fancy internet-AHT like the MacBook Airs -- but the system ROM doesn’t trap Option+’D’ at start-up. So I still couldn’t get AHT to run. Here’s what I did to get it to run / boot from a USB stick.

The Method (what did work)

1) Download the AHT for your computer (see downloads below for specific models).

1b) My copy had me convert the downloaded .dmg from some “old” type using Disk Utility. (Just open the .dmg in Disk Utility and “Convert” to a new target, then mount the target.)

2) Mount and completely wipe a USB stick.

3) From the AHT image, copy /System to the root folder of your USB stick:

cd /Volumes/USB_STICK/ && cp -r ~/AHT_ARCHIVE/System .

4) Now, from the USB drive, copy the /System/Library/CoreServices/.diagnostics/diags.efi to the root directory:

cd /Volumes/USB_STICK/ && cp ./System/Library/CoreServices/.diagnostics/diags.efi .

5) Shutdown all applications.

6) “bless” the USB drive in mount-mode, with the EFI file, and immediately reboot:

cd /Volumes/USB_STICK/ && sudo bless --mount /Volumes/USB_STICK --setBoot --file diags.efi && sudo reboot

7) You should now be booting into AHT -- don’t hold down any keys.

8) Run the tests, and yank the USB key after AHT reboots you.

You can download the AHT package for your computer using this URL:[MODEL NUMBER]-A.dmg

where [MODEL NUMBER] is the four-number ID below:

3282 for Mac-F4208AC8, Mac-F42289C8 Xserve1,1 and Xserve2,1
3259 for Mac-F42C8CC8 MacBookAir1,1
3273 for Mac-F42C88C8 MacPro3,1
3254 for F4238CC8, F42386C8, F4218EC8, F4208EAA, F4208DC8, F4208DA9, F4238BC8, F42388C8 and F22788C8 inclusively.

or more specifically:
3085 for Mac-F22788C8 MacBook3,1
2886 for Mac-F4208EAA Macmini2,1
2845 for Mac-F42386C8 iMac7,1
2833 for Mac-F42388C8 MacBookPro3,1
2770 for Mac-F4238BC8 MacBookPro3,1
2769 for Mac-F4208DC8 MacPro1,1
2667 for Mac-F4208DA9 MacPro2,1
2766 for Mac-F4208CAA MacBook2,1
2592 for Mac-F42189C8 MacBookPro2,1
2591 for Mac-F42187C8 MacBookPro2,2
2590 for Mac-F4208CA9 MacBook2,1
2579 for Mac-F4218FC8 iMac6,1
2535 for Mac-F4218EC8 iMac5,2
2534 for Mac-F4228EC8 iMac5,1
2533 for Mac-F42786A9 iMac5,1
And these are there, but too old to identify: 2418, 2405, 2398, 2393, 2392, 2342, 2216, 2215, 2158, 2120, 2079, 2056, 1880, 1879, 1680 and 1594.

(Thanks to mkincaid at the macnn forum for that post.)


Monday, May 23, 2011

Exciting Esoterica

It’s been a while since Brian Cowen stepped aside to Ms. Enda Kenny, and boy have I been productive. My old supervisor, Mark Keane, of University College infamy, and I have three papers to appear in two conferences this summer: IJCAI-11 and CogSci-11. IJCAI is an amazing conference, one of a couple top-tier computer-science conferences (as rated by the ERA and these guys.) IJCAI stands for “International Joint Conference on Artificial Intelligence. It dates back to the cold war, the “Joint” implying a Soviet - West collaboration. Cool eh? CogSci is also a very good conference, though the acceptance rate was pretty high (~72%) and Mark thinks “it’s gone down-hill a bit [now that I’m less involved...].” It’s still an A-rate conference according to the ERA guys, and to be honest, I’d be overjoyed with with the two papers in CogSci were it not overshadowed by the IJCAI paper. Plus, once I find some money willing to be spent on me, I’ll get to present in Barcelona for IJCAI and Boston for CogSci!

The IJCAI paper is about tracking stock-market bubbles using changes in the number and kinds of verbs in financial reporting. For some of you, the abstract + intro + discussion will be interesting. The method will bore everyone, and the model and analysis might be of technical interest to the more...patient...reader.

The CogSci papers are a bit more technical, and less widely-exciting. They are a splitting and bolstering of some bits of my master’s thesis on metaphor. One focuses on clustering of metaphoric phrases (found in financial text) based on the arguments they take. It’s a neat methodology, and I think we’re onto a good, real results. The second paper focuses on isolating metaphoric antonyms (soar-plummet, gain-lose) in financial text. Again, it investigates using argument distributions to generate results that correlate to those of a human-study. The abstracts are worth everyone’s time, the rest is a bit dense.

Below are the citations and abstracts, with non-publisher links to the papers.

Gerow, Aaron and Keane, Mark T. (2011) Mining the Web for the "Voice of the Herd" to Track Stock Market Bubbles. To appear in Proc. of the 22nd Intl. Joint Conf. on A.I. (IJCAI '11), Barcelona, Spain, 16-22 July, 2011.

Abstract We show that power-law analyses of financial commentaries from newspaper web-sites can be used to identify stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions reflect market movements of the Dow Jones Industrial Average (DJI), the FTSE-100, and the NIKKEI-225. Notably, the statistical regularities in language track the 2007 stock market bubble, showing emerging structure in the language of commentators, as progressively greater agreement arose in their positive perceptions of the market. Furthermore, during the bubble period, a marked divergence in positive language occurs as revealed by a Kullback-Leibler analysis.

Available here.

Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors referred to by these verbs are organised into hierarchical structures of superordinate and subordinate groups.

Available here.

Gerow, Aaron and Keane, Mark T. (2011) Identifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles. To appear in Proc. of the 33rd Ann. Meeting of the Cog. Sci. Soc. (CogSci '11), Boston, MA, USA, 20-23 July, 2011.

Abstract Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-the-opposite task and the most commonly identified antonyms were compiled. In Study 2, we determined whether the argument-distributions for the verbs in these antonym-pairs were sufficiently similar to predict the most frequently-identified antonym. It was found that cosine similarity correlates moderately with the proportions of antonym-pairs identified by people (r = 0.31). More impressively, 87% of the time the most frequently-identified antonym is either the first- or second-most similar pair in the set of alternatives. The implications of these results for distributional approaches to determining metaphoric knowledge are discussed.

Available here.

Thursday, March 10, 2011

The follow-up

Here's how it worked out:
Fine Gael: 76 [Centre-right]
Labour: 37 [Centre-left]
Fianna Fáil: 20 [Centrist]
Sinn Féin: 14 [Quasi-Marxist, Nationalist]
Socialist / People Before Profit / ULA: 5 (Includes Seamus Healy) [Leftist]
Green: 0 [Environmental]
Independents: 14 (4 stated leftists, 1 stated conservative)

Right. So. Fine Gael / Labour coalition recently finalised. Endy Kenny, head of Fine Gael, is Taoiseach. Fine Gael takes 10 cabinet positions (similar to U.S. cabinet) and Labour takes 5. The full cabinet is described here. Independents are working to form an ad-hoc alliance. The five ULA seats, along with at least three other like-mindeds, appear headed for cahoots-dom. Sinn Féin will probably vote with, but not be aligned with an independent lefty alliance. However, there are a number of independents with centre-right sympathies (a la Fine Gael). So it seems the majority government won't have too much trouble passing legislation. They will, though, be tempered by Labour's large involvement--so we're likely to see a productive, centrist government. And! Renegotiating the EU/IMF bailout is on the table (thank goodness...) Sad to see the Greens go but happy to see a number of leftists come in.

Ah. And as for collecting election posters: our house has two Fianna Fáils, a Fine Gael, Sinn Féin, Labour, and an independent -- all from the Dublin Southeast constituency.

Wednesday, January 12, 2011

Saving Space in OS X

Saving Space in OS X

Do you ever miss that old Microsofty feeling you used to get when deleting Temporary Internet files or some dumb folder in Application Data and find you have about five gigs more space? Well, touching the almost-out-of-swap-space 95% disk-usage on my laptop has rekindled my fondness for those cute temporary files and that warm fuzzy feeling you get when you zap them.

Now then, aside from some standard Unixy business in /var, some long-gone junk in /usr/local, and running the fink cleanup business, I couldn’t find anything much delete! So what did I do you ask...actually you didn't ask...sorry... But! I figured out a way to get my respectably large, unequivocally hip 72 gigs music down to a lovable 48G. Yep. No joke. Here’s the deal.

I rip a lot of music from oulde fashioned CDs, and upon looking at my import settings, in iTunes->Preferences->Import Settings, I found I was using some nonsense called an AIFF encoder. This, as they say, sounds a bit dodgy, so I switched to the MP3 encoder. Yes, even though most of my music is OGG (and yours should be too!) I grudgingly switched from one oppressive format to another. And here’s why.

Now! I can order songs in iTunes by bit-rate (in list view, right-click in the headings bar to add a bit-rate column). At the top of the list was a whole bunch of seemingly random songs encoded at staggeringly useless 1141kbps! Whoa. Rumor has it that some people can tell 256kbps from 196kbps but I think that’s rubbish. So I changed my encoder (back in Import Settings) to 196, which is as awesome as it will go. Then! I select all the too-well-encoded-to-be-reasonable songs and went to Advanced->Create MP3 Version and iTunes converted the daylights outa them songs, and then I cunningly deleted the old ones. If you’re tight on space, not unlike myself, you’ll have to pull this maneuver in batches. Oh, and iTunes croaked when I selected over 1000 songs for conversion...bummer. But in the end, I’ve reclaimed a good bit of disk-space.

Tuesday, October 19, 2010

Noun-Noun Compounds

Here’s my latest mini-project.

Noun-noun compounds are a relatively rare syntactic occurrence where two nouns are paired together to blend or attribute separate concepts.  Some examples include “island bungalow”, “race car”, and “stock market”.  Interested yet?  When you think about it, it may seem like the first noun is actually an adjective, but it’s really not.  Technically adjectives need a form different from their noun counterparts.  Nouns and verbs in English often share form: “run for the door” and “I’ll go for a run”.  Adjectives, however, have a distinct form--forms that are often generalised and over-extended to create fake words.  Cramming our examples into an adjective-noun structure we would get “island-esque bungalow”, “race-ish car”, and “stocky market”.  None of these really sound all that great, and it’s not just the made-up-ness of the words.  These examples are overly constrained by the interface between lexical and syntactic boundaries.  An adjective is not what we want, despite it be syntactically a little more common.  So!  People make up noun-noun compounds instead of breaking lexical rules.  I think Ray Jackendoff takes this as evidence that lexical constraints apply in parallel to syntactic constraints, not in sequence.

Linguistically: We can explode the syntax of noun-noun compounds as such: [n1 + n2] => “the [n1] type of [n2]” or inverted: “an [n2] of the [n1] type”.  Poetically, these options might be neat...but that’s about it.  They are a more explicit, boring form of concept combination.  In fact, its so explicit, that some of the metaphorical shading present in the concise, noun-noun form, is lost.  Replacing the fluidity of metaphor we get boring old category assumptions, such as Q: what type of bungalow? A: the island type.  This supposes there are a discrete set types-of-bungalows one of which is island.  Dumb!  Noun-nouns are better.

So, why do we care?  Psychologically, noun-noun compounds are interesting because they are a case of analogy and concept blending.  But beyond that, they are a wonderfully small and easy example of conceptual genesis.  Now there’s a buzz-word for ya.  But check it out.  New concepts are created every day.  Often words are used to characterize, describe, or understand concepts.  But sometimes words enable the creation of new concepts.  Terms, different from concepts, are particularly interesting when they are new and isolated.  A good example is the term “moral hazard” in recent financial and political talk.  The term refers to a situation where someone behaves differently than they would have had they been aware of the full extent of risk.  So we have this noun-noun compound used to characterise a circumstance that is recently relevant to the financial and political arenas.  The term is introduced without much explanation, but it catches on regardless.  Metaphor aside, this is an example of conceptual genesis, enabled and largely inspired by the creative use of language.

What we want to do see what noun-noun compounds say about language and people.

Step 0) Make sure they’re interesting.  See above.
Step 1) Find them.  See below.
Step 2) Figure out what it means.  ...

Step 1:

Below is a script for finding noun-noun compounds in arbitrary text.  It uses TreeTagger (free but separate) to tag a text file with part-of-speech tags (Noun, verb, etc...).  And then it finds the compounds and their lemmas, collapses them into their frequency of occurrence and gives you a spreadsheet.

It needs a Unix environment and TreeTagger.  TreeTagger takes the majority of time, but run on 10,829,875 words it’s not bad:

real    3m32.609s
user    5m24.708s
sys     0m20.861s

Step 2:

Preliminarily: I’ve looked at the results from a corpus of financial texts and get this! 91 of the top 100 noun-noun compounds are distinctly financial in nature.  Compare this with the 45 / 100 for the top bare nouns.  This makes me think that noun-noun compounds might be even MORE interesting...

More later!

The Script:

#! /usr/bin/perl -W
use strict;
$| = 1; 

## A script to find noun-noun compounds in a text file.
## Uses TreeTagger to lemmatise and POS-tag the corpus.
## Then we count and sort the instances into a CSV.
##   A. Gerow, 2010-10-14:

unless (-r $ARGV[0] && $ARGV[1] && -x $ARGV[2]) {
    print "Usage: ./get_NNs [input file] [output file] [path to tree-tagger command]\n";

if (-e "./temp") {
    print "Error: Please remove ./temp before running.\n\n";

sub INT_handler {
    unlink("./temp") if (-e "./temp");
    print "\nCaught SIG_INT, exiting\n\n";

$SIG{'INT'} = 'INT_handler';

my $INFILE  = $ARGV[0]; # input file
my $OUTFILE = $ARGV[1]; # output file
my $TT_PATH = $ARGV[2]; # path to language-specific tree-tagger binary

my $prev_type  = "";
my $prev_word  = "";
my $prev_lemma = "";
my @results;

# Words to exclude (added ad hoc.)
my @exclude = qw'zero one two three four five six seven eight nine ten eleven twelve
                 the an a of per cent for in ? and his her has from its their to these 
                 this that were out were new after whose began before them last with
                 sent rose take first second third fourth fifth sixth seventh eighth
                 more garner over also formal into down up strong all hit far day week
                 year decade highest lowest sure hard other recent said our abouta abouthe

# First cat the corpus through tr to delete some characters before TreeTagger gets it.
print "Tagging...";
qx+cat $INFILE | tr -d '=\`"<>/@#?$%^&(){}[]' | $TT_PATH > ./temp 2> /dev/null +;
print "done\nSearching for noun-noun compounds...";

# For every word as tagged by TreeTagger:
open(IN, "<./temp");
while (<in>) {
    my ($word, $type, $lemma) = split(" ");

    # Lower-case, remove '?' and spaces.
    $word = lc($word);
    $word =~ s/\s*//g;    
    $type = substr($type, 0, 2); # Includes NNS, NNX, ..., NN*

    # Skip if 1) the word is in the exclude list
    #         2) the word is contracted 'is' (ie. 's)
    #         3) the word contains digits
    #         4) the word is shorter than 3 letters
    if (grep($_ eq lc($word), @exclude) || lc($word) eq "'s" ||
        lc($word) =~ m/[0-9]/g || length($word) < 3) {
        $prev_word = "";
        $prev_type = "";
        $prev_lemma = "";

    elsif ($type eq 'NN' && $type eq $prev_type) { 
        push(@results, "$prev_word,$prev_lemma,$word,$lemma");
    $prev_word = $word;
    $prev_type = $type;
    $prev_lemma = $lemma;
print "done\nWriting CSV...";

# Count and collapse like instances:
my %count;
map { $count{$_}++ } @results;

# Write to file and sort it, ascending by number of occurances:
open(OUT, ">temp");
map { print OUT "${count{$_}},$_\n"} keys(%count);
qx/echo 'count,word_1,lemma_1,word_2,lemma_2' > $OUTFILE/; # CSV header
qx/sort -n temp >> $OUTFILE && rm temp/; # UNIX numerical sort

print "done\n\n";

Wednesday, July 14, 2010

Spike Activity

Right then.  Since I last spouted:

- Finished those pesky classes.  All wrapped up rather nicely.  This year I successfully tricked the philosophy guys into thinking I was a philosopher, the psychologists I was a psychologist, the linguists a linguist, and computer scientists are still unclear about how what I do has anything to do with computers.  I win.

- Worked through writing a respectable article which my supervisor and I are submitting to be published.  The process basically amounts to finding and jumping through various technical hoops.  More on this later.

- Was accepted to a couple PhD programmes and finally accepted an offer at Trinity College Dublin.  It’s renown for its humanities division, its progressive acceptance of women, and its deplorable exclusion of Catholics and, of course, its impeccable lawns.  I’ll be studying under these two jokers.  I plan on calling the second “Carl Stallman” (long hair + emacs obsession = rms).  The other has so-far eluded a nick-name, however, his demeanor is reminiscent of a certain loud-mouthed PLU faculty member...except the Trinity version's dress-code belies his capitalist sympathies...

- So.  Right now I’m finishing the minor thesis.  The idea: metaphors applied to common financial objects (stock, market, economy, etc...) will be of a few domains (spatial, physical, war, etc...).  But!  The linguistic instances of these metaphors will take a number of forms, many of which are antonymic (fall & rise, soar & plummet, rally & retreat, etc...)  My job is to compare the collocates (words that occur nearby) these various antonym pairs.  The hypothesis: those antonyms that are less agreed upon (experimentally based) will exemplify a more diverse set of collocates in the corpus even though they statistically occur most-often as describing the action of a small set of objects.  Got it?  This is important because 1) it reaffirms that metaphor is cognitive not linguistic but comes at this from the top-down as a corpus-based study, 2) it suggests that some objects make more metaphorical sense when used to describe either positive or negative events, but maybe not both, and 3) the method is an example of a corpus approach to cognitive-linguistics, and more specifically, metaphor.  Bing bang boom.  A+.

- Was lent a Roland electronic drum-set for a month.  Strong points include more-than-real rebound on the toms, a very sensitive snare complete with rim-shot, quiet, and lots of “styles” of drum-set (my favorite was named “power-house-fusion.” Hah!)  Weak points include a very soft hi-hat that offered nearly no rebound, and that its compact size would likely become a crutch, making it hard to go back to bigger set.  In the end it was an absolute joy to play for the first time in a year.  Thanks P.C..  Huzzah.

- Will be moving from the UCD campus to Renelagh (still in Dublin) -- living with architecture students.  Lord help me.

- Reading Nabakov.  Lovely.

- New bands include The Riot Before, The Flatliners, The Menzingers, Turin Brakes, Austin Lucas, and old motown stuff.  New albums include Against Me! - White Crosses, Murder by Death - Good Morning Magpie, and Gaslight Anthem - American Slang.  All well worth buying.

- Someone mistook me for a messenger.  Two Bicycle Points!  Two people asked me directions, questions I could answer.  Two Dublin Points!  And one person on the phone thought I was from Galway.  One Irish Point!

Points abound.

Sunday, March 21, 2010

A Network Model of Biological Evolution?

Finishing Tim Ingold's piece and the book Linked by Albert-Laszlo Barabasi in the same day was more circumstance than anything, but the overlap was unavoidable.  Barabasi, a Hungarian physicist at the University of Notre Dame, has been forging a new line of research known as Applied Network Theory.  Seminal work in the field includes The Strength of Weak Ties, The Small World Problem, and more recently on power-laws: Power laws, Pareto distributions and Zipf's law

Inspired partly by computer networks, applied network theory takes the basic data structure of a network of interconnected nodes and applies it as a model to various naturally occurring phenomenon.  Some particularly good examples revealed in Barabasi's book include social networks, cell-biology and genetics, international financial markets, and everyone's favorite network, the world wide web*.  Interesting properties, of varying degrees of complexity, fall out of simple network structure like "hubs" of proportionally large inter-connectivity, "islands" of relatively segregated sub-networks, and of course weak links such as those found abundantly in social networks.

The connection to Tim Ingold's call for a organism-centric biology is not hard to see.  Network theory offers a simple and scalable model for organism-culture interaction.  A directed graph (one in which nodes' connectivity distribution is said to follow a power-law) could explain mimetic / culturgen heritability and expansion.  Networks have been used to show how companies rise to monopolies, how youtube videos go viral, and why child-naming patterns exhibit momentum.  Networks offer what could prove to be an elegant reconciliation of implicate organismal traits and their relationship to culture.

*One emergent feature of directed networks are sub-structures referred to as "tubes" which are segments in which elongated, one-directional flow occurs.  Thus, U.S. Alaskan Senator from, Ted Stevens, wasn't completely full of crap when he so elequently characterised the internet as a series of tubes.