[Noisebridge-discuss] [HAIRSPLITTING] Re: 5 geek fallacies

Mikael Vejdemo-Johansson mik at stanford.edu
Sat Feb 27 03:36:45 UTC 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Feb 26, 2010, at 3:39 PM, Jesse Zbikowski wrote:
> On Thu, Feb 25, 2010 at 11:56 PM, Sai Emrys <noisebridge at saizai.com>  
> wrote:
>> Unlike your search, Seth's is not attempting to use popularity to
>> determine historical facts, but to determine the *popularity* of a
>> collocation.
>
> Well, the idea was to determine the *correctness* of the collocation,
> not how many times it's been repeated on the Internet. Trial By Google
> represents a familiar kind of confirmation bias: you form a
> hypothesis, and test it in a way that can only turn up supporting
> evidence. People may well employ this collocation, but that does not
> preclude the possibility that there is a different and preferred
> construction which doesn't reveal itself in such a search.
>
> All kidding aside, there are a number of forums (mainly geared toward
> non-native English speakers) which can offer much more satisfying
> analyses of grammatical questions, for the truly curious. However I
> suppose meta-grammatical discussions along the lines of "does grammar
> have a logical and prescriptive component, or does it merely describe
> how people use language" are more or less par for this list.

Grammatical analysis and descriptive linguistics are different though:  
certainly, one could seek out an appropriate language geeks forum, and  
sit down for erudite analysis of what the current models for  
describing English grammar or word usage make of a given phrase; but  
this ends up being a comparatively prescriptive approach relying on  
the preciseness of past analyses.

What Seth was doing is something I've often seen linguistics  
researcher do (and even cite in research reports) - namely use Google  
to acquire a feeling for relative popularity of different  
collocations. It's a cheap and low labour approach to what otherwise  
means either digging through a couple of thousand newspapers (or other  
textual corpus source) by hand, or - preferred to that - using an  
already established corpus; which tend to come with licenses that make  
them less than accessible for casual research.

Sure, it has confirmation bias. But on the other hand, is [Noisebridge- 
discuss] really a venue you expect peer-review grade research from?


Mikael Vejdemo-Johansson, Dr.rer.nat
Postdoctoral researcher
mik at math.stanford.edu






-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkuIk1MACgkQtUmpDMB8zM2w7QCcDUStDbYAg0Ft9auMT+Hm4ncB
ACoAn2gUFhhrsHa8fcV2DGyWtr/MFJfy
=UlpF
-----END PGP SIGNATURE-----



More information about the Noisebridge-discuss mailing list