Hey, don't worry about it. Nest full of vampires, you come get me, okay. Box full of puppies, that's more of a judgement call.

Jonathan ,'Lies My Parents Told Me'


Bureaucracy 2: Like Sartre, Only Longer  

A thread to discuss naming threads, board policy, new thread suggestions, and anything else that has to do with board administration and maintenance. Guaranteed to include lively debate and polls. Natter discouraged, but not deleted.

Current Stompy Feet: ita, Jon B, DXMachina, P.M. Marcontell, Liese S., amych


erinaceous - Jan 12, 2004 2:17:11 pm PST #6440 of 10005
A fellow makes himself conspicuous when he throws soft-boiled eggs at the electric fan.

hey Bureaucracy Folks!

I'm never in here, but something has come up that I want to Propose to the Buffistas.

As many of you know, I work for mumble-mumble Dictionary, and I am also involved with an organization that is collecting examples of modern American speech and writing for dictionary editors, people who write ESL textbooks, computational and corpus linguists, etc., to compile a corpus for research purposes.

They are looking for web data. I instantly thought about the Buffistas, because we are overwhelmingly American (I won't mention Nilly, Angus, Moonlit, the Nova Scotians, Am Chau et al.), we are highly literate, and our discussions touch on a great many topics (unlike many other discussion boards), and most importantly, the Buffistas OWN Buffistaland, which means that, unlike with some other sites like WX, we didn't give away the store to what we've written.

The data can be anonymized to remove usernames, and the entire board would not be included. They would take, at most, several hundred thousand words at random. (The entire corpus is going to be 200 million words, just to let you know scale.)

I said I would ask if the Buffista discussions (although I didn't actually say "Buffista") could be corporized. I am asking here, but I think that this is something that the Buffistas should vote on.

Plusses: Our writing would influence the creation of dictionaries and teaching materials for certainly at least the next ten years, and realistically for the next forty or more years. We would really be helping a wonderful non-profit scholarly endeavor. It would also push the word 'foamy.'

Cons: Although the selection of words would be random and not connected to user names, it is stuff that would be seen by perhaps as many as a thousand researchers over the next decade. The site might have to be identified by URL in the documentation of corpus sources (I'm not sure). The Buffista Stompies would have to sign a release allowing the corpus researchers to use the data under a restricted license (which says that only people who agree to use the data for scholarly purposes can see it, no commercial exploitation, and that researchers must follow certain ethical rules).

I am happy to answer any questions, either here (I will try to check this thread daily) or by profile email. I promise no salesperson will call. :-)


Katie M - Jan 12, 2004 2:21:25 pm PST #6441 of 10005
I was charmed (albeit somewhat perplexed) by the fannish sensibility of many of the music choices -- it's like the director was trying to vid Canada. --loligo on the Olympic Opening Ceremonies

So... it'd be words, not sentences/full posts?


Sophia Brooks - Jan 12, 2004 2:23:41 pm PST #6442 of 10005
Cats to become a rabbit should gather immediately now here

That sounds interesting, although I am not sure that all the words I use are even remotely real! But does this mean "foamy" might be in the dictionary?


DCJensen - Jan 12, 2004 2:25:18 pm PST #6443 of 10005
All is well that ends in pizza.

does this mean "foamy" might be in the dictionary?

It otter be.


Laura - Jan 12, 2004 2:25:28 pm PST #6444 of 10005
Our wings are not tired.

Perhaps we could otter up lightbulbs to keep discussion together although a vote may or may not actually occur. Very interesting project and I am sure that many of us will have questions.


Jesse - Jan 12, 2004 2:26:16 pm PST #6445 of 10005
Sometimes I trip on how happy we could be.

But Sophia, once they're in the dictionary, they're totally real! Like bootylicious!


Cass - Jan 12, 2004 2:27:15 pm PST #6446 of 10005
Bob's learned to live with tragedy, but he knows that this tragedy is one that won't ever leave him or get better.

We can make new words? Oh yeah.


Aims - Jan 12, 2004 2:45:56 pm PST #6447 of 10005
Shit's all sorts of different now.

I offer up 'constructability' as a new word.


DXMachina - Jan 12, 2004 3:00:38 pm PST #6448 of 10005
You always do this. We get tipsy, and you take advantage of my love of the scientific method.

I love the idea. Before we open up lightbulbs, I think we probably want to get a little more info, and then some one can craft a proposal that we can vote on.

So... it'd be words, not sentences/full posts?

This is the question that I have. It would be the words without the context?


amych - Jan 12, 2004 3:12:44 pm PST #6449 of 10005
Now let us crush something soft and watch it fountain blood. That is a girlish thing to want to do, yes?

I am not sure that all the words I use are even remotely real!

You use 'em, right?