Whoa. Good myth.

Wash ,'Our Mrs. Reynolds'


Buffistas Building a Better Board  

Do you have problems, concerns or recommendations about the technical side of the Phoenix? Air them here. Compliments also welcome.

To-do list


John H - Jan 01, 2003 10:24:43 pm PST #2530 of 10000

the database is doing it, and so it's faster than us having to handle a PHP data structure

I hope I'm not being a pain, but I honestly don't get this.

Leaving aside the fact that the DB doesn't seem to do Random properly anyway, what are the efficiency issues that concern you?

If we get PHP to pick a random number between x and the number of quotes, then get it to select the quote that's greater than or equal to that number, that's somehow always less efficient than getting the DB to do it? Even though we know that DB does it in a very inefficient way (according to some dude on USENET anyway)?

There's something about back- and and front-end efficiencies that I obviously don't get.


§ ita § - Jan 01, 2003 10:42:53 pm PST #2531 of 10000
Well not canonically, no, but this is transformative fiction.

What do you mean "the quote that's greater than or equal to that number"?

I'm missing something. You want to pick a random number between 1 and what? And then select the quote with that ID, or something near it?


John H - Jan 01, 2003 10:49:57 pm PST #2532 of 10000

select the quote with that ID, or something near it

Well I only said that because you said the ids weren't consecutive, so I was thinking we'd select quote ID $x, and if there wasn't one, it could find the next one up -- is that feasible? Pseudocode: "select where equal to or greater than $x, limit of one"?

Or just go back and rewrite the numbers so they are consecutive.


§ ita § - Jan 01, 2003 10:55:17 pm PST #2533 of 10000
Well not canonically, no, but this is transformative fiction.

Or just go back and rewrite the numbers so they are consecutive.

Every time something's deleted? I dunno ...

Okay -- so we pick a random number -- what's the upper ceiling, or do we hit the database for that beforehand?

Then we do select charactername, quotation, quotes.season, quotes.episode, ep_title from quotes left join episodes on episodes.show_name=quotes.show_name and episodes.episode=quotes.episode and episodes.season=quotes.season where is_approved='Y' where quotes.id >= randomnumber limit 1, you're saying?

Personally, for something as trivial as quote frequency, I *suspect* the overhead in connecting, counting, returning, tearing down, connecting, selecting, returning, tearing down is more than the ORDER BY RAND() LIMIT 1.

But I admit I have no benchmarks.


John H - Jan 01, 2003 11:02:28 pm PST #2534 of 10000

John H - Jan 01, 2003 11:02:33 pm PST #2535 of 10000

what's the upper ceiling, or do we hit the database for that beforehand?

My quick fix is to say "how many do we have now? hard-code that in and remember to change it when we put some new quotes in" because only you are going to be putting new quotes in anyway.

select charactername, quotation, quotes.season, quotes.episode, ep_title from quotes left join episodes on episodes.show_name=quotes.show_name and episodes.episode=quotes.episode and episodes.season=quotes.season where is_approved='Y' where quotes.id >= randomnumber limit 1

That seems incredibly long to me. What's all that "show_name" and "season" stuff doing in there? If we don't display it? I'm puzzled, sorry.


§ ita § - Jan 01, 2003 11:11:10 pm PST #2536 of 10000
Well not canonically, no, but this is transformative fiction.

Corrected.

select charactername, quotation, quotes.season, quotes.episode, ep_title
from quotes
left join episodes
on episodes.show_name=quotes.show_name and episodes.episode=quotes.episode and episodes.season=quotes.season
where is_approved='Y' and quotes.id >= randomnumber limit 1

Showname is in the join, not the fields returned -- a join on episode 4, season 2 in the episode database will return Inca Mummy Girl and Untouched. We need to know which show a quote is from too. The season is in the fields returned because that's a field in the quote data structure, as is the episode number. It's not mandatory for the display, but puts little additional drain on the system, so I didn't remove it after we started displaying titles instead of ep numbers.

I have to repeat my position against hardcoding. The only reason I'm the only one massaging the table is because the code hasn't been written yet.

Admittedly I'm absolutely not bothered by having one quote come up much more than the other. I don't read them. And I really don't want to write kludgey code or double database hits for them either.


John H - Jan 01, 2003 11:16:18 pm PST #2537 of 10000

Showname is in the join, not the fields returned -- a join on episode 4, season 2 in the episode database will return Inca Mummy Girl and Untouched.

D'oh. Sorry about that. Staring me in the face.

I have to repeat my position against hardcoding. The only reason I'm the only one massaging the table is because the code hasn't been written yet.

I only meant that as a quick and dirty fix.

So the next best thing is to just select a COUNT from quotes and make that the randomnumber. Can we put that into the select, or do we have to do two selects, one to get the count and one to get the quote? it's been a while since I've thought about SQL.


§ ita § - Jan 01, 2003 11:18:23 pm PST #2538 of 10000
Well not canonically, no, but this is transformative fiction.

We'd have to do two hits.

Is this *really* worth coding? That's my skepticism. I think the quotes are cute, and that's about it.


John H - Jan 01, 2003 11:21:06 pm PST #2539 of 10000

I don't care about the quotes, or the less-than-truly-random-ness of them, I'm just interested in the tech. I'd really like to know why it's not being very random. I certainly didn't see a whole bunch of posts in usenet saying "that damn ORDER BY RANDOM() is broken, everyone knows that", just a few saying it wasn't very efficient.