Never goes smooth. How come it never goes smooth?

Mal ,'Safe'


Buffistas Building a Better Board  

Do you have problems, concerns or recommendations about the technical side of the Phoenix? Air them here. Compliments also welcome.

To-do list


Tom Scola - Jan 08, 2004 12:10:20 pm PST #6494 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

OK, first of all, I was supposed to set up the mailing lists, but I've been slacking. I apologize for that.

Here's the short summary about the performance:

  • The good news is that as far as I can see, the performance problem is strictly a matter of fixing the code. Given our current usage patterns, there's no reason why we can't go back to a cheaper shared hosting solution at the end of the year. Nobody needs to change their behavior.

  • The bad news is that because of the way the code is structured, much of it will have to be thrown out and rewritten from scratch to get the most benefit.

Basically, every time the code queries the database, it uses a "connectAndQuery" function. This gets called around 3 or 4 times per page view. The first problem is that database connects are extremely expensive. Every time we connect to the database, mysqld forks a copy of itself, which then goes away when the connection is closed. The second problem is that for every query, the entire table gets loaded into memory, and then the table gets written out as HTML. This is especially bad with threadsucks, where the size of a httpd process can grow to over 40 megabytes. A far better approach is to fetch the result one row at a time, print out a line of HTML, and then fetch the next row.

Unfortunately, the connectAndQuery idiom is used everywhere in the code.


§ ita § - Jan 08, 2004 12:19:04 pm PST #6495 of 10000
Well not canonically, no, but this is transformative fiction.

A far better approach is to fetch the result one row at a time, print out a line of HTML, and then fetch the next row

How do you think we can avoid the totals being different at the end of the page than they are at the top? Or should we just try and leave all totals or cumulative info until the bottom?

Also -- is the forking shutting down correctly, or are we sitting on connections that never disappear (according to what Rob found in the MySQL code?).


Tom Scola - Jan 08, 2004 12:30:26 pm PST #6496 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

To be perfectly honest, I'm not exactly sure why mysqld is forking. Up to now, I had assumed that mysqld was a single-instance, multithreaded process. That's the way I've seen it work in other places. Maybe it's a configuration issue; I haven't seen anything about it in the mysql manual. Maybe it's just how threads show up in a Linix ps listing.

is the forking shutting down correctly, or are we sitting on connections that never disappear (according to what Rob found in the MySQL code?).

It wasn't that the connections weren't going away, it was that an internal variable that counted the connections was wrong. I haven't looked into that. I don't see any evidence of actual connections that are sticking around.


Tom Scola - Jan 08, 2004 12:33:08 pm PST #6497 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

Cereal:

How do you think we can avoid the totals being different at the end of the page than they are at the top? Or should we just try and leave all totals or cumulative info until the bottom?

I don't know. It would still be better, as a compromise, to write the HTML to a string and keep the string in memory than to keep the SQL table in memory as an array.


§ ita § - Jan 08, 2004 12:34:41 pm PST #6498 of 10000
Well not canonically, no, but this is transformative fiction.

Sorry. Phrased poorly -- I meant is it what Rob found, or something "legit".

I'm apprehensive of the line by line redesign -- I'd thought the tradeoff would lean towards one large transaction as opposed to setting up and tearing down many small ones, but mostly it was about preserving the integrity of the pages.

If it will save much money, however (what the hell are those MySQL slobs doing! Fix the product!) it should be considered.

I'd prefer to switch DB backends, honestly, to something with stored procedures and triggers.


Tom Scola - Jan 08, 2004 12:43:40 pm PST #6499 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

I'd thought the tradeoff would lean towards one large transaction as opposed to setting up and tearing down many small ones,

It's not the transactions that are a problem, it's the opening and closing of connections to the database. You can have many transactions per connection, we just don't currently do it that way, and that's what we need to fix.


§ ita § - Jan 08, 2004 12:46:45 pm PST #6500 of 10000
Well not canonically, no, but this is transformative fiction.

Is this a serious enough issue that you'd recommend changing the flow even if we moved to Postgres? I really want to move, because the triggers and stored procedures, etc. look like they should clear up post numbering issues.


Tom Scola - Jan 08, 2004 12:52:09 pm PST #6501 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

In my mind, the code is a lot more malleable than the database is. There's nothing inherently wrong with the database. In fact, none of our SQL queries will need to change, just how they're executed in the code.


§ ita § - Jan 08, 2004 12:56:08 pm PST #6502 of 10000
Well not canonically, no, but this is transformative fiction.

There's nothing inherently wrong with the database

Well, except it's not as full featured as I'd prefer, and there's that bug Rob found.

I've wanted to change to Postgres since shortly after we went live. It's completely unrelated to the connections issue.

What I'm wondering -- if the back end change is a given, how does that affect your urgency on the page model change?


Tom Scola - Jan 08, 2004 1:00:42 pm PST #6503 of 10000
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.

if the back end change is a given, how does that affect your urgency on the page model change?

The new-connection-for-each-transaction thing needs to be fixed, no matter what database we're using. There should be one sql connection per page view (and if those connections can be cached across page views, even better).

If the loading-the-entire-query-into-memory thing could be fixed for just threadsuck, that would probably be good enough.