For the person who may have everything.
Buffistechnology 2: You Made Her So She Growls?
Got a question about technology? Ask it here. Discussion of hardware, software, TiVos, multi-region DVDs, Windows, Macs, LINUX, hand-helds, iPods, anything tech related. Better than any helpdesk!
Can anyone explain Unicode to me? I have 2 requirements for the explanation: (1) pretend I'm a complete moron -- as in, I don't really know *what* Unicode IS (I thought it was a font); and (2) how does it apply to typesetting?
I will be eternally grateful. Every website I've pulled up about Unicode -- including Wikipedia -- has just made my head hurt.
A Unicode character has two bytes, as opposed to one byte for each ASCII character. This allows many more possible Unicode characters (256²?) , in order to accomodate the many characters of the many languages of the world.
That's all I know.
Very basically, since computers fundamentally deal with (binary) numbers, computers have have had to assign numbers to each of the characters on the keyboard so that it can process them.
Back in the Dark Ages of computing, different computer manufacturers used different sets of numbers to encode letters. In the US, they finally put an end to this madness by specifiying a standard encoding for all the letters and symbols, called ASCII.
But when computers around the world started connecting to each other on the Internet, another problem became apparent: computers in different countries each used their own encoding for their own alphabets, even if their alphabets used the same symbols.
Unicode is an attempt to provide a single encoding for all the alphabet systems in the world. It's a really big, complicated subject.
A Unicode character has two bytes, as opposed to one byte for each ASCII character. This allows many more possible Unicode characters (256²?) , in order to accomodate the many characters of the many languages of the world.
Unicode characters can be as long as 22 bits. [link]
Unicode characters can be as long as 22 bits.
Oh.
Now I know I knew less than I thought I knew.
Unicode is an attempt to provide a single encoding for all the alphabet systems in the world
How does/would/will that affect typesetting? We're switching to a paperless system of editing (which might be the straw that breaks my back and gets me to quit), and my boss keeps saying "At the seminar, they said Unicode was important -- do we *have* Unicode? Or do we need to tell the authors *they* need Unicode?"
She has no idea what it is, and seems to think I should.
Basically, we're going to give ourselves carpal tunnel syndrome and eyestrain by editing everything on the computer, in Word. Then the Word files get dumped into Quark (or InDesign, because we might as well change EVERYTHING ALL AT ONCE AND OH NO THAT WON'T CAUSE ANY SNAGS NOT ONE BIT) for layout.
How does Unicode come into play there? Or does it even?
The very, very, short answer to your question, Steph.
Whatever tool you're using to edit your documents ought to give you the option save your documents using a Unicode encoding. If you're using XML, they're probably already Unicode to begin with.
There are several different ways of storing Unicode files, the most typical ones are UTF-8, and UTF-16. It probably doesn't matter which one you pick, as long as each stage of your workflow is aware of which encoding you're using.
Whatever tool you're using to edit your documents ought to give you the option save your documents using a Unicode encoding. If you're using XML, they're probably already Unicode to begin with.
Um. So a Word document has the option of being saved as Unicode?
So a Word document has the option of being saved as Unicode?
If you're saving it as a .doc file, then no. But your tools that read .doc files ought to be able to make sure that all the characters are encoded properly.
If you're saving it as a .txt file, then Unicode would be one of the options for text files.
If you're saving it as a WordML (XML) file, then you get Unicode for free.