File formats – what are they really?

FileFormatMeaningsSo, you’re ready to write your novel; or you’ve already written it and are about to upload to a website (like for publication. We state on our submission page that we accept DOC, RTF (rich text format), and PDF (portable document format) formats but that we suggest DOC as the best format possible for conversion to the other reading formats (TXT, HTML, ePub, and Mobipocket). And you see all that and wonder what in hell we’re trying to say.

It’s confusing, if you haven’t worked within the computer technology field, to understand what all those acronyms mean (and in fact, they aren’t really acronyms – just a reference to the file extension that your file ends up with when saved on disk). The only thing you might be familiar with all that is DOC – because who doesn’t know that Microsoft documents carry a .DOC suffix – right? Well, I am going to attempt to clear up much of that right now for you. So bear with me while I wax technical for a minute.

Back when there wasn’t as much hype about online technology, about the only way to produce a manuscript was to either write it by hand and have someone type it up for you on a typewriter, or experiment with a computer. In the early days, your only options for word processors was Microsoft Word, Wordperfect, or WordStar (and I know I am dating myself when I include WordStar in that list – who even remembers that application?) A lot of people still rely on handwritten or typewritten manuscripts, preferring the sounds of the typewriter keys or pencil on paper to the near-silent clack of a computer key.

Thankfully, technology has come a far way since then. Now we have an almost infinitesimal number of text editors for use on the computer and online when we’re ready to write. Personally, I think I have found the best possible combination of tools for my own writing. (In case you are interested, I work on an Apple computer, using a combination of a desktop Simplenote application called nvALT, and Byword) I like distraction free writing because I struggle with distractions ordinarily. Facebook, instant messages, Skype messages, email … and even that pesky spelling/grammar underline mode that show me when my fingers miss a letter or add one where it ought not to be. So word processors have never been a favorite program for me. (I utterly despise writing in Word – there’s usually far more going on in Word than just accepting keystroke input – though I depend on it heavily if I am editing other people’s manuscripts. That “Track changes” mode is undeniably vital to the editing process.)

So I use a simple text editor that will take certain codes to add in things like links, basic formatting (bold, italics, etc.), and has the ability to publish online if I choose. This leaves me with either a .TXT or a .RTF file as end-product. Using Word will net you a .DOC file. Other advanced programs use file formats that are too complex for simple conversion programs to use. (e.g. Scrivener project files contain far more information than just your words).

Now, from a technical standpoint, .DOC files hold a lot more information than just your words as well. It contains directives which control formatting, margins, page printing mechanisms, typesetting, font control, header and footer information (even when there are no headers or footers), and other extra stuff. But, most conversion tools are able to recognize this information since Word has been around since the very early days of word processing. This is why we at say that Word files are the best way to get the best formatting in conversions.

Why not PDF”, you might ask. “It’s been around for about as long as Word, hasn’t it?”, you might add. Yes, Adobe’s PDF format has been around for ages as well. The thing is, PDFs contain fixed instructions on how to display text and with today’s eReaders, that just doesn’t work out very well. Open a PDF on a Kindle or a Nook and you’ll see exactly what I am talking about. It’s practically unreadable (although most eReaders now do a fairly decent job at interpreting and displaying PDFs, it still isn’t ideal).

The Kindles and Nooks and other eReaders of nowadays work off a basic standard of file formatting – Kindle uses a proprietary Mobipocket formatting and most other eReaders use a format based on the ePub (also from Adobe) and they are tailored to be displayed within the confines of the eReaders’ display capabilities. In order to get a file into that format, we need a more pliable file format than the PDF (remember the PDF has strict instructions on how a file ought to be displayed).

If you’ve written a book laden with images, you might also notice that displaying your eBook on an eReader is pretty awful as well. That is because most basic eInk eReaders don’t display images very well, or at least only in black & white. They are designed to handle text-only eBooks and don’t expect a book with lots of images to be a part of the plan. Of course, this is not to say that you should not design your eBook with advanced devices in mind (iPad, Kindle Fire, Galaxy tablet, etc.). All I’m saying is that not everyone is likely to have one of those devices to read on and that we ought not to leave the basic eInk eReaders out in the cold.

I may be rambling on a bit too long now, so let me wrap this up by noting the following points:

  • If you’re writing a novel, think about making the words be the art in your work instead of adorning it with multiple colorful images or fanciful fonts. A good number of readers will actually miss out on all that because most conversions end up simplifying what the text looks like so that it will read best on an eInk eReader.
  • Forget about headers and footers because they are items that either disappear when converted, or end up breaking your text up in odd places on an eReader.
  • Page numbers too – because most eReaders redefine that size of a page based on the text size that the reader chooses. (e.g. sometimes when my eyes are tired, I increase the font-size in the book I am reading on my Kindle so that the words are larger type – doing that displaces the page length.)
  • When in doubt, save (or export) your book as a text document and see how it looks there – that will give you an extremely basic idea of what it will eventually look on an eInk eReader (just remember to re-add your basic font and page-break formatting, then save as an .RTF or a .DOC file when you’re done). Which then brings me to …
  • Font choices: look up (using your favorite search engine) “web safe fonts” for ideas and stick to those choices as much as possible. (And incidentally, Comic Sans MS might be on the list of web-safe fonts, but it is actually a font much reviled in the writing community).

When you’re done, feel proud that you have safely navigated the somewhat messy waters that are file formats.

Thanks for reading. Find more similar entries filed under Writing or visit the blog homepage for more great content from your friends at