If you missed the first two parts, you can find them here: Part 1 | Part 2.
When we last left off, we had just wiped out all of the formatting from our Word document and copied it into a text editor, such as Notepad. Here's what my document looks like in the text editor:
I'm using a code editor called 'Sublime Text 2' for my text editor, which is more high-powered than you need right here. (I use it because it's what I use for other things, like website development. You will be fine at this stage if you're using Notepad or TextEdit.)
Now that we've got the text in a text editor, we've effectively removed all of the crud that we managed to fill it with when we first wrote our story in Word. But we still need to tidy it up, and then, if we're going to upload a Word document to Smashwords or Amazon, or if we're going to create a PDF version of our ebook, we're going to have to reformat it all.
That means going back to Word again.
Just a glance at the story I'm formatting shows that it's still a bit messy. For instance, there's a space before the © symbol, and a blank line after the copyright line. We don't want those.
At which point, you might be saying, "Huh? Don't want blank lines? Won't that make everything look crap?*" (*You may choose not to swear, of course.)
Well, it would. Except blank lines can make a real mess of ebooks. Smashwords suggest that you don't have more than four paragraph returns (four presses of the 'Enter' key) to create space. I'm going to go further: don't have any at all. We have a far better way of creating the space we need, and we'll come to that later.
Right now, we're going to get rid of all the unnecessary stuff like extra spaces and blank lines, and do some other tidying-up and conversion at the same time.
At this point, I should note that Smashwords has an excellent and thorough guide for formatting documents for Smashwords. If you are going to upload a Word document to Smashwords, you should familiarise yourself thoroughly with it.
But we want to do more here. We want Word versions for Smashwords and Amazon, and we also want epub and mobi versions, so we'll go about things a little differently.
First thing, though, is to get back to Word.
Open a brand-new, blank document in Word.
Copy all the text from your text editor, and paste it into the blank Word document. Now save it.
It should look something like this, depending on how your blank template is set up in Word (mine has double line spacing and indented paragraphs; yours may be different; it doesn't really matter at this stage):
Right now, it probably doesn't look much different from the way it did when you last had it in Word, which might make the process seem a little redundant. But you can be absolutely sure now that there are no lingering, ugly bits of formatting that would mess you up at a later stage.
Some people indent their paragraphs using the 'tab' key. Even if you don't usually, there might still be a few places where you've done that.
The process we've followed so far won't have got rid of your tabs, and we don't want them in our ebooks, so let's get rid of them first.
Open up the 'Replace' function in Word. In the Find box, type exactly this:
The Replace box should be completely empty.
Make sure you don't have any formatting options, like bold or italic, still set in these boxes from earlier in the process. If you do, clear that formatting first.
Your Find and Replace should look like this:
Click 'Replace All'.
Your tabs should now all be gone.
Next up, we want to get rid of any extra spaces that we've got. These might be at the beginning or end of lines, or you may be in the habit of typing two spaces after a full-stop (period). Maybe you use spaces to indent your paragraphs. Or you might just have stuck some in by mistake or to space things out.
Whichever, we don't want them. Not in ebooks.
First, we'll get rid of any multiple spaces.
Again, go to 'Replace' in Word.
Make sure there's nothing at all in either the Find or Replace boxes. No formatting. No typing. Nothing. Put your cursor in each of the boxes in turn and delete anything in there.
Now, go to the Find box and hit the space bar twice.
Then, to go the Replace box and hit the space bar once.
Click on 'Replace All'.
You may well have to do this several times. Keep going until Word finds nothing more to replace. I had to do it three times with the document I'm working with before Word said it found 0 replacements.
Spaces at the end of lines
In general, spaces at the end of lines won't cause you too many problems, because they'll be ignored. But to be really sure, let's get rid of them anyway. It won't take long.
'Replace' is your friend again.
As before, make sure there's absolutely nothing typed in the Find or Replace boxes, and no formatting.
In the Find box, hit the spacebar once, then type the following exactly:
In the Replace box, type the following exactly (without any spaces):
Then click 'Replace All'. The ^p that you typed is the symbol for a paragraph return in Word. We are searching the Word document for any paragraph return that is preceded by a space, and then we are replacing it with just the paragraph return, sans spaces.
Spaces at the beginning of lines
More important are the spaces at the beginning of lines, like the one I've got at the beginning of the copyright statement. These will make a mess of your formatting. The procedure is very similar.
Go to 'Replace'.
Clear out everything that's in there, including spaces.
In the Find box, type ^p and hit the spacebar once.
In the Replace box, type ^p without any spaces.
Click 'Replace All'.
I had a lot of these spaces at the beginning of lines. (Tut-tut!). Now I have none. (Yay!)
This time, we are searching for any paragraph returns that are immediately followed by a space (at the beginning of the new paragraph), and we are replacing them with just the paragraph return. No space.
Remember we don't want blank lines either. So, we're going to get rid of them.
The way we do this is, as usual, with 'Replace'. Open up the Replace function in Word. In the Find box, type exactly this:
In the Replace box, type exactly this:
Remember, the ^p that you typed is the symbol for a paragraph return. Twice means that there are two paragraph returns, one directly after the other, which means a blank line. We replace the two paragraph returns with a single one.
You need to repeat this process until there are no more results of the Find and Replace. If you've used a lot of paragraph returns at various points, this might take a few goes.
When you're done, have a quick scroll through your document. There shouldn't be any blank lines anywhere.
If there are any, and you've followed all of this, that probably means there's some other formatting in the way. Click on the show / hide formatting symbol in the toolbar on Word. It looks like this:
Scroll through and look at the formatting. Paragraph returns are shown with the above symbol. If you have two together anywhere, just manually delete one.
Hopefully, you won't have to do that. I find that my version of Word makes me do this if there are two paragraph returns right at the end of the document. Who knows why?
Sorting out italics
Now, you'll recall that we replaced our italics with placeholders, so that we could easily reinsert them. Everything that should be italic starts with <em> and ends with </em> (or whatever other placeholder you chose. We need to reverse that.
But first, we're going to tidy up any poor formatting around the italics.
We want to make sure that the text and only the text (and any spaces in between words) are italicised. For example, we want to make sure that no spaces at the beginning or end of the italicised text are also included. And we want to make sure the paragraph returns aren't within the italics. Doing this will mean that, when we convert to epub later on, there are fewer issues to deal with.
We're going to use Find and Replace again, of course.
First up, in the Find box type the following:
<em> and then hit the spacebar.
In the Replace box, hit the spacebar then type the following:
Next, in the Find box hit the spacebar then type:
In the Replace box type:
</em> then hit the spacebar.
Now the line breaks.
In the Find box, type:
In the Replace box, type:
In the Find Box, type:
In the Replace Box type:
Phew. That was kind of exhausting.
Now do exactly the same with bold. (Remembering that we used <strong> and </strong> to wrap the bold text.)
Getting back the italics
We've tidied up the italics, but now we actually have to turn them into italics again. This may appear a little more complex, but it's really quite easy. Just type exactly what I show, and it'll work.
As usual, go to 'Replace' in Word.
In the Find box, type exactly this:
Click on the down arrow (if you haven't already) to show the further options. Click on the 'Use wildcards' checkbox, to make sure it is checked. (Remember, your version of Word may do this differently, but you'll be able to select 'Use wildcards' as an option somehow. Use your 'help' button if necessary.)
In the Replace box, type exactly this:
Still in the Replace box, choose 'Font' from the 'Format' dropdown, and then click on Italic and OK.
The appropriate parts of the manuscript should now be italicised. You'll note, though, that we still have the <em> and </em> markers:
Good old Find and Replace will sort that out.
In the Find box, type:
In the Replace box, clear the formatting (click the 'No formatting' button, or similar) and delete everything that is in the box, so it is blank (not even spaces).
Do the same, but with </em> in the Find box.
Sorted. You now have italics back, without any errors at all in the formatting.
You need to do the same with bold. You can probably figure out how. (Hint: in the Find box, you'll be typing: \<strong\>*\</strong\> ).
There's some more tidying up to do before we style this document, but this has been a long entry, so let's leave if for next time.
If you're getting stuck or confused, feel free to ask questions in the comments!
Part 4 is now available: read part 4 here.
(Note: if you are interested in hiring me for ebook cover design or ebook formatting, you can see samples of my work here: http://www.50secondsnorth.com/ebooks/ and see details (including cost) of my services here: http://www.50secondsnorth.com/ebooks/details-rates.html)