+ ~ -
 
Print E-mail
GUIDELINES UPDATE: As of 16th November 2011, new guidelines for dashes and punctuation marks have been implemented for both the online text correction and online moderation projects. Please see below for further information or contact us if you have any queries.

HEADING

TEXT

EXAMPLES / IMAGE / SCREENSHOT


Accents – Advertisements – Christmas Numbers –  Currency symbols – Dashes – Font sizes – Footnotes – Headings – Household Narrative: currency symbols – Household Narrative: first pages – Household Narrative: fractions – Household Narrative: tables –  Household Words Almanac: images –  Household Words Almanac: calendar days of the week  â€“ Household Words Almanac: textflow â€“ Hyphenation – Images – Italics – Line-length – Line-breaks (double) – Line-breaks (enforcing) – Logged in? – Moderation â€“ Missing text – Paragraphing – Poems/poetry – Punctuation – Saving pages (glitch) – Selecting a 2nd magazine – Spacing – Spelling mistakes – Symbols – Tables and charts – Titles and Headings – Unusual characters


Accents, how to insert?

See under ‘Unusual characters’ etc.


Advertisement(s), do I retain?

Sometimes advertisements appear on the final page of a magazine. Please do not remove these but retain the text as it appears in the original (it is not necessary to replicate the original layout—
just place the entire text of each advertisement in a single paragraph box).


Christmas Numbers, should I retain the masthead?

Yes, please retain the masthead, title and index on the cover page of Extra Christmas Numbers.


Currency symbols, how to present?

Dickens’s journals quote British currency in pounds, shillings and pence, using the symbols/abbreviations l. or ₤ for pounds, s. for shillings, and d. for pence. The abbreviations ‘s.’ and ‘d.’ should be reproduced in italics, and the ‘₤’ symbol,should always be used (from the unusual characters sub-menu), in front of a numerical sum of money, rather than an ‘l.’ in front, or following. TTS rendition will make sense of the pound symbol. Similarly for dollars. (This is particularly important when working on the final page of issues of the Household Narrative, which abounds in currency abbreviations.)


Dashes (double hyphens, ‘em’ dash, ‘en’ dash etc.), why missing and how to insert?

IMPORTANT NOTE: As of 16th November 2011, the guidelines for dashes have changed. Please read the guidelines below carefully. Corrections begun BEFORE 16th November should continue to follow the old rule.

 

NEW RULE (after 16th Nov): OCR very seldom, if ever, reads horizontal lines – you will notice therefore that most divisions between articles are missing, and also most dashes. There is no need to reproduce divisions between articles, but please do reproduce/insert a long (‘em') dash (—) with no spaces on either side as appears in the body of the original text. You can find a long ('em') dash from the Symbol/Unusual characters menu (Ω) or by clicking the em dash icon at the bottom of the editing panel. Once you have inserted one, copy it (by highlighting it and pressing Ctrl + c) and paste it (by pressing Ctrl + v) to repeat insertions. Where the page image shows an even longer dash, if necessary, use a double (or even triple) long ('em') dash.


OLD RULE (before 16th Nov.): OCR very seldom, if ever, reads horizontal lines – you will notice therefore that most divisions between articles are missing, and also most dashes (double hyphens, ‘em’ dashes, ‘en’ dashes, etc.). Please insert a double hyphen with a space on either side ( — ), in place of a long (‘em') dash (—). Alternatively, you can insert an 'em' dash (with no spaces on either side) or 'en' (–) dash (with spaces on either side) from the Symbol/Unusual characters menu (and copy+paste to repeat insertions). Where the page image shows an even longer dash, if necessary, use a triple or quadruple hyphen. In due course, we shall automate the replacement of the double dashes with 'em' dashes.
Note: the compositors of Dickens’s journals made very extensive use of dashes, long and short, to help them adjust and equal out the relatively short lines of the Household Words and All the Year Round columns. Do not worry about trying to reproduce end-of-article division lines.


Font sizes, and families: reproduce or not?

Given that readers who can view the page image will have access to an exact replica, there is no need to attempt to style fonts according to size or family. A plain text in a uniform size of san serif font is all that is aimed at in manual text correction. So, for example, the Gothic font used on the masthead of All the Year Round should appear in the same font as everything else.

Gothic font1.jpg

Footnotes

Occasionally you will come across footnotes (maximum one or two per magazine), which, if they come at the foot of column 1, break the flow of text in the main body of a page. For the time being, we suggest relocating the footnote to the bottom of the paragraph of body text in which the footnote symbol (usually '*') is inserted, using the 'cut and paste' keys: Select/Control+C to copy; Control+V to insert.

If the text of a footnote runs across two pages, please move the text from the latter page onto the previous page, so that the entire text of the footnote runs on from the bottom of the paragraph of the body text in which the footnote symbol is inserted.


Headings

(see ‘Titles and Headings’)


Household Narrative: first pages

For the first twenty-four months of its publication, the Household Narrative ran as its opening articles an overview essay on the month's most significant events in England, Scotland and Wales, called 'THE THREE KINGDOMS'. These articles (probably penned for the most part by Dickens's friend, the journalist and historian John Forster) run across the whole width of the page, rather than in two narrow columns, and they occupy the first few pages of each of these 24 issues. For text correctors with moderator status (the correction of Narratives is restricted to the latter), the correction of these articles does not require any different set of procedures, but we recommend you take a little time to optimise the page view, both in 'Edit' mode, and again, in 'Read' mode. Use the corner icons in either the facsimile or the edit panel, or both, to adjust the width of each so that the lines do not wrap in an unnatural way. Please take extra care with lineation, so that each line in the transcript consists of the same set of words as in the facsimile. When you reach the end of 'THE THREE KINGDOMS', the page will revert to the usual 2-column format with which you are familiar.

HNarrative_thumb
Household Narrative: fractions

The final pages of the Household Narrative abound in fractions, the majority of which cannot simply be inserted by means of the Unusual Characters / Symbols menu Ω. For example, here is a line from the table headed 'RAILWAY' in the second column of p. 24 of the first number of Volume I of the Narrative:

Fractions1
And here is how it was represented at first in the uncorrected transcript:

3318    South Eastern            205/8               185/8       201/4

To create the required fractions, simply add { } around the relevant numbers, with no spaces, thus:

33{1/8) South Eastern            20{5/8}            18{5/8}    20{1/4}

On pressing 'Save Now', each group of numbers within curly brackets will be replaced by the same numbers shaded grey, thus:

Fractions2

 On pressing 'Exit', these shaded groups will convert to fractions, thus:

Fractions3


Household Narrative: table template: STOCKS

 

STOCKS

Highest   

Lowest  

Latest  

Three per Cent. Consols

 

 

 

Three per Cent. Reduced

 

 

Three and a quarter per Cents. 

 

 

 

Long Annuities,

 

 

 

Bank Stock

 

 

 

Exchequer Bills

 

 

 

India Bonds, ₤1000

 

 

 

 

Household Words Almanac: calendar days of the week The first page of each month in the 1856 Almanac presents, in the middle of the page, a table containing key events occurring on each day of the week, surrounded by a decorative border. Please reproduce this table in a single column. For reasons of economy, the compositors abbreviate the days of the week: M for Monday, Tu for Tuesday, W for Wednesday etc. As we have no limits with line length for this feature, it will make sense (for TTS etc.) for us to spell out the days of the week in full. Spaces can be inserted to even the columns, or, a three-column table could be inserted, to taste -- so long as the end result is not ragged to the eye. For consistency: please also embolden 'SUNDAY.'

 Household Words Almanac: textflow 

As you will see from consulting the image pages for the 1856 Almanac, both left and right hand pages for each month are, in terms of design and textflow, designed as a spread. For better or worse, we have decided that we cannot reproduce this, so the text transcript has, from the outset, some dead ends, and flow issues. Where, in the 1856 Almanac, the double column at the foot of the verso/LH page flows direct to the foot of the recto/RH page, we have followed this in laying out the blocks of OCR text for correction. This mean that the transcript for rectos almost always starts at the foot of the page, moves to the top, then flows down to the end of the section headed 'SERVICEABLE INFORMATION.' Please do not adjust or re-arrange the order of text blocks in the Almanac pages: these have been placed deliberately. Markers in curly brackets−'{continued from foot of previous page}'−have been inserted to signal the ordering, and to assist audio listeners to make slightly better sense of the movement. Given that the content of the Almanac consists of an aggregation of factual information, the non sequiturs matter a good deal less than they would, say, in discursive prose.

 

Hyphenation, should I remove?

Yes, where a single word has been broken in two, across two lines. With their page layout of 2 narrow columns, Dickens's journals frequently needed to hyphenate in order to balance line lengths. The corrected text need not preserve this, and it will make TTS rendition, word searches and various other kinds of text-mining run better if it is removed. As a rule of thumb, take UP into the preceding line the half-word from the next line IF the numbers of letters involved is less than half of the total number of characters in the complete word; move DOWN into the next line the half word from the preceding line IF ditto. If it is evenly balanced, then you may decide! Where you see a hyphenated double word broken over two lines (e.g. road-mender, hand-picked, organ-grinder) you should not remove the hyphen, and may leave the structure as it is. 

Also, if a single word is broken in two and hyphenated across two pages then please apply the above rule and move the entire word onto one page, depending on which page the majority of its letters sit.

Example 1:
...ran off at once in a north-
erly direction, thinking of...
>>
...ran off at once in a northerly
direction, thinking of...
Example 2:
undaunted, he went to the in-
firmary and requested
>>
undaunted, he went to the
infirmary and requested

Images

Images are rare across Dickens's suite of publications, but are very common in the Household Words Almanac. For obvious reasons, they cannot be reproduced in the transcript, but it is important to signal their presence. A marker should have already been inserted for you, beginning {image:} followed by a category description e.g. {image: 'historiated initial letter 'X', showing ...?} and we would like you to insert a four- to ten-word (i.e. brief as possible), description after the colon e.g. {image: 'historiated initial letter 'T', showing a mare with her foal}. If the image contains text, e.g. handwriting, use {image: text: contents}, replacing contents with the actual wording, or an approximation of the wording. Thus, when the page is being listened to (using TTS), the listener will be able to acquire the gist of what the image signifies, and anyone searching the archive, will be able to find all the images by searching under {image:.


Italics, when to reproduce?

Italics should be reproduced whenever possible. To insert italics, either press Control +i before commencing to type the requisite word(s) and repeat Ctrl+i after; or, type normally, then go back and select the word(s) to be italicised, and press Ctrl+i.


Lines, length of

Please reproduce the line length as it appears in the original scanned copy (with the exception of words split over two lines/pages by a hyphen — see above).  This can be done by pressing SHIFT + ENTER after the appropriate word.


Linebreaks, double, how to reproduce

Please remove double linebreaks that are not present in the body text of the page image. For example, after the column break at the foot of the first column of a page of text. Or, within the stanzas of a poem (stanzas should be separated by paragraphs, however). The way OCR makes sense of line breaks, and how the transcript displays them can appear arbitrary, and often at variance with the page image. On the one hand, this is unlikely to affect TTS rendition, but on the other such variance, if unexplained, can look random and unnecessary.


Linebreaks, how to enforce?

Often, in attempting to remove an unnecessary double or single linebreak, you will find the lower line is taken back up into the line before. To enforce a single linebreak, place your cursor where the line ought to end (check against page image) and press Shift + Return (or Shift + Enter). Repeat, to insert a forced double linebreak.


Logged in? or Logged out?

As of 08 August 2011, you will not be logged out of your session, and an Autosave (set to kick in after 5 minutes) has been installed. Neverthless; our advice remains simple: use the 'Save Now' button frequently, during page correction, and always after returning from any mid-session breaks.


Missing text, what to do about?

Occasionally (rarely) the OCR will simply miss text: perhaps a word at the start of a line in col. 1 of a recto or at the end of a line in col. 2 of a verso – i.e. close to the gutter. Simply insert the word or words as appropriate. If a page has lots of ‘missing’ words, it may be that the OCR has misread whole chunks of the 2-column structure of the page – if so, text will not be missing so much as mixed up. See the advice about content order on the main ‘How do I Select and Correct a Magazine?’ (link) page. Using the cut and paste shortcuts (Control + C to copy selected text; Control + V to paste selected text), it is possible to rearrange the page, but it takes a little time, and is rather like working on a jigsaw, with the page image acting as the box lid!


Moderation, how does it work?

The submission and moderation process for your magazine is described on the 'Getting Started' page for Online Text Correction. If your magazine correction is accepted on first submission congratulations! If it is returned for some further corrections, these will be clearly indicated on the 'Correction Record', to allow you to focus on exactly which pages need fixing. If 'Assorted, on various pages' is flagged as a response from your moderator, then they will have come across assorted kinds of correction that need implementing. These will have been noted carefully on the 'Correction Record' for the first few pages examined. As these will have been quite consistent, not every page has needed to be consulted, so if you are willing to implement these changes, please check THROUGHOUT the magazine, even on pages where there is no note beneath the thumbnail on the 'Correction Record.' Thank you very much!


Paragraphs, should I rejoin when they are split across pages? How should I reproduce them?

Please rejoin paragraphs that have been split across columns within a page, but not across two pages. Put each article title, including those split over multiple columns, into one paragraph. Each paragraph you see in the original should appear within its own paragraph box in the Text Editor, even if only a short line of dialogue. New paragrpah boxes are created simply by pressing ENTER.


Poetry and verse: how to set?

A potentially complex question! However, here are some simple rules of thumb. 1) Where the poem/verse has mainly short lines (short relative to the width of a column of text), you should try to reproduce the indented margin. It will look and read better as poetry in the transcript. 2) Where the poem/verse has a long line – about as long as a column of text – there isn’t room, so each new line should by left justified to the same position as normal text. 3) Complications begin when, within a verse or poem, the original typesetting offers a pattern of tabs and indents for new lines. YOU CAN REPRODUCE THESE IF YOU WISH BUT IT IS NOT ESSENTIAL. 4) There is room in the DJO transcript for longer lines than were possible in the columns of Dickens’s journals. Sometimes the original typesetting is forced to break a line unnaturally, and a word or so has to stand on its own in the line below, WITHOUT A CAPITAL LETTER. These breaks should not be reproduced, so take the word or words back up into the previous line. 5) Please insert a new paragraph for each new verse.


Punctuation, spacing between words and

See ‘Spacing, between words’


Saving pages function — not working

A number of our text editors have reported that occasionally, a page refuses to save corrections: on pressing the 'Save Corrections and Exit' button or the 'Save Corrections and Return' button, the page freezes and refuses to accept the corrections. This is annoying, and we apologise! At present, however, this particular glitch is hard to resolve from the front end of the website, so we must ask you please to send a note of the journal, volume number and page which has the glitch to This e-mail address is being protected from spambots. You need JavaScript enabled to view it —it will be reported to our webmaster, and fixed in due course. The problem roots with some corrupt code in the database cell for the page in question, and our advice is simply to report the page, and move on to the next, rather than attempting to re-correct the glitchy page, and lose the corrections again. Eventually, on completion of the magazine, please drop a line to this email address, and the magazine can be approved. At present it is unclear whether there is a generic solution that can be implemented, or if each page needs to get fixed individually: we will update this FAQ as soon there is more information available.


Selecting a second magazine to correct â€“ why is this not working?

Currently correctors can only select another magazine to correct after their first magazine has been both submitted and approved.


Spacing, between words and punctuation: reproduce or not

IMPORTANT NOTE: As of 16th November 2011, the guidelines for spacing have changed. Please read the guidelines below carefully. Corrections begun BEFORE 16th November should continue to follow the old rule.

NEW RULE (after 16th Nov.): Throughout the magazines, there are what the modern eye recognises as unnecessary spaces between final letters of sentences and closing punctuation (e.g. a question mark, or exclamation mark). Most of these have been automatically closed-up. However, some spaces still remain which need to be manually closed-up by the corrector. Also, spaces between quotations/words of direct speech and single ( ' ' ) and double ( " " ) quotation marks remain and should also be manually closed-up by the corrector.

OLD RULE (before 16th Nov.): There is no urgent need to remove what the modern eye recognises as unnecessary spaces between final letters of sentences and closing punctuation (e.g. a question mark, or exclamation mark). At present it looks as though we will be able to automate the removal of such spaces. TTS rendition will not be disturbed by them, either. However, it will be useful to remove unnecessary spaces between words of direct speech or quotation, and their opening/closing quotation marks.

PLEASE CORRECT “ I have no idea why he went, ” retorted the Sailor, roughly. TO “I have no idea why he went,” retorted the Sailor, roughly.

Spelling mistakes/errors

You will come across typographical errors and mistakes occasionally in Dickens’s journals, which should be silently corrected. However, it is worth checking in a good dictionary whether the spelling given is an acceptable Victorian variation on modern spelling, or not. Dickens’s journals often give ‘recal’ for ‘recall’, ‘befal’ for ‘befall’ etc. – these variants should not be modernised. Nor need any factual errors you may espy be corrected – we are reproducing the original content, warts and all!


Symbols

See under ‘Unusual characters’ etc.


Tables and charts: how to set

Occasionally, Household Words and All the Year Round will feature tables and charts. If you feel confident about using the ‘Insert Table’ function on the JCE Editor (similar to the Insert Table function in Word), and patiently styling the table to look like the original, then please go ahead. If not, there are two alternatives.

  1. If the table/chart is simple, it can be reproduced, without lines, simply as text on the page, so try this first.
  2. If complex, it is probably best to leave AS IS, and send a note to This e-mail address is being protected from spambots. You need JavaScript enabled to view it giving the volume, magazine number and page. It will get dealt with in-house (eventually!)

Tables are very common in the Household Narrative of Current Events and they are, generally, very complex and time consuming to reproduce. It is best to use the Insert Table function on the JCE Editor throughout, so please do not undertake the correction of a Household Narrative if this is a feature you feel uncertain about using. The layout of the tables does not have to absolutely replicate the original -- this is actually quite difficult to achieve using the JCE Editor -- so don't be worried if your table(s) isn't an exact copy of the original(s).


    Titles and Headings – how to style?

    As the introductory OTC tutorial explains, we ask you to remove the standard Masthead, and the running headers of internal pages (the exception being the Masthead of Extra Christmas numbers):—these will be replaced by automated means in due course. However, please retain titles of articles, advertisements, and announcements.

    Titles, sub-titles, and chapter headings etc. should each be in a separate paragraph box (use SHIFT + Return/Enter to create a new paragraph box).

    Reproduce/leave what look like unnecessary full stops, as these will improve text-to-speech (TTS) rendition. Leave block capitals where this reproduces the original, but do not attempt to reproduce different font sizes or families. A plain rendition in a uniform size of san serif font is all that is aimed at with manual text correction.


    Unusual characters, accents, symbols etc., how to insert?

    The JCE Editor has a sub-menu for inserting the most common of these; its button has the Greek ‘Omega’ symbol (in capital), thus Ω, as its icon. If a character or symbol used in the page image is not available via this sub-menu (for example, an ancient runic character) then the best option, to assist TTS rendition is to use the nearest modern English equivalent (if there is one), or, as a last resort, to put in square brackets the briefest description imaginable.

    Unusual characters sub-menu.jpg

    —

    – Household Narrative:

    Who's Online

    We have 300 guests and 2 robots online.