Review:*
Manitoba Free Press On-line
by Gordon Goldsborough

January 2003
A shorter version of this article appeared in the Keywords newsletter.

Jump to:
Summary | Overview | How does it work? | Is it any good?The reviewer

Summary

What is it:

Web site database containing a searchable name and subject index to the Manitoba Daily Free Press (1874-1893)
www.ancestry.lycos.com/search/rectype/inddbs/6545.htm

Cost:

US$29.95 for three months access,
US$79.95 for one year

Assessment:

A useful site for those doing research on early life in Winnipeg and Manitoba, but you must be mindful of the database limitations and you will need patience, a reasonably powerful computer, and high-speed Internet access to use it effectively.

Overview

The Manitoba Free Press, an MHS Centennial Business, began publication in 1872. Aside from a name change to Winnipeg Free Press, it is the longest running of the province's newspapers, and the oldest daily (since 1874) west of Toronto. The Free Press is an invaluable source of information about early life in Manitoba, and microfilm copies at libraries around the province are often the cornerstone of research by local historians. Marriage, birth, and death announcements provide useful data for genealogists, and local news can provide a more intimate look at events than is typically found in books and government documents. However, unless you are prepared to spend long hours hunched over a microfilm reader, you must know the exact date of an event in which you are interested. Scanning for information on a subject over a range of time requires a keen eye and patience which, in this day of easily searched computer databases and Internet search engines, requires far more work and dedication than many people are willing to commit.

A hi-tech solution to this problem appeared recently, as the genealogy web site www.ancestry.com created a name and subject index to the Manitoba Daily Free Press, covering the period from 1874 to 1893. The database, apparently provided by the firm Heritage Microfilm, is one of dozens in a "Historical Newspaper Collection," with the Free Press as the only Canadian representative. Access to the entire collection, including a large number of American papers, costs US$29.95 for three months or US$79.95 for one year. Given the focus on family records, it is understandable that the search facility provides only three fields: given name, surname, and keyword. Enter a word into one or more of these fields, and you are shown a list of pages where they appear from among the 31,000 in the database. Click on a page and it is displayed with your search word(s) highlighted. The page can be saved on your computer in JPEG format.

The database is a good first effort but it is far from perfect. It was created by capturing an image of an entire page using a computer scanner then using Optical Character Recognition (OCR) software to search the image for recognized words. The process was done automatically, without human supervision, so not all words appearing on a page were identified correctly. Likewise, poor-quality scans of faded pages or hyphenated words result in incomplete indexing, and some dates, often long sequences, were inexplicably not included in the database. Nevertheless, the database would useful to anyone doing research on the early history of Winnipeg and Manitoba, especially if a specific date is not known.

Another web site offering a similar service, with several versions of the Manitoba Free Press, is available at www.newspaperarchive.com.

How does it work?

Go to the Historical Newspaper Collection at www.ancestry.com by clicking the link on their home page, or go directly to the Manitoba Daily Free Press by clicking here.

If you go to the entire collection, you can click on one of the American states to see a list of the newspapers available for that state. For example, there are currently (as of January 2003) five newspapers for North Dakota, all in Bismarck, some covering a period of less than a year. Click on one of the titles to search that paper. There does not appear to be a way to navigate to the Manitoba Free Press this way, because there is no clickable map for Canada - the Free Press is the only Canadian paper in the collection.

When you get to the page for the Free Press (click here to go there), there are three fields in which you can enter text for searching. Given that the web site is focused on genealogy, it is not surprising that two of the fields relate to names: given name and surname. But don't be fooled. I found that you can enter a surname in the given name field, and vice versa, and the search works just the same. You can enter the entire name in one field, also with the same results. And you can enter a name into the keyword field, or a keyword into either of the name fields. In other words, you can use the three fields interchangeably.

Entering a single word into a single field searches for all occurrences of that word in the database. Entering two words into one field, or one word into each of two fields will search for pages on which both words occur. But beware! Pages found this way will contain both your search words but not necessarily consecutively. So if you enter "William" and "Henderson," there is no assurance that you will find pages in which the full name "William Henderson" appears. It is just as likely that you will find a page with "William Smith" and "John Henderson." A search will often produce large numbers of "hits" but many of them, perhaps even all of them, will not actually contain the full phrase that you are looking for.

Notice the line "View images of the original document" below the three search fields. Click it to browse through the entire Free Press database manually by selecting a specific date. This is what you will see:

Although the Free Press commenced publication in 1872, it only became a daily in 1874 so the database starts there. For reasons unknown, the database ends in 1893. And don't get your hopes up too much. The database does not include every day of every month for the entire period. Some months are fairly complete whereas others have few or no dates. (And remember that some dates are "missing" because the Free Press did not publish on Sundays.) Let's look at a month that is complete: September 1888.

If we select the 21st, we see the first of eight pages scanned for that date. There are two ways to view a page. The default viewer has rudimentary capabilities. Instead, you can choose to download a free viewer from ancestry.com as a plug-in for your web browser. The plug-in has many features not found in the basic viewer, so I definitely recommend it. It lets you enlarge, pan, and select areas of the page. If you have conducted a search, it will highlight your search word(s) in color. Believe me, this is a BIG help if you are scanning a dense page of text for a single, small word. This is our page as seen with the enhanced viewer:

Click the "Zoom In" button, then select the amount by which you wish to enlarge the page. The default is "Fit Width" which fits the entire page to display across the browser window, making the text all but unreadable. The zoom options are 25%, 50%, 75%, 100%, 125%, 150% and 200%; "Fit Image" to see the entire page at once, or "Fit Height" to see the entire page length (but good luck seeing anything meaningful in these latter two options). Here is the 50% zoomed view of our page. The text is easily read although some of the individual letters are not clear. This causes problems with the OCR technology used by the computer to "read" the text.

OCR stands for Optical Character Recognition and it was the method used to create the Free Press index. No, the database was not created by sitting a human down in front of a microfilm reader and having them read and transcribe the text. That would have taken FAR too long! Instead, each page was scanned to create a graphic image then the OCR software examined the pixels in the image to locate ones that formed characters. Characters were strung together to form words, then these were compared to a dictionary to exclude meaningless groups of random letters - hope that the dictionary was fairly complete or some legitimate words would have been dropped. Hyphenated words that break across two lines will probably not be recognized either.

If a letter in the scanned page cannot be "read" by the OCR software, chances are that the word will be interpreted incorrectly. Hence, "false positives" - pages that are purported to contain your search word(s) but do not - occur fairly often. For example, I found there were hits on the word "Internet" which, if true, would mean the Internet was actually invented almost a century before everyone believes. It is more likely that a similar word was incorrectly identified by the OCR software. But these positives do not occur nearly as often as "false negatives" - you get no hits on pages that you know contain your word(s) because they were not recognized by the OCR process. Newspaper pages that were damaged or unclear in the original microfilm version were especially prone to this problem.

OCR technology is being used increasingly to automate the process of indexing large amounts of printed text. For example, the National Archives of Canada recently posted a searchable database for the diaries of William Lyon Mackenzie King that was created the same way. It cannot be used to read handwriting and it has trouble with some typefaces.

Browsing the Free Press in this way is really not very efficient, unless you do not have ready access to the original microfilm. Each scanned page is a large file that must be downloaded from the ancestry.com server to your computer. Even with high-speed Internet access, there is a perceptible delay from the time that you select a page (or click the Previous or Next buttons in the enhanced viewer) to the time that it displays, even when you are viewing at a time when Internet traffic is light, such as late at night. Try it during the day, and expect to wait considerable time to see your page. And don't even think of using a dialup connection unless you have lots of time on your hands.

Entire pages can be saved as graphic files on your computer, in JPEG format, but they are very large depending on the amount of information on the page. I saved files that averaged about 6 megabytes. The JPEG format compresses into a smaller file than many other graphic formats; the same files when decompressed grew to over 100 megabytes. So, in addition to needing a fast Internet connection, you would be well advised to have a computer with lots of memory and a huge hard disk if you intend to save, open and manipulate the pages outside of the browser.

The power of this database is really not in the ability to browse individual pages. Instead, it comes from the ability to search the entire database, in one step, for words or phrases. So let's take an example. Say that you want to find pages on which the name "William Henderson" appears. Bear in mind a couple of caveats: 1) Learn to expect that a large number of hits will not contain your full name for the reason explained above, and 2) Be sure to try variations on spelling, especially for names that can be abbreviated, such as "Wm" for William and "Jno" for John.

We learn there are apparently 26 newspaper pages containing the name "Wm Henderson." The first ten are shown at one time. Notice that there are often pairs of "hits" for the same date. I found that the first of these was usually a real hit whereas the second one produced an error saying that the page could not be found in the database. So I generally surfed every second hit to save time. But be careful, because sometimes both pages produced hits. Go figure...

Sifting through the false positives (pages containing "Wm" and "Henderson" somewhere on the page but not together) produced a few legitimate hits. Be patient. Here is one of them for 6 April 1876, showing that William Henderson was a signatory on a petition requesting Francis Cornish to run for the position of alderman in the upcoming civic elections.

As I said above, do not feel constrained to enter just names into the search fields. For example, I wanted to see if the database contained any references to the Jobbers Hockey Club, that was organized by employees of Winnipeg grocery brokerage companies. So I entered the search like this:

I could have just as readily entered "Jobbers" as the given name and "hockey" as the surname, and got the same results:

There were 29 hits but a lot were false positives because they occurred before the Club was founded in 1893. Here is one of these, with "Jobbers" and "hockey" on the same page but clearly not consecutively (hits were marked in green or yellow).

It can get incredibly frustrating to wade through a large number of hits, finding none of them actually containing your search phrase. But persevere and you sometimes find gold, such as I got on page 5 of the paper published on 14 January 1893, announcing the formation of the club:

Interestingly, I had tried searching on the name "E. Nicholson" earlier but this page was not listed as a hit, presumably because the letters of the surname were not sufficiently clear to be recognized correctly by the OCR software - a "false negative."

Is it any good?

The database is great, if you are prepared to accept its limitations:

  1. It covers only the period from 1874 to 1893, and there are many missing pages within this range.
  2. The OCR technology used to create the index is far from perfect so search results contain false positives and some pages that should be found are not.
  3. It is not possible to constrain searches to find two or more words only if they appear consecutively.
  4. The files of scanned pages are very large so they download slowly, even on high-speed Internet connections.

The database is especially good if you are looking for information during a long period of time without a specific date.

Is it worth the money? That depends on how many search terms you have and how much time you are prepared to commit to the task. For the work that I was doing, it was definitely worthwhile, as I found information that I would have never found otherwise. And there are lots of interesting advertisements and other historical tidbits that you see along the way so that even pages with false positives are usually worth the wait. And a subscription to the Historical Newspapers Collection provides access to all the newspapers, not just the Free Press so there are potentially huge numbers of pages that can be searched. So, all things considered, I recommend it as a good way to cover a lot of ground quickly.

The reviewer

Gordon Goldsborough is First Vice-President and Webmaster of the Manitoba Historical Society, and an Associate Professor at the University of Manitoba. He is working on two books of early Manitoba history, one of which might even be published within the next couple of years. (Feel free to bug him about it.) He can be reached by email at ggoldsb@cc.umanitoba.ca.


* Disclaimer: The views expressed in this article are those of the reviewer. They do not necessarily represent those of the Manitoba Historical Society. Mention of services or products does not constitute endorsement by the Manitoba Historical Society.