Monday, May 08, 2006

Sections RL, NM and U

Well things certainly seem to be speeding up somewhat...

Three new sections were photographed over the weekend, two large, one little, (N,M / U and RL). Transcription is going really well with a lot of sections being done recently (means I have to speed up my transcriptions to keep up!).

I've also managed to get a prototype search function and a database built to which I need to add a whole section of data to give it a good testing.

It will only take until 2008 at this rate to finish! Then on to the cemetery on Western Avenue, a mere half the size of the Cathays one! ;)

Wednesday, May 03, 2006

Section O1 images uploaded

Section O1 images have been uploaded for transcription.

Friday, April 21, 2006

Section F1, F2 and C2

This section is completed.

I will be creating a test database and put this section on-line shortly as it is a nice small section (only 64 images) and this can be used to test the functionality of the database.

Any ideas for functionality to be included can be emailed to me.

Cheers
Richard

Tuesday, April 18, 2006

New batches of photographs

Several new batches of photographs are now available for transcription.

Please have a look at the south map to see the areas in red which have been done recently. These are areas O1, J1, F1 and F2, D1 and C1. All of these with the exception of O1 are online and ready.

Just for information, F is a conservation area, and D1 is a Catholic section with many more ornate grave markers than the other sections so far.

Grab your keyboards!

Thursday, April 06, 2006

Section 5 images/spreadsheet ready for transcription

Section 5 is ready to go - I've made sure that there is only one headstone per image, there are no images of the stonemasons marks or the plot numbers.

There is a new spreadsheet for section 5 on the site, with some updated instructions. Please have a look at the instructions as these will help the post-transcription process of re-formatting the data into a consistent format.

The status area of the page tells you which images are assigned to which transcriber.

As soon as one of the sections is completed, I will be transferring that section into a database which we then need to test to see if it can do what we want it to do for searching and delivering the image data to the end-user.

Thanks for all the help.

Thursday, March 30, 2006

Latest batch of images

Latest batch of photos done for the next section of the graveyard (660 images in total)... these can be found under the area05 section.

Next section available

Hi all

The next section of photographs is now available... please have a look at the team status section for the allocations.

Cheers
Richard

Wednesday, March 29, 2006

Glamorgan Record Office

I had a very interesting mail today from the Glamorgan Records Office, who in principal think it's a good idea to index the graveyard. I think we should attempt to cross reference the burial index with the plot numbers, and hence the gravestone images and publish the whole lot online... assuming we arn't all six feet under before we finsih that is...

Here is the mail:

>We were very interested to hear of your project to document the burial plots at
>Cathays Cemetery. The Glamorgan Record Office holds a duplicate set of burial
>registers for the cemetery, dating from 1859-1951, and we received regular
>enquires from members of the public wishing to consult these registers for
>family history research purposes. As far as we are concerned, you would be
>welcome to visit the Record Office in order to index these volumes during our
>normal opening hours, which can be found on our website www.glamro.gov.uk.
>
>Please note, though, that these volumes are not the property of the Record
>Office. We merely hold them on behalf of the owner, Cardiff City Council. As a
>result, you would need to contact the Council in order to obtain permission to
>publish your index, either in hard copy or over the internet. The department
>which is responsible for the cemetery registers is Bereavement Services. They
>can be contacted by e-mail at ThornhillReception@cardiff.gov.uk.

New photographs

I've photographed another section of the graveyard (615 photos in total) and these are currently being uploaded to the holding area for transcript.

The main page will be updated shortly and the photographs allocated to transcription. I've also cropped the pictures to remove an uncertainty about which image is to be transcribed and also to reduce the size of the images.

Friday, March 24, 2006

Database required

I think the best way to publish this online is via a database as there's simply too much data to produce static HTML pages.

If this is done correctly, we can add other municipal cemeteries to the data at will without changing the setup. I'm currently investigating mysql as the backend as this is a free resource offered on many webservers, is widely supported and most importantly, can be accessed using SQL. This will make publishing pages from the database pretty straightforward.

However, publication of the data as a DVD is another issue - given the first two sections are around a CD's worth each - and that you can get about 8 CD's worth on a DVD... you will see that the entire cemetery will not fit on a single DVD. Of course the new DVD formats in the pipeline may well take care of this as they can fit up to 50 gigibytes of data on them. Even so, to publish the full cemetery in this way would be costly and time consuming.

Anyway, I'm progressing with the database at the moment - as soon as I have one section up and running it will be available from the main CATHAYS.HTM page we are using for the project.

Cheers...
Richard

Thursday, March 23, 2006

Update

I've updated the main page http://www.genebooks.com/cathays/cathays.htm with a few bits and pieces today.

Mainly, the changes are to the status page, and the addition of some more links which can be used to access the master copies of the updated spreadsheets. As each section is transcribed, I'll tweak the formats etc and load the data into the master spreadsheet and publish it here. Also, when we complete a section I'll put up an HTML page so we can see how the finished project will look - I need to think about this one though because it might need to be a database driven page.

More photogrpahy to come next week - I've got a week off and want to complete several more sections of photography.

Cheers!
Richard

Tuesday, March 21, 2006

Update

The project has kicked off well, of the two sections photographed (about 1100 photographs) only 200 photo's are not allocated for transcription yet. We've had a good response from people wanting to help with the transcription, and others with suggestions about data layouts etc.

At this rate I'm going to need to be back in the cemetery this week to photograph another section or two...

And yes, I seem to be unable to correctly spell cemetery, as has been pointed out several times now... oops. I suppose I should double check the transcripts I'm doing then as well ;)

Cheers for now

Monday, March 20, 2006

Status page

I've added an overall status page (see link in the title). This will help give an overall view of how we are doing.

Images are grouped into 100's as this is a good amount to transribe. I'll start disributing batches of them today.

Carolanne - can you contact me as I don't have your email address and can't send you an update of the images which need transcribing.

Cheers all
Richard

Sunday, March 19, 2006

Cathays Cemetary - second tranche of photos

Well it's early Sunday evening now and I've photogpraphed the second area of the cemetary as defined by the map outline I put together over the weekend. I'll upload the images to the website as I did with section 1.

Back later...

Sample spreadsheet for layout

Here's a spreadsheet with a proposed layout for the transcription - any suggestions on changes or additions??

http://www.genebooks.com/cathays/cathays_section_01.xls

Overview page to follow so we can track who is working on what...

Cheers
Richard

Saturday, March 18, 2006

Map of Cathays Cemetary

I've started sectioning the graveyard up into more manageable plots. Using the link on the title of this post you can download an xml file which defines a Google Earth data layer. If you open this file with Google Earth it will show you the Cathays cemetary sectioned into smaller areas. Area 1 is highlighted in red because I photographed it today.

As each section gets completed, the appropriate area will be highlighted on the map - an easy way of tracking what is going on and how much has been completed.

The photographs for the first section are being uploaded to an area of my website for distribution to all you nice transcibing people who want to help with the transcription... there are 507 shots in the first batch - some of these have two headstones per image if the inscription quality is sufficient to allow it. Some of the headstones have multiple shots per inscription where the headstone was difficult to see or the lighting was poor. You can get at the raw data here:
http://www.genebooks.com/cathays/area01/
I'll make an html page to coordinate this in the morning.

But before we get stuck in we need to decide how we are going to store and publish the data. I would suggest a spreadsheet initially as this can easily be used to transfer the data to HTML format or a database. However, the question is do we want to simply take the inscriptions like I've done on the OGRE website (ie., date of death, SURNAME, other names, inscription) or try to make it more structured by indicating family relationships.

Ok, I'm off to watch the new Willy Wonka movie as my brain is now officially zapped and this requires no thinking power whatsoever.

Cheers
Richard

Site Visit

I'm off to Cardiff today so I will visit the site to get a feel of the next 15 years worth of work :)

I've also started to chop up the cemetary into more manageable chunks by using Google Earth's satellite imagery. This seems like a good way to go as I can easily identify the natural boundaries to sections (ie., paths, walls, trees etc).

I'll post the data file online sometime this weekend when I finish it. Anyone wanting to see the sections / volunteer to transcribe on etc should get Google Earth installed. It's free and is an amazing bit of technology. For those who don't want or can't run Google Earth I'll post some static image files showing the areas.

There are also some interesting posts/ideas on the GLA mailing list so I'll grab them and re-post them here.

Cheers
Richard

Friday, March 17, 2006

The Cathays Cemetary Project

This blog has been set up to allow discussion on a potential transcription project for the Cathays Cemetary in Cardiff, Wales.

The proposal is to photograph and transcribe the contents of the cemetary, a no mean task considering the size of it, and publish those results for researchers to benefit in locating their ancestors.

I'll start by posting my interest in the project and thoughts on how we should proceed, how the data can be accessed etc.


For a couple of years now I've been photographing the graveyards in Monmouthshire and Glamorgan and publishing them on the OGRE website (which can be found at http://www.cefnpennar.com). For the usual churchyard sized cemetary this format is adequate and easily maintainable and there are usually no more than a hundred or so graves to be catalogued.

For larger municipal graveyards, this format becomes unwieldy as you cannot list 10,000 graves on a single HTML page as it would be impossible to maintain and would take ages to load when viewed.

Thus, the first question is how to store and access the data? The two obvious options are an on-line database and query pages, or a set of static, linked HTML pages in alphabetical order, for example.

The next obvious issue would be organising and managing the photography of the headstones. From experience I know that I can do an average graveyard in a couple of hours. By the time you've waded through brambles and all sorts of obstructions (probably not an issue in the Cathays cemetary), got the camera to focus and fiddled with the zoom etc you are looking at perhaps a couple of shots a minute maximum.

Now a number of those shots will be duds due to lighting and focus issues, therefore we need to be able to re-shoot images on demand for those that do not come out properly. Obviously the graveyard needs to be sectioned and organised. Some of this may be possible to review given the level of detail in the Google Map satellite imagery, but I suspect a fairly detailed site visit would be required to examine how to use natural boundaries to give small, workable areas.

The volunteers who can help photograph the graves will obviously need access to the kind of equipment which can give a minimum image resolution, and the ability and technical knowledge to be able to crop and rotate those images and submit them to a central repository (such dreary things as file naming conventions to prevent overwriting images, and backups come to mind here).

Now comes the hard part, transcribing the images. This is the most tricky part, and requires many cups of black coffee and late nights... Welsh transcriptions should really be handled by Welsh speakers as I know when I transcribe welsh data it slows me to a snails pace checking the spelling and translating the dates etc to English.

All transcriptions need to be checked for accuracy, as it is very easy to transpose the data when you've typed in thousands of surnames.

And this brings us back to the data format. A couple of methods strike me as the way forward, firstly, to use an on-line form to submit and review data, or a simple spreadsheet.

The online form will be the harder way to do it, and would require an on-line database to provide access. This would require an amount of effort to set up and check before it could be used.

A spreadsheet is easier to handle and mail, and we can use it to generate HTML or a set of data for a database, and can be up and running fairly swiftly.

Finally, I would prefer to see this offered to the end user as a free resource that can be accessed without charge. Something to consider for this is that a free website is usually rubbish due to the small bandwidth provided and the lack of service. If this is to be published on the web it needs to have a small downtime (all computers crash sometimes), it needs to be reliably back-up by the service provider in case they have a disaster, and it needs a good bandwidth allowance to allow people to get at the site to see the data.

So, having said all that, let's get a discussion going and see where it takes us.

Cheers
Richard, aka The OGRE.