LOST VISIONS
  • Home
  • About
  • Blog
  • Featured Images
  • Lost Visions Team
  • Events
  • Contact

Lost Visions: A Computational Overview

29/7/2014

0 Comments

 
The "Lost Visions" project is creating an internet-based system for enabling users to interact with digitized content in a variety of different ways. A key objective is to allow digitized content to be searchable using crowd sourced, bibliographic and content-based features. These features are captured as keywords, either based on text provided by a user (as part of a crowdsourcing activity), keywords extracted from key positions in the bibliographic data (e.g. name of illustrator, engraver, photographer, book title etc.), or those obtained by the outcome of an image-processing algorithm. A website has been implemented for the project which links into a high performance computing environment at Cardiff University to support image storage and processing.

A key novelty of this project is the combined use of these features to facilitate search.

The system enables users:

(i)                  to view and interact with digitized images;

(ii)                to tag images using keywords or to make use of a pre-defined taxonomy;

(iii)               to add images of interest into a "personal" archive. This archive does not allow images to be downloaded, but primarily to be recorded into a "personal space" on the web site;

It also enables comparison between images to be carried out using image-processing algorithms. These algorithms originate from work in content-based image retrieval, enabling full-image features to be recorded and compared across images. A "geo" element has also been included in the project, which enables search to be carried out based on a geographic bounding box, or text search for place-names via a gazetteer.

The current prototype has been implemented using PostgreSQL (with the PostGIS add on). The user interface is implemented in PyCharm (Python).

Various image processing libraries have been investigated to discover what data can be retrieved from images in order to compare and sort images automatically. So far, a "Bag of Words" method of processing the SIFT descriptors of images. This is using code found at:
https://github.com/shackenberg/Minimal-Bag-of-Visual-Words-Image-Classifier
and modified to perform a random selection from a selection of training images, and produce a confusion matrix based on its success. The SIFT descriptors only need to be calculated once, and can be stored in a separate file per image. Subsequently, K-means histograms are calculated, and these are also saved to disk for reuse.

A Support Vector Machine (SVM) is then trained and is also saved to a file.  However this training needs to be performed every time the training set changes, and this is the most processor intensive (and so time consuming) action by far. Additionally, as the SVM is categorizing images based on their "closeness" to a point on a multi-dimensional plane, a threshold will be placed on how far from this point is acceptable. This distance can be considered as a "confidence" value, and will be an indicator as to which images should be further analysed, either via crowdsourcing or with different descriptor processors.

Current work on the image analysis has not yet made full use of our high-end computing capability: this is expected to be the next step in our implementation.
0 Comments

Consultation Workshop II: Big Data and the Digital Humanities

11/7/2014

6 Comments

 
Picture
On the 3rd July we hosted a workshop for members of the Lost Visions Advisory Panel. External delegates in attendance were Felicity Bazell (Capture), Giles Bergel (Oxford University), Abbie Ennock (Capture), Simon Mahony (University College London) and Mike Pidd (HRI, Sheffield University).

The workshop focused on the computational and technical aspects of the project, beginning with presentations from Omer Rana on ‘Big Data and the Digital Humanities’ and Paul Rosin on ‘Image Recognition’. Ian Harvey then gave a demonstration of the illustration archive, which was followed by questions and a stimulating roundtable session. Points of discussion included how we might enhance the existing metadata, recent developments in content-based image retrieval and the politics of keywording.   

It was agreed that the crowdsourcing aspect of the project has wide implications for both the computer sciences and the humanities. Crowdsourced data is highly significant within an anthropological context and can be more fully understood through the use of sentiment analysis algorithms. This data also has much to tell us about the interaction between word and image and how we must understand the illustrated text as a bimedial work of art.

There were a number of suggestions that we are planning to implement as the database develops, including allowing users to create their own image collections, tailoring images to taggers’ specific fields of interest and allowing tagging of multiple images at a time.

We ended the day with dinner at a local restaurant where the discussions continued over wine and Italian food. The workshop was both stimulating and productive and has given us a wealth of ideas to take forward for the next stage of the project.


6 Comments

Consultation Workshop I: Illustration and Teaching

10/7/2014

1 Comment

 
Picture
Last week we held two consultation workshops to demonstrate the online database and to obtain feedback and advice on the future development of the archive.   

The first – which took place on 1st July – was attended by teachers from local secondary schools and FE colleges, with representatives from Cathays High School, Merthyr Tydfil College and St David’s Catholic College. Our focus in the workshop was the possible application of the image collection for teaching English literature, although the breadth of the dataset means that the images in the archive will be significant for a wide range of disciplines including history, art, geography and media studies.

The morning session consisted of a demonstration of the archive and its search functionality. The teachers offered us valuable input relating to the usability of the archive and how they might use the images in their own teaching. Suggestions ranged from using material to contextualise the study of Renaissance drama and First World War poetry to examining literary illustrations of fairy tales and Gothic fiction. The educational practitioners also saw considerable potential for using images within the creative writing element of their teaching, which is especially relevant in the FE sector with the new Creative Writing A Level.

After a hearty lunch, we moved on to the afternoon session on image tagging. After an interactive tagging session we had a discussion about how best to engage pupils in the tagging process. There were many valuable suggestions made about modifying the interface, improving links to social media and emphasising the gamificiation aspect which will be of considerable help to us as we move forward with the project.  It is clear that there are many mutual benefits to engaging pupils in this kind of technological community participation and there are some exciting possibilities for incorporating this within the Welsh Baccalaureate assessment. In the final roundtable discussion, the teachers offered advice about creating bespoke teaching resources and a discussion forum for teachers as part of the database.

The day provided us with a range of ideas about how to move forward with the Lost Visions project and convinced us that the images in the collection have vast possibility for incorporation within the existing curriculum and exciting potential to promote the study of illustration.


1 Comment
    Picture
    Picture

    Archives

    March 2015
    February 2015
    January 2015
    November 2014
    September 2014
    July 2014
    June 2014
    April 2014
    February 2014

    Tweets by @Lost_Visions
    https://www.facebook.com/illustrationarchive?fref=ts

    Categories

    All
    Archives
    British Library
    Crowdsourcing
    Databases
    Digital Humanities
    Education
    Gaming
    Tagging
    Teaching

    Links

    Lost Visions: Cardiff University News
    Lost Visions: Cardiff Book History
    Lost Visions: BBC News

    RSS Feed

Powered by Create your own unique website with customizable templates.