

In terms of these three NoSQL types, they have variable suitability for handling Entity-Relationship data.

In contrast, with Graph Databases, the data is managed in the form of nodes with properties, and edges (that can be labelled) connecting the nodes. In Document Stores the basic element is a document which contains a set of fields.

In Key-Value Stores only pairs of key and value are allowed, thus no structure can be defined. There are three prominent types of NoSQL systems: Key-Value Stores, Document Stores and Graph Databases, which greatly differ in their meta-model definitions. Recently, a growing number of systems which allow schema evolution have emerged, labell ed under the generic term “NoSQL”. the IPSA collection and the 1641 depositions.

Hence, this meta-model has been chosen by CULTURA to define the digital cultural archives using its environment e.g. However, conceptual modelling based on the Entity- Relationship model, which is commonly used in the process of relational databases schema definition, is an effective methodology for capturing data requirements. This requirement rules out relational databases, since modifying the columns of a populated relational database table is a costly task. Therefore, the data management part of the system must support easy on-the-fly modifications of the underlying schema. In both cases, the system schema must evolve over time to reflect the work of researchers and others.
#Jubler generic manual#
These actions could include the manual manipulation of existing data, the user referencing of specific data elements or the interactions between sets of users. The other requirement is to support a layer of services that allow a range of user actions. detection of new type of enti- ties such as “murder” events within the 1641 depositions, which aren’t explicitly e n- coded a priori. The first requirement is to allow the incorporation of new concepts which augment the original data during the research process e.g. In the context of digital humanities the data modelling challenge has two specific characteristics. Data modelling is one of the crucial aspects of designing a data-centric system. The Translation Model was developed on the top of the previously developed OCR correction methodology. These manually normalised documents were ran- domly selected and accounted for approximately 6% of documents from the 1641 depositions. The statistical model built to automatically normalise the historical text utilised manually normalised documents. people, places, events, dates, as well as facilitating improved search across the collection by taking account of spelling variants of a search term. This normalised text enables better identification of entities, e.g. The primary purpose of the normalisation process is to produce documents without historical variations on letter level. information extraction) on historical texts, which contain non-standard spelling, historical grammar and many old word forms, is a non-trivial challenge requiring normalisation of word spelling and entity extraction. Performing document analysis techniques (i.e. These links are generated for each heading by com- paring the most prominent terms in a user’ s browsing history (displayed under “Infl u- encing Terms”) with the metadata of all depositions, and rendering the most relevant. In this example, green text links to specific depositions are listed under four headings (Place, Occupation, Person Type and Nature/Crime). Figure 3 shows one example of the recommended content shown to users who browse the 1641 depositions using CULTURA. application of these techniques helps empower experienced researchers, novice researchers and the wider community to discover, interrogate, and analyse the cultural heritage resources.
