From Text to Political Positions

This post is from project team member Maarten Marx:

The book From Text to Political Positions. Text analysis across disciplines edited by Bertie Kaal, Isa Maks and Annemarie van Elfrinkhof recently appeared.
It contains chapters by DiLiPaD researchers Graeme Hirst and Maarten Marx.


From Text to Political Positions addresses cross-disciplinary innovation in political text analysis for party positioning. Drawing on political science, computational methods and discourse analysis, it presents a diverse collection of analytical models including pure quantitative and qualitative approaches.Dilipad-logo-REVERSED-300dpi By bringing together the prevailing text-analysis methods from each discipline the volume aims to alert researchers to new and exciting possibilities of text analyses across their own disciplinary boundary.
The volume builds on the fact that each of the disciplines has a common interest in extracting information from political texts. The focus on political texts thus facilitates interdisciplinary cross-overs. The volume also includes chapters combining methods as examples of cross-disciplinary endeavours. These chapters present an open discussion of the constraints and (dis)advantages of either quantitative or qualitative methods when evaluating the possibilities of combining analytic tools.

Continue reading

Introducing PML (Parliamentary Markup Language)

This post is from the project’s metadata specialist, Richard Gartner:

A key feature of the Dilipad project is its use of the XML schema PML (Parliamentary Metadata Language) as its core metadata format. PML was devised as part of an earlier project, LIPARM (Linking Parliamentary Records through Metadata), which constructed it as an interoperable format for components of the Parliamentary record. In that project, it was designed primarily as a discovery tool, allowing the construction of union databases of Parliamentary data; in the Dilipad project, we are extending the use we make of it to allow detailed, machine-actionable analyses of the content of this data.

PML was put together to allow all important parts of the Parliamentary record to be recorded (such as the people in it, the roles they filled, their groupings (party or otherwise) within Parliament, and above all, their contributions to proceedings (mainly, though not exclusively, speeches)). It then allows these components to be joined together to record relationships between them: a speech, for instance, is linked to information on the speaker, to the sitting in which it takes place, and to the Acts or Bill which result from the proceedings in which it is made.

The ‘glue’ that enables these components to be linked together, and to other instances of the same component in other PML documents, is the use of URIs (Universal Resource Identifiers) to specify what they are and what type of component they belong to. A record of a speech in a debate, for instance, may be identified as such in this way:-

<pml:contribution typeURI=”“>

The typeURI attribute contains a URI in a controlled vocabulary to identify precisely what type of ‘contribution’ is being recorded here.

Similarly an MP is identified though a URI which refers to their entry in a vocabulary:-



<pml:label>Boateng, Paul</pml:label>


Using URIs in this way allows PML records to integrate semantically with resources outside the individual PML document, particularly with the Semantic Web. Within the PML file, interlinkages between components are recorded by internal XML IDs. A  political party, for instance, may be identified by a <unit> element as follows:-

<pml:unit ID=”uk.proc.d.1992-01-16-parties-663″ type=”party”



<pml:label xmlns=””>Labour Party</pml:label>



An MP’s affiliation to the party is then marked by recording the ID of this party in the person’s categoryIDs attribute:-

<pml:person categoryIDs=”uk.proc.d.1992-01-16-parties-663″


<pml:label>Boateng, Paul</pml:label>


In this way, a complex set of linkages may be built within and outside a PML file: the resulting body of data rapidly forms a substantial corpus capable of machine-readable analysis.

For a full description of the PML schema, and a sample file, please see this page on the LIPARM project website.

Digging into Parliamentary Data tutorial

Jaap Kamps and Maarten Marx (both University of Amsterdam) will give a tutorial at Digital Libraries 2014, London, called Digging into Parliamentary Data.
Date: Monday, 8th September.

The tutorial will show how to incrementally annotate textual corpora, starting from OCR’ed flat text to encoding structure and entities, and demonstrate the remarkable power of such light-weight semantic annotation: linear text becomes valuable research data that can be sliced and diced to present many views.

This tutorial is part of the Digging into Linked Parliamentary Data (DiLiPaD) and Exploratory Political Search (ExPoSe) projects.

Text of tutorial proposal.
Additional tutorial material (Slides, links, etc).

Introducing the History of Parliament

This is a post by Paul Seaward, introducing the History of Parliament – of which he is the director – and its involvement in the project.


The History of Parliament has long been an established feature in the landscape of British history. The product of an initiative in the 1920s, since 1951 it has been at work compiling a history of the English, British and UK Parliament in the form of biographies of each of its many thousand members, together with accounts of politics and elections of each constituency that returned them to Parliament, with surveys analysing this information and providing additional material about the institution of parliament and its work. We are a small research institute, funded by Parliament and outside the university sector, though with many and very important links to it, especially through the Institute of Historical Research. So far we’ve published around 21,000 biographies, and well over 3,000 constituency histories – now all online.

Studying Parliament’s history through its members is an extraordinarily rich way of relating the political life of a nation not only to the lives of the individuals who were deeply occupied in it, but also to the lives of the many communities and individuals whom they represented. It provides a key to the infinite number of interactions between the national legislature and people whom it affected. We’ve long been keen on taking this further: by using digital resources to bring together in user-friendly fashion all of the sources that illuminate these complex relationships, to make it increasingly accessible for investigation the nexus between individual ambitions and local concerns and the records of what was done and said in Parliament; to make it possible to really represent the complexity of attitudes and activities in some physical form. We’ve long been working with many others on digitising the parliamentary record – most of all of course with our friends at the IHR, on British History Online (and the IHR also maintains our own website) – but also with the creators of the online version of ‘historic’ Hansard, and scholars at Harvard and the LSE who have taken our digital data on nineteenth century divisions and enhanced it – something we hope to make available more broadly soon.

Dilipad is a great way of investigating how we can try to bring a set of very diverse sources together – diverse both across countries, and within the parliamentary proceedings of each individual country – and maximise their potential for asking increasingly complicated and sophisticated questions about political activity and political life. How closely have MPs reflected the attitudes and concerns of their own constituencies, and how has this changed over time? Is there a visible difference in the way they represent their constituents between MPs who come from the constituency and ‘carpet-bagging’ MPs who come from elsewhere? What do MPs most talk about, and how do they talk about it? What is the relationship between what they say on a particular subject and how they vote? The questions are clearly endless, and they won’t all be answered by this project. But by the end of the project we hope to have a set of tools and resources that will enable us to enhance the History of Parliament’s own ability to engage more closely and deeply with all of them.