2017 Edition > WorkshpsMonday 26/06Introduction to Digital Humanities - Elena PierazzoWhat are the Digital Humanities? Why is it important to know them? What are their purposes? This introductory course will give a theoretical frame to the week of courses. Course taught in French XML - Elina LeblancThe aim of this course is to get the basics of the XML language, which is essential to work with TEI and HTML. This course will provide a set of exercises to practice the XML syntax, but also to discover and understand the software Oxygen XML Editor. Course taught in French TEI (base) - Elina LeblancDiscovery of TEI and its basics through the encoding of several prose texts. This course follows directly the XML course and will provide a concrete application of its rules. Course taught in French Introduction to Image Processing – Peter StokesThis course will provide an introduction to digital images and image processing for people working with books and manuscripts. When working with books, manuscripts and documents today, it is almost inevitable that digital images will be involved: whether for private analysis, to prepare a transcription, or for publication as part of a digital edition or other purpose. In order to get the most out of these images, it is important to understand what they are and how they relate to the original object. In this course, then, we will discuss topics such as spatial and colour resolution; colour calibration; RGB colour; evaluating quality of digital images; and some basic techniques for image enhancement and analysis. Course taught in French Tuesday 27/06TEI Modelisation - Elena PierazzoModelisation is an activity which helps to establish in a formal and manageable way, with a software, a problematic linked to a research project. The course will help the participants to select the most suitable TEI markups for their own research project through the Roma software. Course taught in French HTML and CSS - Laura AntoniettiThe objective of this course is to give to the students the theoretical and practical basics of the HTML and CSS languages: after a short theoretical introduction, different exercises will help the participants to learn how to create simple Web pages (structure and style). Course taught in French Wednesday 28/06TEI Transcription and edition - Elena Pierazzo -- FULL!!The course aims to provide a strong introduction to the transcription of primary resources (manuscripts, printed books, other documents) in TEI. From the transcription level, we will shift to the edition level with the markups for the normalization and editorial regularization and the markups for the creation of critical apparatus. Course taught in French or in Italian (According to the request of the participants). EpiDoc - TEI for Epigraphy and Inscriptions – Emmanuelle MorlockThis EpiDoc workshop is divided into two half-days. The first half-day will present the main TEI concepts and markups, that are used to transpose in the digital world the traditional epigraphic approaches for transcription, analysis, description and classification of inscriptions. The second half-day will introduce the other constituent elements of the "EpiDoc method", that are headed towards Web publication, interoperability and community exchanges about practices. Course taught in French. Linked Open Data with Recogito – Valeria VitaleThis class will introduce Recogito, an online tool developed by Pelagios Commons, to identify and annotate named entities in historical documents, and, in particular, to enable geotagging and georesolution of place references via Linked Open Data. The participants will be walked step by step through the creation of semantic annotations: from the choice of the sources, to the use of automatic recognition; from the disambiguation of the annotations to the different data visualisations options. The students will annotate, via a simple interface, text as well as images and tabular data, both singularly and in simultaneous collaboration. The annotations will be then exported in various standard formats, including CSV, RDF XML, TEI XML and GeoJason, ready to be, potentially, further processed. Course taught in Italian GIS - Andreas Nijenhuis-Bescher and Julien CarantonFrom the data to the map - GIS for research The course "from archives to map" aims to show the path from research to its representation. Cartography can translate in a spatial representation data collected in archives, in databases or in literature. This course reproduce the differents phases from scientific research to the elaboration of a map, through a Geographic Information System (GIS). Based on a concrete example, the course shows the phases of the elaboration of a database and its representation. Course taught in French. Thursday 29/06NLP - Hervé Blanchon, Laurent Besacier and Gilles SérassetSession 1 : Introduction to natural language processing (Hervé Blanchon) In the first part, I will present a panorama of the applications of natural language processing (analysis, generation, translation, information retrieval, text mining, alignment…), the encountered problems (ambiguity, incompleteness), the different approaches (expert, empirical). Session 2 : Machine Translation and Analysis (Hervé Blanchon) In the second part, I will speak in more details about machine translation and the analysis of text, by presenting the methods and tools, and their potential applications. During this course, I will try to present and indicate some available tools for the scientific community. Session 3 : Lexical Resources (Didier Schwab) In this course, we will approach several monolingual and multilingual lexical resources, with which we work in our researches. We will speak about their characteristics, the way they are built and their exploitation for different tasks of natural language processing. We will especially study: WordNet, BabelNet, DBNary, distributed representations (Word2vec, Glove, Baroni vectors…). Session 4 : Fitting under-resourced languages and referencing in-danger languages: two different challenges for speech language engineering (Laurent Besacier) In this course, I will begin by defining two different concepts: under-resourced and in-danger languages. Under-resourced languages are an important societal and economic challenge: the objective is to provide these languages with tools and resources for natural language processing. I will introduce some contributions of the LIG on this theme in the domain of the development of voice technologies (especially speech recognition). In-danger languages encounter a different problem: the objective is to document and describe languages that are condemned to disappear in a near future, or to contribute to their revival when it is still time. Here, the technologies (speech recognition, mobile applications) can help the field linguist in his work of documentation/description. Courses taught in French. Lemmatization and Treebanking (Latin) - Eleonora Litta and Marco PassarottiLinguistic resources and NLP tools for Latin. The course aims to provide to the participants the basics for linguistic resources and natural language processing tools for the Latin language. A short introduction will present the essential concepts and the specialised terminology, and especially the levels of metalinguistic annotations and the different typologies of linguistic resources. In particular, the annotation styles for annotated corpora at a syntactic level will be described. In this context, the course will present two types of resources for Latin: dependency treebanks and lexicons. A short training in the query of treebanks with two different query languages will be considered. For the Latin natural language processing tools, the working of a morphological analyser (Lemlat) and, especially, of a last extension dedicated to morphological derivations, will be presented. Next, the course will focus on the methods and tools (with evaluation) for morpho-syntactic and syntactic analysis, by looking at some main open problems. Finally, some resources and specific tools with their pratical applications will be described. Course taught in Italian.
|