image description


CorTexT is a platform for methodology development, software engineering and support for the analysis of corpuses of text in the social sciences and humanities.

The CorTexT platform provides researchers, teachers, instructors and students with value-added tools and services to process bodies of text for research, studies, expertise and training. This platform offers an instrument for social sciences and humanities research, rooted in the digital humanities movement. The aim is to provide users with state-of-the-art tools to process, characterize, analyse and quantify textual data that has undergone little to no calibration. As much as possible – within the limits of legal constraints –, this platform promotes open science, in terms of both its uses and developments in information processing.

The CorTexT platform is the result of pooled human means and computerized storage and computation resources, to develop the analysis of bodies of text of all sizes. The tools are developed in the form of added-value services and are based on developments in data processing and algorithms, some of which are original. The platform project was initially part of a founding initiative by the GIS IFRIS, under the LABEX SITES. Its implementation was funded first by the INRA Unit SenS 1326 and then by the mixed research unit UMR LISIS. The CorTexT platform accessible through the URL is edited both by INRA and by a major application: CorTexT Manager. The core of the CorTexT platform is located at the Université Paris-Est Marne-la-Vallée campus.

The work carried out in connection with the CorTexT platform pertains to several research fields:

  • Socio-semantic analysis of heterogeneous bodies of text
  • Data mining and information extraction
  • Sciences of complexity applied to social networks
  • Computer sciences for the social sciences and humanities