Funded Projects › HORIZON
HPLT · High Performance Language Technologies
High Performance Language Technologies (HPLT) is a space combining petabytes of natural language data with large-scale model training. With trillions of words of text, the space will be the largest open text collection. Cleaning and privacy protecting services improve the quality and ethical properties of the text. Going beyond static repositories that require the user to individually analyze each data set, the project will rate data sets by how much they improve end-to-end language models and machine translation systems. Continuous integration of models and data will result in free downloadable high-quality models for all official European Union languages and beyond. The models will be reproducible with information and evaluation metrics shown in a publicly available dashboard. By focusing on training at scale, the project complements the inference-focused European Language Grid, which in turn will be used for model deployment. Datasets, models and information about them will be published in recognized FAIR data repositories, aggregation catalogues and marketplaces for easy discovery, access, replication, and exploitation.
Consortium · 8 organisations
UNIVERZITA KARLOVA
CZ · €641,813
SIGMA2 AS
NO · €373,036
CESNET ZAJMOVE SDRUZENI PRAVNICKYCH OSOB
CZ · €415,000
HELSINGIN YLIOPISTO
FI · €594,625
PROMPSIT LANGUAGE ENGINEERING, SL
ES · €414,400
UNIVERSITETET I OSLO
NO · €752,814
TURUN YLIOPISTO
FI · €689,000
THE UNIVERSITY OF EDINBURGH
UK
← Find collaborators and more funded projects
Source: CORDIS, Publications Office of the European Union. Global Research Partnerships surfaces open EU research data to help you find collaborators; we are not affiliated with the European Union.