Arantza Díaz de Ilarraza Sánchez (San Sebastián, 18 April 1957) is a professor of informatics at the University of the Basque Country. In 1981, she began her work as a lecturer at the Faculty of Informatics of Donostia. As a specialist in language and computer technology, she has held positions of responsibility in Basque technology institutions.
Arantza Díaz de Ilarraza | |
---|---|
Born | Arantza Díaz de Ilarraza Sánchez 18 April 1957 |
Nationality | Basque, Spanish |
Alma mater | University of the Basque Country |
Known for | Main researcher of the Ixa research group (1988–2020), First president of the HiTZ Centre |
Scientific career | |
Fields | Informatics, Language Technology, Computational Linguistics, Artificial intelligence |
Thesis | Management of natural-language dialogues for an intelligent teaching system. (1990) |
Doctoral advisor | Felisa Verdejo |
Notable students | Mikel Artetxe Zurutuza |
Academic career
editDíaz de Ilarraza graduated in 1979 and began lecturing in the same faculty two years later.[citation needed] In 1983, she completed her degree dissertation and in 1990 defended her PhD thesis entitled "Management of natural-language dialogues for an intelligent teaching system".[1]
Díaz de Ilarraza has worked in numerous fields as a researcher. Although most of them are connected with natural language processing and the Ixa Group, she also worked with the Galan Group in the field of Intelligent Tutoring Systems for 20 years. Her main lines of research are:
- Intelligent tutoring systems (from 1981): Díaz de Ilarraza's PhD thesis was in this field: the managing of a dialogue system of the CAPRA intelligent tutoring system that taught computer programming. After completing her thesis, she directed this line of research with Isabel Fernández de Castro and they both started the Galan group. In 1989, she secured her first European project. The following centres collaborated in the project known as ITSIE (Intelligent Tutoring System for Industrial Environments): UPV/EHU, Iberdrola, Labein, Heriot-Watt University, Marconi (Edinburgh) and CISE (Italy).[2]
- Lexical knowledge extraction and management (1993–2000)[3][4]
- Basic linguistic analysers (from 1994)
- Integration of linguistic tools in teaching environments (from 1994)[5][6]
- Integration of language tools and assistance in linguistic text tagging (from 1995): The EPEC syntactically tagged corpus (EPEC-DEP) emerged from these works.[7][8]
- Application of NLP technology to medical texts (from 2010): A huge advance was achieved in semi-automatically translating health terminology into Basque. The research's starting point was the SNOMED CT database, which contained 300,000 English clinical terms to be translated into Basque. After completing the translation in 2018, the group[clarification needed] is currently integrating machine translation in the Itzulbide project to create technical facilities to produce healthcare reports in Basque.[9]
- Machine translation (from 2000): Aingeru Mayor's thesis realised the Matxin translating system (2007), the first to be developed for Basque. In 2010, a statistical machine translator was created by Gorka Labaka's. With the emergence of the neuronal paradigm, a huge improvement was seen since 2017 in machine translation among the major languages. Afterwards, the Basque research community was able to put Basque neuronal translators at the same level. In 2015, the DeepL translator provided quality results in translations across ten languages, but Basque was not among them; the Ixa Group began working on that in the TADEEP project, and the first public demo was available in 2017. That year various organisations (Ixa Group, Elhuyar, Vicomtech, Ametzagaña, Mondragon Lingua, etc.) collaborated and launched the MODELA project. The first service offering neuronal translation into Basque over the Internet for the general public was published a year later in 2018.[10] In this field at least three other neuronal translators have been made available since:
- The Basque Government's neuronal translator uses the Basque Government's translation libraries (over 10 million "sentences" gathered over a 20-year period).
- Batua.eus: Vicomtech incorporated improvements into the MODELA system (transferring from RNN technology to Transformer technology) and enlarged its libraries.
- Itzultzailea.eus: Elhuyar also made similar improvements and incorporated additional languages (English, French, Spanish, Galician and Catalan).[11]
HiTZ Center
editIn 1988, she created the Ixa Group along with four others. Both Ixa and HiTZ are multidisciplinary teams (73 members, consisting of computer scientists, linguists and engineers) that promote research, training, technological transfer and innovation in the area of language technology, mainly for the Basque language.[12][13] In 2018, she retired as President of the HiTZ Center (Basque Center for Language Technology).
SEPLN association
editShe was one of the creators of the Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN, in English: Spanish association for Natural Language Processing), a scientific and professional association for people working on natural language processing. Later on, since 1990 to 2004, she was the vicepresident of SEPLN, and also the editor of the international journal Procesamiento del Lenguaje Natural published by SEPLN.[14]
Textbooks in Basque
editDíaz de Ilarraza was one of the first authors in the area of computer science to publish textbooks and teaching materials in Basque, with those books later being translated into Spanish. In 1993, she published the Basque-language book Programen egiaztapena eta eratorpena with Xabier Arregi and Paqui Lucio Carrasco (UEU, 1993).[15] (English: Program verification and derivation). In 1999 she co-authored Oinarrizko programazioa. Ariketa bilduma (English: Basic Programming) with Kepa Sarasola.[16]
References
edit- ^ Diaz de Ilarraza, Arantza. "GESTION DE DIALOGOS EN LENGUAJE NATURAL PARA UN SISTEMA DE ENSEÑANZA INTELIGENTE". Tesis doctorales: TESEO. Ministerio de Educación Cultura y Deporte. Madrid. Retrieved 10 February 2021.
- ^ Gutierrez, Julián; Elorriaga, Jon Ander; Fernandez Castro, Isabel; Vadillo, Jose Angel; Diaz-Ilarraza, Arantza (1998). "Intelligent tutoring systems for training of operators for thermal power plants". Artificial Intelligence in Engineering. 12 (3): 8. doi:10.1016/S0954-1810(97)00015-0. Retrieved 10 February 2021.
- ^ Arregi-Iparragirre, Xabier. "ANHITZ: SISTEMA DICCIONARIAL MULTILINGUE. Multilingual dictionary-system". Tesis doctorales: TESEO. Ministerio de educación. Madrid. Retrieved 10 February 2021.
- ^ Agirre, Eneko. "KONTZEPTUEN ARTEKO ERALAZIO-IZAERAREN FORMALIZAZIOA ONTOLOGIAK ERABILIAZ: DENTSITATE KONTZEPTUALA. Conceptual density". Tesis doctorales: TESEO. Ministrio de Educación. Madrid. Retrieved 10 February 2021.
- ^ Maritxalar-Anglada, Montserrat. "MUGARRI: ENTORNO MULTISISTÉMICO PARA EL ACCESO AL CONOCIMIENTO LINGÜISTICO DE ESTUDIANTES DE SEGUNDA LENGUA". Tesis doctorales: TESEO. Ministerio de Educación. Madrid. Retrieved 10 February 2021.
- ^ Oronoz-Antxordoki, Maite. "EUSKARAZKO ERRORE SINTAKTIKOAK DETEKTATZEKO ETA ZUZENTZEKO BALIABIDEEN GARAPENA: DATAK, POSTPOSIZIO-LOKUZIOAK ETA KOMUNZTADURA". Tesis doctorales: TESEO. Ministerio de Educación. Madrid. Retrieved 10 February 2021.
- ^ Aduriz, Itziar; Aranzabe, Maxux; Atutxa, Aitziber; Diaz de Ilarraza, Arantza; Ezeiza, Nerea; Gojenola, Koldo; Oronoz, Maite; Urizar, Ruben. "Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing" (PDF). CLAW2006. Ixa Group. Retrieved 10 February 2021.
- ^ "EPEC-DEP (BDT) / EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing". IXA Group. University of the Basque Country. Ixa Group. Retrieved 10 February 2021.
- ^ ""Itzulbide" project: a tool for normalizing the use of Basque in clinical histories". Language Technology. IXa Group. Retrieved 11 February 2021.
- ^ "Machine Translation". HiTZ. Basque Center for Language Technology. UPV/EHU HiTZ Center. Retrieved 10 February 2021.
- ^ "Itzultzailea. Elhuyarren itzultzaile automatikoa. Machine translator of Elhuyar". Itzultzailea. Elhuyarren itzultzaile automatikoa. Elhuyar Foundation. Retrieved 10 February 2021.
- ^ SEPLN. "Recognition of the scientific career of SEPLN retired members at SEPLN 2019". Spanish Society for Natural Language Processing. SEPLN. Retrieved 10 February 2021.
- ^ "The advisory council of the SEPLN journal". Spanish Society for Natural Language Processing. Spanish Society for Natural Language Processing. Retrieved 10 February 2021.
- ^ "Recognition of the scientific career of SEPLN retired members at SEPLN 2019 | Spanish Society for Natural Language Processing". www.sepln.org. Retrieved 21 September 2023.
- ^ "Bilaketaren emaitzak :: Buruxkak Liburutegi digitala".
- ^ Diaz de Ilarraza Sanchez, Arantza; Sarasola, Kepa (1999). Oinarrizko programazioa: Ariketa bilduma. Bilbo: UEU, Basque Summer University. ISBN 978-84-8438-002-3. Retrieved 4 February 2021.