Ramón Fernandez Astudillo

[Google Scholar] [GitHub] [LinkedIn] [Twitter]

I am currently Research Staff Member at IBM Research AI in the T. J. Watson research center in Yorktown Heights, New York. Before this I was senior Research Scientist at Unbabel and associate researcher at INESC-ID in Lisboa and before that PhD candidate at TU Berlin. I got here starting from signal processing, then speech recognition and then natural language processing. While at it, deep learning happened. I ended up spending very large chunks of my life in Spain, Germany and Portugal and recently moved to the Unites States. It is hard to find things that do not interest me, but artificial and crowd intelligence seem particularly motivating in this moment in history.

2019

2018

2016

2015

2012

2010

Phd

Obtained the Dr-Ing (PhD) title with distinction in 2010 in the fields of speech processing and robust automatic speech recognition with the thesis

Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition [pdf] [code]

My doctor-fathers were Reinhold Orglmeister and Rainer Martin, but I mostly have a doctor-mother, Dorothea Kolossa, who developed the initial idea and helped me kickstart my thesis.

In short, context for my thesis is the following

This was a relatively active topic in robust ASR at the time, with multiple competing approaches (feature/model based). My thesis contributions were basically

Aside from my PhD, in my time in Berlin I also got to tutor the Neural Networks Seminar of EMSP, supervised Phillip Mandelartz’s Thesis and helped other students in the department with their projects and theses.

After defending my thesis I still spent some months at EMSP finishing papers. While finishing, a causality led me to discover Isabel Trancoso’s department at INESC-ID in Lisbon. After reading João Graças and Diamantino Caseiros papers, I decided to apply for a post-doctoral grant to transition from speech to natural language processing. Luckily, I was awarded with a 3+3 year FCT Post-Doctoral grant to join INESC-ID/L2F.

2006

I worked as an intern at Peiker Acustic for six months with the aid of a Leonardo grant and in collaboration with the TU-Berlin. The output was a spectral codebook-based speech reconstruction algorithm. Aside from realising how hard is to write a tech report in German, I also got to know better the Minimum Mean Spectral Amplitude and Log Spectral Amplitude Estimators (also known and Ephraim-Malah filters).

on this same year I was awarded with a La Caixa and the German Academic Exchange Service (DAAD) scholarship for research towards the Ph.D. degree at the EMSP department of the Technische Universität Berlin. I started working with Dorothea Kolossa on the topic of uncertainty propagation.

2005

I got the Industrial Engineering degree with specialization in electronics and automatics at the Escuela Politecnica Superior de Ingenieria de Gijon (Spain). At the time, this was a 6 year multidisciplinary degree plus thesis. I got to learn a lot of math/phisics and all things engineering from industrial heat and cold to macro-economy, got also to play with all levels of programming languages from VHDL and x86 CISC to Visual C++ and MFCs (no Python sadly). On my free time, I started learning fuzzy logic and programmed my first neural network.

I did my last year at Technische Universität Berlin with the aid of an Erasmus grant (2004). My final thesis was also at TU-Berlin, at the Electronics and Medical Signal Processing (EMSP) department (2005). My thesis was directed by Dorothea Kolossa and the topic was prunning in tied-mixture Hidden Markov Models for Automatic Speech Recognition.