Publications

Highlights

(For a full list see below or go to HAL)

A corpus of audio-visual Lombard speech with frontal and profile views

A bi-view audiovisual Lombard speech corpus which can be used to support joint computational-behavioral studies in speech perception.

Najwa Alghamdi, Steve Maddock, Ricard Marxer, Jon Barker, and Guy J. Brown

Journal of the Acoustical Society of America

Corpus available at The Audio-visual Lombard Grid Speech Corpus website

The impact of the Lombard effect on audio and visual speech recognition systems

Analysis of audio and visual Lombard speech using new 54 speaker database. New data on the inter-speaker variability of the Lombard effect. Measurement of the impact of Lombard mismatch in a noise robust speech recognition system. Detailed analysis of plain speech versus Lombard speech performance in well-adapted recognition system. Evidence that visual Lombard speech supports higher recognition performance than visual plain speech.

Ricard Marxer, Jon Barker, Najwa Alghamdi, and Steve Maddock

Speech Communication

 

Full List

  1. Gogate, M., Adeel, A., Marxer, R., Barker, J., & Hussain, A. (2018). DNN Driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation. In Interspeech 2018 (pp. 2723–2727). Hybderabad, India: ISCA. https://doi.org/10.21437/Interspeech.2018-2516
  2. Alghamdi, N., Maddock, S., Marxer, R., Barker, J., & Brown, G. (2018). A corpus of audio-visual Lombard speech with frontal and profile views. Journal of the Acoustical Society of America, 143(6), EL523–EL529. https://doi.org/10.1121/1.5042758
  3. Marxer, R., Barker, J., Alghamdi, N., & Maddock, S. (2018). The impact of the Lombard effect on audio and visual speech recognition systems. Speech Communication, 100, 58–68. https://doi.org/10.1016/j.specom.2018.04.006
  4. Tao, L., Burghardt, T., Mirmehdi, M., Damen, D., Cooper, A., Camplani, M., … Craddock, I. (2018). Energy expenditure estimation using visual and inertial sensors. IET Computer Vision, 12(1). https://doi.org/10.1049/iet-cvi.2017.0112
  5. Whitehouse, S., Yordanova, K., Ludtke, S., Paiement, A., & Mirmehdi, M. (2018). Evaluation of cupboard door sensors for improving activity recognition in the kitchen. In IEEE International Conference on Pervasive Computing and Communications Workshops (Percom Workshop).
  6. Yordanova, K., Paiement, A., Schröder, M., M. Tonkin, M., Woznowski, P., Olsson, C., … Sztyler, T. (2018). Challenges in Annotation of useR Data for UbiquitOUs Systems: Results from the 1st ARDUOUS Workshop. https://doi.org/arXiv:1803.05843
  7. Stroe, P., Xie, X., & Paiement, A. (2018). Manifold modeling of the beating heart motion. In Medical Image Analysis and Understanding (MIUA).
  8. Morgan, J., Paiement, A., Seisenberger, M., Williams, J., & Wyner, A. (2018). A Chatbot Framework for the Children’s Legal Centre. In International conference on Legal Knowledge and Information Systems (Jurix).
  9. Camplani, M., Paiement, A., Mirmehdi, M., Damen, D., Hannuna, S., Burghardt, T., & Tao, L. (2017). Multiple human tracking in RGB-depth data: a survey. IET Computer Vision, 11(4), 265–285. https://doi.org/10.1049/iet-cvi.2016.0178
  10. Moore, R. K., Thill, S., & Marxer, R. (2017). Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR) (Dagstuhl Seminar 16442). Dagstuhl Reports, 6(10), 154–194. https://doi.org/10.4230/DagRep.6.10.154
  11. Sheng, K., Dong, W., Li, W., Razik, J., Huang, F., & Hu, B. (2017). Centroid-Aware Local Discriminative Metric Learning in Speaker Verification. Pattern Recognition, 72(c), 176–185.
  12. Marxer, R., & Barker, J. (2017). Binary Mask Estimation Strategies for Constrained Imputation-Based Speech Enhancement. In Proc. Interspeech 2017 (pp. 1988–1992). https://doi.org/10.21437/Interspeech.2017-1257
  13. Malavasi, M., Turri, E., Atria, J. J., Christensen, H., Marxer, R., Desideri, L., … Green, P. (2017). An Innovative Speech-Based User Interface for Smarthomes and IoT Solutions to Help People with Speech and Motor Disabilities. Studies in Health Technology and Informatics, 242(Harnessing the Power of Technology to Improve Lives), 306–313. https://doi.org/10.3233/978-1-61499-798-6-306
  14. Vincent, E., Watanabe, S., Nugraha, A. A., Barker, J., & Marxer, R. (2017). An analysis of environment, microphone and data simulation mismatches in robust speech recognition. Computer Speech & Language, 46, 535–557. https://doi.org/10.1016/j.csl.2016.11.005
  15. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2017). The third ’CHiME’ speech separation and recognition challenge: Analysis and outcomes. Computer Speech & Language, 46, 605–626. https://doi.org/10.1016/j.csl.2016.10.005
  16. Yordanova, K., Whitehouse, S., Paiement, A., Mirmehdi, M., Kirste, T., & Craddock, I. (2017). What’s cooking and why? Behaviour recognition during unscripted cooking tasks for health monitoring. In IEEE International Conference on Pervasive Computing and Communications - Work in Progress (PerCom Work in Progress). https://doi.org/10.1109/PERCOMW.2017.7917511
  17. Zhou, K., Paiement, A., & Mirmehdi, M. (2017). Detecting humans in RGB-D data with CNNs. In IAPR International Conference on Machine Vision Applications (MVA). https://doi.org/10.23919/MVA.2017.7986862
  18. Woznowski, P., Burrows, A., Diethe, T., Fafoutis, X., Hall, J., Hannuna, S., … Oikonomou, G. (2017). SPHERE: A Sensor Platform for Healthcare in a Residential Environment. In Designing, Developing, and Facilitating Smart Cities (pp. 315–333). Springer International Publishing.
  19. Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., & Hussain, A. (2016). A Data Driven Approach to Audiovisual Speech Mapping. In C.-}L. Liu, A. Hussain, B. Luo, K. C. Tan, Y. Zeng, & Z. Zhang (Eds.), Advances in Brain Inspired Cognitive Systems - 8th International Conference, BICS 2016, Beijing, China, November 28-30, 2016, Proceedings (Vol. 10023, pp. 331–342). https://doi.org/10.1007/978-3-319-49685-6_30
  20. Marxer, R., Barker, J., Cooke, M., & Garcia Lecumberri, M. L. (2016). A corpus of noise-induced word misperceptions for English. The Journal of the Acoustical Society of America, 140(5), EL458–EL463. https://doi.org/10.1121/1.4967185
  21. Moore, R. K., Marxer, R., & Thill, S. (2016). Vocal Interactivity in-and-between Humans, Animals, and Robots. Frontiers in Robotics and AI, 3. https://doi.org/10.3389/frobt.2016.00061
  22. Moore, R. K., & Marxer, R. (2016). Progress and prospects for spoken language technology: Results from four sexennial surveys. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 3012–3016). https://doi.org/10.21437/Interspeech.2016-948
  23. Lecumberri, M. L. G., Barker, J., Marxer, R., & Cooke, M. (2016). Language effects in noise-induced word misperceptions. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 640–644). https://doi.org/10.21437/Interspeech.2016-330
  24. Green, P., Marxer, R., Cunningham, S., Christensen, H., Rudzicz, F., Yancheva, M., … Tamburini, F. (2016). CloudCAST-Remote speech technology for speech professionals. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 08-12-September-2016, pp. 1608–1612). https://doi.org/10.21437/Interspeech.2016-148
  25. Bosch, J. J., Marxer, R., & Gómez, E. (2016). Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music. Journal of New Music Research, 45(2), 101–117. https://doi.org/10.1080/09298215.2016.1182191
  26. Marxer, R., & Purwins, H. (2016). Unsupervised Incremental Online Learning and Prediction of Musical Audio Signals. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(5), 863–874. https://doi.org/10.1109/TASLP.2016.2530409
  27. Liu, C.-L., Hussain, A., Luo, B., Tan, K. C., Zeng, Y., & Zhang, Z. (Eds.). (2016). Advances in Brain Inspired Cognitive Systems - 8th International Conference, BICS 2016, Beijing, China, November 28-30, 2016, Proceedings (Vol. 10023). https://doi.org/10.1007/978-3-319-49685-6
  28. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2016). The CHiME challenges: Robust speech recognition in everyday environments. In New era for robust speech recognition - Exploiting deep learning. Springer. Retrieved from https://hal.inria.fr/hal-01383263
  29. Tao, L., Burghardt, T., Mirmehdi, M., Damen, D., Cooper, A., Hannuna, S., … Craddock, I. (2016). Calorie counter: RGB-depth visual estimation of energy expenditure at home. Asian Conference on Computer Vision Workshop (ACCV Workshop) (Vol. 10116 LNCS). https://doi.org/10.1007/978-3-319-54407-6_16
  30. Hall, J., Hannuna, S., Camplani, M., Mirmehdi, M., Damen, D., Burghardt, T., … Craddock, I. (2016). Designing a video monitoring system for AAL applications: The SPHERE case study. In IET International Conference on Technologies for Active and Assisted Living (TechAAL). https://doi.org/10.1049/ic.2016.0061
  31. Tao, L., Paiement, A., Damen, D., Mirmehdi, M., Hannuna, S., Camplani, M., … Craddock, I. (2016). A comparative study of pose representation and dynamics modelling for online motion quality assessment. Computer Vision and Image Understanding, 148, 136–152. https://doi.org/10.1016/j.cviu.2015.11.016
  32. Hannuna, S., Camplani, M., Hall, J., Mirmehdi, M., Damen, D., Burghardt, T., … Tao, L. (2016). DS-KCF: a real-time tracker for RGB-D data. Journal of Real-Time Image Processing. https://doi.org/10.1007/s11554-016-0654-3
  33. Tao, L., Burghardt, T., Mirmehdi, M., Damen, D., Cooper, A., Camplani, M., … Craddock, I. (2016). Real-time estimation of physical activity intensity for daily living. In IET International Conference on Technologies for Active and Assisted Living (TechAAL). https://doi.org/10.1049/ic.2016.0060
  34. Whitehouse, S., Yordanova, K., Paiement, A., & Mirmehdi, M. (2016). Recognition of unscripted kitchen activities and eating behaviour for health monitoring. In IET International Conference on Technologies for Active and Assisted Living (TechAAL). https://doi.org/10.1049/ic.2016.0050
  35. Paiement, A., Mirmehdi, M., Xie, X., & Hamilton, M. C. K. (2016). Registration and modeling from spaced and misaligned image volumes. IEEE Transactions on Image Processing, 25(9). https://doi.org/10.1109/TIP.2016.2586660
  36. Barker, J., Marxer, R., Vincent, E., & Watanabe, S. (2015). The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015 (pp. 504–511). https://doi.org/10.1109/ASRU.2015.7404837
  37. Ma, N., Marxer, R., Barker, J., & Brown, G. J. (2015). Exploiting synchrony spectra and deep neural networks for noise-robust automatic speech recognition. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015 (pp. 490–495). https://doi.org/10.1109/ASRU.2015.7404835
  38. Bonada, J., Janer, J., Marxer, R., Umeyama, Y., Kondo, K., & Garcia, F. (2015). Technique for estimating particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9224406
  39. Casanueva, I., Hain, T., Christensen, H., Marxer, R., & Green, P. (2015). Knowledge transfer between speakers for personalised dialogue management. In the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 12–21). Prague, Czech Republic: Association for Computational Linguistics.
  40. Marxer, R., Cooke, M., & Barker, J. (2015). A framework for the evaluation of microscopic intelligibility models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2015-January, pp. 2558–2562).
  41. Bonada, J., Janer, J., Marxer, R., Umeyama, Y., & Kondo, K. (2015). Technique for suppressing particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9070370
  42. Umeyama, Y., Kondo, Y., K. Takahashi, Bonada, J., Janer, J., & Marxer, R. (2015). Technique for estimating particular audio component (Version 1). Retrieved from https://www.google.com/patents/US9002035
  43. Mercier, M., Razik, J., & Glotin, H. (2015). Synthèses d’interactions multimodales parcimonieuses pour l’écriture de l’œuvre Iquisme et l’analyse de ses percepts. In Journées d’Informatique Musicale (pp. 1–7). Montreal.
  44. Doh, Y., Razik, J., & Glotin, H. (2015). Sparse coding of Megaptera songs reveals their evolution. In Humpback Whale World Congress. Sainte Marie, Madagascar.
  45. Razik, J., Glotin, H., Hoeberechts, M., Doh, Y., & Paris, S. (2015). Sparse coding for efficient bioacoustic data mining: Preliminary application to analysis of whale songs. In EADM, ICDM Workshop.
  46. Paiement, A. (2015). Integrated registration, segmentation, and interpolation for 3D/4D sparse data. Electronic Letters on Computer Vision and Image Analysis, 14(3). https://doi.org/10.5565/rev/elcvia.712
  47. Woznowski, P., Fafoutis, X., Song, T., Hannuna, S., Camplani, M., Tao, L., … Craddock, I. (2015). A multi-modal sensor infrastructure for healthcare in a residential environment. In IEEE International Conference on Communication Workshop (ICCW). https://doi.org/10.1109/ICCW.2015.7247190
  48. Tao, L., Burghardt, T., Hannuna, S., Camplani, M., Paiement, A., Damen, D., … Craddock, I. (2015). A comparative home activity monitoring study using visual and inertial sensors. In International Conference on E-health Networking, Application & Services (HealthCom). https://doi.org/10.1109/HealthCom.2015.7454583
  49. Crabbe, B., Paiement, A., Hannuna, S., & Mirmehdi, M. (2015). Skeleton-Free Body Pose Estimation from Depth Images for Movement Analysis. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW). https://doi.org/10.1109/ICCVW.2015.49
  50. Camplani, M., Hannuna, S., Mirmehdi, M., Damen, D., Paiement, A., Tao, L., & Burghardt, T. (2015). Real-time RGB-D Tracking with Depth Scaling Kernelised Correlation Filters and Occlusion Handling. In British Machine Vision Conference (BMVC).
  51. Paiement, A., Mirmehdi, M., Xie, X., & Hamilton, M. C. K. (2014). Integrated segmentation and interpolation of sparse data. IEEE Transactions on Image Processing, 23(1). https://doi.org/10.1109/TIP.2013.2286903
  52. Paiement, A., Tao, L., Hannuna, S., Camplani, M., Damen, D., & Mirmehdi, M. (2014). Online quality assessment of human movement from skeleton data. In British Machine Vision Conference (BMVC). https://doi.org/http://dx.doi.org/10.5244/C.28.79
  53. Marxer, R. (2013). Audio Source Separation in Low-latency and High-latency Scenarios (PhD thesis). Universitat Pompeu Fabra, Barcelona, Spain. Retrieved from http://ricardmarxer.com/phd/thesis_revised.pdf
  54. Janer, J., & Marxer, R. (2013). Separation of unvoiced fricatives in singing voice mixtures with semi-supervised NMF. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  55. Marxer, R., & Janer, J. (2013). Modelling and Separation of Singing Voice Breathiness in Polyphonic Mixtures. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  56. Marxer, R., & Janer, J. (2013). Low-latency bass separation using harmonic-percussion decomposition. In Proceedings of the 16th International Conference on Digital Audio Effects Conference (DAFx-13), Maynooth, Ireland.
  57. Glotin, H., Sueur, J., Artières, T., Adam, O., & Razik, J. (2013). Sparse coding for scaled bioacoustics: From Humpback whale songs evolution to forest soundscape analyses. Journal of the Acoustical Society of America, 133(5), 3311. https://doi.org/DOI:10.1121/1.4805502
  58. Doh, Y., Glotin, H., Razik, J., & Paris, S. (2013). Mono-channel spectral attenuation modeled by hierarchical neural net estimates hydrophone-whale distance. In Neural Information Processing Scaled for Bioacoustics, from Neurons to Big Data - NIPS Int. Conf. workshop (Vol. ISSN 979-10-90821-04-0, pp. 88–96). Retrieved from http://sabiod.org/nips4b
  59. Bartcus, M., Chamroukhi, F., Razik, J., & Glotin, H. (2013). Unsupervised whale song decomposition with Bayesian non-parametric Gaussian Mixture. In Neural Information Processing Scaled for Bioacoustics, from Neurons to Big Data - NIPS Int. Conf. workshop (Vol. ISSN 979-10-90821-04-0, pp. 205–211). Retrieved from http://sabiod.org/nips4b
  60. Glotin, H., Giraudet, P., Razik, J., Paris, S., Halkias, X., Chamroukhi, F., … Mishchenko, A. (2013). Tracking multiple marine mammals by shortly or widely spaced hydrophones. In Dirac NGO, Detection Classification localization of Marine Mammals using passive acoustics (Vol. ISBN 978-2-7466-6118-9, pp. 71–92).
  61. Paris, S., Doh, Y., Glotin, H., Halkias, X., & Razik, J. (2013). Physeter catodon localization by sparse coding. In ICML4B in ICML Int. Conf.
  62. Doh, Y., Razik, J., Paris, S., Adam, O., & Glotin, H. (2013). Décomposition parcimonieuse des chants de cétacés pour leur suivi. Traitement Du Signal, 30(3-4-5), 219–242.
  63. Doh, Y., Saloma, A., Gandilhon, N., Nolibé, G., Jung, J.-L., Razik, J., … Adam, O. (2013). Biodiversité marine : observatoires des baleines (Marine Biodiversity and whale observatories). In Tall Ship Race conference. Toulon - France.
  64. Bosch, J. J., Kondo, K., Marxer, R., & Janer, J. (2012). Score-informed and timbre independent lead instrument separation in real-world scenarios. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO) (pp. 2417–2421). IEEE.
  65. Marxer, R. (2012). El arte generativo y la belleza de los procesos. Novatica, (216), 51–56. Retrieved from http://www2.ati.es/novatica/2012/216/nv216sum.html
  66. Marxer, R., Janer, J., & Bonada, J. (2012). Low-latency instrument separation in polyphonic audio using timbre models. In F. J. Theis, A. Cichocki, A. Yeredor, & M. Zibulevsky (Eds.), Latent Variable Analysis and Signal Separation: Proceedings of 10th International Conference, LVA/ICA 2012, Tel Aviv, Israel (Vol. 7191, pp. 314–321). Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-28551-6_39
  67. Janer, J., Marxer, R., & Arimoto, K. (2012). Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 281–284). IEEE. https://doi.org/10.1109/ICASSP.2012.6287872
  68. Marxer, R., & Janer, J. (2012). A Tikhonov regularization method for spectrum decomposition in low latency audio source separation. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 277–280). IEEE. https://doi.org/10.1109/ICASSP.2012.6287871
  69. Razik, J., Paris, S., & Glotin, H. (2012). Broadcast News Phoneme Recognition by Sparse Coding. In ICPRAM.
  70. Ballas, N., Labbé, B., Shabou, A., Borgne, H. L., Gosselin, P., Redi, M., … Crucianu, M. (2012). IRIM at TRECVID 2012: Semantic Indexing and Instance Search. In TRECVID Workshop.
  71. Razik, J., Mella, O., Fohr, D., & Haton, J.-P. (2011). Frame-synchronous and Local Confidence Measures for ASR. IJPRAI, 25(2), 157–182.
  72. Delezoide, B., Gorisse, D., Precioso, F., Gosselin, P., Redi, M., Mérialdo, B., … Glotin, H. (2011). IRIM at TRECVID 2011: Semantic Indexing and Instance Search. In National Institute of Standards and Technology (NIST), TRECVID 2011 workshop participants notebook papers.
  73. Glotin, H., Razik, J., Paris, S., & Prévot, J. M. (2011). Real-time entropic unsupervised violent scenes detection in Hollywood movies - DYNI @ MediaEval CEURS. In MediaEval (Vol. 807).
  74. Razik, J., Glotin, H., Paris, S., & Adam, O. (2011). Humpback whale song sparse coding and information theory analysis. In International Workshop on Detection, Classification, Localization & Density Estimation of Marine Mammals using Passive Acoustics (p. 41).
  75. Glotin, H., Razik, J., Giraudet, P., Paris, S., & Bénard, F. (2011). Sparse coding for fast minke whale tracking with Hawaiian bottom mounted hydrophones. In International Workshop on Detection, Classification, Localization & Density Estimation of Marine Mammals using Passive Acoustics (p. 30).
  76. Bendris, M., Brun, C., Carrive, J., Chollet, G., Marraud, D., Razik, J., & Vanni, S. (2011). Sémantique et Multimodalité en analyse de l’information. Hermes.
  77. Razik, J. (2011). Sparse coding: from speech to whales. ERMITES.
  78. Caon, D., Amehraye, A., Razik, J., Chollet, G., Andreao, R., & Mokbel, C. (2010). Experiments on Acoustic Model supervised adaptation and evaluation by K-Fold Cross Validation technique. In International Symposium on I/V Communications and Mobile Network, ISIVC.
  79. Paiement, A., Mirmehdi, M., Xie, X., & Hamilton, M. (2010). Simultaneous Level Set interpolation and segmentation of short- and long-axis MRI. In Medical Image Understanding and Analysis (MIUA).
  80. Hazan, A., Marxer, R., Brossier, P., Purwins, H., Herrera, P., & Serra, X. (2009). What/when causal expectation modelling applied to audio signals. Connection Science, 21(2-3), 119–143. https://doi.org/10.1080/09540090902733764
  81. Perrot, P., Razik, J., & Chollet, G. (2009). Voice Disguise and Reversibility. In European Academy of Forensic Science EAFS 2009.
  82. Perrot, P., Morel, M., Razik, J., & Chollet, G. (2009). Vocal Forgery in Forensic Sciences. In International Conference on Forensic Applications and Techniques in Telecommunications, Information and Multimedia, e-Forensics (p. 7p).
  83. Perrot, P., Razik, J., Morel, M., Khemiri, H., & Chollet, G. (2009). Techniques de conversion de voix appliquées à l’imposture. In TAIMA 2009 (pp. 393–398).
  84. Zouari, L., Khemiri, H., Razik, J., Amehraye, A., & Chollet, G. (2009). Reconnaissance de la parole en temps réel pour le dialogue oral. In TAIMA 2009 (pp. 409–415).
  85. Purwins, H., Grachten, M., Herrera, P., Hazan, A., Marxer, R., & Serra, X. (2008). Computational models of music perception and cognition II: Domain-specific music processing. Physics of Life Reviews, 5(3), 169–182. https://doi.org/10.1016/j.plrev.2008.03.004
  86. Purwins, H., Herrera, P., Grachten, M., Hazan, A., Marxer, R., & Serra, X. (2008). Computational models of music perception and cognition I: The perceptual and cognitive processing chain. Physics of Life Reviews, 5(3), 151–168. https://doi.org/10.1016/j.plrev.2008.03.004
  87. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2008). Comprehension Improvement using Local Confidence Measure: Towards Automatic Transcription for Classroom. In Workshop on Child, Computer and Interaction - WOCCI, satellite event of ICMI (p. 5p).
  88. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2008). Frame-Synchronous and Local Confidence Measures for on-the-fly Automatic Speech Recognition. In InterSpeech (p. 4p).
  89. Han, Y., Razik, J., Chollet, G., & Liu, G. (2008). Speaker Retrieval for TV Show Videos by Associating Audio Speaker Recognition Result to Visual Faces. In K-Space PhD Jamboree Workshop.
  90. Han, Y., Liu, G., Chollet, G., & Razik, J. (2008). Person Identity Clustering in TV Show Videos. In Visual Information Engineering, VIE’2008 (p. 4p).
  91. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2008). Mesures de confiance locales et trame-synchrones. In Journées d’Etude sur la Parole, JEP’2008 (p. 4p).
  92. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2008). Transcription automatique pour malentendants : amélioration à l’aide de mesures de confiance locales. In Journées d’Etude sur la Parole, JEP’2008 (p. 4p).
  93. Hazan, A., Brossier, P., Marxer, R., & Purwins, H. (2008). What/when causal expectation modelling applied to percussive audio (No. 5). The Journal of the Acoustical Society of America (Vol. 123, p. 3800). Acoustical Society of America. https://doi.org/10.1121/1.2935488
  94. Marxer, R., Holonowicz, P., Hazan, A., & Purwins, H. (2008). Dynamical hierarchical self-organization of harmonic and motivic musical categories (No. 5). The Journal of the Acoustical Society of America (Vol. 123, p. 3800). Acoustical Society of America. https://doi.org/10.1121/1.2935489
  95. Boily, C. M., Padmanabhan, T., & Paiement, A. (2008). Regular black hole motion and stellar orbital resonances. Monthly Notices of the Royal Astronomical Society, 383(4). https://doi.org/10.1111/j.1365-2966.2007.12682.x
  96. Scholl, I., Habbal, S. R., & Paiement, A. (2008). On the Automated Detection of Coronal Holes in Space-Based Data. In American Geophysical Union.
  97. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2007). Frame-Synchronous and Local Confidence Measures For On-The-Fly Keyword Spotting. In International Symposium on Signal Processing and its Applications, ISSPA (p. 4p).
  98. Razik, J. (2007). Mesures de confiance trame-synchrones et locales en reconnaissance automatique de la parole (PhD thesis). Université de Nancy.
  99. Boily, C. M., Padmanabhan, T., & Paiement, A. (2007). Black hole motion as catalyst of orbital resonances. Proceedings of the International Astronomical Union (Vol. 3). https://doi.org/10.1017/S1743921308015834
  100. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2006). Mesures de Confiance Trame-Synchrone. In Journées d’Etude sur la Parole, JEP’2006 (pp. 135–138).
  101. Razik, J., Mella, O., Fohr, D., & Haton, J. P. (2005). Local Word Confidence Measure Using Word Graph and N-Best List. In EuroSpeech (pp. 3369–3372).
  102. Razik, J., Fohr, D., Mella, O., & Parlangeau-Vallès., N. (2004). Segmentation Parole/Musique pour la Transcription Automatique. In Journées d’Etude sur la Parole, JEP’2004 (pp. 417–420).
  103. Razik, J., Sénac, C., Fohr, D., Mella, O., & Parlangeau-Vallès, N. (2003). Comparison of Two Speech/Music Segmentation Systems For Audio Indexing on the Web. In multiconference on Systemics, Cybernetics and Informatics, SCI (p. 6p).
  104. Razik, J. (2003). Segmentation Parole/Musique (Master's thesis). Université de Nancy.