Publications

Highlights

(For a full list see below or go to HAL)

Eiffel Tower: A deep-sea underwater dataset for long-term visual localization

Images from four visits to the same hydrothermal vent edifice over the course of five years. Camera poses and a common geometry of the scene were estimated using navigation data and Structure-from-Motion. This serves as a reference when evaluating visual localization techniques. An analysis of the data provides insights about the major changes observed throughout the years.

Clémentin Boittiaux, Claire Dune, Maxime Ferrera, Aurélien Arnaubec, Ricard Marxer, Marjolaine Matabos, Loic Van Audenhaege, Vincent Hugel

The International Journal of Robotics Research

The data is made publicly available at seanoe.org/data/00810/92226/

Homography-Based Loss Function for Camera Pose Regression

A novel loss function which is based on a multiplane homography integration. This new function does not require prior initialization and only depends on physically interpretable hyperparameters. It minimizes best the mean square reprojection error during training when compared with existing loss functions.

Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel

IEEE Robotics and Automation Letters

The code is made publicly available at https://github.com/clementinboittiaux/homography-loss-function

A corpus of audio-visual Lombard speech with frontal and profile views

A bi-view audiovisual Lombard speech corpus which can be used to support joint computational-behavioral studies in speech perception.

Najwa Alghamdi, Steve Maddock, Ricard Marxer, Jon Barker, Guy J. Brown

Journal of the Acoustical Society of America

Corpus available at The Audio-visual Lombard Grid Speech Corpus website

The impact of the Lombard effect on audio and visual speech recognition systems

Analysis of audio and visual Lombard speech using new 54 speaker database. New data on the inter-speaker variability of the Lombard effect. Measurement of the impact of Lombard mismatch in a noise robust speech recognition system. Detailed analysis of plain speech versus Lombard speech performance in well-adapted recognition system. Evidence that visual Lombard speech supports higher recognition performance than visual plain speech.

Ricard Marxer, Jon Barker, Najwa Alghamdi, Steve Maddock

Speech Communication

 

Full List

  1. Poupard, M., Best, P., Morgan, J. P., Pavan, G., & Glotin, H. (2024). A first vocal repertoire characterization of long-finned pilot whales ( Globicephala melas ) in the Mediterranean Sea: a machine learning approach. Royal Society Open Science, 11(11). https://doi.org/10.1098/rsos.231973
  2. Cuervo, S., & Marxer, R. (2024). Scaling Properties of Speech Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (pp. 351–361). Miami, United States: Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.21
  3. Kershenbaum, A., Akçay, Ç., Babu-Saheer, L., Barnhill, A., Best, P., Cauzinille, J., … Dunn, J. (2024). Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists. Biological Reviews. https://doi.org/10.1111/brv.13155
  4. Cauzinille, J., Favre, B., Marxer, R., Clink, D., Ahmad, A. H., & Rey, A. (2024). Investigating self-supervised speech models’ ability to classify animal vocalizations: The case of gibbon’s vocal signatures. In Interspeech 2024 (pp. 132–136). Kos / Greece, Greece: ISCA. https://doi.org/10.21437/Interspeech.2024-1096
  5. Best, P., Cuervo, S., & Marxer, R. (2024). Transfer Learning from Whisper for Microscopic Intelligibility Prediction. In Interspeech 2024 (pp. 3839–3843). Kos, Greece: ISCA. https://doi.org/10.21437/Interspeech.2024-2258
  6. Kalda, J., Alumae, T., Lebourdais, M., Bredin, H., Baroudi, S., & Marxer, R. (2024). TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024. In Interspeech 2024 (pp. 1635–1639). Kos, Greece: ISCA. https://doi.org/10.21437/interspeech.2024-2462
  7. Kalda, J., Alumae, T., Baroudi, S., Lebourdais, M., Bredin, H., & Marxer, R. (2024). ToTaTo System Descriptions for the NOTSOFAR1 Challenge. In 8th International Workshop on Speech Processing in Everyday Environments (CHiME 2024) (pp. 23–25). Kos, Greece: ISCA. https://doi.org/10.21437/CHiME.2024-5
  8. Joly, A., Picek, L., Kahl, S., Goëau, H., Espitalier, V., Botella, C., … Müller, H. (2024). Overview of LifeCLEF 2024: Challenges on Species Distribution Prediction and Identification. In Lecture notes in computer science (Vol. LNCS-14959, pp. 183–207). Grenoble, France: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-71908-0_9
  9. Cauzinille, J., Favre, B., Marxer, R., & Rey, A. (2024). Applying machine learning to primate bioacoustics: review and perspectives. American Journal of Primatology. https://doi.org/10.1002/ajp.23666
  10. Chavin, S., Couvat, J., Best, P., Bouveret, L., Chalifour, J., Duncan, T., … Glotin, H. (2024). Exploration temps-fréquence du répertoire et évolution des chants des baleines à bosse de l’Atlantique Nord. In SERENADE. Toulon, France. Retrieved from https://hal.science/hal-04918598
  11. Daish, P., Micallef, N., Lorenzo-Dus, N., Paiement, A., & Sahoo, D. (2024). Towards Co-Designing a Continuous-Learning Human-AI Interface: A Case Study in Online Grooming Detection. In International Conference on Advanced Visual Interfaces (AVI) workshop - Designing and Building Hybrid Human–AI Systems (SYNERGY 2024). Genoa, Italy. Retrieved from https://hal.science/hal-04556796
  12. Justine, G., Véronique, S., Agnese, M., Stéphane, C., Julie, G., & Hervé, G. (2024). Circadian rhythms of cetaceans from Arctic and Mediterranean seas with regulated anthropophony. In serenade. La Garde, France. Retrieved from https://univ-tln.hal.science/hal-04952267
  13. Kalda, J., Pagés, C., Marxer, R., Alumäe, T., & Bredin, H. (2024). PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings. In The Speaker and Language Recognition Workshop (Odyssey 2024) (pp. 115–122). Quebec City, Canada: ISCA. https://doi.org/10.21437/odyssey.2024-17
  14. Cauzinille, J., Favre, B., Marxer, R., & Rey, A. (2024). From speech to primate vocalizations: self-supervised deep learning as a comparative approach. In Proceedings of the 15th International Conference on the Evolution of Language (EVOLANG XV) (Vol. 15, p. 64). Madison, United States: Gary Lupyan. https://doi.org/10.17617/2.3587960
  15. Cuervo, S., & Marxer, R. (2024). Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired Listeners. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1421–1425). Seoul, South Korea: IEEE. https://doi.org/10.1109/ICASSP48485.2024.10447907
  16. Boittiaux, C., Marxer, R., Dune, C., Arnaubec, A., Ferrera, M., & Hugel, V. (2024). SUCRe: Leveraging Scene Structure for Underwater Color Restoration. In 2024 International Conference on 3D Vision (3DV) (pp. 1488–1497). Davos, Switzerland: IEEE. https://doi.org/10.1109/3DV62453.2024.00148
  17. Joly, A., Picek, L., Kahl, S., Goëau, H., Espitalier, V., Botella, C., … Müller, H. (2024). LifeCLEF 2024 Teaser: Challenges on Species Distribution Prediction and Identification. In N. Goharian, N. Tonellotto, Y. He, A. Lipani, G. McDonald, C. Macdonald, & I. Ounis (Eds.), Lecture notes in computer science (Vol. LNCS-14613, pp. 19–27). Glasgow, United Kingdom: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-56072-9_3
  18. Chaoui, A., Morgan, J., Paiement, A., & Aboudarham, J. (2024). Removing cloud shadows from ground-based solar imagery. Machine Vision and Applications. Retrieved from https://hal.science/hal-04674488
  19. Danneville, F., Hicham, L., Boulet, P., Devienne, P., Glotin, H., & Loyez, C. (2023). [Invited] Ultra low power Cochlea for biodiversity monitoring. In Colloque BIOCOMP 2023. Banyuls - sur - Mer, France. Retrieved from https://hal.science/hal-04463709
  20. Kahl, S., Denton, T., Klinck, H., Reers, H., Cherutich, F., Glotin, H., … Joly, A. (2023). Overview of BirdCLEF 2023: Automated Bird Species Identification in Eastern Africa. In CEUR Workshop Proceedings (Vol. 3497, pp. 1934–1942). Thessalonique, Greece: Mohammad Aliannejadi and Guglielmo Faggiol and Nicola Ferro and Michalis Vlachos. Retrieved from https://hal.inrae.fr/hal-04345437
  21. Cuervo, S., & Marxer, R. (2023). On the Benefits of Self-supervised Learned Speech Representations for Predicting Human Phonetic Misperceptions. In INTERSPEECH 2023 (pp. 1788–1792). Dublin, Ireland: ISCA. https://doi.org/10.21437/Interspeech.2023-1476
  22. Moore, R. K., & Marxer, R. (2023). Progress and Prospects for Spoken Language Technology: Results from Five Sexennial Surveys. In INTERSPEECH 2023 (pp. 401–405). Dublin, Ireland: ISCA. https://doi.org/10.21437/Interspeech.2023-235
  23. Chetouani, M., Briefer, E., Dassow, A., Marxer, R., Moore, R., Obin, N., & Stowell, D. (2023). Vocal interactivity in-and-between humans, animals and robots. Interaction Studies, 24(1), 1–4. https://doi.org/10.1075/is.00016.che
  24. Best, P., Paris, S., Glotin, H., & Marxer, R. (2023). Deep audio embeddings for vocalisation clustering. PLoS ONE, 18(7), e0283396. https://doi.org/10.1371/journal.pone.0283396
  25. Richards, F., Paiement, A., Xie, X., Sola, E., & Duc, P.-A. (2023). Panoptic Segmentation of Galactic Structures in LSB Images. In 18th International Conference on Machine Vision Applications, IEICE proceeding series. Hamamatsu, Shizuoka, Japan. Retrieved from https://hal.science/hal-04129549
  26. Sanz, P., Marín, R., López-Barajas, S., Solis, A., Marxer, R., & Hugel, V. (2023). 1st Year of running MIR at UJI. In OCEANS 2023 - Limerick (pp. 1–5). Limerick, Ireland: IEEE. https://doi.org/10.1109/OCEANSLimerick52467.2023.10244270
  27. Boittiaux, C., Dune, C., Ferrera, M., Arnaubec, A., Marxer, R., Matabos, M., … Hugel, V. (2023). Eiffel Tower: A Deep-Sea Underwater Dataset for Long-Term Visual Localization. The International Journal of Robotics Research. https://doi.org/10.1177/02783649231177322
  28. Boittiaux, C., Dune, C., Arnaubec, A., Marxer, R., Ferrera, M., & Hugel, V. (2023). Long-term visual localization in deep-sea underwater environment. In ORASIS. Carqueiranne, France: Thanh Phuong Nguyen. Retrieved from https://hal.science/hal-04108737
  29. Joly, A., Kahl, S., Picek, L., Botella, C., Marcos, D., Šulc, M., … Müller, H. (2023). LifeCLEF 2023 teaser: Species Identification and Prediction Challenges. In LNCS. Lecture Notes in Computer Science (Vol. LNCS-13982, pp. 568–576). Dublin, Ireland: Springer. https://doi.org/10.1007/978-3-031-28241-6_65
  30. Chetouani, M., Mandel-Briefer, E., Dassow, A., Marxer, R., Moore, R. K., Obin, N., & Stowell, D. (2023). Vocal Interactivity in-and-between Humans, Animals and Robots. John Benjamins Publishing Co. https://doi.org/10.1075/is.24.1
  31. Gibbs, L., Bingham, R. J., & Paiement, A. (2023). A novel filtering method for geodetically-determined ocean surface currents using deep learning. Environmental Data Science. Retrieved from https://hal.science/hal-04285643
  32. Patris, J., Malige, F., Hamame, M., Glotin, H., Barchasz, V., Gies, V., … Buchan, S. (2023). Medium-term acoustic monitoring of small cetaceans in Patagonia, Chile. PeerJ, 11, e15292. https://doi.org/10.7717/peerj.15292
  33. Sarano, F., Sarano, V., Tonietto, M.-L., Yernaux, A., Jung, J.-L., Arribart, M., … Adam, O. (2023). Nursing Behavior in Sperm Whales (Physeter macrocephalus). Animal Behavior and Cognition, 10(2), 105–131. https://doi.org/10.26451/abc.10.02.02.2023
  34. Cuervo, S., Łańcucki, A., Marxer, R., Rychlikowski, P., & Chorowski, J. (2022). Variable-rate hierarchical CPC leads to acoustic unit discovery in speech. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (Vol. 35, pp. 34995–35006). New Orleans, United States. Retrieved from https://hal.science/hal-04093636
  35. Dinar, F., Chayla, R., Paris, S., & Busvelle, E. (2022). A low-level set of stationary features dedicated to non-intrusive load monitoring. In International Conference on Systems and Control. Marseille, France. Retrieved from https://hal.science/hal-03855164
  36. Lehnhoff, L., Glotin, H., Bernard, S., Dabin, W., Le Gall, Y., Menut, E., … Mérigot, B. (2022). Behavioural Responses of Common Dolphins Delphinus delphis to a Bio-Inspired Acoustic Device for Limiting Fishery By-Catch. Sustainability, 14(20), 13186. https://doi.org/10.3390/su142013186
  37. Richards, F., Xie, X., Paiement, A., Sola, E., & Duc, P.-A. (2022). MULTI-SCALE GRIDDED GABOR ATTENTION FOR CIRRUS SEGMENTATION. In IEEE International Conference on Image Processing (ICIP). Bordeaux, France. https://doi.org/10.1109/ICIP46576.2022.9898045
  38. Hafsati, M., Bentounes, K., & Marxer, R. (2022). Blind Speech Separation Through Direction of Arrival Estimation Using Deep Neural Networks with a Flexibility on the Number of Speakers. In 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP) (pp. 1–5). Shanghai, China: IEEE. https://doi.org/10.1109/MMSP55362.2022.9949050
  39. Best, P., Marxer, R., Paris, S., & Glotin, H. (2022). Temporal evolution of the Mediterranean fin whale song. Scientific Reports, 12(1), 13565. https://doi.org/10.1038/s41598-022-15379-0
  40. Boittiaux, C., Marxer, R., Dune, C., Arnaubec, A., & Hugel, V. (2022). Homography-Based Loss Function for Camera Pose Regression. IEEE Robotics and Automation Letters, 7(3), 6242–6249. https://doi.org/10.1109/LRA.2022.3168329
  41. Rojas-cerda, C., Buchan, S. J., Branch, T. A., Malige, F., Patris, J., Hucke-gaete, R., & Staniland, I. (2022). Presence of Southeast Pacific blue whales ( Balaenoptera musculus ) off South Georgia in the South Atlantic Ocean. Marine Mammal Science, 38, 1425–1441. https://doi.org/10.1111/mms.12946</span></div>