•    Freeware
  •    Shareware
  •    Research
  •    Localization Tools 20
  •    Publications 707
  •    Validators 2
  •    Mobile Apps 22
  •    Fonts 31
  •    Guidelines/ Draft Standards 3
  •    Documents 13
  •    General Tools 38
  •    NLP Tools 105
  •    Linguistic Resources 255

Search Results | Total Results found :   1196

You refine search by : All Results
  Catalogue
This paper deals with the problem of detecting replay attacks on speaker verification systems. In literature, apart from the acoustic features, source features have also been successfully used for this task. In existing source features, only the information around glottal closure instants (GCIs) have been utilized. We hypothesize that the feature derived by capturing the temporal dynamics between two GCIs would be more discriminative for such task. Motivated by that, in this work we explore the use of discrete cosine transform compressed integrated linear prediction residual (ILPR) features for discriminating between genuine and replayed signals. A spoof detection system is built using the compressed ILPR feature and a Gaussian mixture model (GMM) classifier. A baseline system is also built using constant-Q cepstral coefficient feature with GMM backend. These systems are tested on the ASVSpoof 2017 Version 2.0 database.

Added on May 9, 2019

251

  More Details
  • Contributed by : Individual
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Sarfaraz Jelil, Sishir Kalita, S. R. Mahadeva ,Rohit Sinha

We propose a novel speech denoising framework by minimizing the probability of error (PE), which measures the deviation probability of the estimate from its true value. To develop the minimum PE (MPE) criterion, one requires the knowledge of the noise probability density function (p.d.f.), which may not be available in a parametric form in speech denoising applications. Therefore, we adopt two approaches for modeling the noise p.d.f.: (i) Gaussian modeling based on adaptive variance estimation; and (ii) a Gaussian mixture model (GMM) in view of its approximation capabilities. We consider discrete cosine transform (DCT) domain shrinkage, where the optimum shrinkage parameter is obtained by minimizing an estimate of the PE. A performance assessment for real-world noise types shows that for input signal-to-noise ratios (SNR) greater than 5 dB, the proposed MPE-based point-wise shrinkage estimators outperform three benchmark techniques in terms of segmental SNR and short-time objective intelligibility (STOI) scores.

Added on May 9, 2019

17

  More Details
  • Contributed by : Individual
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Jishnu Sadasivan, Subhadip Mukherjee, Chandra Sekhar Seelamantula

Code-switching refers to the phenomena of mixing of words or phrases from foreign languages while communicating in a native language by the multilingual speakers. Codeswitching is a global phenomenon and is widely accepted in multilingual communities. However, for training the language model (LM) for such tasks, a very limited code-switched textual resources are available as yet. In this work, we present an approach to reduce the perplexity (PPL) of Hindi-English code-switched data when tested over the LM trained on purely native Hindi data. For this purpose, we propose a novel textual feature which allows the LM to predict the code-switching instances. The proposed feature is referred to as code-switching factor (CS-factor). Also, we developed a tagger that facilitates the automatic tagging of the code-switching instances. This tagger is trained on a development data and assigns an equivalent class of foreign (English) words to each of the potential native (Hindi) words.

Added on May 9, 2019

19

  More Details
  • Contributed by : Individual
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Ganji Sreeram, Rohit Sinha

An approach to diarize taniavartanam segments of a Carnatic music concert is proposed in this paper. Information bottleneck(IB) based approach used for speaker diarization is applied for this task. IB system initializes the segments to be clustered uniformly with fixed duration. The issue with diarization of percussion instruments in taniavartanam is that the stroke rate varies highly across the segments. It can double or even quadru-ple within a short duration, thus leading to variable information rate in different segments. To address this issue, the IB sys-tem is modified to use the stroke rate information to divide the audio into segments of varying durations. These varying dura-tion segments are then clustered using the IB approach which is then followed by Kullback-Leibler hidden Markov model (KL-HMM) based realignment of the instrument boundaries. Perfor-mance of the conventional IB system and the proposed system is evaluated on standard Carnatic music dataset. The proposed technique shows a best case absolute improvement of 8.2% over the conventional IB based system in terms of diarization error rate.

Added on May 9, 2019

6

  More Details
  • Contributed by : Conssortium
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Nauman Dawalatabad, Jom Kuriakose, C. Chandra Sekhar, Hema A. Murthy

Code-switching or mixing is the use of multiple languages in a single utterance or conversation. Borrowing occurs when a word from a foreign language becomes part of the vocabulary of a language. In multilingual societies, switching/mixing and borrowing are not always clearly distinguishable. Due to this, transcription of code-switched and borrowed words is often not standardized, and leads to the presence of homophones in the training data. In this work, we automatically identify and disambiguate homophones in code-switched data to improve recognition of code-switched speech. We use a WX-based common pronunciation scheme for both languages being mixed and unify the homophones during training, which results in a lower word error rate for systems built using this data.

Added on May 9, 2019

18

  More Details
  • Contributed by : Individual
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Brij Mohan Lal Srivastava, Sunayana Sitaram