•    Freeware
  •    Shareware
  •    Research
  •    Localization Tools 20
  •    Publications 715
  •    Validators 2
  •    Mobile Apps 22
  •    Fonts 31
  •    Guidelines/ Draft Standards 3
  •    Documents 13
  •    General Tools 38
  •    NLP Tools 105
  •    Linguistic Resources 265

Search Results | Total Results found :   1214

You refine search by : All Results
  Catalogue
Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Konkani as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
1

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Kannada as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
0

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Assamese as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Last updated on December 26, 2019

0
1

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Urdu as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 26, 2019

0
1

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Out-of-Vocabulary (OOV) detection and recovery is an important aspect of reducing Word Error Rate (WER) in Automatic Speech Recognition (ASR). In this paper, we evaluate the effect on WER for a low-resource language ASR system using OOV detection and recovery. We use a small seed corpus of continuous speech and improve the vocabulary by incorporating the detected OOV words. We use a syllable-model to detect and learn OOV words and, augment the word-model with these words leading to improved recognition. Our research investigates the effect on OOV detection and recovery after adding missing syllable sounds in the syllable model using a Text-toSpeech (TTS) system. Our experiments are conducted using 5 hours of continuous speech Kannada corpus. We use an already available Festival TTS for Hindi to generate Kannada speech. Our initial experiments report an improvement in OOV detection due to addition of missing syllable sounds using a crosslingual TTS system.

Added on December 17, 2019

51

  More Details
  • Contributed by : Consortium
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Savitha Murthy,Dinkar Sitaram,Sunayana Sitaram