Search Knowledge Base Articles

Hindi long speech transcription

Hindi long speech transcription	हिंदी लंबी भाषण प्रतिलेखन
Hindi Colloquial Video Annotation	हिंदी बोलचाल की वीडियो व्याख्या

Hindi Colloquial Video Annotation Guidelines Document	हिंदी बोलचाल की वीडियो व्याख्या दिशानिर्देश दस्तावेज़
Two reminders regarding the rules:	नियमों के बारे में दो अनुस्मारक:
1. Noise tags need to be annotated. The most common tag is "N" because a lot of our data contains background music or audio noise.	1. शोर टैग को एनोटेट किया जाना चाहिए। सबसे आम टैग "N" है क्योंकि हमारे बहुत से डेटा में बैकग्राउंड म्यूजिक या ऑडियो शोर होता है।₹50,000 to ₹20 Lakhs
2. Invalid sentences do not need to be segmented. All segmented data must be valid.Loan Requirement	. अमान्य वाक्यों को खंडित करने की आवश्यकता नहीं है। सभी खंडित डेटा मान्य होने चाहिए।Not more than 4 loans
1. Data Overview 1.1 Data TypeCredit Profile	1. डेटा अवलोकन 1.1 डेटा प्रकारNo Defaults on previous loans
1.1 Data Type The audio data is sourced from publicly available videos, with a requirement that speakers express themselves in a natural, colloquial style, avoiding articulate and overly formal reading styles, such as those found in language teaching or film-related videos. Videos with prolonged background music or noisy recording environments should be avoided. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.)	1.1 डेटा प्रकार ऑडियो डेटा सार्वजनिक रूप से उपलब्ध वीडियो से लिया जाता है, जिसमें यह शर्त होती है कि वक्ता स्वाभाविक, बोलचाल की शैली में खुद को अभिव्यक्त करें, स्पष्ट और अत्यधिक औपचारिक पठन शैलियों से बचें, जैसे कि भाषा शिक्षण या फिल्म-संबंधी वीडियो में पाए जाते हैं। लंबे समय तक चलने वाले बैकग्राउंड म्यूजिक या शोरगुल वाले रिकॉर्डिंग वातावरण वाले वीडियो से बचना चाहिए। (यदि एनोटेशन प्रक्रिया के दौरान ऐसे वीडियो पाए जाते हैं जो आवश्यकताओं को पूरा नहीं करते हैं, तो उन्हें तुरंत रिपोर्ट किया जाना चाहिए। गलत तरीके से एनोटेट किए गए डेटा को अमान्य माना जाएगा।)1 year from registration
1.2 Language Hindi, with a tolerance for slight accents. However, data with accents so strong that it hinders normal annotation or contains dialects will not be accepted. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.)Negative Industry	1.2 भाषा हिंदी, थोड़े उच्चारण के लिए सहनशीलता के साथ। हालाँकि, ऐसे डेटा जिसमें उच्चारण इतना मजबूत हो कि वह सामान्य एनोटेशन में बाधा डालता हो या जिसमें बोलियाँ हों, उसे स्वीकार नहीं किया जाएगा। (यदि एनोटेशन प्रक्रिया के दौरान ऐसे वीडियो पाए जाते हैं जो आवश्यकताओं को पूरा नहीं करते हैं, तो उन्हें तुरंत रिपोर्ट किया जाना चाहिए। गलत तरीके से एनोटेट किया गया डेटा अमान्य माना जाएगा।)Proprietorship, Partnership, Pvt. Ltd.
.3 Data Volume This project mainly involves collecting online video data, with a total annotation requirement of 500 valid hours. Each audio/video segment must be at least five minutes long. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.)Business Type	1.3 डेटा वॉल्यूम इस परियोजना में मुख्य रूप से ऑनलाइन वीडियो डेटा एकत्र करना शामिल है, जिसमें कुल एनोटेशन की आवश्यकता 500 वैध घंटे है। प्रत्येक ऑडियो/वीडियो खंड कम से कम पाँच मिनट लंबा होना चाहिए। (यदि एनोटेशन प्रक्रिया के दौरान ऐसे वीडियो पाए जाते हैं जो आवश्यकताओं को पूरा नहीं करते हैं, तो उन्हें तुरंत रिपोर्ट किया जाना चाहिए। गलत तरीके से एनोटेट किया गया डेटा अमान्य माना जाएगा।)Avoid sourcing customers from the below mentioned industry list
1.4 Data Quality The following types of data must be excluded during annotation: 1. Videos with an obvious reading style or speakers with heavy accents. 2. Videos predominantly in languages other than the required one. 3. Poor audio quality, with background music or noisy environments present for most of the duration. 4. Cases where the speaker is not physically present (eg, phone calls, recorded playback). 5. Synthetic robot voices generated through speech synthesis.	1.4 डेटा गुणवत्ता एनोटेशन के दौरान निम्न प्रकार के डेटा को बाहर रखा जाना चाहिए: 1. स्पष्ट पढ़ने की शैली वाले वीडियो या भारी उच्चारण वाले वक्ता। 2. मुख्य रूप से आवश्यक भाषा के अलावा अन्य भाषाओं में वीडियो। 3. खराब ऑडियो गुणवत्ता, जिसमें अधिकांश समय पृष्ठभूमि संगीत या शोरगुल वाला वातावरण मौजूद रहता है। 4. ऐसे मामले जहाँ वक्ता शारीरिक रूप से मौजूद नहीं है (जैसे, फ़ोन कॉल, रिकॉर्ड किया गया प्लेबैक)। 5. भाषण संश्लेषण के माध्यम से उत्पन्न सिंथेटिक रोबोट आवाज़ें।

2.Data Annotation

2.डेटा एनोटेशन

2.1 Validity Assessment For complete, qualified audio segments, annotation must be performed by segmenting the audio by sentence. The following situations indicate that a sentence is invalid and does not require segmentation: 1. If two people speak simultaneously in a single sentence, and their voices overlap significantly with similar volume, the audio is deemed invalid. If the overlap is minimal (only one or two words) and the primary speaker's content is clearly audible, normal transcription is performed. 2. If a sentence contains parts that are inaudible and the content cannot be determined, the explicit, or violent content), it is deemed invalid. duration. The sentence is invalid. 3. If strong noise (environmental noise, equipment noise) makes it difficult to hear the primary speaker, the sentence is deemed invalid. Based on annotation experience, the average length of each natural language segment should be 5-6 seconds. 4. If there are frame drops in a sentence, it is considered invalid. 5. If the sentence is not spoken by a human (eg, automated customer service, synthesizedvoice, or TV broadcast), it is invalid. 6. If the sentence contains non-target language portions, it is considered invalid. (English can be transcribed, but other languages require consultation with the product manager.) 7. If the sentence contains sensitive information (eg, politically sensitive, religiously sensitive,explicit, or violent content), it is deemed invalid.	2.डेटा एनोटेशन 2.1 वैधता मूल्यांकन पूर्ण, योग्य ऑडियो खंडों के लिए, ऑडियो को वाक्य के अनुसार खंडित करके एनोटेशन किया जाना चाहिए। निम्नलिखित स्थितियाँ संकेत करती हैं कि वाक्य अमान्य है और उसे खंडित करने की आवश्यकता नहीं है: 1. यदि दो लोग एक ही वाक्य में एक साथ बोलते हैं, और उनकी आवाज़ें समान मात्रा के साथ महत्वपूर्ण रूप से ओवरलैप होती हैं, तो ऑडियो को अमान्य माना जाता है। यदि ओवरलैप न्यूनतम है (केवल एक या दो शब्द) और प्राथमिक वक्ता की सामग्री स्पष्ट रूप से श्रव्य है, तो सामान्य प्रतिलेखन किया जाता है। 2. यदि वाक्य में ऐसे भाग हैं जो अश्रव्य हैं और सामग्री निर्धारित नहीं की जा सकती है, स्पष्ट, या हिंसक सामग्री), तो इसे अमान्य माना जाता है। अवधि। वाक्य अमान्य है। 3. यदि मजबूत शोर (पर्यावरणीय शोर, उपकरण शोर) प्राथमिक वक्ता को सुनना मुश्किल बनाता है, तो वाक्य को अमान्य माना जाता है। एनोटेशन अनुभव के आधार पर, प्रत्येक प्राकृतिक भाषा खंड की औसत लंबाई 5-6 सेकंड होनी चाहिए। 4. यदि वाक्य में फ़्रेम ड्रॉप हैं, तो इसे अमान्य माना जाता है। 5. यदि वाक्य किसी मानव द्वारा नहीं बोला गया है (जैसे, स्वचालित ग्राहक सेवा, संश्लेषित आवाज़, या टीवी प्रसारण), तो यह अमान्य है। 6. यदि वाक्य में गैर-लक्ष्यित भाषा के अंश हैं, तो इसे अमान्य माना जाता है। (अंग्रेजी में लिप्यंतरण किया जा सकता है, लेकिन अन्य भाषाओं के लिए उत्पाद प्रबंधक से परामर्श की आवश्यकता होती है।) 7. यदि वाक्य में संवेदनशील जानकारी है (जैसे, राजनीतिक रूप से संवेदनशील, धार्मिक रूप से संवेदनशील, स्पष्ट, या हिंसक सामग्री), तो इसे अमान्य माना जाता है।
2.2 Valid audio segmentation 1. Annotators must ensure semantic coherence and segment by sentence. Long sentences can be split into clauses, with each sentence lasting no longer than 8 seconds, but not too short. 2. The optimal position for time boundaries is at the lowest point of the waveform. 3. Audio from different speakers should not be segmented into the same sentence. 4. When annotating, leave 0.2 to 0.3 seconds of silence around the audio segment. If so Different speakers within the same segment should be assigned distinct identity IDs and labeled with their gender. Silence does not exist, it is not mandatory. Segments should be taken from parts without sudden noise. If necessary, the buffer time before and after the segment can be shortened to avoid sudden noise, but the audio must not be clipped. 5. Even single-word responses must be segmented. If neighboring sentences can be merged, do so whenever possible. 6. If a speaker pauses for more than 2 seconds, the sentence should be split into two parts China Microsoft without considering semantic continuity. Pauses shorter than 2 seconds should remain as one sentence, provided the total duration does not exceed 8 seconds. 7. If a speaker pauses for less than 2 seconds, and the pause introduces noise or breaks the continuity and completeness of the meaning, the segment should not be split. 8. Audio with clipping, truncation, frame drops, or abnormal energy levels is considered invalid.	2.2 वैध ऑडियो विभाजन 1. एनोटेटर्स को अर्थगत सुसंगतता और वाक्य के अनुसार खंड सुनिश्चित करना चाहिए। लंबे वाक्यों को खंडों में विभाजित किया जा सकता है, प्रत्येक वाक्य 8 सेकंड से अधिक लंबा नहीं होना चाहिए, लेकिन बहुत छोटा भी नहीं होना चाहिए। 2. समय सीमाओं के लिए इष्टतम स्थिति तरंग के सबसे निचले बिंदु पर है। 3. अलग-अलग वक्ताओं के ऑडियो को एक ही वाक्य में विभाजित नहीं किया जाना चाहिए। 4. एनोटेट करते समय, ऑडियो सेगमेंट के आसपास 0.2 से 0.3 सेकंड का मौन रखें। यदि ऐसा है एक ही सेगमेंट के भीतर अलग-अलग वक्ताओं को अलग-अलग पहचान आईडी दी जानी चाहिए और उनके लिंग के साथ लेबल किया जाना चाहिए। मौन मौजूद नहीं है, यह अनिवार्य नहीं है। खंडों को अचानक शोर के बिना भागों से लिया जाना चाहिए। यदि आवश्यक हो, तो अचानक शोर से बचने के लिए खंड से पहले और बाद में बफर समय को छोटा किया जा सकता है, लेकिन ऑडियो को क्लिप नहीं किया जाना चाहिए। 5. यहां तक कि एकल-शब्द प्रतिक्रियाओं को भी खंडित किया जाना चाहिए। यदि पड़ोसी वाक्यों को मर्ज किया जा सकता है, तो जब भी संभव हो ऐसा करें। 6. यदि कोई वक्ता 2 सेकंड से अधिक समय तक रुकता है, तो वाक्य को अर्थगत निरंतरता पर विचार किए बिना दो भागों में विभाजित किया जाना चाहिए। 2 सेकंड से कम समय के विराम को एक वाक्य के रूप में रहना चाहिए, बशर्ते कि कुल अवधि 8 सेकंड से अधिक न हो। 7. यदि कोई वक्ता 2 सेकंड से कम समय के लिए रुकता है, और विराम शोर पैदा करता है या अर्थ की निरंतरता और पूर्णता को तोड़ता है, तो खंड को विभाजित नहीं किया जाना चाहिए। 8. क्लिपिंग, ट्रंकेशन, फ़्रेम ड्रॉप या असामान्य ऊर्जा स्तर वाले ऑडियो को अमान्य माना जाता है।
2.3 Speaker Identification Different speakers within the same segment should be assigned distinct identity IDs and labeled with their gender.	2.3 वक्ता की पहचान एक ही खंड के विभिन्न वक्ताओं को अलग-अलग पहचान आईडी दी जानी चाहिए और उनके लिंग के साथ लेबल किया जाना चाहिए।
2.4 Transcription Guidelines Annotators must transcribe the audio content exactly as heard, ensuring the transcription is free of missing, extra, or incorrect words. The general rules are as follows: 1. Capitalization: If a word is typically capitalized, transcribe it according to standard writing conventions. For example: China Microsoft, 2. Numbers: When numbers appear in the audio, do not transcribe them as Arabic numerals. Instead, write them out in the language's standard written form.	2.4 प्रतिलेखन दिशानिर्देश एनोटेटर्स को ऑडियो सामग्री को ठीक उसी तरह से प्रतिलेखित करना चाहिए जैसा सुना गया है, यह सुनिश्चित करते हुए कि प्रतिलेखन में कोई छूटे हुए, अतिरिक्त या गलत शब्द न हों। सामान्य नियम इस प्रकार हैं: 1. कैपिटलाइज़ेशन: यदि कोई शब्द आमतौर पर कैपिटल में लिखा जाता है, तो उसे मानक लेखन परंपराओं के अनुसार प्रतिलेखित करें। उदाहरण के लिए: चीन माइक्रोसॉफ्ट, 2. संख्याएँ: जब ऑडियो में संख्याएँ दिखाई देती हैं, तो उन्हें अरबी अंकों के रूप में प्रतिलेखित न करें। इसके बजाय, उन्हें भाषा के मानक लिखित रूप में लिखें।

2.5 Spelled-out Words Letters should be transcribed in uppercase and separated by spaces. For example:	2.5 वर्तनी-युक्त शब्द अक्षरों को बड़े अक्षरों में लिखा जाना चाहिए और रिक्त स्थान से अलग किया जाना चाहिए। उदाहरण के लिए:
five thirty pm	five thirty pm
the fbi	the fbi
NFC	NFC

2.6 Abbreviations When transcribing, do not use abbreviations. Always write out the full word as pronounced. For example:	2.6 संक्षिप्तीकरण जब प्रतिलेखन करें, तो संक्षिप्तीकरण का उपयोग न करें। हमेशा पूरा शब्द उसी तरह लिखें जैसा कि उसका उच्चारण किया गया है। उदाहरण के लिए:
This is Dr.Smith	this is doctor smith

2.7 Punctuation
Punctuation must be used according to grammatical rules. Any punctuation explicitly spoken by the speaker should be transcribed. For example:

2.7 विराम चिह्न
विराम चिह्नों का प्रयोग व्याकरणिक नियमों के अनुसार किया जाना चाहिए। वक्ता द्वारा स्पष्ट रूप से बोले गए किसी भी विराम चिह्न को लिपिबद्ध किया जाना चाहिए। उदाहरण के लिए:

• "@" should be transcribed as at
• ".com" should be transcribed as dot com
Only the following punctuation marks are allowed during transcription:
• Comma (,),
• Hyphen (-) (only within words),
• Period (.),
• Exclamation mark (!),
• Apostrophe (’),
• Question mark (?).
No other punctuation is permitted. Any inserted punctuation must adhere to grammatical rules.
All symbols should be in the standard English input format.

• "@" को at के रूप में लिखा जाना चाहिए
• ".com" को dot com के रूप में लिखा जाना चाहिए
प्रतिलेखन के दौरान केवल निम्नलिखित विराम चिह्नों की अनुमति है:
• अल्पविराम (,),
• हाइफ़न (-) (केवल शब्दों के भीतर),
• अवधि (.),
• विस्मयादिबोधक चिह्न (!),
• अपोस्ट्रोफ़ (’),
• प्रश्न चिह्न (?).
किसी अन्य विराम चिह्न की अनुमति नहीं है। कोई भी सम्मिलित विराम चिह्न व्याकरणिक नियमों का पालन करना चाहिए।
सभी प्रतीक मानक अंग्रेजी इनपुट प्रारूप में होने चाहिए।

2.8 Interjections
1. Interjections must be transcribed accurately based on pronunciation and meaning.
2. Arabic numerals are not allowed in the transcription of interjections or general text.

2.8 विस्मयादिबोधक
1. विस्मयादिबोधक को उच्चारण और अर्थ के आधार पर सटीक रूप से लिपिबद्ध किया जाना चाहिए।
2. विस्मयादिबोधक या सामान्य पाठ के लिपिबद्धीकरण में अरबी अंकों की अनुमति नहीं है।

2.9 Other Guidelines
• Profanity: Transcribe offensive language exactly as spoken. Do not substitute or censor letters.
• Internet Slang and Popular Terms: Transcribe internet slang and commonly used online terms according to their standard usage.
• Repetition: If words or phrases are repeated in the audio, transcribe all occurrences.
• Ambiguous Meaning with Clear Pronunciation: If the audio is clear but the meaning is uncertain (e.g., common names), use homophones that match the pronunciation. Ensure that the transcription aligns with the pronunciation and maintains accuracy. If the context clearly indicates the intended meaning, transcribe accordingly to reflect both the correctpronunciation and meaning.
• Unfinished Words: If a word is cut off before completion, add a hyphen (-) and leave a space
before the next word. For example: I want to go to s- school.
However, if the incomplete word appears at the end of a sentence, omit it. Transcribe only complete words at the sentence’s end.

2.9 अन्य दिशा-निर्देश
• अभद्र भाषा: आपत्तिजनक भाषा को ठीक उसी तरह लिखें जैसा कि बोला जाता है। अक्षरों को प्रतिस्थापित या सेंसर न करें।
• इंटरनेट स्लैंग और लोकप्रिय शब्द: इंटरनेट स्लैंग और आम तौर पर इस्तेमाल किए जाने वाले ऑनलाइन शब्दों को उनके मानक उपयोग के अनुसार लिखें।
• दोहराव: अगर ऑडियो में शब्द या वाक्यांश दोहराए गए हैं, तो सभी घटनाओं को ट्रांसक्राइब करें।
• स्पष्ट उच्चारण के साथ अस्पष्ट अर्थ: अगर ऑडियो स्पष्ट है लेकिन अर्थ अनिश्चित है (उदाहरण के लिए, सामान्य नाम), तो उच्चारण से मेल खाने वाले होमोफ़ोन का उपयोग करें। सुनिश्चित करें कि ट्रांसक्रिप्शन उच्चारण के साथ संरेखित हो और सटीकता बनाए रखे। अगर संदर्भ स्पष्ट रूप से इच्छित अर्थ को इंगित करता है, तो सही उच्चारण और अर्थ दोनों को दर्शाने के लिए तदनुसार ट्रांसक्राइब करें।
• अधूरे शब्द: अगर कोई शब्द पूरा होने से पहले ही कट जाता है, तो एक हाइफ़न (-) जोड़ें और अगले शब्द से पहले एक स्थान छोड़ दें। उदाहरण के लिए: मैं एस-स्कूल जाना चाहता हूँ।
हालाँकि, अगर अधूरा शब्द वाक्य के अंत में आता है, तो उसे छोड़ दें। वाक्य के अंत में सिर्फ़ पूरे शब्दों को ही लिखें।

2.10 Special Symbols
• If special cases arise during annotation, apply the appropriate tags.
• Ensure that all tags are valid and avoid errors such as:
◦ Missing paired tags,
◦ Inconsistent capitalization,
◦ Unmatched brackets.

2.10 विशेष चिह्न
• यदि एनोटेशन के दौरान विशेष मामले सामने आते हैं, तो उचित टैग लागू करें।
• सुनिश्चित करें कि सभी टैग मान्य हैं और इस तरह की त्रुटियों से बचें:
◦ युग्मित टैग गायब हैं,
◦ असंगत कैपिटलाइज़ेशन,
◦ बेमेल ब्रैकेट।

Data Validity	Noise	Special Tags	Explanation	Role Identification	Text Transcription
Valid data	no noise	none	Transcribe the audio content according to the guidelines and standards provided.	O1 or O2…	Today I went to eat.
		[N]	If a sentence contains noise, mark it at the end with [N], without specifying the type of noise. Oral noises made by the speaker (such as lip smacking or inhaling) do not need to be annotated.	O1 or O2…	Today I went to eat [N]
		HM]	If the speaker performs rap or sings part of the content, mark the end of the sentence with [HM] to indicate the melodic or rhythmic nature of the speech.	O1 or O2…	I drink alone, intoxicated [HM]
		[OVERLAP/] [/OVERLAP]	If there is overlapping speech and one speaker is significantly clearer, transcribe only the clear speaker’s voice. Use role identification to label the clear speaker, and mark the affected text with [OVERLAP] to indicate interference from the other speaker.	O1 or O2…	Today I went to [OVERLAP/] eat [/OVERLAP] dinner.

Eligibility Criteria for Car Loans

Two Wheelar Loan

New Car & Used Car Loan

Sale & Purchase - Refinace BT Topup

Loan Apply-Link-Open-Start

Brand Partner Merchant Onboarding*

!Shubh Shuruaat Start Pay ke Sath!

Start-Pay-YouTube

आपके समय और प्रयास के लिए धन्यवाद। यदि आपके कोई प्रश्न हों,

Contact Page: https://startpayonline.com/contact/

Additional Information

Knowledge Base:https://crm.startpayonline.com/knowledge-base

तो कृपया मुझसे संपर्क करने में संकोच न करें।

तकनीकी प्रश्न या समस्या?

कृपया 24/7 सहायता पोर्टल पर जाएँ या [email protected] पर ईमेल करें ।

!Shubh Shuruaat Start Pay ke Sath!

शुरू हो जाओ - "एसपीओं-SPO"

Terms & Conditions apply*

Did you find this article useful?

Bank Credit Card Application Status

AU Small Finance Bank Credit Card Application Status !Shubh Shuruaat Start Pay ke Sath! Check Servic...
Aadhaar Card Eligiblity in Hindi

Aadhaar Card Eligiblity in Hindi Telegram Group Links | Join, Share !...
Gas Agency Dealership

Gas Agency Dealership !शुभ शरुआत स्é...
GST Registration

Process for GST Registration !Shubh Shuruaat Start Pay ke Sath! Online GST Registration Starting at ...
PAN-Permanent Account Number (PAN) Registration

Permanent Account Number (PAN) Registration !Shubh Shuruaat Start Pay ke Sath! ...

Search Knowledge Base Articles

Hindi long speech transcription

Additional Information

Did you find this article useful?

Related Articles