Hindi long speech transcription
Hindi Colloquial Video Annotation
!शुभ शरुआत स्टार्ट पे के साथ!
Hindi long speech transcription | हिंदी लंबी भाषण प्रतिलेखन |
Hindi Colloquial Video Annotation | हिंदी बोलचाल की वीडियो व्याख्या |
Hindi Colloquial Video Annotation Guidelines Document | हिंदी बोलचाल की वीडियो व्याख्या दिशानिर्देश दस्तावेज़ |
Two reminders regarding the rules: | |
1. Noise tags need to be annotated. The most common tag is "N" because a lot of our data contains background music or audio noise. |
|
2. Invalid sentences do not need to be segmented. All segmented data must be valid.Loan Requirement | |
1.1 Data Type The audio data is sourced from publicly available videos, with a requirement that speakers express themselves in a natural, colloquial style, avoiding articulate and overly formal reading styles, such as those found in language teaching or film-related videos. Videos with prolonged background music or noisy recording environments should be avoided. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.) |
|
1.2 Language Hindi, with a tolerance for slight accents. However, data with accents so strong that it hinders normal annotation or contains dialects will not be accepted. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.)Negative Industry |
|
.3 Data Volume This project mainly involves collecting online video data, with a total annotation requirement of 500 valid hours. Each audio/video segment must be at least five minutes long. (If videos that do not meet the requirements are discovered during the annotation process, they must be reported immediately. Misannotated data will be considered invalid.)Business Type |
|
1.4 Data Quality The following types of data must be excluded during annotation: 1. Videos with an obvious reading style or speakers with heavy accents. 2. Videos predominantly in languages other than the required one. 3. Poor audio quality, with background music or noisy environments present for most of the duration. 4. Cases where the speaker is not physically present (eg, phone calls, recorded playback). 5. Synthetic robot voices generated through speech synthesis. |
1.4 डेटा गुणवत्ता |
2.Data Annotation | 2.डेटा एनोटेशन |
2.1 Validity Assessment For complete, qualified audio segments, annotation must be performed by segmenting the audio by sentence. The following situations indicate that a sentence is invalid and does not require segmentation: 1. If two people speak simultaneously in a single sentence, and their voices overlap significantly with similar volume, the audio is deemed invalid. If the overlap is minimal (only one or two words) and the primary speaker's content is clearly audible, normal transcription is performed. 2. If a sentence contains parts that are inaudible and the content cannot be determined, the explicit, or violent content), it is deemed invalid. duration. The sentence is invalid. 3. If strong noise (environmental noise, equipment noise) makes it difficult to hear the primary speaker, the sentence is deemed invalid. Based on annotation experience, the average length of each natural language segment should be 5-6 seconds. 4. If there are frame drops in a sentence, it is considered invalid. 5. If the sentence is not spoken by a human (eg, automated customer service, synthesizedvoice, or TV broadcast), it is invalid. 6. If the sentence contains non-target language portions, it is considered invalid. (English can be transcribed, but other languages require consultation with the product manager.) 7. If the sentence contains sensitive information (eg, politically sensitive, religiously sensitive,explicit, or violent content), it is deemed invalid. |
2.डेटा एनोटेशन
2.1 वैधता मूल्यांकन 6. यदि वाक्य में गैर-लक्ष्यित भाषा के अंश हैं, तो इसे अमान्य माना जाता है। (अंग्रेजी में लिप्यंतरण किया जा सकता है, लेकिन 7. यदि वाक्य में संवेदनशील जानकारी है (जैसे, राजनीतिक रूप से संवेदनशील, धार्मिक रूप से संवेदनशील, स्पष्ट, या हिंसक सामग्री), तो इसे अमान्य माना जाता है। |
2.2 Valid audio segmentation 1. Annotators must ensure semantic coherence and segment by sentence. Long sentences can be split into clauses, with each sentence lasting no longer than 8 seconds, but not too short. 2. The optimal position for time boundaries is at the lowest point of the waveform. 3. Audio from different speakers should not be segmented into the same sentence. 4. When annotating, leave 0.2 to 0.3 seconds of silence around the audio segment. If so Different speakers within the same segment should be assigned distinct identity IDs and labeled with their gender. Silence does not exist, it is not mandatory. Segments should be taken from parts without sudden noise. If necessary, the buffer time before and after the segment can be shortened to avoid sudden noise, but the audio must not be clipped. 5. Even single-word responses must be segmented. If neighboring sentences can be merged, do so whenever possible. 6. If a speaker pauses for more than 2 seconds, the sentence should be split into two parts China Microsoft without considering semantic continuity. Pauses shorter than 2 seconds should remain as one sentence, provided the total duration does not exceed 8 seconds. 7. If a speaker pauses for less than 2 seconds, and the pause introduces noise or breaks the continuity and completeness of the meaning, the segment should not be split. 8. Audio with clipping, truncation, frame drops, or abnormal energy levels is considered invalid. |
2.2 वैध ऑडियो विभाजन 7. यदि कोई वक्ता 2 सेकंड से कम समय के लिए रुकता है, और विराम शोर पैदा करता है या अर्थ की निरंतरता और पूर्णता को तोड़ता है, तो खंड को विभाजित नहीं किया जाना चाहिए। 8. क्लिपिंग, ट्रंकेशन, फ़्रेम ड्रॉप या असामान्य ऊर्जा स्तर वाले ऑडियो को अमान्य माना जाता है। |
2.3 Speaker Identification Different speakers within the same segment should be assigned distinct identity IDs and labeled with their gender. |
2.3 वक्ता की पहचान |
2.4 Transcription Guidelines Annotators must transcribe the audio content exactly as heard, ensuring the transcription is free of missing, extra, or incorrect words. The general rules are as follows: 1. Capitalization: If a word is typically capitalized, transcribe it according to standard writing conventions. For example: China Microsoft, 2. Numbers: When numbers appear in the audio, do not transcribe them as Arabic numerals. Instead, write them out in the language's standard written form. |
2.4 प्रतिलेखन दिशानिर्देश |
2.5 Spelled-out Words Letters should be transcribed in uppercase and separated by spaces. For example: |
2.5 वर्तनी-युक्त शब्द अक्षरों को बड़े अक्षरों में लिखा जाना चाहिए और रिक्त स्थान से अलग किया जाना चाहिए। उदाहरण के लिए: |
five thirty pm | five thirty pm |
the fbi | the fbi |
NFC | NFC |
2.6 Abbreviations When transcribing, do not use abbreviations. Always write out the full word as pronounced. For example: |
2.6 संक्षिप्तीकरण जब प्रतिलेखन करें, तो संक्षिप्तीकरण का उपयोग न करें। हमेशा पूरा शब्द उसी तरह लिखें जैसा कि उसका उच्चारण किया गया है। उदाहरण के लिए: |
This is Dr.Smith | this is doctor smith |
2.7 Punctuation Punctuation must be used according to grammatical rules. Any punctuation explicitly spoken by the speaker should be transcribed. For example: |
2.7 विराम चिह्न विराम चिह्नों का प्रयोग व्याकरणिक नियमों के अनुसार किया जाना चाहिए। वक्ता द्वारा स्पष्ट रूप से बोले गए किसी भी विराम चिह्न को लिपिबद्ध किया जाना चाहिए। उदाहरण के लिए: |
• "@" should be transcribed as at • ".com" should be transcribed as dot com Only the following punctuation marks are allowed during transcription: • Comma (,), • Hyphen (-) (only within words), • Period (.), • Exclamation mark (!), • Apostrophe (’), • Question mark (?). No other punctuation is permitted. Any inserted punctuation must adhere to grammatical rules. All symbols should be in the standard English input format. |
• "@" को at के रूप में लिखा जाना चाहिए • ".com" को dot com के रूप में लिखा जाना चाहिए प्रतिलेखन के दौरान केवल निम्नलिखित विराम चिह्नों की अनुमति है: • अल्पविराम (,), • हाइफ़न (-) (केवल शब्दों के भीतर), • अवधि (.), • विस्मयादिबोधक चिह्न (!), • अपोस्ट्रोफ़ (’), • प्रश्न चिह्न (?). किसी अन्य विराम चिह्न की अनुमति नहीं है। कोई भी सम्मिलित विराम चिह्न व्याकरणिक नियमों का पालन करना चाहिए। सभी प्रतीक मानक अंग्रेजी इनपुट प्रारूप में होने चाहिए। |
2.8 Interjections 1. Interjections must be transcribed accurately based on pronunciation and meaning. 2. Arabic numerals are not allowed in the transcription of interjections or general text. |
2.8 विस्मयादिबोधक 1. विस्मयादिबोधक को उच्चारण और अर्थ के आधार पर सटीक रूप से लिपिबद्ध किया जाना चाहिए। 2. विस्मयादिबोधक या सामान्य पाठ के लिपिबद्धीकरण में अरबी अंकों की अनुमति नहीं है। |
2.9 Other Guidelines • Profanity: Transcribe offensive language exactly as spoken. Do not substitute or censor letters. • Internet Slang and Popular Terms: Transcribe internet slang and commonly used online terms according to their standard usage. • Repetition: If words or phrases are repeated in the audio, transcribe all occurrences. • Ambiguous Meaning with Clear Pronunciation: If the audio is clear but the meaning is uncertain (e.g., common names), use homophones that match the pronunciation. Ensure that the transcription aligns with the pronunciation and maintains accuracy. If the context clearly indicates the intended meaning, transcribe accordingly to reflect both the correctpronunciation and meaning. • Unfinished Words: If a word is cut off before completion, add a hyphen (-) and leave a space before the next word. For example: I want to go to s- school. However, if the incomplete word appears at the end of a sentence, omit it. Transcribe only complete words at the sentence’s end. |
2.9 अन्य दिशा-निर्देश • अभद्र भाषा: आपत्तिजनक भाषा को ठीक उसी तरह लिखें जैसा कि बोला जाता है। अक्षरों को प्रतिस्थापित या सेंसर न करें। • इंटरनेट स्लैंग और लोकप्रिय शब्द: इंटरनेट स्लैंग और आम तौर पर इस्तेमाल किए जाने वाले ऑनलाइन शब्दों को उनके मानक उपयोग के अनुसार लिखें। • दोहराव: अगर ऑडियो में शब्द या वाक्यांश दोहराए गए हैं, तो सभी घटनाओं को ट्रांसक्राइब करें। • स्पष्ट उच्चारण के साथ अस्पष्ट अर्थ: अगर ऑडियो स्पष्ट है लेकिन अर्थ अनिश्चित है (उदाहरण के लिए, सामान्य नाम), तो उच्चारण से मेल खाने वाले होमोफ़ोन का उपयोग करें। सुनिश्चित करें कि ट्रांसक्रिप्शन उच्चारण के साथ संरेखित हो और सटीकता बनाए रखे। अगर संदर्भ स्पष्ट रूप से इच्छित अर्थ को इंगित करता है, तो सही उच्चारण और अर्थ दोनों को दर्शाने के लिए तदनुसार ट्रांसक्राइब करें। • अधूरे शब्द: अगर कोई शब्द पूरा होने से पहले ही कट जाता है, तो एक हाइफ़न (-) जोड़ें और अगले शब्द से पहले एक स्थान छोड़ दें। उदाहरण के लिए: मैं एस-स्कूल जाना चाहता हूँ। हालाँकि, अगर अधूरा शब्द वाक्य के अंत में आता है, तो उसे छोड़ दें। वाक्य के अंत में सिर्फ़ पूरे शब्दों को ही लिखें। |
2.10 Special Symbols • If special cases arise during annotation, apply the appropriate tags. • Ensure that all tags are valid and avoid errors such as: ◦ Missing paired tags, ◦ Inconsistent capitalization, ◦ Unmatched brackets. |
2.10 विशेष चिह्न • यदि एनोटेशन के दौरान विशेष मामले सामने आते हैं, तो उचित टैग लागू करें। • सुनिश्चित करें कि सभी टैग मान्य हैं और इस तरह की त्रुटियों से बचें: ◦ युग्मित टैग गायब हैं, ◦ असंगत कैपिटलाइज़ेशन, ◦ बेमेल ब्रैकेट। |
Data Validity |
Noise | Special Tags |
Explanation | Role Identification |
Text Transcription |
Valid data | no noise | none | Transcribe the audio content according to the guidelines and standards provided. |
O1 or O2… | Today I went to eat. |
[N] | If a sentence contains noise, mark it at the end with [N], without specifying the type of noise. Oral noises made by the speaker (such as lip smacking or inhaling) do not need to be annotated. |
O1 or O2… | Today I went to eat [N] |
||
HM] | If the speaker performs rap or sings part of the content, mark the end of the sentence with [HM] to indicate the melodic or rhythmic nature of the speech. |
O1 or O2… | I drink alone, intoxicated [HM] |
||
[OVERLAP/] [/OVERLAP] |
If there is overlapping speech and one speaker is significantly clearer, transcribe only the clear speaker’s voice. Use role identification to label the clear speaker, and mark the affected text with [OVERLAP] to indicate interference from the other speaker. |
O1 or O2… | Today I went to [OVERLAP/] eat [/OVERLAP] dinner. |
Eligibility Criteria for Car Loans
Sale & Purchase - Refinace BT Topup
Brand Partner Merchant Onboarding*
!Shubh Shuruaat Start Pay ke Sath!
Start-Pay-YouTube
आपके समय और प्रयास के लिए धन्यवाद। यदि आपके कोई प्रश्न हों,
Contact Page: https://startpayonline.com/contact/
Additional Information
Knowledge Base:https://crm.startpayonline.com/knowledge-base
तो कृपया मुझसे संपर्क करने में संकोच न करें।
शुरू हो जाओ - "एसपीओं-SPO"
Terms & Conditions apply*