Sentence Intonation for Polish Language

July 20, 2017 | Author: Jan Trębacz | Category: Speech Synthesis, Sentence (Linguistics), Subject (Grammar), Predicate (Grammar), Syntax
Share Embed Donate


Short Description

Download Sentence Intonation for Polish Language...

Description

PTFonR07:PTFonR07

2008-05-06

15:50

Strona 79

Sentence Intonation for Polish Language Prozodia wypowiedzi w języku polskim

Bożena Piorkówska, Janusz Rafałko, Wojciech Lesiński, Edward Szpilewski Institute of Computer Sciences, University of Bialystok, Bialystok, Poland [email protected] ABSTRACT The article presents tests results of examination of sentence intonation in Polish. The tests were performed for the project “Development of multi-voice and multi-language Text-to-Speech (TTS) and Speech-to-Text (STT) conversations system (language: Belarussian, Polish, Russian)”. A short introduction to prosody with particular stress on syntax (simple and compound sentence structure). The examined material was recordings of a text read aloud by four different people (two women and two men). The article also presents the process of analysis and future plans. STRESZCZENIE Artykuł prezentuje wyniki badań intonacji zdaniowej dla języka polskiego przeprowadzonych na potrzeby projektu „Syntezer mowy polskiej na podstawie tekstu”. Krótkie wprowadzenie do problematyki prozodii, ze szczególnym uwzględnieniem składni, czyli budowy zdania pojedynczego i złożonego. Materiałem badawczym były nagrania tekstu czytanego przez cztery różne osoby (dwie kobiety i dwóch mężczyzn). W artykule przedstawiono również sposób przeprowadzania analizy oraz kierunki dalszej pracy.

1. Introduction The research aims to fill the gap in introducing and promoting computerised speech technology for Polish language. The decisive factor in achieving high quality of speech synthesis is the completeness of the resources and databases used. The research objective is to develop the linguistic resources, vocabulary, grammar and acoustical databases. The synthesis of phonemic characteristics of speech is based on the Allophones Natural Waves method. The basic principle of synthesising the prosodic features of speech is the division of an utterance into accent groups and the formation on their basis of entire tonal, rhythmical and dynamic contours of a syntagm and utterance as a whole. By using Data Driven approach the speech synthesiser will resort to prosodic feature databases for the synthesis of speech sounds and intonation. The two modules are expected to achieve a high quality of synthesised speech. In order for the synthesed speech to sound natural it needs to have rhythm and, what is more important, proper intonation. The way people speak differs depending on the ability to produce utterances in a particular way. The voice signal is described using numerous 79

PTFonR07:PTFonR07

80

2008-05-06

15:50

Strona 80

Speech and Language Technology. Volume 9/10

physical parameters which vary as the speech goes on. The acoustic parameter that is particularly important in prosodic analysis is the basic frequency (F0) and its curve (intonation contour). That is why special emphasis is placed on determining the intonation contour and F0 maximum and minimum values for particular types of utterances.

2. Kind of utterances Syntax (the sentence structure and message) is crucial in certain utterances intonation. One of the divisions of utterances is into clauses and phrases. Phrases are groups of words that have either no subject or no predicate e. g. Przechodzić tylko na zielonym świetle. A sentence is a group of grammatically interrelated words containing a subject and a predicate e. g. Proszę przechodzić tylko na zielonym świetle. Sentences with modifiers are called long simple sentences, whereas the ones without – short simple sentences. A simple sentence can be as short as one word. Longer utterances with more than one predicate and/or subject are compound sentences. Based on relations between these elements compound and complex sentences are distinguished. There are several types of compound sentences: 1. Conjunction – the sentences are joined; The co-ordinating conjunctions may be: i, oraz, a, jak również, ani itp. (and, as well as etc.); e. g. Marek był w górach a Ania nad morzem (Mark was in the mountains and Ann was by the seaside); 2. Negation – one sentence negates the other; The co-ordinating conjunctions may be: ale, lecz, a, jednak, zaś, natomiast itp. (but, however etc.); e. g. Kasia była spóźniona jednak się nie śpieszyła (Cathy was late but she was not in a hurry); 3. Disjunction – one sentence excludes the other; The co-ordinating conjunctions may be: albo, czy, lub, bądź itp. (or, either etc.); e. g. Przeczytam książkę albo pójdę do kina (I will read a book or go to the cinema); 4. Implication – the second sentence is a consequence of the first; The co-ordinating conjunctions may be: więc, toteż, dlatego, zatem itp. (so, that is why, consequently etc.); e. g. Marek jest zdolny więc ma wysoką średnią (Mark is intelligent so his average is high). In complex sentences the components are not equal. In such sentences there is a main clause (antecedent) and a subordinate clause (consequent). The subordinate clause substitutes or complements one of the main clause parts. Depending on the purpose and emotional undertone we divide sentences into declarative, interrogative, imperative and exclamatory ones. To express emotions we usually use punctuation marks like: dash, question mark, exclamation mark or ellipsis.

3. Research method The very first step was creating the proper text to be recorder. It had to convey the examined types of sentences. It consisted of several sentences of each examined type. Besides that, the sentences differed from each other in conjunctions and message.

PTFonR07:PTFonR07

2008-05-06

15:50

Strona 81

Sentence Intonation for Polish Language

Picture 1. Spectogram of sentence read by a man.

Picture 2. Spectogram of sentence read by a woman.

81

PTFonR07:PTFonR07

82

2008-05-06

15:50

Strona 82

Speech and Language Technology. Volume 9/10

Picture 3. Tone contour of sentence read by women (two above – female) and men (two below – male).

Next, using SoundForge, as the four different people read the text aloud, it was recorded. Each person recorded his/her text at a time so as not to suggest intonation to the other participants of the research. For every sentence (using Praat – a computer program with which phoneticians can analyse, synthesize, and manipulate speech) a spectrogram (spectro-temporal representation of the sound) and an intonation contour were generated. Computer analysis gave some important data considering F0 fluctuation and its mean value. Tone contour examination also brought interesting results. Spectrograms and tone contours of sentences “Maciek był w górach a Ania nad morzem” (Mark was in the mountains and Ann was by the seaside) read by a man and a woman are presented below. It is clearly seen that the man has a low-pitched voice – F0 maximum value is about 140Hz, whereas the woman’s – 243Hz. The woman’s intonation line is similar to the man’s line. They only differ in pitch variation which is normal as women have higher voice than men.

4. Test results 4.1 Acoustic parameters fluctuation What greatly influences the basic frequency F0 is the pitch (high or low) of a speaker’s voice. Generally the higher the basic tone the speaker has the greater the range between Fmin and Fmax. It is presented in the figure below. There are minimum and maximum frequencies of sample compound negation sentences. The same utterances of every person were chosen. The message of the utterance is of lesser importance. The table below presents minimum, maximum and mean value frequencies in different utterances. The type of

PTFonR07:PTFonR07

2008-05-06

15:50

Strona 83

Sentence Intonation for Polish Language

83

Picture 4. Minimum and maximum frequencies of sample negation sentences.

an utterance needs special attention. Exclamatory and interrogative sentences have broader frequency range. Generally the highest values of Fmax are in exclamatory and imperative sentences. Tone contour observations showed that the way the participants produce their sentences does not differ significantly in its diagram. It is clearly seen in fig.1.4. It Table 1. Minimum, maximum and mean value frequencies in different utterances

conjunction sentences disjunction sentences negation sentences complex sentences implication sentences exclamatory sentences interrogative sentences declarative sentences imperative sentences

Woman_1 Fmin Fmax Favr 176 265 231 191 275 237 180 292 245 170 288 243 224 277 244 208 285 244 179 273 236 166 287 242 204 285 241 177 276 238 199 298 275 137 297 263 175 282 175 233 277 233 126 266 222 183 269 223 190 286 247 194 282 235

Wonam_2 Fmin Fmax Favr 133 236 206 157 235 201 160 244 206 161 253 213 136 288 211 174 252 211 139 256 207 156 262 214 110 267 157 166 253 202 195 292 261 171 247 221 110 266 223 182 256 223 175 271 234 158 274 211 167 284 238 182 290 251

Fmin 127 126 119 136 125 133 132 142 129 134 146 126 103 146 127 113 136 125

Man_1 Fmax 212 213 204 200 233 197 182 197 227 219 273 225 209 200 223 225 219 206

Favr 154 162 156 170 167 177 161 171 168 181 214 176 103 146 171 160 181 159

Fmin 93 93 89 93 103 132 92 100 99 105 104 102 106 130 102 108 111 103

Man_2 Fmax 124 123 135 126 136 197 123 136 138 177 194 200 173 146 144 155 161 174

Favr 110 110 112 111 112 107 110 113 111 113 162 166 106 130 122 124 142 150

presents the same compound sentence intonation lines produces by every one of the participants. The two above – women’s, the two below – men’s. There were differences concerning accent intensity of a particular word in a sentence. The message and the speaker’s interpretation were crucial here. The figure

PTFonR07:PTFonR07

84

2008-05-06

15:50

Strona 84

Speech and Language Technology. Volume 9/10

Picture 5. Tone contour of compound sentence read by women (two above – female) and men (two below – male).

below presents tone contour for the sentence “Jutro pojadę na wycieczkę, albo zostanę w domu” (I will go on a trip tomorrow or stay at home). Three of the speakers decided that when they go was more important and one that the very fact of going on a trip was significant.

Picture 6. Tone contour of sentence “Jutro pojadę na wycieczkę, albo zostanę w domu” (I will go on a trip tomorrow or stay at home) read by women (two above – female) and men (two below – male).

PTFonR07:PTFonR07

2008-05-06

15:50

Strona 85

Sentence Intonation for Polish Language

85

4.2. Syntax analysis Various compound sentences intonation lines comparison failed to show any significant differences. Of course there were utterances whose tone contour was not like the general model, but, as it was previously mentioned, it was due to the interpretation. Such utterance contour consists of two rise-and-fall parts. Usually the accented words belong to the subject and here the tone contour is high. When the conjunction appears, there is again a rise. Complex sentences have different F0 frequency graph. Regardless of the sentence construction – main clause before the subordinate one or vice versa – the tone contour falls. Interrogative sentences have rising intonation line, but the strongest rise occurs in the last words. In imperative and exclamatory sentences the intonation line first rises sharply, then falls. In declarative sentences the tone contour rises and falls, but these do not happen as abruptly as in the previous ones. The figure below shows interrogative, exclamatory, imperative and declarative sentences intonation lines.

5. Conclusion The correct and natural use of intonation is very difficult to accomplish. A person, pronouncing a sentence, knows exactly what he is trying to say and knows the meaning of words he uses. A lot of information is communicated through the accurate prosody of the spoken text. To get the best results, not only the sentence construction should be analysed, but also the meaning and layout of its words. Because of this such tests are crucial in order to obtain the best possible quality of synthesed speech. The notion of intonation and eurhythmics of particular types of utterances will make natural speech generation possible. In the future, such synthesiser will surely be commonly used. The expected results can be applied in further research in applied linguistics, especially, in the study of phonetics and prosody of the Polish language, in expanding the theoretical framework for multilingual speech communication systems. The project has great relevance for economic and social fields. The obtained results will facilitate the development of new areas of business activities and services in Poland which are connected with the creation of speech synthesiser. The speech synthesiser can be used in audio servers to provide information to the users in telephone banking, cultural and tourist information telephone services, makes possible a round-the-clock telephone transmission of required information by means of speech; the on-line telephone information services. One of the possible applications of the synthesiser is the socially oriented system, such as a computer-based transmission of textual information be means of voice to the sick, socially disabled and for the blind. The extension of this work is a project of executing an opposite process which is speech recognising and notation in the form of text. Conversion of the speech information into text: Speech-to-Text (STT). Recognition and synthesis methods of the audio-visual patterns will be developed.

PTFonR07:PTFonR07

86

2008-05-06

15:50

Strona 86

Speech and Language Technology. Volume 9/10

Acnowledgement This paper was supported by the EUROPEAN COMMISSIN under grant INTAS Ref. number 04-77-7404. The author wish to express their thanks for the support. REFERENCES 0[1] Lobanov B., Karnevskaya H. 1991. MW Speech Synthesis from Text. Aix-en-Provense, France: Proc. of the XII International Congress of Phonetic Sciences. 0[2] Shpilewski E., Piorkowska B, Rafalko J., Lobanov B, Kiselov V., Tsirulnik L. 2004. Polish TTS in Multi-Voice Slavonic Languages Speech Synthesis System. Saint Petersburg: Proceedings of the 9th International Conference “Speech and Computer” – SPECOM’2004. 0[3] Boguslavsky I., Lobanov B. and Karnevskaya H. 1996. Generation of Intonation and Accentuation of SyntheticSpeech on the Base of Morpho-Syntactic Knowledge. Moscow: Proceedings of the International Workshop “Integration of Language and Speech”. 0[4] Piorkowska B., Rafalko J., Shpilewski E. 2005. Conversion of Textual Information to Speech for Polish Language. Wroclaw: Proceedings of the 4th International Conference on Computer Recogniotion Systems – CORES’2005. 0[5] Lobanov B., Piorkowska B., Rafalko J., Tsirulnik L. 2005. Implementation of Interlanguage Differences of Completeness and Incompleteness Prosody Types in Russian and Polish TTS. Moscow: Proceedings of the International Conference Dialog-2005.

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF