How to Compress Subtitles
This guide discusses strategies for reducing or compressing text to tackle reading-speed issues. If you wish to find out about additional ways to deal with reading-speed issues, please watch this short tutorial.
Very often, although a very close (almost word-for-word) translation is possible in the target language, it would make the resulting subtitle too long for the viewers to read in the time that it is displayed on screen (with a reading speed of over 21 characters/second), or too long to fall within the OTP's rules for the max number of characters in a line (42) or the whole subtitle (84) (learn more about these technical style rules by watching this tutorial). In these cases, you need to "compress" (reduce) the text in the subtitle. Transcribing is a form of translation too, from the spoken to the written language, and compressing or reducing text will also sometimes be necessary while transcribing, to create a good subtitle. To compress a subtitle means either not to include any equivalent of a certain part of the original text at all (since that part is superfluous), or not include a direct equivalent but express the meaning in a different way (e.g. by referring to the context in the talk). Although the way a subtitle can be compressed largely depends on its context, there are several recurring patterns in subtitles that can be compressed across languages.
Note: Even though the maximum reading speed is 21 characters / second, if you predict the viewer would find the subtitle exceptionally difficult (proper names, poetic language), consider lowering the reading speed even more, by compressing the text and / or extending the duration of the subtitle.
Compressing, grammar, and the immediate and cultural context
The extent to which a subtitle can be compressed can often be defined by certain features of the subtitle language or the availability of immediate or cultural context.
Compressing and the subtitle language
Below, there are many examples of how a subtitle line can be compressed. These English examples are meant to indicate general strategies that can help you find a way to compress the subtitle in your language. Each language offers different means of expressing the same meaning in fewer words. This depends on the grammar, but also on the cultural context - sometimes one word, phrase or idiom can call up an idea that is expressed in many words in the original subtitle, because that idea is more recognizable in the culture that uses the target language of the translation. However, it may also be possible that the rules of a given target language will not make it possible to compress a subtitle line in one of the ways presented below (e.g. if the grammar of the target language requires more details to be specified explicitly than in the original subtitle, e.g. the gender of some of the things the speaker is referring to).
Compressing and context
The possible extent of compression also depends on the context - either the immediate linguistic context (what was said before in the talk, or sometimes, what will be said shortly after the current subtitle), or a broader visual, auditory and cultural context (what the viewer sees and hears in the video, and what the target audience will already know due to their shared background). For example, in English captions, "the Department of Motor Vehicles" can be compressed to "DMV" and the meaning would stay the same, but this compression would not be possible when translating into a language whose users are not normally familiar with this US agency. Similarly, "Just look at this green apple" can be compressed as "Just look at it" if it was preceded by the sentence "This is a beautiful green apple," since what the word "it" refers to would be pre-defined by the immediate context. In contrast, this type of compression would probably not be possible if by saying "Just look at this green apple," the speaker introduced the green apple into the talk for the first time (e.g. by switching through to a slide with a photograph of a green apple and commenting on it). However, the immediate context that follows may also make it possible to compress a subtitle. If the speaker said "Just look at this green apple" while showing a slide with a picture of the fruit, and immediately afterwards added "It's the juicy green apple that I woke up dreaming of," the first subtitle ("Just look at this green apple") could safely be compressed into "Just look at this" or "Look at this," since the immediate visual context (the slide) and the next line would explain what the word "this" referred to.
Compressing without changing the meaning
While compressing, you must be careful not to change the speaker's intended meaning. Remember that if there are no line-length or reading speed issues, compressing is not necessary. The purpose of compressing is not to "summarize" parts of the talk that the subtitler deems unnecessary or unimportant, but quite the opposite - to allow the viewer to get as much of the meaning of the original as possible, by creating a line that will be short enough to be read in the time that the subtitle appears on the screen. If a subtitle is too long for the time it is displayed, most viewers will not be able to finish reading it before it disappears, and in effect, a lot of the meaning will be lost on them. In such cases, when the subtitle is made shorter by getting rid of some of its non-essential parts, while preserving the message, the text will be brief enough for the audience to read and no part of it will "evaporate" due to the subtitle's disappearing too quickly. However, even if compressing is not necessary, you can decide to remove some non-essential language, to make the subtitle easier to read (introductory words and phrases like "now then," slips of the tongue, unintentional repetitions).
Exceptions - when not to compress
A lot of the words and expressions described below can be omitted or made shorter, but from time to time, their meaning will still need to be expressed in some way in the translation. This is usually necessary when an item that can usually be omitted is important in some way in the context, e.g. used to contrast with another similar item.
For example, the word "almost" can often be omitted from the subtitle. Let's say that a speaker is talking about a dinner and just wants to let us know that they ate a lot (and then couldn't sleep because of how full they felt). The speaker says "I ate almost ten samosas." If necessary, the word "almost" can safely be removed from the subtitle ("I ate ten samosas"), because what is important is not that the speaker didn't eat the full ten, but that they ate a lot. However, in a different context, the number may be important. If the speaker is talking about an eating contest, and describing the reasons why they failed, the word "almost" cannot be omitted in the subtitle, because it is crucial to what the speaker intends to convey, i.e. they ate almost ten - but somebody else ate the full ten, and won.
Some parts of the original subtitle can simply be left out in the translation. Examples follow below.
Wait, wait. I still haven't shown you slide 3. --> Wait, I still haven't shown you slide 3. OR I still haven't shown you slide 3. OR I haven't shown you slide 3! OR Wait till you see slide 3.
It was a very, very long dinner. --> It was a very long dinner. OR It was a long dinner. OR We sat there for hours.
Exclamations and greetings
Hey, that's not it! --> That's not it!
Oh my God, are you guys OK? --> Are you guys OK? OR Are you OK?
Hi, I'm Jimmy Hundlepoint. --> I'm Jimmy Hundlepoint.
Addressing a person
People, this example won't be the last one. --> This example won't be the last one. OR This won't be the last example. OR There will be more examples. OR I've got more examples. OR I've got another one.
She told me, "Be nice, Jack." --> She told me, "Be nice." OR She told me to be nice.
Abbreviating speakers' names
Speaker changes need to be represented in the subtitles. Additional speakers may appear if the speaker who began the talk is joined by another speaker on stage (e.g. for a question-and-answer session), or if video or audio material featuring spoken utterances is included in the talk. Speakers should be indicated by their full names and a colon the first time they appear, and by their initials (no periods) when they appear again in the same conversation. Consider this example:
Oh, you've got a question for me? Okay. (Applause) Chris Anderson: Thank you so much for that. You know, you once wrote, I like this quote, "If by some magic, autism had been eradicated from the face of the Earth, then men would still be socializing in front of a wood fire at the entrance to a cave." Temple Grandin: Because who do you think made the first stone spears? The Asperger guy. (...) CA: So, I wanted to ask you a couple other questions. (...) But if there is someone here who has an autistic child, or knows an autistic child and feels kind of cut off from them, what advice would you give them? TG: Well, first of all, you've got to look at age. (...)
To learn more about identifying speakers and using other sound representation, see this guide.
She loves making pizza. She totally does. --> She loves making pizza. OR She really loves making pizza. OR She actually loves making pizza.
This is a blue cucumber, right? --> Is this a blue cucumber? OR Is this cucumber blue? OR Is this one blue? OR Is it blue? OR A blue cucumber?
How on Earth... How on Earth am I going to make it in time? --> How on Earth am I going to make it in time? OR How am I going to make it in time? OR How am I going to make it? OR Will I make it? OR I don't think I'll make it.
Simplifying the semantics
Sometimes it's possible to omit some elements of style or semantic nuance that is not crucial to the message in the particular subtitle we want to shorten.
It was a huge, enormous building. --> It was a huge building. OR It was a big building. OR It was big.
Our organization is about perseverance, or stick-to-itiveness, if you will. --> Our organization is about perseverance. OR Our organization is about stick-to-itiveness.
These phrases often serve to keep the audience interested, to emphasize a point, or to lead the audience along a series of points. Very often, they are added "by default" by speakers when they do not serve much purpose other than to add a slight emphasis. They can frequently be removed and their meaning can be covered by the context.
Yeah, sure, but what about the price of tofu? --> But what about the price of tofu? OR But tofu is expensive/cheap/not free.
Well, it's not really about who does it, but how one does it. --> It's not really about who does it, but how one does it. OR It's about how one does it, not who does it. OR It's about how one does it. OR It's about how it is done/how you do it.
Look/listen/remember, that's also a good example. --> That's also a good example. OR Another good example. OR Another good one. OR Also a good one.
As you know/as you may know/you know, this is pretty easy. --> This is pretty easy.
Let's face it, it wasn't the best decision. --> It wasn't the best decision. OR It wasn't a good decision. OR It was a mistake.
So, this was my next idea. --> This was my next idea. OR My next idea.
Note: The word "so" can be used in two ways - as a way to connect two sentences together or as a way to indicate that the speaker is talking about a result. Often, a speaker will begin a sentence with the word "so" simply to get the sentence started, in which case it can be omitted (e.g. "So like I said before..."). However, "so" can also be used in the sense of "thus/accordingly/consequently/therefore" (e.g. "I ran out of water. So I couldn't bake anymore"). Then, it probably cannot be omitted. The word "so" can also be used inside a sentence to indicate a purpose (e.g. "Put on a sweater so you don't catch a cold"), sometimes in the expression like "so that," "so as to." This is also a case where "so" cannot be omitted. The same rule can be used with the word "Now;" at the beginning of the sentence, if it doesn't mean "currently," it's a connector that can usually be removed if necessary.
I suppose/guess the place was interesting. --> The place was interesting. OR It was interesting.
In my view,/I think/believe we shouldn't have done it. --> We shouldn't have done it. OR It was a bad idea. OR It was a mistake.
Note: usually, if the viewer is just presented with a statement, they will assume, from the context, that whatever belief is contained in the subtitle should be ascribed to the speaker, so it is possible to safely cut out "I think/believe." However, in some contexts, the speaker will be using "I think/believe" to distance their personal beliefs from somebody else's (in such cases, the word "I" is usually emphasized). Then, including an equivalent of "I think/believe" in the translation may be necessary, but you can often find other ways to convey the fact that the speaker is distancing themselves from other opinions (e.g. "To me, that's not bad.").
"Really" and other adverbials
This is really/pretty/totally/amazingly good coffee. --> This is good coffee. OR This coffee is so good!
My car wasn't actually/in fact/really green. --> My car wasn't green. OR But my car wasn't green.
Note: these adverbials are very often used simply conversationally, as a way of emphasizing the "actuality" of whatever one is talking about. However, in some cases, they are used to show contrast between what someone might have believed and what is actually true. Then, some kind of equivalent may need to be used, and the example with "but" shows just one way how that same meaning can actually be expressed using fewer words.
Some words and phrases that express number, quantity or extent are actually redundant if their meaning can be inferred from the context, of if the speaker used the quantifier not to be exact, but to give a general sense of magnitude.
They all want that device. --> They want that device. OR They want it.
I lived there for almost/over/more than a year. --> I lived there for a year.
It's been there for dozens of years. --> It's been there for ages. OR It's been there a long time. OR It's been there for so long. OR It's been there for years.
Idioms and metaphors
Although it's important to convey the speaker's style as much as possible, sometimes an expression that is short in the original may have a very long equivalent that would make the subtitle too long. In translation, in these cases, the omission can be compensated for by using an idiomatic expression in a different subtitle where it would sound right (not jar with the immediate context), thereby conveying the speaker's style as a whole (the fact that they use interesting imagery or turns of phrase), but not necessarily in the particular subtitle that it was necessary to compress.
I told Joanne it was like beating a dead horse. --> I told Joanne it was futile. OR I told her it was futile.
It hit me like a ton of bricks. --> I didn't expect that. OR Who would have thought. OR A bad surprise. OR Oh my!
Simplifying the syntax
The same meaning can often be expressed using a sentence with a structure that is easier for the viewer to read, and generally shorter. In translation, this can often be the case when the target language has a syntactic structure close to the original, but also a simpler, shorter way of expressing the same idea that is not that similar to what the speaker said. For example, English uses Passive Voice sentences quite often, and while some languages have an equivalent Passive Voice construction, they will also have other, more commonly used and simpler ways of expressing the same meaning.
Sometimes, the idea expressed by two sentences can be conveyed by one shorter sentence.
I thought we could do it over. Just one more time. --> We could do it over. OR We could do it one more time. OR Could we do it over?
Note: this line can be interpreted in two different ways - either as the speaker reflecting about the possibility of doing something one more time, or the speaker trying to convince somebody else that something could be done over. "Could we do it over?" could be used for the latter.
Changing quoted speech into direct speech
She said: "Why don't you come with me, then." --> She asked me to come along. OR She asked me to come.
And then Yolanda said: "I don't know anything about that." --> Yolanda didn't know anything. OR Yolanda didn't know. OR She didn't know.
Changing passive voice into active voice
It was created by the R&D Department at our company. --> Our R&D department created it.
References to the non-linguistic context of the talk
Sometimes the speaker talks about things that can be understood by the viewer by just looking at what's going on in the video or listening to its sound track. Explicit references to this immediate visual and auditory context can often be removed from the subtitles.
Things you can see anyway
Very often, an explicit description can be omitted, because the viewer can see what is being referred to (e.g. references to the layout of a slide or to what is happening on stage).
Let me just put it here on the table next to me. --> Let me put it here. OR I'll put it here. OR This goes here.
This girl in this picture here is John Smith. --> This is John Smith.
Note: this kind of compression is possible if the slide is being shown while the speaker is saying this or if it becomes visible shortly after. If the slide is not shown at all for some reason, or appears much later (e.g. a few sentences after), it may be advisable to leave out less text (e.g. "This is a picture of John Smith"), although more extensive compression may be possible anyway if it is still obvious from the context that the speaker is referring to a slide.
Things you can hear anyway
Sometimes, the speaker refers to a sound, and it is possible to remove the explicit reference to the sound, provided that the audio content has been represented in parentheses, e.g. (Music), (Whistling).
(Knocking on the door) So after I heard her knocking, I knew it was her, I let her in. --> (Knocking on the door) I knew it was her, so I let her in. OR (Knocking on the door) I let her in.
- Belczyk, Arkadiusz. Tłumaczenie filmów. Wydawnictwo "Dla szkoły," Wilkowice 2007. - this article uses some of the classification of things to compress in subtitles developed in this book.