• Gaétan Prieur-Drevon

Asian websites: a guide to fix 3 linguistic challenges


Today let’s talk about the semantic differences between western and Asian languages.

A spotlight has been cast on Asian markets over the last few months which has highlighted some of these cultural differences. Western language speaking marketing teams might overlook some of the more specific differences not realizing the importance of them to their customers in Japan and China. 


If you take the following details into account you will enter these markets with better, clearer communication. This way you will be sure to develop a stronger relationship with your new customers there.


This is no news for luxury and fashion brands, which have had their eyes on Japan and China for more than 2 decades of flourishing business. Being pioneers in global marketing, these brands have overcome the challenge of adapting their communication to local practices. Here is a quick guide about 3 challenges to have in mind when localizing your Chinese, Japanese or Korean websites!


Based on our customers' experience localizing to Asia, here are 3 main challenges they faced along the way that are purely about the language itself.

(Spoiler alert: Wezen can help with each of them)

  1. Line-breaking rules 🔰 - Responsive designs mean sentences can be split at different points depending on devices. This can have an impact on the meaning itself of the sentence.

  2. White spaces 💻 - Did you know not all languages require a space after a full stop?

  3. Encoding 🖹 - Logograms take more computer memory space!



Challenge #1: Line-breaking rules 🔰

Let's take a look at Japanese, Chinese, and Korean. In those languages, certain characters would take a different meaning should they come at the end of a line or at the start of a line. Also, certain groups of characters convey a special meaning and should stay grouped together.


Contrary to English or Spanish where web browsers create line breaks between words only, for some Asian languages they will place those line breaks in between characters themselves. This means you have no control over the fact that characters can be at the beginning or end of a line.


Those linguistic rules are known as kinsoku-shori for Japanese. Here are some examples of rules that Wezen Translate implements so your content keeps the meaning you intend once online. Implementing them has a huge upside regarding the appreciation of quality rendered through the website.


Characters forbidden at the start of a line

  • Closing brackets:

 〟’”⦆»)]} 』】〙
  • Japanese characters that are not allowed at the start of a line:

ㇽㇾㇿゕゖㇰㇱㇲㇳㇴㇵㇶㇷㇸㇹㇺㇻㇼー
  • Mid-sentence punctuation:

・、:;,
  • Sentence-ending punctuation:

.
  • Hyphens:

 ‐゠–〜
  • Delimiters:

 ?!‼⁇⁈⁉

Characters forbidden at the end of a line

  • Opening brackets:

([{〔〈《「『【〘

Characters that should not be split

  • Numbers

  • Grouped characters:

一い 昨日ととお


Characters that can’t be separated from anything

 — …‥



Simplified Chinese


Characters forbidden at the start of a line

〗"~!}¢°·’””†‡›℃∶、。〃〆〕:;?

Characters forbidden at the end of a line

$(.〖〝﹙﹛「[{£¥

Traditional Chinese

Characters forbidden at the start of a line

— ’”•” 、。》﹚﹜?︶」︰︱﹑﹒﹘、!),.:;?

Characters forbidden at the end of a line

﹃〈『〔〝︿︴﹙﹛({︵《「


Characters forbidden at the start of a line

!%),.:;?]}¢°’”†‡℃〆〈《「『〕

Characters forbidden at the end of a line

[\{£¥‘“(々$(〇〉


Challenge #2: Japanese white spaces

In English, Spanish, German, French, a sentence often finishes with a full stop. Then, a space marks the separation with the next sentence. Translation systems break down texts into smaller units, which usually means sentences. A translator will then translate one sentence at a time.


👩‍🏫 To be efficient, translators use Translation Memories, based on algorithms that convert entire paragraphs to segments (aka translation units).

Brian is in the kitchen. Where is Brian?

becomes:

Segment#1: Brian is in the kitchen.
Segment#2: Where is Brian?

Once translated, to French for instance, the two segments will be merged as one whole piece of text. However, in order to keep the space between the 2 sentences, the 2 translated segments can't be merged right away: otherwise we would get "kitchen.Where". The space must be integrated.


You obviously don't want translators to manually make sure to leave a space after "kitchen. ". This would result in a lot of errors with missing white spaces etc.. This is why Computer Aided Translation tools ensure the space is added when the translated texts are compiled.



Now, let's talk about Japanese.

Here is what the translated sentences should look like:

ブライアンは台所にいます。ブライアンはどこ?

The full stop character (。) used in Japanese already includes a space in its own design. There is no use for an extra space, which would result in a funny layout. Leaving spaces will leave a bad impression to your online visitors. Make sure your Translation tools take this specificity into account so the space between the source sentences are not automatically replicated in the translations!



Challenge #3: Encoding 🖹

This is a bit technical, yet crucial. New standards make this easier to solve however you may want to check that all the systems you use in your content management / content production workflow can support Asian characters.


Historically a single character would be stored using one byte, however it is not the case for Asian languages. Indeed, Korean, Japanese, Chinese for instance require two bytes to store a single character. This means you need all your infrastructure to be able to manage double byte encoding, and also make sure the 2 encoding systems can be reconciled. Which can be a pain for IT teams. Before launching any localization initiative to Asia, make sure you test all platforms together with different sets of characters.


Wezen stores texts with a UTF-8 encoding which supports both types of character sets. To be more precise, and as Asian markets seem to be particularly fond of smileys... we use utf8mb4 which supports all your crazy emojis 😉



Technology like Wezen makes sure you don't even have to think about these challenges anymore, as the platform integrates the necessary rules at a technical level. These are just the tip of the iceberg... stay tuned for more quick wins in your Asian localization thanks to Wezen!

59 views

Recent Posts

See All
Follow us
  • Follow Wezen on our YouTube channel
  • Follow us on our LinkedIn page

© 2020 by Wezen, part of Datawords Group. All rights reserved.

datawords-group-logo-website.png