First Skeleton Unit Paintedkriget Kommer



XLIFF1.2 Representation Guide for Gettext PO

Get the best deals on Vintage Skeleton Key when you shop the largest online selection at eBay.com. Free shipping on many items Browse your favorite brands affordable prices. A skeleton is the hard structure that protects the internal organs of a living thing. Skeletons can be inside the body or outside the body. In mammals, which include humans, the skeleton is made of bones.All the bones, when they are joined together, make the 'skeletal system' of a body.

Committee Draft 02

16 October 2006

  1. In the first decades of printing, many early almanacs and surgical manuals included elaborate diagrams of the skeleton to assist practitioners and patients in knowledge of the body. Look at the two images here for an example. At the end of the fifteenth century, renewed.
  2. This is a fascinating and unique brass reproduction of a spherical desk clock with a skeleton face. It has hemispherical glass lenses on both the front and back, and you can see a magnified view of the 17-jewel clock mechanism move both the front and the back. The back lens has a small ground flat area so it will sit upright on a flat surface.
  3. Assessed, diagnosed, and treated muscular and skeletal ailments for 20 to 30 patients per day at three occupational health sites. Managed three to six ancillary staff members at each site.

Specification URIs:

This Version:

Latest Version:

Technical Committee:

Chair(s):

Editor(s):

Abstract:

This document defines a guide for mapping the GNU GettextPO (Portable Object) file format to XLIFF (XML Localisation InterchangeFile Format).

Status:

This document was last revised or approved by the XLIFF TC on the above date. The level of approval is also listed above. Check the 'Latest Version' or 'Latest Approved Version' location noted above for possible later revisions of this document.

Technical Committee members should send comments on this specification to the Technical Committee’s email list. Others should send comments to the Technical Committee by using the “Send A Comment” button on the Technical Committee’s web page at http://www.oasis-open.org/committees/xliff/.

For information on whether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the Technical Committee web page http://www.oasis-open.org/committees/xliff/ipr.php.

The non-normative errata page for this specification is located at http://www.oasis-open.org/committees/xliff/.

First skeleton unit painted kriget kommer block

Notices

Copyright © OASIS® 2007. All Rights Reserved.

All capitalized terms in the following text have the meanings assigned to them in the OASIS Intellectual Property Rights Policy (the 'OASIS IPR Policy'). The full Policy may be found at the OASIS website.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to OASIS, except as needed for the purpose of developing any document or deliverable produced by an OASIS Technical Committee (in which case the rules applicable to copyrights, as set forth in the OASIS IPR Policy, must be followed) or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.

This document and the information contained herein is provided on an 'AS IS' basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

OASIS requests that any OASIS Party or any other party that believes it has patent claims that would necessarily be infringed by implementations of this OASIS Committee Specification or OASIS Standard, to notify OASIS TC Administrator and provide an indication of its willingness to grant patent licenses to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification.

OASIS invites any party to contact the OASIS TC Administrator if it is aware of a claim of ownership of any patent claims that would necessarily be infringed by implementations of this specification by a patent holder that is not willing to provide a license to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification. OASIS may include such claims on its website, but disclaims any obligation to do so.

OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS' procedures with respect to rights in any document or deliverable produced by an OASIS Technical Committee can be found on the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this OASIS Committee Specification or OASIS Standard, can be obtained from the OASIS TC Administrator. OASIS makes no representation that any information or list of intellectual property rights will at any time be complete, or that any claims in such list are, in fact, Essential Claims.

The names 'OASIS', XLIFF, are trademarks of OASIS, the owner and developer of this specification, and should be used onlyto refer to the organization and its official outputs. OASIS welcomes reference to, and implementation and use of, specifications, while reserving the right to enforce its marks againstmisleading uses. Please see http://www.oasis-open.org/who/trademark.php for above guidance.

Table of Contents

>5.5. References

Appendices

1. Introduction

As different tools may provide different filters to extractthe content of Gettext Portable Object (PO) documents it is importantfor interoperability that they represent the extracted data inidentical manner in the XLIFF document.

1.1. Purpose

The intent of this document is to provide a set of guidelinesto represent PO data in XLIFF. It offers a collection of recommendedmappings of all features of PO that developers of XLIFF filters canimplement, and users of XLIFF utilities can rely on to insure a betterinteroperability between tools.

1.2 Transitional and Strict

XLIFF is specified in two 'flavors'. Indicate which of these variants you are using by selecting the appropriate schema. The schema may be specified in the XLIFF document itself or in an OASIS catalog.The namespace is the same for both variants. Thus, if you want to validate the document, the tool used knows which variant you are using. Each variant has its own schema that defines which elements and attributes are allowed in certain circumstances.

As newer versions of XLIFF are approved, sometimes changes are made that render some elements, attributes or constructs in older versions obsolete. Obsolete items are deprecated and should not be used even though they are allowed.The XLIFF specification details which items are deprecated and what new constructs to use.

  • Transitional - Applications that produce older versions of XLIFF may still use deprecated items.Use this variant to validate XLIFF documents that you read. Deprecated elements and attributes are allowed.

    xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.2 xliff-core-1.2-transitional.xsd'

  • Strict - All deprecated elements and attributes are not allowed. Obsolete items from previous versions of XLIFF are deprecated and should not be used when writing new XLIFF documents. Use this to validate XLIFF documents that you create.

    xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.2 xliff-core-1.2-strict.xsd'

2. Overview of thePO file format

Because the Gettext PO format is not a defined standard - noris the format well documented, we will in this section present anoverview of the features and design of the PO file format.

2.1. POand POT

There are two types of PO files: PO Template files (POTs) andLanguage specific PO files (POs). POTs contains a skeleton header,followed by the extracted translation units. POTs are generated by the xgettext extraction tooland are not meant to be edited by humans. POTs are converted intoLanguage Specific POs by the msginittool, and these files are then edited by translators.

When source code is updated, a new POT is generated for theproject, and the changes from previous versions are incorporated intothe existing translations by using the msgmergetool. This tool inserts new translation units into the existing POfiles, marks translation units no longer in use as obsolete, andupdates any references and extracted comments.

Translated PO files are converted to binary resource files,known as MO (Machine Object) files, by the msgfmttool. The Gettext library use MO files at run time; hence PO files areonly used in the development and localisation process.

2.2. GeneralStructure

A PO file starts with a header, followed by a number oftranslation units.

2.3. Header

The PO header follows a similar structure to PO translationunits, but is distinguished by its empty source element (msgid). The header variables arecontained in the headers' target (msgstr)element, with newline character representations ('n')separating each variable.

The initial comment lines (comments are lines starting with '# ') usually contains a copyrightnotice as well as licensing information, followed by a list of alltranslators that has been involved in translating the specific PO file.

The header skeleton in a POT file is initially marked withthe fuzzy flag (flags arecomma separated entries on lines starting with '#,'). This flag is removed when the header variables arefilled in and the POT file is initialized to a language-specific POfile.

Table 1. PredefinedPO Header variables

Variable NameDescription
Project-Id-VersionApplication name and version
Report-Msgid-Bugs-To Mailing list or contact person for reporting errors intranslation units.
POT-Creation-Date Date POT file was generated. Automatically filled inby Gettext
PO-Revision-Date Time stamp when PO file was last edited by atranslator
Last-Translator Contact information for last translator editing thefile.
Language-TeamName of language team that translated this file
MIME-VersionMIME version used for specifying Content-Type
Content-TypeMIME content type and character set for this file
Content-Transfer-EncodingMIME transfer encoding
Plural-Forms Number of plural forms in target language, andc-expression for evaluating which plural form to use for a parameter.

In addition to these predefined variables, the PO header cancontain custom user-defined variables of the same format.

2.4. TranslationUnits

PO translation units use the source string (msgid) as primary id, and containthe translation in the msgstrfield. In addition to this, PO translation units contain othermeta-data, explained in further detail in the following sections.

2.4.1. Sourceand Target

<

The msgid and msgstr contains the source andtarget string of a translation unit.

The actual content of msgidand msgstr is aconcatenation of the strings enclosed by quotes (U+0022characters) on each line. For example:

is exactly the same as:

2.4.2. TranslatorComments

Translator comments are lines starting with '# ' (U+0023+ U+0020). These commentsare added by translators, and are not present in POT files.

2.4.3. ExtractedComments

Extracted comments are lines starting with '#.' (U+0023+ U+002E). These commentsare extracted from the source code. Source-code comments are normallyextracted if they are on the same line as the source string, or on theline immediately preceding it, as in the following c-example:

This would become:

When updating a PO file from a new POT file, existingextracted comments in the language specific PO file are discarded, andthe extracted comments present in the POT file are inserted in theexisting PO file.

2.4.4. References

References are identified by lines starting with '#:' (U+0023+ U+003A). References arespace separated lists of locations (sourcefile:linenumber)specifying where the translation unit is found in a source file.

As each msgidhas to be unique within a PO domain, a single translation unit cancontain multiple references; one for each location where the string isfound in the source code.

Similar to extracted comments, when updating a PO file from anew POT file, existing references in the language specific PO file arediscarded, and the references present in the POT file are inserted inthe existing PO file.

2.4.5. Flags

Flags are identified by lines starting with '#,' (U+0023+ U+002C). Multiple flagsare separated by commas.

Flags are used both as processing instructions by the Gettexttools, and by translators to indicate that a translation unit isunfinished or 'fuzzy'.

Table 2. Flagvalues and descriptions

Flag NameDescription
fuzzy

Indicates that a translation units needs review by atranslator.

This flag is inserted by the gettext tools when atranslation unit changes, or when the translation unit does not passthe format check.

The flag is also commonly used by translators to mark atranslation unit as unfinished.

Note that entries marked as fuzzyare not included when PO files are compiled to binary MO files.

no-wrap

Indicates that the text in the msgidfield is not to be wrapped at page with (usually 80 characters) whichit usually is. Note that this does not affect the wrapping of theactual source string, only the representation of it in the PO file.

This flag is set by developers in the source code, orby adding a command-line flag when invoking the Gettext tools.

X-format,where Xis any of the following:
  • awk
  • c
  • csharp
  • elips
  • gcc-internal
  • java
  • librep
  • lisp
  • objc
  • object-pascal
  • perl
  • perl-brace
  • php
  • python
  • qt
  • scheme
  • sh
  • smalltalk
  • tcl
  • ycp

Indicates that Gettext is to do a format check on thetranslation unit to validate that both msgidand msgstr contains validparameter values according to the source format.

This flag is automatically inserted by the Gettextextraction tool.

no-X-format,where Xis any of the items in the list above.

Indicates that Gettext is to skip the format check forthis translation unit.

This flag has to be set by developers in the sourcecode.

Flags (except fuzzy)are inserted and overridden by developers in source code, by addingthem to a comment immediately preceding the call to gettext, as in thefollowing example:

Since the Gettext call here is inside a printffunction call, the gettext tools will automatically assume this is a c-format string. But in thisexample the developer overrides that, and specifies it is not so, whichwould generate the following PO translation unit:

2.4.6. PluralForms

Gettext, in addition to supporting normal translation unitswith a single msgid and msgstr, support plural formtranslation units. These translation units contain the singularEnglish form in the msgidfield, and the pluralform in the msgid_plural.As the target, these translation units have an array of msgstr, representing the number offorms in the target language:

The target language may have one or more forms (Japanese hasone form, while Polish has 3 forms), and the logic for selecting whichform to use for a parameter is defined in a PO header field, where nplurals defines the number offorms and plural containsa c-expression for evaluating which item in the msgstrarray to use at run time:

This is a typical example for a Germanic language, which hasa special case when n is1. A more complex example is Polish, which has special cases for when n is 1, and in addition somenumbers ending in 2, 3 or 4:

Painted

C-expressions are defined as condition? true_value : false_value where conditionis an expression evaluating to true/false. In the above example, thefirst condition is n1which if true gives the result 0,and if false gives the result of a second c-expression. For the secondexpression, the condition is n%10>=2&& n%10<=4 && (n%100<10|| n%100>=20), which if true gives the result 1, and if false gives the result 2. At run time, Gettext will usethe msgstr with the indexreturned from this expression.

2.4.7. ObsoleteTranslation Units

Obsolete entries are translation units that are no longerpresent in the source-files, and are therefore commented out when a POfile is updated. These entries are re-used by Gettext only if thetranslation-unit re-appears in the project, and are also used for fuzzymatching by the 'msgmerge' tool. Obsolete entries are marked with '#~' (U+0023+ U+007E), as in thefollowing example:

2.5. Domains

One single PO file normally represents one MO file, known asa Gettext domain,but the PO format also allows for representing multiple domains in asingle PO file. This is done by adding the domainkeyword followed by the domain name, as in the following example:

The above example would produce two MO files, domain_1.moand domain_2.mo. If nodomain is specified, translation units belong to the default domain messages.

A PO header is bound to a domain, so each domain has its ownheader.

Having multiple domains in a single PO file is very rare; infact, the authors have never seen this in use.

First skeleton unit painted kriget kommer der

3. GeneralConsiderations

This section discusses the general considerations to take inaccount when extracting data from PO files.

Kommer

3.1. POflavours

Because of good open source tool support, the PO file formathas been used as a common file format for the extraction of localisabledata from a number of different source formats, including XML-baseddocument-formats such as Docbook. This guide mainly coversrepresentation of PO files generated from the GNU Gettext toolkit -targeting only localisation of software messages.

It is fully possible to apply this guide to PO filesextracted from XML formats. However, it is highly recommended to usenative XLIFF filters wherever possible, and not use PO as amiddle-format in these processes.

3.2. Sourceand Target Languages

The PO file format does not provide a way of identifying thesource and target language within a file. By GNU standards, GNUsoftware is written in American English (en-US),and this is reflected in Gettext by only having support for Germanicplural forms in the source language. It is therefore recommended to setthe source-languageattribute to en-US bydefault.

POSIX locale names typically use the form language[_territory][.codeset][@modifier],where language is an ISO639 language code, territoryis an ISO 3166 country code, and codesetis a character set or encoding identifier like ISO-8859-1or UTF-8.

Locale names (through use of the source-language,target-language and xml:lang attributes), should, - asspecified in the XLIFF specification, use [RFC 3066], and notvariants of the POSIX form.

3.3. TranslationUnit Ids

The PO file format is different from most other softwarelocalisation resource formats in that it does not use ID basedtranslation units. Gettext use the source string as the primary id,meaning that within a Gettext domain, a source string must be unique.

When representing a PO translation unit in XLIFF we cannotuse the source string as the value for the idor resname attributebecause of the limitations of XML attribute values. Many localisationtools rely on these attributes for leveraging, updates and alignment,hence not providing a solution for this may cause interoperabilityproblems.

We suggest the following approach for providing unique resname attribute values fortranslation units:

  • For non-pluralTranslation Units, use a string hash of domain_name+ '::' + msgid. If the Translation Unit is in the defaultdomain, use 'messages' asthe domain name.
  • For pluralTranslation Units, use a string hash of domain_name+ '::' + msgid + '::plural[' + n + ']', where n is the plural index of msgstr.

It is however possibleto use the PO format with logical ids, though this approach is not muchused. To support this, filters may add an optional function (specifiedby a command-line flag or similar) to use msgidas the logical id, and then put the value of msgstrin the <source>element.

For example:

would be mapped to:

After translation, the translated entry would be inserted as msgstr. For example:

would be back-converted to PO as:

3.4. Handlingof Escape Sequences in Software Messages

Software messages commonly use escapesequences for representing common controlcharacters like newline ('n'),horizontal tabs ('t'),and others. When converting to XLIFF, these sequences can either bepreserved, or filters may choose to replace escape sequences with theintended character representation.

For example, the following C source code fragment:

would be represented in PO as:

This fragment could be presented in XLIFF by preserving theescape sequences:

which could be further enhanced by encapsulating escapecharacters in XLIFF <ph>or <x/>elements:

Or, the filters could replace escape sequences with theintended characters:

The recommended approach, as also depicted in the tablebelow, is as follows:

  • Escape Sequencesrepresenting ASCI ControlCharacters, except 'n'(Linefeed LF - U+000A), 'r' (Carriage Return CR - U+000D) and 't'(Horizontal Tabulator HT - U+0009),should remain as escaped sequences in XLIFF. The escape sequencesshould be abstracted in <ph>or <x/>elements, with the c-typeattribute set to x-ch-NNwhere NN is the name ofthe ASCI control character.
  • The ControlCharacter 't'(Horizontal Tabulator HT - U+0009)should be converted to the intended Unicode representation (U+0009).
  • The ControlCharacter 'n' (LinefeedLF - U+000A) should beconverted to the intended Unicode representation (U+000A).
  • The Gettext toolsdiscourages use of the 'r'(Carriage Return CR - U+000D)escape sequence. Filters maychoose to implement support for Mac and DOS/Windows style line endingsby replacing DOS/Windows ('rn')and Older Mac ('r') lineendings with Unix ('n')line endings. Filters could store information about the original lineendings encoding, and use this information to insert the correct lineendings on back-conversion.
  • All other escapedcharacters should be converted to the intended Unicode representation.

In addition, characters in a PO file that are not supportedby the XML specification (For example Vertical Tabulator VT - U+000B) should be abstracted in asimilar way to control characters.

Table 3. Handlingof Common Escape Sequences

Escape SequenceIntended CharacterPO representationXLIFF representation
?? (U+003F)??
'' (U+0027)''
'' (U+0022)''
(U+005C)
aBEL (U+0007) [a]a<phctype='x-ch-bel'>a</ph> or <x ctype='x-ch-bel'/>
bBS (U+0008) [a]b[b]<phctype='x-ch-bs'>b</ph> or <x ctype='x-ch-bs'/>
fFF (U+000C) [a]f[b]<phctype='x-ch-ff'>f</ph> or <x ctype='x-ch-ff'/>
nLF (U+000A)nLF[c]
rCR (U+000D)r[b]LF[c]
tHT (U+0009)tHT
vVT (U+000B) [a]VT[d]<phctype='x-ch-vt'>v</ph> or <x ctype='x-ch-vt'/>

[a] Thesecharacters cannot be used in XML. For more information, see Section 2.2in the XML Specification [XML 1.0].

[b] Throws aGettext Warning when used: 'xgettext:internationalized messages should not contain the `X' escape sequence'where X is 'b', 'f' or 'r'.

[c] See bulletpoint above on handling Windows and Mac line endings.

[d] Is in laterversions of Gettext handled similar to 'b','f' and 'r' escape sequences.

Although most of the XLIFF inline tags are represented in the TMX standard, the <x/> tag is not. TMX is a standard to exchange Translation Memory (TM) data created by Computer Aided Translation (CAT) and localization tools. If you plan to store or deliver XLIFF text content using TMX, you may wish to use the <ph> approach for encapsulating escape sequences or you will need to represent <x/> tags in some alternate way in TMX.

3.5. CharacterSet Conversion

The Content-TypePO header field specifies the character encoding used in the PO file.This field is used at run time by Gettext to provide character setconversion to the character set used by the application.

When extracting data from PO files, filters should use the Content-Type information toprovide conversion to UTF-8for storing data in XLIFF. On back-conversion, filters should alsohonour this field when re-creating the PO file.

3.6. Extractingfrom POT files

POT files are automatically generated by the Gettext tools,and is nothing but a simple string table containing the extractedtranslation units. POTs are much simpler than POs, which are modifiedby humans and contain additional meta-data (Translator comments, Headerinformation).

If PO is not used in the localisation process, it would inmany situations be more feasible to convert directly from POT to XLIFF,and not use language-specific PO files at all in the localisationprocess.

When converting from POT, the header can be ignored, as theheader stored in POT is simply a skeleton header. When back-convertingto PO, the filter can insert the necessary PO header elements (MIMEelements and optionally plural forms definitions), providing all dataneeded to produce the language specific MO files.

When plural translation units exist in the POT file, it isimportant to note that it is impossible to send off a language neutralXLIFF file to translators. Filters need to insert the correct number of<trans-unit>elements for a plural group, and hence, filters need information on howmany plural forms there are in a target language.

4. GeneralStructure

Each PO file maps to one XLIFF <file>element. XLIFF representations of PO files should have the datatype attribute set to po, and the originalattribute set to the name of the PO file.

The XLIFF may encapsulate the meta-data from the PO header ina <trans-unit>element, or store the header in a skeleton file.

The XLIFF <body>element contains translation units, which may be grouped by PO domainsusing hierarchical <group>elements.

5. DetailedMapping

5.1. Header

There are two recommended approaches to handling the POheader in XLIFF: Leaving the header out of the XLIFF file, or treatingthe header as a translation unit. Both approaches are described below.

5.1.1. Approach1: Leave header out

The information contained in the PO header is not needed inthe localisation process, and can be left out of the XLIFF file.

When converting POT files, it is possible to completelyignore the PO header, as described in Section 3.6,“Extracting from POT files”.

5.1.2. Approach2: Use a <trans-unit>element

This approach involves storing the whole PO header as a XLIFF<trans-unit>element; with the restypeattribute set to x-gettext-domain-header.In PO the header is identified by a empty source field (msgid), and the header is storedin the target field (msgstr).In converting to XLIFF, we copy the value of msgstrto both <source>and <target>,ensuring that translators can modify the header without loosing trackof the original content. Translator comments and the fuzzy flag is handled the same wayas other translation units.

For example:

would be mapped to:

The content of the PO header can hardly be seen astranslatable data, hence this approach is not fully faithful to theXLIFF specification. However, this approach is recommended as a'lesser-of-evils' approach in that it allows translators to modify POheader information - which is necessary in many Gettext basedlocalisation processes.

5.2. TranslationUnits

5.2.1. Non-Plurals

Each PO entry maps to a XLIFF <trans-unit>element, and contains the source string (msgid)in the <source>element, and the translation (msgstr)in the <target>element. White space and formatting should be preserved by setting the xml:space attribute to preserve.

For example:

would be mapped to:

5.2.2. Plurals

Each plural PO entry maps to a XLIFF <group>element with the restypeattribute set to x-gettext-plurals,and contains one <trans-unit>element for each plural form in the target language.

For example:

would be mapped to:

When the target language has more than two plural forms, theplural source (msgid_plural)should be used in the <source>element for all translation units except the first.

For example:

would be mapped to:

When only one form exists for the target language (Forexample Japanese, Chinese, Korean), the plural group should include asecond <trans-unit>element with the translateattribute set to no. Thiselement should contain the original plural source (msgid_plural)in the <source>element, and is needed when back-converting to PO to create the msgid_plural field.

For example:

would be mapped to:

It is important to be aware of the implications of pluralforms when extracting data from language neutral POT files, asdescribed in Section 3.6,“Extracting from POT files”.

5.2.3. ObsoleteEntries

Obsolete entries should not be included in the XLIFF file,and can be stored in a skeleton or ignored.

5.3. TranslatorComments

Translator comments in PO have the same function as <note>elements in XLIFF - providing a way for people involved in thelocalisation process to include comments relating to a translationunit.

It is possible to map each translator comment to a <note>element, specifying that the comment is extracted from the PO fileusing the from attribute.Multi-line comments are concatenated, each line separated by a newlinecharacter.

For example:

could be mapped to:

Optionally, translator comments can be mapped to <context>elements with the context-typeattribute set to x-po-transcomment.For example:

could be mapped to:

It is up to the individual filter implementer to decide whichapproach (if not both) to use.

5.4. ExtractedComments

Extracted comments in PO are comments extracted from sourcecode, and provide a way for developers to add comments relating to atranslation unit. They can be mapped to XLIFF in a similar fashion toTranslator Comments.

It is possible to map each extracted comment to a <note>element, specifying that the comment is extracted from the PO file, -representing a developer comment, using the fromattribute. Multi-line comments are concatenated, each line separated bya newline character.

For example:

could be mapped to:

Optionally, extracted comments can be mapped to <context>elements with the context-typeattribute set to x-po-autocomment.The surrounding <context-group>element(same context group as the Translator Comment as described above) wouldhave the name attributeset to a value that must be unique within the enclosing <file> element and the purpose attribute set to information.

For example:

could be mapped to:

As with Translator Comments, it is up to the individualfilter implementer to decide which approach (if not both) to use.

5.5. References

Each reference is mapped to two <context>elements, one specifying the source file (context-typeattribute set to sourcefile)and the other representing the location in the source file (context-type attribute set to linenumber).

Each reference is in addition grouped in a <context-group>element, with the nameattribute set to a value that must be unique within the enclosing <file> element and the purpose attributeset to location.

For example:

First Skeleton Unit Painted Kriget Kommer 1

would be mapped to:

5.6. Flags

5.6.1. fuzzy

The fuzzy flagin PO maps to the approvedattribute of a <trans-unit>element in XLIFF. The approvedattribute is set to no ifthe fuzzy flag ispresent, and is set to yesif the flag is absent.

For example:

should be mapped to:

If the msgstrfield is empty and the fuzzyflag is absent, the translation unit is still marked as not approved.When the msgstr fieldcontains data and the fuzzyflag is set, the stateattribute of the <target>element is set to needs-review-translation.

For example:

should be mapped to:

When back-converting to PO, the fuzzyflag is set unless the approvedattribute of the translation unit is set to yes.

5.6.2. no-wrap

The no-wrapflag only controls the visual layout of a translation unit in the POfile, and not the actual content. Hence, this flag has no meaning in anXLIFF file and can be ignored by filters.

Note that it is possible, when back-converting to PO, tohonour the no-wrap flag.This can be done by implementing the same formatting rules as theGettext tools:

  • Leave the first line(same line as the msgid/msgstrkeyword) blank.
  • Only split lineswhen encountering the newline character ('n');Do not word-wrap long lines.

For example:

would when back-converted be formatted as:

in favour of word-wrapping similar to this:

First Skeleton Unit Painted Kriget Kommer 2

How the no-wrapflag is stored (if it is honoured) in the localisation process, is upto the individual filter implementers.

5.6.3. X-format

First Skeleton Unit Painted Kriget Kommer 6

The X-formatflag (For example: c-format,java-format, php-format) specifies that theGettext is to do some format checks before accepting the translation,ensuring that the parameters present in the source string (msgid) is there in the translatedentry (msgstr). Thisformat check is done by the Gettext tools aftertranslation, when generating MO files, or when merging a PO file with anewly extracted POT file.

This flag can be honoured by extracting parameters to <ph> or <x/>elements with the c-typeattribute set to the value mapping to the format flag (see the tablebelow). For example:

Here the parameters %sand %d can be extracted:

Table 4. Recommendedc-type attribute values

First skeleton unit painted kriget kommer 2
Flag Namec-type value
awk-formatx-awk-param
c-formatx-c-param
csharp-formatx-csharp-param
elisp-formatx-elisp-param
gcc-internal-formatx-gcc-internal-param
java-formatx-java-param
librep-formatx-librep-param
lisp-formatx-lisp-param
objc-formatx-objc-param
object-pascal-formatx-object-pascal-param
perl-formatx-perl-param
perl-brace-formatx-perl-brace-param
php-formatx-php-param
python-formatx-python-param
qt-formatx-qt-param
scheme-formatx-scheme-param
sh-formatx-sh-param
smalltalk-formatx-smalltalk-param
tcl-formatx-tcl-param
ycp-formatx-ycp-param

For some source formats special consideration is needed whenreordering parameters. For example:

If we here in the target language wanted to write:

we would have to specify the position of the parameters:

Most XLIFF editors do not provide a way for translators toedit the content of <ph>elements, and with <x/>elements the content is fully abstracted, meaning this logic would haveto be implemented in the filters.

For example, in the following PO fragment:

the extraction filter could insert necessary ordering-tagswhen converting to XLIFF:

The translator could then safely re-order the parameters:

and the back converted PO file would then become:

Take note that the the parameters in msgidare replaced with the original parameters on back-conversion.

It is recommended to implement support for extractingparameters only if support for parameter re-ordering is alsoimplemented.

5.6.4. no-X-format

no-X-format(For example: no-c-format,no-php-format) flags canbe ignored as they have no functional use and are ignored by theGettext tools. These flags are added by developers in source code tooverride the automatic insertion of x-formatflags.

5.7. Domains

If multiple domains are present in a PO file, it isrecommended to group each domain in a <group>element with the restypeattribute set to x-gettext-domainand the resname attributeset to the name of the domain. For Example:

should be mapped to:

In many cases a domain is not specified for the firsttranslation units of a PO file (They are said to belong to the defaultdomain 'messages'). It is recommended to not group these translationunits, but rather have them as children of the <body>element, only grouping domains when the domainkeyword is found. For Example:

should be mapped to:

A. Contributions

The following people have contributed to this document:

  • Josep Condal
  • Fredrik Corneliusson
  • Doug Domeny
  • Karl Eichwalder
  • Asgeir Frimannsson
  • Tim Foster
  • David Fraser
  • Paul Gampe
  • Bruno Haible
  • James M. Hogan
  • Rodolfo M. Raya
  • Peter Reynolds
  • Yves Savourel
  • Bryan Schnabel
  • Tony Jewtushenko

B. Examples of convertedPO files

We have provided the following two examples of PO filesconverted to XLIFF:

  • A simple PO Template file [example.pot] converted to XLIFF [example.xlf].
  • A partially translated PO file [example_nb_NO.po]converted to XLIFF [example_nb-NO.xlf].

References

[GNUGettext]

The GNU Gettext Manual http://www.gnu.org/software/gettext/manual

[OASIS]

Organization for the Advancement of Structured Information Standards Web site.

[XML1.0]

Extensible Markup Language (XML) 1.0 (Third Edition) . W3C (World Wide Web Consortium), Feb 2004

[XLIFFTools]

The XLIFF Tools Project http://xliff-tools.freedesktop.org/