string encoding+decoding summary
The IFC exchange format "STEP physical file" only allows characters represented by decimal value 32 to 126 from the code table in ISO 8859-1. Any other character, like some Western characters, like the German "Umlaut", Greek or Cyrillic letters, or Asian characters, has to be encoded before being exchanged as part of a string value.
The rules for decoding and encoding are defined in ISO10303-21: "Industrial automation systems and integration — Product data representation and exchange — Part 21: Implementation methods: Clear text encoding of the exchange structure". A short summary and guideline is included in the IFC Implementation Guide, available here >>>.
To support the encoding and decoding of strings, the team from Bauhaus University Weimar and Hochtief, developers of the OPENIFCTOOLS, has extracted code from its IFC Open Java Toolbox (btw. now also available as .NET) and made available as a standalone tool with accompanying source code. It is, in the same way as the whole OPENIFCTOOLS, under the license expressed at http://creativecommons.org/licenses/by-nc-sa/3.0.
The package made available includes:
- OPENIFCTOOLS_StringConverter.jar - the binary, under the Windows environment it can be started by double-click, if .jar is associated with java.exe
- OPENIFCTOOLS_StringConverter_src.jar - contains the Source Code
- OPENIFCTOOLS_StringConverter_doc.zip - contains the javadoc documentation
- download the package here >>>
when downloading you agree to the license expressed at http://creativecommons.org/licenses/by-nc-sa/3.0
- download the whole IFC Open Java Toolbox at http://www.openifctools.com
Note: buildingSMART provided the content on behalf of its developers on an as-is basis and is not responsible for its content.
|
The picture shows a screen shot of the demo application The encoding uses an example from Russian language. The encoding uses ISO-8859-5 (table including Cyrillic) |
![]() |
disclaimer: the Japanese text had been copied from a website not knowing its meaning. So hopefully it makes any sense and is not offending in any way The encoding uses an example from Japanese language. The encoding uses 2-byte characters based on ISO10646 |
![]() |
disclaimer: the Korean text had been copied from a website not knowing its meaning. So hopefully it makes any sense and is not offending in any way The encoding uses an example from Korean language. The encoding uses 2-byte characters based on ISO10646 |
Remarks: The functionality is encapsulated in the class stringconverter.StringConverter.java, the demo application is included in demo.StringEncodingDemo.java.



