我需要将Microsoft语言环境标识 (如1033(适用于美国英语))转换为ISO 639语言代码或直接转换为Java 语言环境实例。 (编辑:甚至可以直接进入微软表中的“语言 – 国家/地区”)。
这是可能的,最简单的方法是什么? 当然,最好只使用JDK标准库,但如果这是不可能的,则使用第三方库。
当开始看起来没有现成的Java解决方案来做这个映射时,我们花了大约20分钟来滚动我们自己的东西,至少现在是这样。
我们从马的嘴里取得了这些信息,例如: http://msdn.microsoft.com/en-us/goglobal/bb964664.aspx ,并通过Excel复制粘贴到.properties文件中,如下所示:
1078 = Afrikaans - South Africa 1052 = Albanian - Albania 1118 = Amharic - Ethiopia 1025 = Arabic - Saudi Arabia 5121 = Arabic - Algeria ...
(如果您有类似需求,您可以在这里下载文件。)
然后有一个非常简单的类,它将.properties文件中的信息读入地图,并且有一个用于转换的方法。
Map<String, String> lcidToDescription; public String getDescription(String lcid) { ... }
是的,这实际上并没有映射到语言代码或Locale对象 (这是我原来的要求),而是微软的“语言 – 国家/地区”的描述。 事实证明,这足以满足我们目前的需要。
免责声明:这实际上是一种简单的“虚拟”方式,在Java中自己完成,显然在您自己的代码库中保存(和维护)LCID映射信息的副本并不是很优雅。 (另一方面,我也不想包括一个巨大的库jar或者为这个简单的映射做任何过于复杂的事情)。所以,尽管有这个答案,如果你知道类似这样的东西, 随时可以发布更优雅的解决方案或现有的库 。
你可以使用GetLocaleInfo来做到这一点(假设你在Windows(win2k +)上运行)。
这个C ++代码演示了如何使用这个函数:
#include "windows.h" int main() { HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE); if(INVALID_HANDLE_VALUE == stdout) return 1; LCID Locale = 0x0c01; //Arabic - Egypt int nchars = GetLocaleInfoW(Locale, LOCALE_SISO639LANGNAME, NULL, 0); wchar_t* LanguageCode = new wchar_t[nchars]; GetLocaleInfoW(Locale, LOCALE_SISO639LANGNAME, LanguageCode, nchars); WriteConsoleW(stdout, LanguageCode, nchars, NULL, NULL); delete[] LanguageCode; return 0; }
把这个变成一个JNA电话并不需要太多的工作。 (提示:以整数形式发射常量以查找它们的值。)
样本JNA代码:
使用JNI有一点涉及,但可以处理一个相对简单的任务。
至少,我会考虑使用本地调用来建立您的转换数据库。 我不确定Windows是否有办法列举LCID,但在.Net中肯定会有某些东西。 作为一个构建级的东西,这不是一个巨大的负担。 我想避免手动维护列表。
以下代码将以编程方式在Microsoft LCID代码和Java语言环境之间创建一个映射,从而使映射保持最新:
import java.io.IOException; import java.util.HashMap; import java.util.Locale; import java.util.Map; /** * @author Gili Tzabari */ public final class Locales { /** * Maps a Microsoft LCID to a Java Locale. */ private final Map<Integer, Locale> lcidToLocale = new HashMap<>(LcidToLocaleMapping.NUM_LOCALES); public Locales() { // Try loading the mapping from cache File file = new File("lcid-to-locale.properties"); Properties properties = new Properties(); try (FileInputStream in = new FileInputStream(file)) { properties.load(in); for (Object key: properties.keySet()) { String keyString = key.toString(); Integer lcid = Integer.parseInt(keyString); String languageTag = properties.getProperty(keyString); lcidToLocale.put(lcid, Locale.forLanguageTag(languageTag)); } return; } catch (IOException unused) { // Cache does not exist or is invalid, regenerate... lcidToLocale.clear(); } LcidToLocaleMapping mapping; try { mapping = new LcidToLocaleMapping(); } catch (IOException e) { // Unrecoverable runtime failure throw new AssertionError(e); } for (Locale locale: Locale.getAvailableLocales()) { if (locale == Locale.ROOT) { // Special case that doesn't map to a real locale continue; } String language = locale.getDisplayLanguage(Locale.ENGLISH); String country = locale.getDisplayCountry(Locale.ENGLISH); country = mapping.getCountryAlias(country); String script = locale.getDisplayScript(); for (Integer lcid: mapping.listLcidFor(language, country, script)) { lcidToLocale.put(lcid, locale); properties.put(lcid.toString(), locale.toLanguageTag()); } } // Cache the mapping try (FileOutputStream out = new FileOutputStream(file)) { properties.store(out, "LCID to Locale mapping"); } catch (IOException e) { // Unrecoverable runtime failure throw new AssertionError(e); } } /** * @param lcid a Microsoft LCID code * @return a Java locale * @see https://msdn.microsoft.com/en-us/library/cc223140.aspx */ public Locale fromLcid(int lcid) { return lcidToLocale.get(lcid); } } import com.google.common.collect.HashMultimap; import com.google.common.collect.ImmutableList; import com.google.common.collect.ImmutableMap; import com.google.common.collect.SetMultimap; import com.google.common.collect.Sets; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import java.util.Collections; import java.util.List; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.stream.Collectors; import org.bitbucket.cowwoc.preconditions.Preconditions; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import org.slf4j.Logger; import org.slf4j.LoggerFactory; /** * Generates a mapping between Microsoft LCIDs and Java Locales. * <p> * @see http://stackoverflow.com/a/32324060/14731 * @author Gili Tzabari */ final class LcidToLocaleMapping { private static final int NUM_COUNTRIES = 194; private static final int NUM_LANGUAGES = 13; private static final int NUM_SCRIPTS = 5; /** * The number of locales we are expecting. This value is only used for performance optimization. */ public static final int NUM_LOCALES = 238; private static final List<String> EXPECTED_HEADERS = ImmutableList.of("lcid", "language", "location"); // [language] - [comment] ([script]) private static final Pattern languagePattern = Pattern.compile("^(.+?)(?: - (.*?))?(?: \\((.+)\\))?$"); /** * Maps a country to a list of entries. */ private static final SetMultimap<String, Mapping> COUNTRY_TO_ENTRIES = HashMultimap.create(NUM_COUNTRIES, NUM_LOCALES / NUM_COUNTRIES); /** * Maps a language to a list of entries. */ private static final SetMultimap<String, Mapping> LANGUAGE_TO_ENTRIES = HashMultimap.create(NUM_LANGUAGES, NUM_LOCALES / NUM_LANGUAGES); /** * Maps a language script to a list of entries. */ private static final SetMultimap<String, Mapping> SCRIPT_TO_ENTRIES = HashMultimap.create(NUM_SCRIPTS, NUM_LOCALES / NUM_SCRIPTS); /** * Maps a Locale country name to a LCID country name. */ private static final Map<String, String> countryAlias = ImmutableMap.<String, String>builder(). put("United Arab Emirates", "UAE"). build(); /** * A mapping between a country, language, script and LCID. */ private static final class Mapping { public final String country; public final String language; public final String script; public final int lcid; Mapping(String country, String language, String script, int lcid) { Preconditions.requireThat(country, "country").isNotNull(); Preconditions.requireThat(language, "language").isNotNull().isNotEmpty(); Preconditions.requireThat(script, "script").isNotNull(); this.country = country; this.language = language; this.script = script; this.lcid = lcid; } @Override public int hashCode() { return country.hashCode() + language.hashCode() + script.hashCode() + lcid; } @Override public boolean equals(Object obj) { if (!(obj instanceof Locales)) return false; Mapping other = (Mapping) obj; return country.equals(other.country) && language.equals(other.language) && script.equals(other.script) && lcid == other.lcid; } } private final Logger log = LoggerFactory.getLogger(LcidToLocaleMapping.class); /** * Creates a new LCID to Locale mapping. * <p> * @throws IOException if an I/O error occurs while reading the LCID table */ LcidToLocaleMapping() throws IOException { Document doc = Jsoup.connect("https://msdn.microsoft.com/en-us/library/cc223140.aspx").get(); Element mainBody = doc.getElementById("mainBody"); Elements elements = mainBody.select("table"); assert (elements.size() == 1): elements; for (Element table: elements) { boolean firstRow = true; for (Element row: table.select("tr")) { if (firstRow) { // Make sure that columns are ordered as expected List<String> headers = new ArrayList<>(3); Elements columns = row.select("th"); for (Element column: columns) headers.add(column.text().toLowerCase()); assert (headers.equals(EXPECTED_HEADERS)): headers; firstRow = false; continue; } Elements columns = row.select("td"); assert (columns.size() == 3): columns; Integer lcid = Integer.parseInt(columns.get(0).text(), 16); Matcher languageMatcher = languagePattern.matcher(columns.get(1).text()); if (!languageMatcher.find()) throw new AssertionError(); String language = languageMatcher.group(1); String script = languageMatcher.group(2); if (script == null) script = ""; String country = columns.get(2).text(); Mapping mapping = new Mapping(country, language, script, lcid); COUNTRY_TO_ENTRIES.put(country, mapping); LANGUAGE_TO_ENTRIES.put(language, mapping); if (!script.isEmpty()) SCRIPT_TO_ENTRIES.put(script, mapping); } } } /** * Returns the LCID codes associated with a [country, language, script] combination. * <p> * @param language a language * @param country a country (empty string if any country should match) * @param script a language script (empty string if any script should match) * @return an empty list if no matches are found * @throws NullPointerException if any of the arguments are null * @throws IllegalArgumentException if language is empty */ public Collection<Integer> listLcidFor(String language, String country, String script) throws NullPointerException, IllegalArgumentException { Preconditions.requireThat(language, "language").isNotNull().isNotEmpty(); Preconditions.requireThat(country, "country").isNotNull(); Preconditions.requireThat(script, "script").isNotNull(); Set<Mapping> result = LANGUAGE_TO_ENTRIES.get(language); if (result == null) { log.warn("Language '" + language + "' had no corresponding LCID"); return Collections.emptyList(); } if (!country.isEmpty()) { Set<Mapping> entries = COUNTRY_TO_ENTRIES.get(country); result = Sets.intersection(result, entries); } if (!script.isEmpty()) { Set<Mapping> entries = SCRIPT_TO_ENTRIES.get(script); result = Sets.intersection(result, entries); } return result.stream().map(entry -> entry.lcid).collect(Collectors.toList()); } /** * @param name the locale country name * @return the LCID country name */ public String getCountryAlias(String name) { String result = countryAlias.get(name); if (result == null) return name; return result; } }
Maven的依赖:
<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>18.0</version> </dependency> <dependency> <groupId>org.bitbucket.cowwoc</groupId> <artifactId>preconditions</artifactId> <version>1.25</version> </dependency> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.8.3</version> </dependency>
用法:
System.out.println("Language: " + new Locales().fromLcid(1033).getDisplayLanguage());
将打印“语言:英语”。
意思是,LCID 1033映射到英语。
注 :这只会生成运行时JVM上可用的语言环境的映射。 意思是,你只会得到所有可能的语言环境的一个子集。 也就是说,我不认为技术上可以实例化你的JVM不支持的Locale,所以这可能是我们可以做的最好的…
这是 “Java LCID” 在谷歌的第一个命中是这个javadoc:
gnu.java.awt.font.opentype.NameDecoder
private static java.util.Locale getWindowsLocale(int lcid)
Maps a Windows LCID into a Java Locale. Parameters: lcid - the Windows language ID whose Java locale is to be retrieved. Returns: an suitable Locale, or null if the mapping cannot be performed.
我不知道该去哪里下载这个库,但它是GNU的,所以它不应该太难找到。