Language Detection (Google Service)

Detecting language used in any field of given inputs

About

In this language detection, Google Translate Service is utilized to detect the language used in any field from the given input. It is possible to use this component both in real-time and batch processing.

Adding to your Dynamic Pipeline

This Component can be added to your Dynamic pipelines through the "Language Detection (Google Service)" component. It requires the following fields for configuration:

  • Destination Path (Required): the metadata field in which the ISO-3166-1 alpha-2 code will be outputted to. This will be the detected language. The field can be an existing field, or the component can create a new field for this functionality.
  • Main (Required): This is the input field that the Language Detection will use as a source. By default, the content.body field will be used, however any field can be chosen as an output.

Compatible Languages

The language coverage is continuously improved as this component uses Google Translate API in the back end. Referring to https://cloud.google.com/translate/docs/languages the language coverage is:

LanguageLanguage ID (ISO 3166 code)
AfrikaansAF
AlbanianSQ
AmharicAM
ArabicAR
ArmenianHY
AssameseAS
AymaraAY
AzerbaijaniAZ
BambaraBM
BasqueEU
BelarusianBE
BengaliBN
BhojpuriHO
BosnianBS
BulgarianBG
CatalanCA
CebuanoEB
ChineseZH
CorsicanCO
CroatianHR
CzechCS
DanishDA
DhivehiDV
DogriOI
DutchNL
EnglishEN
EsperantoEO
EstonianET
EweEE
FilipinoIL
FinnishFI
FrenchFR
FrisianFY
GalicianGL
GeorgianKA
GermanDE
GreekEL
GuaraniGN
GujaratiGU
HaitianHT
HausaHA
HawaiianAW
HebrewIW
HindiHI
HmongMN
HungarianHU
IcelandicIS
IgboIG
IlocanoLO
IndonesianID
IrishGA
ItalianIT
JapaneseJA
JavaneseJW
KannadaKN
KazakhKK
KhmerKM
KinyarwandaRW
KonkaniOM
KoreanKO
KrioRI
KurdishKU
KurdishKB
KyrgyzKY
LaoLO
LatinLA
LatvianLV
LingalaLN
LithuanianLT
LugandaLG
LuxembourgishLB
MacedonianMK
MaithiliAI
MalagasyMG
MalayMS
MalayalamML
MalteseMT
MaoriMI
MarathiMR
MeiteilonEI
MizoUS
MongolianMN
MyanmarMY
NepaliNE
NorwegianNO
NyanjaNY
OdiaOR
OromoOM
PashtoPS
PersianFA
PolishPL
PortuguesePT
PunjabiPA
QuechuaQU
RomanianRO
RussianRU
SamoanSM
SanskritSA
ScotsGD
SepediSO
SerbianSR
SesothoST
ShonaSN
SindhiSD
SinhalaSI
SlovakSK
SlovenianSL
SomaliSO
SpanishES
SundaneseSU
SwahiliSW
SwedishSV
TagalogTL
TajikTG
TamilTA
TatarTT
TeluguTE
ThaiTH
TigrinyaTI
TsongaTS
TurkishTR
TurkmenTK
TwiAK
UkrainianUK
UrduUR
UyghurUG
UzbekUZ
VietnameseVI
WelshCY
XhosaXH
YiddishYI
YorubaYO
ZuluZU

Usage in Search API

This Operation allows a user to specify the destination field, source fields, and separator.

{
    "query": {
		...
},
    "operations": [
        {
            "name": "detect_language",
            "destination_path": "operations.language",
            "parameters": {
                "main": "content.body"
            }
        }
    ]
}