User:TJones (WMF)/Notes/Language Detection Evaluation/Results by Language Count with Spaces

From mediawiki.org

It turns out that the ElasticSearch language detection plugin works a bit better when spaces are added to the beginning and end of each query.

Languages reported by count[edit]

Actual:
English (599)	Spanish (43)	Chinese (20)	Portuguese (19)	Arabic (10)	
French (10)	Tagalog (9)	German (8)	Malay (6)	Russian (5)	
Turkish (5)	Indonesian (4)	Persian (4)	Swahili (4)	Korean (3)	
Bengali (2)	Bulgarian (2)	Hindi (2)	Italian (2)	Norwegian (2)	
Croatian (1)	Dutch (1)	Estonian (1)	Finnish (1)	Greek (1)	
Hmong (1)	Japanese (1)	Kannada (1)	Latin (1)	Polish (1)	
Serbian (1)	Somali (1)	Swedish (1)	Tamil (1)	Thai (1)	
Uzbek (1)	

1:
English (280)	Romanian (67)	Spanish (57)	French (53)	Italian (44)	
Tagalog (43)	German (28)	Portuguese (25)	Indonesian (24)	Chinese (19)	
Swedish (13)	Albanian (13)	Danish (12)	Norwegian (11)	Estonian (11)	
Dutch (11)	Arabic (8)	Turkish (7)	Finnish (6)	Croatian (6)	
Polish (5)	Persian (5)	Lithuanian (4)	Russian (3)	Korean (3)	
Macedonian (2)	Bulgarian (2)	Bengali (2)	Hindi (2)	Czech (2)	
Tamil (1)	Japanese (1)	Thai (1)	Hungarian (1)	Greek (1)	
Latvian (1)	Ukrainian (1)	

2:
English (340)	Romanian (90)	French (83)	Italian (72)	Spanish (70)	
Tagalog (59)	German (42)	Portuguese (34)	Indonesian (28)	Danish (24)	
Albanian (23)	Chinese (22)	Dutch (19)	Swedish (18)	Estonian (18)	
Norwegian (16)	Croatian (10)	Arabic (9)	Turkish (8)	Lithuanian (8)	
Finnish (8)	Polish (7)	Czech (6)	Persian (5)	Hungarian (4)	
Korean (3)	Latvian (3)	Russian (3)	Macedonian (3)	Hindi (2)	
Bulgarian (2)	Bengali (2)	Ukrainian (1)	Greek (1)	Thai (1)	
Tamil (1)	Japanese (1)	

3:
English (344)	Romanian (94)	French (85)	Italian (80)	Spanish (75)	
Tagalog (62)	German (47)	Portuguese (37)	Danish (29)	Indonesian (29)	
Albanian (25)	Dutch (24)	Chinese (22)	Estonian (19)	Swedish (18)	
Norwegian (16)	Turkish (11)	Lithuanian (10)	Polish (10)	Croatian (10)	
Arabic (9)	Czech (8)	Finnish (8)	Persian (5)	Hungarian (4)	
Korean (3)	Latvian (3)	Russian (3)	Macedonian (3)	Hindi (2)	
Bengali (2)	Bulgarian (2)	Ukrainian (1)	Vietnamese (1)	Greek (1)	
Thai (1)	Tamil (1)	Japanese (1)	

4:
English (345)	Romanian (94)	French (85)	Italian (80)	Spanish (76)	
Tagalog (63)	German (48)	Portuguese (37)	Indonesian (30)	Danish (29)	
Dutch (25)	Albanian (25)	Chinese (22)	Estonian (19)	Swedish (18)	
Norwegian (16)	Croatian (11)	Turkish (11)	Polish (10)	Lithuanian (10)	
Arabic (9)	Finnish (8)	Czech (8)	Persian (5)	Hungarian (4)	
Macedonian (3)	Russian (3)	Korean (3)	Latvian (3)	Bengali (2)	
Bulgarian (2)	Hindi (2)	Tamil (1)	Japanese (1)	Thai (1)	
Vietnamese (1)	Greek (1)	Ukrainian (1)	

Recall and precision by number of languages considered[edit]

thresh	f0.5	f1	f2	recall	prec	total	hits	misses
TOTAL (775)
1	 51.2%	 51.2%	 51.2%	 51.2%	 51.2%	775	397	378
2	 46.6%	 50.7%	 55.7%	 59.6%	 44.2%	775	462	584
3	 45.0%	 49.8%	 55.6%	 60.4%	 42.4%	775	468	637
4	 45.0%	 49.8%	 55.8%	 60.6%	 42.3%	775	470	642

English (599)
1	 80.6%	 63.0%	 51.8%	 46.2%	 98.9%	599	277	3
2	 86.0%	 71.8%	 61.6%	 56.3%	 99.1%	599	337	3
3	 86.3%	 72.3%	 62.2%	 56.9%	 99.1%	599	341	3
4	 86.4%	 72.5%	 62.4%	 57.1%	 99.1%	599	342	3

Spanish (43)
1	 60.9%	 66.0%	 72.1%	 76.7%	 57.9%	43	33	24
2	 51.1%	 58.4%	 68.2%	 76.7%	 47.1%	43	33	37
3	 48.1%	 55.9%	 66.8%	 76.7%	 44.0%	43	33	42
4	 49.0%	 57.1%	 68.5%	 79.1%	 44.7%	43	34	42

Chinese (20)
1	 99.0%	 97.4%	 96.0%	 95.0%	100.0%	20	19	0
2	 92.6%	 95.2%	 98.0%	100.0%	 90.9%	20	20	2
3	 92.6%	 95.2%	 98.0%	100.0%	 90.9%	20	20	2
4	 92.6%	 95.2%	 98.0%	100.0%	 90.9%	20	20	2

Portuguese (19)
1	 50.4%	 54.5%	 59.4%	 63.2%	 48.0%	19	12	13
2	 41.9%	 49.1%	 59.1%	 68.4%	 38.2%	19	13	21
3	 41.9%	 50.0%	 61.9%	 73.7%	 37.8%	19	14	23
4	 41.9%	 50.0%	 61.9%	 73.7%	 37.8%	19	14	23

Arabic (10)
1	 95.2%	 88.9%	 83.3%	 80.0%	100.0%	10	8	0
2	 87.0%	 84.2%	 81.6%	 80.0%	 88.9%	10	8	1
3	 87.0%	 84.2%	 81.6%	 80.0%	 88.9%	10	8	1
4	 87.0%	 84.2%	 81.6%	 80.0%	 88.9%	10	8	1

French (10)
1	  6.8%	  9.5%	 16.1%	 30.0%	  5.7%	10	3	50
2	  7.3%	 10.8%	 20.3%	 50.0%	  6.0%	10	5	78
3	  7.1%	 10.5%	 20.0%	 50.0%	  5.9%	10	5	80
4	  7.1%	 10.5%	 20.0%	 50.0%	  5.9%	10	5	80

Tagalog (9)
1	 19.3%	 26.9%	 44.3%	 77.8%	 16.3%	9	7	36
2	 16.3%	 23.5%	 42.1%	 88.9%	 13.6%	9	8	51
3	 15.6%	 22.5%	 40.8%	 88.9%	 12.9%	9	8	54
4	 15.3%	 22.2%	 40.4%	 88.9%	 12.7%	9	8	55

German (8)
1	 25.0%	 33.3%	 50.0%	 75.0%	 21.4%	8	6	22
2	 17.0%	 24.0%	 40.5%	 75.0%	 14.3%	8	6	36
3	 17.9%	 25.5%	 44.3%	 87.5%	 14.9%	8	7	40
4	 17.5%	 25.0%	 43.8%	 87.5%	 14.6%	8	7	41

Malay (6)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	6	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	6	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	6	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	6	0	0

Russian (5)
1	 88.2%	 75.0%	 65.2%	 60.0%	100.0%	5	3	0
2	 88.2%	 75.0%	 65.2%	 60.0%	100.0%	5	3	0
3	 88.2%	 75.0%	 65.2%	 60.0%	100.0%	5	3	0
4	 88.2%	 75.0%	 65.2%	 60.0%	100.0%	5	3	0

Turkish (5)
1	 45.5%	 50.0%	 55.6%	 60.0%	 42.9%	5	3	4
2	 40.5%	 46.2%	 53.6%	 60.0%	 37.5%	5	3	5
3	 30.6%	 37.5%	 48.4%	 60.0%	 27.3%	5	3	8
4	 30.6%	 37.5%	 48.4%	 60.0%	 27.3%	5	3	8

Indonesian (4)
1	 20.0%	 28.6%	 50.0%	100.0%	 16.7%	4	4	20
2	 17.2%	 25.0%	 45.5%	100.0%	 14.3%	4	4	24
3	 16.7%	 24.2%	 44.4%	100.0%	 13.8%	4	4	25
4	 16.1%	 23.5%	 43.5%	100.0%	 13.3%	4	4	26

Persian (4)
1	 83.3%	 88.9%	 95.2%	100.0%	 80.0%	4	4	1
2	 83.3%	 88.9%	 95.2%	100.0%	 80.0%	4	4	1
3	 83.3%	 88.9%	 95.2%	100.0%	 80.0%	4	4	1
4	 83.3%	 88.9%	 95.2%	100.0%	 80.0%	4	4	1

Swahili (4)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	4	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	4	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	4	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	4	0	0

Korean (3)
1	100.0%	100.0%	100.0%	100.0%	100.0%	3	3	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	3	3	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	3	3	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	3	3	0

Bengali (2)
1	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0

Bulgarian (2)
1	 50.0%	 50.0%	 50.0%	 50.0%	 50.0%	2	1	1
2	 50.0%	 50.0%	 50.0%	 50.0%	 50.0%	2	1	1
3	 50.0%	 50.0%	 50.0%	 50.0%	 50.0%	2	1	1
4	 50.0%	 50.0%	 50.0%	 50.0%	 50.0%	2	1	1

Hindi (2)
1	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	2	2	0

Italian (2)
1	  5.6%	  8.7%	 19.2%	100.0%	  4.5%	2	2	42
2	  3.4%	  5.4%	 12.5%	100.0%	  2.8%	2	2	70
3	  3.1%	  4.9%	 11.4%	100.0%	  2.5%	2	2	78
4	  3.1%	  4.9%	 11.4%	100.0%	  2.5%	2	2	78

Norwegian (2)
1	 10.9%	 15.4%	 26.3%	 50.0%	  9.1%	2	1	10
2	  7.6%	 11.1%	 20.8%	 50.0%	  6.2%	2	1	15
3	  7.6%	 11.1%	 20.8%	 50.0%	  6.2%	2	1	15
4	  7.6%	 11.1%	 20.8%	 50.0%	  6.2%	2	1	15

Croatian (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	6
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	10
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	10
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	11

Dutch (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	11
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	19
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	24
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	25

Estonian (1)
1	 11.1%	 16.7%	 33.3%	100.0%	  9.1%	1	1	10
2	  6.8%	 10.5%	 22.7%	100.0%	  5.6%	1	1	17
3	  6.5%	 10.0%	 21.7%	100.0%	  5.3%	1	1	18
4	  6.5%	 10.0%	 21.7%	100.0%	  5.3%	1	1	18

Finnish (1)
1	 20.0%	 28.6%	 50.0%	100.0%	 16.7%	1	1	5
2	 15.2%	 22.2%	 41.7%	100.0%	 12.5%	1	1	7
3	 15.2%	 22.2%	 41.7%	100.0%	 12.5%	1	1	7
4	 15.2%	 22.2%	 41.7%	100.0%	 12.5%	1	1	7

Greek (1)
1	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0

Hmong (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Japanese (1)
1	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0

Kannada (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Latin (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Polish (1)
1	 23.8%	 33.3%	 55.6%	100.0%	 20.0%	1	1	4
2	 17.2%	 25.0%	 45.5%	100.0%	 14.3%	1	1	6
3	 12.2%	 18.2%	 35.7%	100.0%	 10.0%	1	1	9
4	 12.2%	 18.2%	 35.7%	100.0%	 10.0%	1	1	9

Serbian (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Somali (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Swedish (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	13
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	18
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	18
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	18

Tamil (1)
1	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0

Thai (1)
1	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
2	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
3	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0
4	100.0%	100.0%	100.0%	100.0%	100.0%	1	1	0

Uzbek (1)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	1	0	0

Albanian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	13
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	23
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	25
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	25

Czech (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	2
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	6
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	8
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	8

Danish (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	12
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	24
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	29
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	29

Hungarian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	4
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	4
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	4

Latvian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3

Lithuanian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	4
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	8
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	10
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	10

Macedonian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	2
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	3

Romanian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	67
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	90
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	94
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	94

Ukrainian (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1

Vietnamese (0)
1	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	0
2	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	0
3	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1
4	  0.0%	  0.0%	  0.0%	  0.0%	  0.0%	0	0	1

thresh	f0.5	f1	f2	recall	prec	total	hits	misses


Most frequent incorrect ID by language[edit]

English (599)
1	Romanian (61)	French (49)	Italian (38)	Tagalog (31)	Spanish (19)
	German (18)	Albanian (13)	Danish (12)	Swedish (12)	Dutch (11)
	Indonesian (11)	Portuguese (10)	Norwegian (9)	Estonian (7)	Croatian (5)
	Finnish (5)	Polish (4)	Lithuanian (3)	Czech (2)	Hungarian (1)
	Latvian (1)	Turkish (1)
2	Romanian (80)	French (73)	Italian (58)	Tagalog (43)	German (32)
	Spanish (29)	Danish (24)	Albanian (20)	Dutch (18)	Swedish (17)
	Portuguese (15)	Indonesian (14)	Estonian (13)	Norwegian (13)	Croatian (8)
	Finnish (7)	Czech (6)	Lithuanian (6)	Polish (6)	Hungarian (4)
	Latvian (3)	Turkish (2)	Chinese (1)
3	Romanian (84)	French (75)	Italian (65)	Tagalog (44)	German (36)
	Spanish (32)	Danish (29)	Albanian (22)	Dutch (21)	Swedish (17)
	Portuguese (16)	Indonesian (14)	Estonian (13)	Norwegian (13)	Polish (9)
	Croatian (8)	Czech (8)	Lithuanian (8)	Finnish (7)	Turkish (5)
	Hungarian (4)	Latvian (3)	Chinese (1)
4	Romanian (84)	French (75)	Italian (65)	Tagalog (45)	German (37)
	Spanish (32)	Danish (29)	Albanian (22)	Dutch (22)	Swedish (17)
	Portuguese (16)	Indonesian (15)	Estonian (13)	Norwegian (13)	Polish (9)
	Croatian (8)	Czech (8)	Lithuanian (8)	Finnish (7)	Turkish (5)
	Hungarian (4)	Latvian (3)	Chinese (1)

Spanish (43)
1	Romanian (3)	German (2)	English (1)	Italian (1)	Lithuanian (1)
	Portuguese (1)	Tagalog (1)
2	Romanian (5)	Italian (4)	French (3)	Portuguese (3)	German (2)
	Tagalog (2)	English (1)	Estonian (1)	Lithuanian (1)	Norwegian (1)
3	Italian (5)	Romanian (5)	Portuguese (4)	French (3)	Tagalog (3)
	Estonian (2)	German (2)	English (1)	Lithuanian (1)	Norwegian (1)
	Vietnamese (1)
4	Italian (5)	Romanian (5)	Portuguese (4)	French (3)	Tagalog (3)
	Estonian (2)	German (2)	Croatian (1)	English (1)	Lithuanian (1)
	Norwegian (1)	Vietnamese (1)

Chinese (20)
1	Tagalog (1)
2	Dutch (1)	Romanian (1)	Tagalog (1)
3	Dutch (1)	Romanian (1)	Tagalog (1)
4	Dutch (1)	Romanian (1)	Tagalog (1)

Portuguese (19)
1	Spanish (5)	Italian (1)	Romanian (1)
2	Spanish (7)	Italian (3)	Romanian (2)	French (1)
3	Spanish (8)	Italian (3)	Romanian (2)	French (1)
4	Spanish (8)	Italian (3)	Romanian (2)	French (1)

Arabic (10)
1	English (1)	Persian (1)
2	Albanian (1)	English (1)	Persian (1)
3	Albanian (1)	English (1)	Persian (1)
4	Albanian (1)	English (1)	Persian (1)

French (10)
1	German (2)	English (1)	Estonian (1)	Italian (1)	Romanian (1)
	Tagalog (1)
2	German (2)	Italian (2)	English (1)	Estonian (1)	Romanian (1)
	Tagalog (1)
3	German (2)	Italian (2)	English (1)	Estonian (1)	Romanian (1)
	Tagalog (1)
4	German (2)	Italian (2)	English (1)	Estonian (1)	Romanian (1)
	Tagalog (1)

Tagalog (9)
1	Indonesian (1)	Italian (1)
2	Indonesian (1)	Italian (1)	Spanish (1)
3	Dutch (1)	Indonesian (1)	Italian (1)	Spanish (1)
4	Dutch (1)	Indonesian (1)	Italian (1)	Spanish (1)

German (8)
1	Indonesian (1)	Swedish (1)
2	Indonesian (1)	Italian (1)	Swedish (1)	Tagalog (1)
3	Indonesian (1)	Italian (1)	Swedish (1)	Tagalog (1)
4	Indonesian (1)	Italian (1)	Swedish (1)	Tagalog (1)

Malay (6)
1	Indonesian (6)
2	Indonesian (6)	Italian (1)
3	Indonesian (6)	Italian (1)	Tagalog (1)
4	Indonesian (6)	Italian (1)	Tagalog (1)

Russian (5)
1	Macedonian (1)	Ukrainian (1)
2	Macedonian (1)	Ukrainian (1)
3	Macedonian (1)	Ukrainian (1)
4	Macedonian (1)	Ukrainian (1)

Turkish (5)
1	Croatian (1)	Estonian (1)
2	Croatian (1)	Estonian (1)
3	Croatian (1)	Estonian (1)
4	Croatian (1)	Estonian (1)

Indonesian (4)
1
2	Tagalog (1)
3	Tagalog (1)
4	Tagalog (1)

Persian (4)
1
2	Arabic (1)
3	Arabic (1)
4	Arabic (1)

Swahili (4)
1	Tagalog (2)	Indonesian (1)	Turkish (1)
2	Indonesian (2)	Tagalog (2)	Croatian (1)	Turkish (1)
3	Indonesian (2)	Tagalog (2)	Croatian (1)	Dutch (1)	Turkish (1)
4	Indonesian (2)	Tagalog (2)	Croatian (1)	Dutch (1)	Turkish (1)

Korean (3)
1
2	Chinese (1)
3	Chinese (1)
4	Chinese (1)

Bulgarian (2)
1	Macedonian (1)
2	Macedonian (1)
3	Macedonian (1)
4	Macedonian (1)

Norwegian (2)
1	Portuguese (1)
2	Portuguese (1)
3	Portuguese (1)
4	Portuguese (1)

Croatian (1)
1	Romanian (1)
2	Portuguese (1)	Romanian (1)
3	Portuguese (1)	Romanian (1)	Spanish (1)
4	Portuguese (1)	Romanian (1)	Spanish (1)

Dutch (1)
1	French (1)
2	French (1)
3	French (1)
4	French (1)

Estonian (1)
1
2	Lithuanian (1)
3	Lithuanian (1)
4	Lithuanian (1)

Hmong (1)
1	Estonian (1)
2	Albanian (1)	Estonian (1)
3	Albanian (1)	Estonian (1)	Indonesian (1)
4	Albanian (1)	Estonian (1)	Indonesian (1)

Latin (1)
1	Portuguese (1)
2	Portuguese (1)
3	Portuguese (1)
4	Portuguese (1)

Serbian (1)
1	Bulgarian (1)
2	Bulgarian (1)	Macedonian (1)
3	Bulgarian (1)	Macedonian (1)
4	Bulgarian (1)	Macedonian (1)

Somali (1)
1	Turkish (1)
2	Turkish (1)
3	Turkish (1)
4	Turkish (1)

Swedish (1)
1	Norwegian (1)
2	Norwegian (1)
3	Norwegian (1)
4	Norwegian (1)

Uzbek (1)
1	Turkish (1)
2	Albanian (1)	Turkish (1)
3	Albanian (1)	Turkish (1)
4	Albanian (1)	Turkish (1)



Most frequent ID for non-languages[edit]

Name (361)
1	English (56)	German (38)	Tagalog (36)	Italian (32)	French (29)
	Romanian (24)	Indonesian (23)	Spanish (15)	Danish (12)	Norwegian (10)
	Turkish (10)	Albanian (9)	Croatian (9)	Swedish (9)	Dutch (8)
	Estonian (8)	Portuguese (8)	Finnish (6)	Hungarian (6)	Latvian (4)
	Lithuanian (4)	Polish (4)	Vietnamese (1)
2	English (76)	German (47)	Tagalog (47)	Italian (41)	French (38)
	Indonesian (38)	Romanian (30)	Spanish (22)	Danish (18)	Croatian (16)
	Turkish (15)	Dutch (14)	Estonian (14)	Norwegian (14)	Portuguese (14)
	Swedish (13)	Albanian (12)	Finnish (10)	Lithuanian (9)	Hungarian (8)
	Latvian (5)	Polish (4)	Vietnamese (2)	Chinese (1)
3	English (78)	Tagalog (52)	German (49)	Italian (42)	Indonesian (41)
	French (38)	Romanian (31)	Spanish (23)	Danish (18)	Croatian (17)
	Norwegian (17)	Turkish (16)	Estonian (15)	Portuguese (15)	Dutch (14)
	Swedish (14)	Albanian (13)	Finnish (11)	Lithuanian (11)	Hungarian (8)
	Latvian (5)	Polish (5)	Vietnamese (2)	Chinese (1)
4	English (78)	Tagalog (52)	German (49)	Indonesian (42)	Italian (42)
	French (38)	Romanian (31)	Spanish (25)	Danish (19)	Croatian (17)
	Norwegian (17)	Turkish (16)	Estonian (15)	Portuguese (15)	Dutch (14)
	Swedish (14)	Albanian (13)	Finnish (12)	Lithuanian (11)	Hungarian (8)
	Latvian (5)	Polish (5)	Vietnamese (2)	Chinese (1)

?? (69)
1	Indonesian (12)	English (9)	Tagalog (9)	Romanian (5)	Albanian (4)
	Dutch (4)	Italian (4)	Norwegian (3)	Portuguese (3)	Estonian (2)
	German (2)	Polish (2)	Spanish (2)	Swedish (2)	Bulgarian (1)
	Croatian (1)	Danish (1)	Finnish (1)	French (1)	Hungarian (1)
2	English (12)	Indonesian (12)	Tagalog (12)	Albanian (8)	Dutch (8)
	Romanian (5)	Croatian (4)	Italian (4)	Portuguese (4)	Estonian (3)
	German (3)	Norwegian (3)	Spanish (3)	Danish (2)	Hungarian (2)
	Polish (2)	Swedish (2)	Bulgarian (1)	Czech (1)	Finnish (1)
	French (1)	Vietnamese (1)
3	English (12)	Indonesian (12)	Tagalog (12)	Albanian (8)	Dutch (8)
	Romanian (5)	Croatian (4)	German (4)	Italian (4)	Portuguese (4)
	Estonian (3)	Norwegian (3)	Spanish (3)	Swedish (3)	Danish (2)
	Hungarian (2)	Polish (2)	Bulgarian (1)	Czech (1)	Finnish (1)
	French (1)	Turkish (1)	Vietnamese (1)
4	English (12)	Indonesian (12)	Tagalog (12)	Albanian (8)	Dutch (8)
	Romanian (5)	Croatian (4)	German (4)	Italian (4)	Portuguese (4)
	Estonian (3)	Norwegian (3)	Spanish (3)	Swedish (3)	Danish (2)
	Hungarian (2)	Polish (2)	Bulgarian (1)	Czech (1)	Finnish (1)
	French (1)	Turkish (1)	Vietnamese (1)

URL (67)
1	English (28)	Portuguese (9)	Croatian (4)	Italian (4)	Romanian (4)
	Tagalog (4)	French (3)	Chinese (2)	Dutch (2)	Hungarian (2)
	Polish (2)	Czech (1)	German (1)	Norwegian (1)
2	English (34)	Portuguese (10)	French (7)	Italian (6)	Tagalog (6)
	Romanian (5)	Croatian (4)	Polish (4)	Chinese (2)	Dutch (2)
	German (2)	Hungarian (2)	Spanish (2)	Czech (1)	Estonian (1)
	Norwegian (1)	Turkish (1)
3	English (35)	Portuguese (10)	Italian (8)	French (7)	Tagalog (6)
	Romanian (5)	Croatian (4)	Polish (4)	Chinese (2)	Dutch (2)
	German (2)	Hungarian (2)	Spanish (2)	Czech (1)	Estonian (1)
	Norwegian (1)	Turkish (1)
4	English (35)	Portuguese (10)	Italian (8)	French (7)	Tagalog (6)
	Romanian (5)	Croatian (4)	Polish (4)	Chinese (2)	Dutch (2)
	German (2)	Hungarian (2)	Spanish (2)	Czech (1)	Estonian (1)
	Norwegian (1)	Turkish (1)

Junk (46)
1	Hungarian (8)	Albanian (7)	English (6)	Indonesian (3)	Norwegian (3)
	Polish (3)	Portuguese (3)	Estonian (2)	French (2)	Swedish (2)
	Danish (1)	Dutch (1)	Finnish (1)	German (1)	Italian (1)
	Latvian (1)	Romanian (1)
2	Albanian (10)	Hungarian (8)	Dutch (6)	English (6)	Polish (5)
	Indonesian (4)	Portuguese (4)	French (3)	Italian (3)	Norwegian (3)
	Danish (2)	Estonian (2)	Swedish (2)	Tagalog (2)	Croatian (1)
	Finnish (1)	German (1)	Latvian (1)	Romanian (1)
3	Albanian (10)	English (9)	Hungarian (9)	Dutch (7)	Polish (5)
	Indonesian (4)	Portuguese (4)	French (3)	Italian (3)	Norwegian (3)
	Danish (2)	Estonian (2)	Swedish (2)	Tagalog (2)	Croatian (1)
	Czech (1)	Finnish (1)	German (1)	Latvian (1)	Romanian (1)
4	Albanian (10)	English (9)	Hungarian (9)	Dutch (7)	Polish (5)
	Indonesian (4)	Portuguese (4)	French (3)	Italian (3)	Norwegian (3)
	Danish (2)	Estonian (2)	Swedish (2)	Tagalog (2)	Croatian (1)
	Czech (1)	Finnish (1)	German (1)	Latvian (1)	Romanian (1)

DOI (33)
1	French (11)	English (5)	Croatian (3)	Romanian (3)	Albanian (2)
	Estonian (2)	German (2)	Polish (1)	Spanish (1)
2	French (13)	English (7)	Albanian (5)	Danish (4)	Croatian (3)
	German (3)	Romanian (3)	Estonian (2)	Spanish (2)	Finnish (1)
	Indonesian (1)	Polish (1)	Vietnamese (1)
3	French (13)	English (7)	Albanian (5)	Danish (4)	Croatian (3)
	German (3)	Romanian (3)	Estonian (2)	Spanish (2)	Finnish (1)
	Indonesian (1)	Polish (1)	Vietnamese (1)
4	French (13)	English (7)	Albanian (5)	Danish (4)	Croatian (3)
	German (3)	Romanian (3)	Estonian (2)	Spanish (2)	Finnish (1)
	Indonesian (1)	Polish (1)	Vietnamese (1)

User (16)
1	English (3)	Croatian (2)	Romanian (2)	Spanish (2)	French (1)
	German (1)	Indonesian (1)	Italian (1)	Latvian (1)	Polish (1)
	Tagalog (1)
2	English (4)	Tagalog (3)	Croatian (2)	Indonesian (2)	Latvian (2)
	Polish (2)	Romanian (2)	Spanish (2)	Albanian (1)	Czech (1)
	Estonian (1)	French (1)	German (1)	Italian (1)	Portuguese (1)
	Swedish (1)	Turkish (1)
3	English (4)	Tagalog (4)	Croatian (2)	Indonesian (2)	Latvian (2)
	Polish (2)	Romanian (2)	Spanish (2)	Albanian (1)	Czech (1)
	Estonian (1)	French (1)	German (1)	Italian (1)	Portuguese (1)
	Swedish (1)	Turkish (1)
4	English (4)	Tagalog (4)	Croatian (2)	Indonesian (2)	Latvian (2)
	Polish (2)	Romanian (2)	Spanish (2)	Albanian (1)	Czech (1)
	Estonian (1)	French (1)	German (1)	Italian (1)	Portuguese (1)
	Swedish (1)	Turkish (1)

Species (13)
1	Italian (3)	English (2)	French (2)	Romanian (2)	Estonian (1)
	Lithuanian (1)	Portuguese (1)	Spanish (1)
2	English (3)	French (3)	Italian (3)	Romanian (3)	Estonian (1)
	Lithuanian (1)	Portuguese (1)	Spanish (1)	Tagalog (1)
3	English (4)	French (3)	Italian (3)	Romanian (3)	Estonian (1)
	Lithuanian (1)	Portuguese (1)	Spanish (1)	Tagalog (1)
4	English (4)	French (3)	Italian (3)	Romanian (3)	Estonian (1)
	Lithuanian (1)	Portuguese (1)	Spanish (1)	Tagalog (1)

Number (12)
1	Chinese (2)	Portuguese (1)
2	Chinese (2)	Portuguese (1)
3	Chinese (2)	Portuguese (1)
4	Chinese (2)	Portuguese (1)

DevTrans (11)
1	Indonesian (3)	Tagalog (3)	Albanian (1)	Estonian (1)	Latvian (1)
	Portuguese (1)	Romanian (1)
2	Indonesian (4)	Tagalog (3)	Albanian (1)	Croatian (1)	Estonian (1)
	Latvian (1)	Portuguese (1)	Romanian (1)
3	Indonesian (4)	Tagalog (3)	Portuguese (2)	Albanian (1)	Croatian (1)
	Estonian (1)	Latvian (1)	Romanian (1)
4	Indonesian (4)	Tagalog (3)	Portuguese (2)	Albanian (1)	Croatian (1)
	Estonian (1)	Latvian (1)	Romanian (1)

None (10)
1	English (4)	Albanian (1)	Danish (1)	French (1)	Portuguese (1)
	Romanian (1)
2	English (4)	Albanian (1)	Danish (1)	French (1)	German (1)
	Hungarian (1)	Portuguese (1)	Romanian (1)
3	English (4)	Albanian (1)	Danish (1)	French (1)	German (1)
	Hungarian (1)	Portuguese (1)	Romanian (1)
4	English (4)	Albanian (1)	Danish (1)	French (1)	German (1)
	Hungarian (1)	Portuguese (1)	Romanian (1)

OCR (10)
1	English (2)	Albanian (1)	Danish (1)	Dutch (1)	Estonian (1)
	German (1)	Italian (1)	Romanian (1)	Turkish (1)
2	Dutch (3)	English (2)	Estonian (2)	Romanian (2)	Albanian (1)
	Danish (1)	French (1)	German (1)	Italian (1)	Spanish (1)
	Turkish (1)
3	Dutch (3)	English (2)	Estonian (2)	Romanian (2)	Albanian (1)
	Danish (1)	French (1)	German (1)	Indonesian (1)	Italian (1)
	Spanish (1)	Turkish (1)
4	Dutch (3)	English (2)	Estonian (2)	Romanian (2)	Albanian (1)
	Danish (1)	French (1)	German (1)	Indonesian (1)	Italian (1)
	Spanish (1)	Turkish (1)

Mixed (6)
1	Indonesian (3)	English (2)	German (1)
2	Indonesian (3)	English (2)	Albanian (1)	Chinese (1)	German (1)
3	Indonesian (3)	English (2)	Albanian (1)	Arabic (1)	Chinese (1)
	German (1)
4	Indonesian (3)	English (2)	Albanian (1)	Arabic (1)	Chinese (1)
	German (1)

Emoji (5)
1	Hebrew (2)
2	Hebrew (2)
3	Hebrew (2)
4	Hebrew (2)

Linked (4)
1	Chinese (1)	Croatian (1)	English (1)	Tagalog (1)
2	Chinese (2)	Tagalog (2)	Croatian (1)	English (1)	French (1)
3	Chinese (2)	Tagalog (2)	Croatian (1)	English (1)	French (1)
4	Chinese (2)	Tagalog (2)	Croatian (1)	English (1)	French (1)

TamilTrans (4)
1	Indonesian (1)	Lithuanian (1)	Romanian (1)	Tagalog (1)
2	Indonesian (2)	Tagalog (2)	Lithuanian (1)	Romanian (1)	Turkish (1)
3	Indonesian (2)	Tagalog (2)	Finnish (1)	Lithuanian (1)	Romanian (1)
	Turkish (1)
4	Indonesian (2)	Tagalog (2)	Finnish (1)	Lithuanian (1)	Romanian (1)
	Turkish (1)

Abbrev (2)
1	English (1)	Indonesian (1)
2	English (1)	Indonesian (1)	Italian (1)
3	English (1)	Indonesian (1)	Italian (1)
4	English (1)	Indonesian (1)	Italian (1)

Email (2)
1	Indonesian (1)	Portuguese (1)
2	Indonesian (1)	Portuguese (1)	Romanian (1)
3	Portuguese (2)	Indonesian (1)	Romanian (1)
4	Portuguese (2)	Indonesian (1)	Romanian (1)

GreekTrans (1)
1	Latvian (1)
2	Croatian (1)	Latvian (1)
3	Croatian (1)	Latvian (1)
4	Croatian (1)	Latvian (1)

JapanTrans (1)
1	Croatian (1)
2	Croatian (1)	Lithuanian (1)
3	Croatian (1)	Lithuanian (1)	Tagalog (1)
4	Croatian (1)	Lithuanian (1)	Tagalog (1)

KoreanTrans (1)
1	Estonian (1)
2	Estonian (1)
3	Estonian (1)
4	Estonian (1)

Symbol (1)
1	French (1)
2	French (1)
3	French (1)
4	French (1)

TamilTran (1)
1	English (1)
2	English (1)	Portuguese (1)
3	English (1)	Portuguese (1)	Romanian (1)
4	English (1)	Portuguese (1)	Romanian (1)