Home
WSSANLP
Transliteration
Contact

 

 

 
 

This website is dedicated to Natural Language Processing (NLP) and Computational Linguistic (CL) work on South and Southeast Asian Languages. Here you will find online systems for these languages, Computational Resources, a comprihensive contact list of people working on these languages, etc.

South and Southeast Asian Region and its Languages

South Asia comprises of the countries- Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan and Sri Lanka. Southeast Asia, on the other hand, consists of Burma, Cambodia, Laos, Thailand, Vietnam, Malaysia, Brunei, East Timor, Indonesia, Philippines and Singapore. The following table gives an idea about the size of population and the number of living languages in the regions of South and Southeast Asia.

Sr.

Country

Population

Living Languages

1

India

1,134,403,000

438

2

Indonesia

248,496,420

719

3

Pakistan

158,081,000

72

4

Bangladesh

153,281,000

42

5

Viet Nam

85,029,000

106

6

Philippines

84,566,000

171

7

Thailand

63,003,000

74

8

Burma

47,967,000

111

9

Nepal

27,094,000

124

10

Malaysia

25,653,000

137

11

Sri Lanka

19,094,000

7

12

Cambodia

13,511,970

23

13

Afghanistan

12,164,970

52

14

Singapore

4,327,000

21

15

Laos

2,796,000

84

16

East Timor

1,067,000

19

17

Bhutan

637,000

25

18

Brunei

374,000

15

19

Maldives

359,000

1

Total

 

2,081,904,360

2241

Source (Lewis, 2009)

Table 1: Population and Number of Living Languages of South and Southeast Asia

The 2241 languages described in Table 1 belong to different language families like Indo-Aryan, Indo-Iranian, Dravidian, Sino-Tibetan, Austro-Asiatic, Kradai, Hmong-Mien, etc. In terms of population, South Asia and Southeast Asia represent 34.94% of the total population of the world. Some of the languages of these regions have a large number of native speakers: Hindi (5th largest according to number of its native speakers), Bengali (6th), Punjabi (12th), Tamil (18th), Urdu (20th), etc.

WHAT'S NEW

5th WSSANLP 2013

The 5th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP) has been successfully organized as a collocated event at the COLING 2014. Prof. Christian Boitet, University of Grenoble, France has chaired the 5th WSSANLP.