Difference between revisions of "NERSSEAL 2008"

From Openresearch
Jump to: navigation, search
(Event created)
 
(HlEfWdugCCrw)
 
Line 1: Line 1:
{{Event
+
I noticed their site is rnuinng very slow at the moment but its not certainly hacked. Its opened very well here.2 points:First the competition is over 4 days back.Second the competition aim was to get their site hacked. So even it would have been hacked, that was supposed to be expected thing!  Technically no site in this world is unhackable. So if any site gets hacked its not a big deal as long as private information like password is protected using proper hashing algorithms.Incident like Lifehacker user's password leak was real shameful
| Acronym = NERSSEAL 2008
 
| Title = IJCNLP Workshop on NER for South and South East Asian Languages
 
| Type = Workshop
 
| Series =
 
| Field = Linguistics
 
| Homepage = ltrc.iiit.ac.in/ner-ssea-07
 
| Start date = Jan 12, 2008
 
| End date =  Jan 12, 2008
 
| City= Hyderabad
 
| State =
 
| Country =  India
 
| Abstract deadline =
 
| Submission deadline = Sep 21, 2007
 
| Notification =
 
| Camera ready =
 
}}
 
 
 
<pre>
 
Call for Papers
 
 
 
Papers are invited on substantial, original, and unpublished research on all aspects of Named Entity Recognition (NER) for South and South East Asian (SSEA) languages. At least one of the languages considered should be an SSEA language. We also invite researchers to be contestants in a shared task (the second track of the workshop) on NER for SSEA languages.
 
Background and Motivation
 
 
 
Most of the SSEA languages are scarce in resources as well as tools and NER systems are no exception. It is very important that good systems for NER be available, because many problems in information extraction and machine translation (among others) are dependent on accurate NER. However, the issues involved are significantly different for these languages from those for European languages or even East Asian languages. For example, these languages do not have capitalization, which is a major feature for NER systems for European languages.
 
 
 
Another similarity among these languages is that most of them use scripts of Brahmi origin. For some languages, there are additional issues like word segmentation (e.g. for Thai). Large gazetteers are not available for most of these languages. There is also the problem of lack of standardization and spelling variation. The number of frequently used words which can also be used as names is very large for many languages, unlike European languages where a larger proportion of the first names are not used as common words. And most importantly, there is a serious lack of labeled data for machine learning.
 
Scope
 
 
 
This workshop will be the second stage of an annual event called NLPAI Machine Learning Contest which focuses on application of machine learning techniques for one major NLP problem every year. This year the problem was NER. However, unlike that event, this workshop will have one track for regular research papers on NER for SSEA languages and the second track will be on the lines of a shared task.
 
Shared Task
 
 
 
In the shared task, the contestants having their own NER systems will be given some annotated test data. The participating systems will be ranked according to their performance on the test data. There may or may not be training data for a particular language. In either case, the contestants will have the freedom to use any technique for NER, e.g. a purely rule based technique or a purely statistical technique.
 
 
 
At present some data is available for Hindi, Bengali and Telugu for the shared task. Other languages can be included in the contest provided data for them becomes available. The data released for the shared task will be made accessible to all researchers, not just the participants.
 
 
 
If the language you are interested in has not been included in the shared task, you can also prepare the annotated test data and submit it to us. We will then include that language in the shared task.
 
 
 
The task in this contest will be different in one important way. The NER systems also have to identify nested named entities. For example, in the sentence The Lal Bahadur Shastri National Academy of Administration is located in Mussoorie, 'Lal Bahadur Shastri' is a Person, but 'Lal Bahadur Shastri National Academy of Administration' is an Organization. In this case, the NER systems will have to identify both 'Person' and 'Organization' in the given sentence.
 
Submission
 
 
 
Paper submission is through the centralized workshop submission page. Papers have to be written in English. Note that shared task contestants also have to submit a paper describing their method and the results etc. Long or short papers can be submitted to either of the tracks. Long papers can be up to 8 pages long, while the maximum length for short papers is 5 pages (including references, figures, tables etc.). All selected papers will be published in the workshop proceedings.
 
 
 
The papers should be formatted using the LaTeX styles or MS Word templates recommended for the main IJCNLP conference. These documents are available here. Reviewing will be blind. The draft papers should not contain any information that can identify the authors, as far as possible.
 
Important Dates
 
 
 
    * Release of Training and Development Data: Aug 2 to Aug 25, 2007 (for different languages)
 
    * Release of Test Data: Sept 13, 2007
 
    * Annotated Test Data Submission Deadline: Sept 15, 2007
 
    * Paper Submission Deadline: Sept 21, 2007
 
    * Notification of Paper Acceptance: Oct 26, 2007
 
    * Camera Ready Submission Deadline: Nov 16, 2007
 
 
 
Note: There is no separate registration for the shared task(the contest). You will be a contestant if you submit the annotated test data by the deadline mentioned above.
 
Program Committee
 
Rajeev Sangal, IIIT, Hyderabad, India
 
 
 
Dekai Wu, The Hong Kong University of Science & Technology, Hong Kong
 
 
 
Ted Pedersen, University of Minnesota, USA
 
 
 
Dipti Misra Sharma, IIIT, Hyderabad, India
 
 
 
Virach Sornlertlamvanich, TCL, NICT, Thailand
 
 
 
M. Sasikumar, CDAC, Mumbai, India
 
 
 
Sudeshna Sarkar, Indian Institute of Technology, Kharagpur, India
 
 
 
Thierry Poibeau, CNRS, France
 
 
 
Sobha L., AU-KBC, Chennai, India
 
 
 
Tzong-Han Tsai, National Taiwan University, Taiwan
 
 
 
Prasad Pingali, IIIT, India
 
 
 
Canasai Kreungkrai, NICT, Japan
 
 
 
Manabu Sassano, Yahoo Japan Corporation, Japan
 
 
 
Anil Kumar Singh, IIIT, Hyderabad, India
 
 
 
Doaa Samy, Universidad Aut�noma de Madrid, Spain
 
 
 
Ratna Sanyal, Indian Inst. of Inf. Tech., Allahabad, India
 
 
 
V. Sriram, IIIT, Hyderabad, India
 
 
 
Anagha Kulkarni, Carnegie Mellon University, USA
 
 
 
Soma Paul, IIIT, Hyderabad, India
 
 
 
Contact Persons
 
Dipti Misra Sharma, Rajeev Sangal, Anil Kumar Singh
 
Language Technologies Research Centre
 
International Institute of Information Technology
 
Gachibowli, Hyderabad, India
 
 
 
Phone: 91-40-23001412, 91-40-23001967/9 Extension 144
 
Fax: 91-40-23001413
 
Email: dipti@iiit.ac.in, sangal@iiit.ac.in, anil@research.iiit.ac.in
 
</pre>This CfP was obtained from [http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=6&amp;copyownerid=9 WikiCFP][[Category:Natural language processing]]
 
[[Category:Natural language processing]]
 

Latest revision as of 20:05, 13 October 2012

I noticed their site is rnuinng very slow at the moment but its not certainly hacked. Its opened very well here.2 points:First the competition is over 4 days back.Second the competition aim was to get their site hacked. So even it would have been hacked, that was supposed to be expected thing! Technically no site in this world is unhackable. So if any site gets hacked its not a big deal as long as private information like password is protected using proper hashing algorithms.Incident like Lifehacker user's password leak was real shameful