IJCAI 2007 Workshop on Analytics for Noisy Unstructured Text Data (AND2007)

Venue: IIIT Hyderabad

Location: Hyderabad, India

Event Date/Time: Jan 08, 2007 End Date/Time: Jan 08, 2007
Paper Submission Date: Sep 25, 2006
Report as Spam


Noisy unstructured text data is found in informal settings such as online chat, SMS, emails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech, printed text, handwritten text contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing case information, pause filling words such as “um” and “uh.” Such text can be seen in large amounts in contact centers, on-line chat rooms, OCRed text documents, SMS corpus etc. The theme of the IJCAI 2007 Conference is "AI and its benefits to society." In keeping with this theme, this workshop proposes to look at text analytics of highly noisy text that is produced in such everyday applications in society.

The goal of the workshop is to focus on the problems encountered in analyzing such noisy documents coming from various sources. The nature of the text warrants moving beyond traditional text analytics techniques. We hope that the workshop will allow researchers to present current research and development in addressing this challenge. We also believe that as a result of this workshop there will be sharing of real life noisy data sets and will result in their becoming available to a wider research community.