This web page: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm
Semantic Analysis in Language Technology (Autumn 2013)
SAIS: Semantisk Analys I Språkteknologi
Last Updated: 10 Feb 2014 (the end) | ||||
Nov, 12 (Tue) Lec 01 |
10‑12 | 9-2042 (Turing) |
Introduction (short: pdf; long: slideshare) |
* Jurafsky and Martin (2009: chs. 17-18) * Clark et al. (2010: ch. 15) * Indurkhya and Damerau (2010: ch. 5) |
Nov, 14 (Thu) Lec 02 |
10-12 | 9-2042 (Turing) |
Digression: - Academia & Professional Profiling - Get inspired: Job Opportunities (pdf, slideshare) Ass 1: Essay Assignment (pdf, slideshare)
From Semantics to Semantic-Oriented Applications(pdf, slideshare) |
* Aristotelian Logic (wikipedia) * Predicate Logic (wikipedia) * turkey sandwich vs. hard disk Q&A * Big Data Video * Baseline classifier for SA (video, pdf) (D. Jurafsky at Coursera) |
Nov, 19 (Tue) Lec 03 | 10-12 | 9-2042 (Turing) |
Ass 2: Sentiment Analysis Assignment (pdf)
Sentiment Analysis (SA) I -- What is SA? (pdf) -- Sentiment Analysis and Opinion Mining (Full Tutorial) |
* Sentiment Analysis (wikipedia) * Liu (2012): Ch 1 - Ch 3 * CyberEmotions -- EU Project * CyberEmotions -- Video Summary |
Nov, 21 (Thu) Lec 04 | 10-12 | 9-2042 (Turing) | Sentiment Analysis II (pdf, slideshare)
PT1-Practical Activity: MoodIndex App |
* Text Categorization (wikipedia) * Liu (2012) Ch 5, 6 * Liu (2011) Ch 11 |
Nov, 26 (Tue) Lec 05 | 10-12 | 9-2042 (Turing) |
Ass 3: Word Sense Disambiguation Assignment (pdf)
Word Sense Disambiguation (WSD) (I) -- From Sentiment Analysis to WSD (pdf) -- Word Senses & Relations (D.Jurafsky)(pdf) -- WordNet and Other Thesauri (D.Jurafsky)(pdf) PT2-Practical Activity: Manual Disambiguation |
* Jurafsky and Martin (2009: ch. 19) * Mats Dahllöf (2012): pdf |
Nov, 28 (Thu) Lec 06 | 10-12 | 9-2042 (Turing) |
-- Lexical Resources: the war of names (pdf) Computational WSD and Word Similarity (II) -- Thesaurus Methods (D.Jurafsky)(pdf, video) -- Distributional Methods (D.Jurafsky)(pdf, video) |
* Jurafsky and Martin (2009: ch. 20) |
Dec, 03 (Tue) Lec 07 | 10-12 | 9-2042 (Turing) |
Ass 4: Semantic Role Labelling/ Predicate-Argument Structure (SRL/PAS) (pdf)
PT3-Practical Activity: Senses in WordNet PT4-Practical Activity: Selectional Restrictions |
* Jurafsky and Martin (2009: 19.4.6) |
Dec, 05 (Thu) Lec 08 | 10-12 | 9-2042 (Turing) |
SRL/PAS (II)(pdf) PT5-Practical Activity: FrameNet & PropBank |
* Jurafsky and Martin (2009: 19.4) * Jurafsky and Martin (2009: 20.9) * Palmer et al. (2005) * PropBank (wikipedia) * FrameNet (wikipedia) |
Dec, 10 (Tue) Lec 09 | 10-12 | 9-2042 (Turing) |
Questions and Answers (pdf)
PT6: SAIS-SensEval |
* Semeval(Senseval) (wikipedia) |
Dec, 12 (Thu) Lec 10 | Oral Presentations | |||
Dec, 17 (Tue) Lec 11 | 10-12 | 9-2042 (Turing) |
Final Discussion and Wrap-Up (pdf)
Further Explanation of Ass2, SA (pdf) | |
Jan, 20 (Mon) | 2014-01-20: Deadline: Essay Assignment |
Intended learning outcomes
In order to pass the course, a student must be able to:
describe systems that perform the following tasks, apply them to authentic linguistic data, and evaluate the results:
- disambiguate instances of polysemous lemmas [word sense disambiguation, WSD];
- study approaches to extract the semantic roles and predicate-argument structure [SRL/PAS];
- detect and extract attitudes and opinions from text [sentiment analysis, SA].
Assignments and Examination
- Essay assignment: This will involve a more independent study of a system, an approach, or a field within semantics-oriented language technology. The study will be presented both as a written essay and an oral presentation. The essay work will also include a feedback step where the work of another group is reviewed. (This is intended to provide training useful for later larger essay projects.)
- Assignment on Semantic Role Labelling/Predicate-Argument Structure (SRL/PAS).
- Assignment on Sentiment Analysis (SA).
- Assignment on Word Sense Disambiguation (WSD).
Grade G will be given to students who pass each assignment. Grade VG to those who pass the essay assignment and at least one of the other ones with distinction.
Compulsory Readings
Additional material will be used, in particular for the essay assignment.
* Bing Liu (2012)
Sentiment Analysis and Opinion Mining,
Morgan & Claypool.
This title is available to students, staff, and
faculty of Uppsala University as part of the Library's purchase of Synthesis
Collection 5. Students can freely download the book from any campus-associated IP address using the links below:
link 1
or
link 2
* Bing Liu's Tutorial slides.
* Bing Liu (2011) Web Data Mining, Second Edition. Springer. Online Copy at the Library
* Daniel Jurafsky and James H. Martin (2009), Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition, Pearson Education.
* M Palmer, D Gildea, P Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles, Computational Linguistics 31 (1), 71-106.
Additonal Suggested Readings
Daniel Gildea and Daniel Jurafsky. 2002. Automatic Labeling of Semantic Roles, Computational Linguistics 28:3, 245-288.
Richard Johansson and Pierre Nugues. 2008. Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank. CoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning.
Clark A., Fox C. and Lappin S. (eds.) (2010). The Handbook of Computational Linguistics and Natural Language Processing. Blackwell Publishing. Online (web, this website)
Indurkhya N. and Damerau F. (eds) (2010). Handbook of Natural Language Processing, Chapman and Hall/CRC, Second Edition. Google eBook preview.
Moshfeghi, Yashar (2012) Role of emotion in information retrieval - PhD Thesis.
Alexander Osherenko, Opinion mining and lexical affect sensing. Computer-aided analysis of opinions and emotions in texts. PhD thesis published by SVH, 2010 (p. 255)
Demos, etc.
Semantic role labeling/predicate argument structure analysis
List of SRL/PAS systems:- SENNA
SENNA (NEC machine learning department) is a software distributed under a non-commercial license, which outputs a host of Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG). Focus on the semantic role labeling (SRL)component. Web page: http://ml.nec-labs.com/senna/
- The GATE Predicate-Argument EXtractor Component (PAX)
As the Predicate-Argument EXtractor (The Semantic Software Lab, Canada) comes in form of a GATE component, you will obviously need GATE itself. Most of the required pre-processing components are included in the GATE distribution. Web page: http://www.semanticsoftware.info/pax
- Enju.
Enju (University of Tokyo)also gives a predicate argument structure analysis as output. Enju is installed at the University's Linux network. There is also an online demo but is temporarily unavailable. So if you wish to test this system, use the version installed at University. The argument labels are described here.
- Illinois Semantic Role Labeler (SRL) Demo.
Semantic Role Labeler (University of Illinois at Urbana-Champaign) is a machine-learning based tool that identifies shallow semantic information in a given sentence. The tool labels verb-argument structure following the notation defined by the Propbank project, identifying who did what to whom by assigning roles that indicate the agent, patient, and theme of each verb to constituents of the sentence representing entities related by the verb. This system applies machine learning techniques to learn to analyze a sentence. using Propbank section 02-21 as the training data, as used in the CoNLL-2005 shared task. Web page: http://cogcomp.cs.illinois.edu/page/software_view/SRL
- SwiRL: The Semantic Role Labeler
SwiRL (by Mihai Surdeanu, University of Arizona) is a Semantic Role Labeling (SRL) system for English constructed on top of full syntactic analysis of text. The syntactic analysis is performed using Eugene Charniak's parser (included in this package). SwiRL trains one classifier for each argument label using a rich set of syntactic and semantic features. The classifiers are learned using one-vs-all AdaBoost classifiers.Web page: http://www.surdeanu.info/mihai/swirl/
- Semantic role labeller (Richard Johansson, Lunds universitet). Online demo.
- LinGO English Resource Grammar , Online demo.
- FrameNet.dk.
- Proposition Bank.
- Olga Babko-Malaya 2005. PropBank Annotation Guidelines.
Unified Verb Index (find PropBank analyses of verbs).
Information extraction
ANNIE – Information extraction system, University of Sheffield.
Open Information Extraction, University of Washington.
START,a Web-based question answering system, MIT.
Sentiment analysis – document level
Sentiment Analysis Online demo. Python NLTK Text Classification.
Lexalytics, sentiment analysis.
TrustYou Labs, statistical sentiment analysis.
Sentiment analysis – Twitter data
- Sentiment140 – "Search by product or brand. Discover the Twitter sentiment".
- Tweetfeel – "Real-time Twitter search with feelings using insanely complex sentiment analysis".
- Tweettone – "Real-time twitter sentiment analysis".
- SentiStrength – "Multi-lingual twitter sentiment analysis".
- SentimentAnalyzer – "English, German, French".
- A list of Twitter Sentiment Analysis Tools.
Sentiment analysis – more
Sentiment Tutorial. LingPipe – tool kit for processing text using computational linguistics (written in Java).
How to build your own Twitter Sentiment Analysis Tool