Inst. f. lingvistik och filologi
Uppsala universitet
Hoppa över länkar
och datorlingvistik

This web page: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Semantic Analysis in Language Technology (Autumn 2013)

SAIS: Semantisk Analys I Språkteknologi 

Credits: 7,5 hp
Syllabus: 5LN456
Teacher: Marina Santini

Previous course by Mats Dahllöf


Last Updated: 10 Feb 2014 (the end)

Nov, 12
Lec 01
Introduction (short: pdf; long: slideshare) * Jurafsky and Martin (2009: chs. 17-18)
* Clark et al. (2010: ch. 15)
* Indurkhya and Damerau (2010: ch. 5)
Nov, 14
Lec 02
- Academia & Professional Profiling
- Get inspired: Job Opportunities
(pdf, slideshare)

Ass 1: Essay Assignment (pdf, slideshare)
  • First submission: 5 Dec
  • Final submission: 20 Jan

From Semantics to Semantic-Oriented
Applications(pdf, slideshare)
* Aristotelian Logic (wikipedia)
* Predicate Logic (wikipedia)
* turkey sandwich vs. hard disk Q&A

* Big Data Video
* Baseline classifier for SA (video, pdf)
(D. Jurafsky at Coursera)
Nov, 19
Lec 03
Ass 2: Sentiment Analysis Assignment (pdf)
  • Submission: 22 Dec

Sentiment Analysis (SA) I
-- What is SA? (pdf)
-- Sentiment Analysis and Opinion Mining
(Full Tutorial)
* Sentiment Analysis (wikipedia)
* Liu (2012): Ch 1 - Ch 3
* CyberEmotions -- EU Project
* CyberEmotions -- Video Summary
Nov, 21
Lec 04
Sentiment Analysis II (pdf, slideshare)

PT1-Practical Activity: MoodIndex App
* Text Categorization (wikipedia)
* Liu (2012) Ch 5, 6
* Liu (2011) Ch 11
Nov, 26
Lec 05
Ass 3: Word Sense Disambiguation Assignment (pdf)
  • Submission: 7 Jan

Word Sense Disambiguation (WSD) (I)
-- From Sentiment Analysis to WSD (pdf)
-- Word Senses & Relations (D.Jurafsky)(pdf)
-- WordNet and Other Thesauri (D.Jurafsky)(pdf)

PT2-Practical Activity: Manual Disambiguation
* Jurafsky and Martin (2009: ch. 19)
* Mats Dahllöf (2012): pdf
Nov, 28
Lec 06
-- Lexical Resources: the war of names (pdf)

Computational WSD and Word Similarity (II)
-- Thesaurus Methods (D.Jurafsky)(pdf, video)
-- Distributional Methods (D.Jurafsky)(pdf, video)
* Jurafsky and Martin (2009: ch. 20)
Dec, 03
Lec 07
Ass 4: Semantic Role Labelling/
Predicate-Argument Structure (SRL/PAS) (pdf)
  • Submission: 13 Jan
-- SRL/PAS (I)(pdf)

PT3-Practical Activity: Senses in WordNet

PT4-Practical Activity: Selectional Restrictions
* Jurafsky and Martin (2009: 19.4.6)
Dec, 05
Lec 08
SRL/PAS (II)(pdf)

PT5-Practical Activity: FrameNet & PropBank
* Jurafsky and Martin (2009: 19.4)
* Jurafsky and Martin (2009: 20.9)
* Palmer et al. (2005)
* PropBank (wikipedia)
* FrameNet (wikipedia)
Dec, 10
Lec 09
Questions and Answers (pdf)
PT6: SAIS-SensEval
* Semeval(Senseval) (wikipedia)
Dec, 12
Lec 10
Oral Presentations
Dec, 17
Lec 11
Final Discussion and Wrap-Up (pdf)

Further Explanation of Ass2, SA (pdf)
Jan, 20
2014-01-20: Deadline: Essay Assignment

Intended learning outcomes

In order to pass the course, a student must be able to:

describe systems that perform the following tasks, apply them to authentic linguistic data, and evaluate the results:

Assignments and Examination

Grade G will be given to students who pass each assignment. Grade VG to those who pass the essay assignment and at least one of the other ones with distinction.

Compulsory Readings

Additional material will be used, in particular for the essay assignment.

* Bing Liu (2012) Sentiment Analysis and Opinion Mining, Morgan & Claypool.

This title is available to students, staff, and faculty of Uppsala University as part of the Library's purchase of Synthesis Collection 5. Students can freely download the book from any campus-associated IP address using the links below: link 1 or link 2

* Bing Liu's Tutorial slides.
* Bing Liu (2011) Web Data Mining, Second Edition. Springer. Online Copy at the Library

* Daniel Jurafsky and James H. Martin (2009), Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition, Pearson Education.

* M Palmer, D Gildea, P Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles, Computational Linguistics 31 (1), 71-106.

Additonal Suggested Readings

Daniel Gildea and Daniel Jurafsky. 2002. Automatic Labeling of Semantic Roles, Computational Linguistics 28:3, 245-288.

Richard Johansson and Pierre Nugues. 2008. Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank. CoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning.

Clark A., Fox C. and Lappin S. (eds.) (2010). The Handbook of Computational Linguistics and Natural Language Processing. Blackwell Publishing. Online (web, this website)

Indurkhya N. and Damerau F. (eds) (2010). Handbook of Natural Language Processing, Chapman and Hall/CRC, Second Edition. Google eBook preview.

Moshfeghi, Yashar (2012) Role of emotion in information retrieval - PhD Thesis.

Alexander Osherenko, Opinion mining and lexical affect sensing. Computer-aided analysis of opinions and emotions in texts. PhD thesis published by SVH, 2010 (p. 255)

Demos, etc.

Semantic role labeling/predicate argument structure analysis

List of SRL/PAS systems:
  1. SENNA

    SENNA (NEC machine learning department) is a software distributed under a non-commercial license, which outputs a host of Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG). Focus on the semantic role labeling (SRL)component. Web page: http://ml.nec-labs.com/senna/

  2. The GATE Predicate-Argument EXtractor Component (PAX)

    As the Predicate-Argument EXtractor (The Semantic Software Lab, Canada) comes in form of a GATE component, you will obviously need GATE itself. Most of the required pre-processing components are included in the GATE distribution. Web page: http://www.semanticsoftware.info/pax

  3. Enju.

    Enju (University of Tokyo)also gives a predicate argument structure analysis as output. Enju is installed at the University's Linux network. There is also an online demo but is temporarily unavailable. So if you wish to test this system, use the version installed at University. The argument labels are described here.

  4. Illinois Semantic Role Labeler (SRL) Demo.

    Semantic Role Labeler (University of Illinois at Urbana-Champaign) is a machine-learning based tool that identifies shallow semantic information in a given sentence. The tool labels verb-argument structure following the notation defined by the Propbank project, identifying who did what to whom by assigning roles that indicate the agent, patient, and theme of each verb to constituents of the sentence representing entities related by the verb. This system applies machine learning techniques to learn to analyze a sentence. using Propbank section 02-21 as the training data, as used in the CoNLL-2005 shared task. Web page: http://cogcomp.cs.illinois.edu/page/software_view/SRL

  5. SwiRL: The Semantic Role Labeler

    SwiRL (by Mihai Surdeanu, University of Arizona) is a Semantic Role Labeling (SRL) system for English constructed on top of full syntactic analysis of text. The syntactic analysis is performed using Eugene Charniak's parser (included in this package). SwiRL trains one classifier for each argument label using a rich set of syntactic and semantic features. The classifiers are learned using one-vs-all AdaBoost classifiers.Web page: http://www.surdeanu.info/mihai/swirl/

  6. Semantic role labeller (Richard Johansson, Lunds universitet). Online demo.
  7. LinGO English Resource Grammar , Online demo.
  8. FrameNet.dk.
  9. Proposition Bank.
  10. Olga Babko-Malaya 2005. PropBank Annotation Guidelines.
  11. Unified Verb Index (find PropBank analyses of verbs).

Information extraction

ANNIE – Information extraction system, University of Sheffield.

Open Information Extraction, University of Washington.

START,a Web-based question answering system, MIT.

Sentiment analysis – document level

Sentiment Analysis Online demo. Python NLTK Text Classification.

Lexalytics, sentiment analysis.

TrustYou Labs, statistical sentiment analysis.

Sentiment analysis – Twitter data

  1. Sentiment140 – "Search by product or brand. Discover the Twitter sentiment".
  2. Tweetfeel – "Real-time Twitter search with feelings using insanely complex sentiment analysis".
  3. Tweettone – "Real-time twitter sentiment analysis".
  4. SentiStrength – "Multi-lingual twitter sentiment analysis".
  5. SentimentAnalyzer – "English, German, French".
  6. A list of Twitter Sentiment Analysis Tools.

Sentiment analysis – more

Sentiment Tutorial. LingPipe – tool kit for processing text using computational linguistics (written in Java).

How to build your own Twitter Sentiment Analysis Tool