Java wrapper around several sentiment analysis tools, that was created for MixedEmotions project.
This package requires java 1.8 and python3 ( SA_VADER ), maven for instalation and wget for downloading Sentiment140 dataset.
- Use provided shell script to download Sentiment140 dataset needed by lingpipe.
download_datasets.sh
- Use maven to build jar file.
mvn install
Output of this will be two files, mefw-0.0.1-jar-with-dependencies.jar and mefw-0.0.1.jar in target directory.
Prints help.
mefw (-h | --help)
Prints version of interface/wrapper.
mefw --version
Prints available processors.
mefw list processors
Starts up HTTP server daemon. Default port is 80.
mefw server [--port= --config= --jsonld]
Prints if text in input file was positive, neutral or negative.
mefw process [--config= --jsonld] Example: $ SA_lingpipe input.txt output.txt
Provided processors are basically wrappers around varius sentiment analysis implementations.
All processors have to be stored in directory:
src/main/java/cz/vutbr/mefw/plugins Example: src/main/java/cz/vutbr/mefw/plugins/SA_lingpipe
If your processor uses additional classes, they should be stored in:
src/main/java/cz/vutbr/mefw/plugins/[name_of_processor_group] Example src/main/java/cz/vutbr/mefw/plugins/SA/
Other resources like datasets, libs, etc. belong in:
resources/[name_of_processor_group] Example: resources/but_sentiment
All processors have to be subclass of ProcessorAdapter class, which is abstract class with 3 methods. If this requirements are not met, your processor will not work.
First method is constructor with config argument. Config is used to specifi path to plugin and resources directory.
public ProcessorAdapter(Config config);
Second method is load(). This method is used to load external resources, for example datasets.
public void load();
Third method is process(String data). This method is used for actual processing of data. Method returns result of analysis, in case of sentiment analysis tool, this method returns positive, neutral or negative.
public String process(String data)
Lingpipe site
Alias-i. 2008. LingPipe 4.1.0. http://alias-i.com/lingpipe (accessed October 1, 2008)
StanfordCoreNLP site
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.
VADER site
Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
NLTK site
Sentiment140 site
Sanders site
Movie reviews site
rt_polarity data site
Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales., Proceedings of the ACL, 2005.
pros-cons site
sts_gold site
Czech sentiment site
Fiala, Ondřej, 2015, Aspect-Term Annotated Customer Reviews in Czech, LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague, http://hdl.handle.net/11234/1-1507.
This development has been partially funded by the European Union through the MixedEmotions Project (project number H2020 655632), as part of the RIA ICT 15 Big data and Open Data Innovation and take-up
programme.
http://ec.europa.eu/research/participants/portal/desktop/en/opportunities/index.html