Suicide Notes Data Set and Shared Task Definition
Emotions are subjective, as is their interpretation, leading to a great amount of variations in how people interpret emotions. We investigated whether machines could reach human being competency in spotting emotions in text. We also wanted to upgrade the mainstream sentiment analysis from simple binary positive-negative classification to multilevel classification with 6 positive, 7 negative and 2 neutral emotions. We learned that ensemble classifiers can indeed reach human competency which is roughly 60% accuracy.
Radiology Reports Data Set and Shared Task Definition
The installation of electronic medical records transfers all free text to structured, drop-down boxes. When Cincinnati Children’s started using EPIC software, paper forms were eliminated along with the free-text form fields. There is a great deal of knowledge in free text and, therefore, a need to figure out how to extract it. We are attempting to overcome these obstacles by using natural language processing (NLP). Specifically, we are focused on developing and implementing neuro-cognitive algorithms that enable computers to understand the concepts and semantic relationships within clinical text. We have developed a tool that anonymizes free text and have used this tool to create a radiology corpus to support NLP research. Our next steps include further annotating the existing corpus, developing a second corpus and using these corpora to train new, memory-based text processing algorithms.