Title: Synonymy Expansion using Link Prediction Methods: A case study of Assamese WordNet, Authors: Bornali Phukon, Akash Anil, Sanasam Ranbir Singh, Priyankoo Sarmah
Title: Semi-Supervised Deep Learning for Multiplex Networks, Authors: Anasua Mitra, Vijayan P, Sanasam Ranbir Singh, Diganta Goswami, Parthasarathy S., and Ravindran B.
Title: SwitchNet : Learning to Switch for Word Level Language Identification in Code-Mixed Social Media Text, Authors: Neelakshi Sarma, Sanasam Ranbir Singh, Diganta Goswami
Title: Sentiment Analysis of Tweets using Heterogeneous Multi-layer Network Representation and Embedding, Authors: Loitongbam Gyanendro Singh, Anasua Mitra, Sanasam Ranbir Singh
Title: Empirical study of sentiment analysis tools and techniques on societal topics, Authors: Loitongbam Gyanendro Singh, Sanasam Ranbir Singh
Title: Emotion Dynamics of Public Opinions on Twitter, Authors: Debashis Naskar, Sanasam Ranbir Singh, Durgesh Kumar, Sukumar Nandi, Eva Onaindia De La Rivaherrera
Title: Effect of Class Imbalance in Heterogeneous Network Embedding: An Empirical Study, Authors: Akash Anil, Sanasam Ranbir Singh
Title: SHE: Sentiment Hashtag Embedding Through Multitask learning, Authors: Loitongbam Gyanendro Singh, Akash Anil, Sanasam Ranbir Singh
Title: On Applying Meta-path for Network Embedding in Mining Heterogeneous DBLP Network, Authors: Akash Anil, Uppindar Chugh, Sanasam Ranbir Singh; Title: Prioritized Named Entity Driven LDA for Document Clustering, Authors: Durgesh Kumar, Sanasam Ranbir Singh
Title: Effectiveness of using language independent transcribers for spoken language identification for different Indian languages Author: Rajlakshmi Saikia, Sanasam Ranbir Singh and Priyankoo Sarmah
Title: Network Sampling Using k-hop Random Walks for Heterogeneous Network Embedding, Authors: Akash Anil, Shubham Singhal, Piyush Jain, Sanasam Ranbir Singh, Ajay Ladhar, Sandeep Singh and Uppinder Chugh
Title: Transliteration of English Loanwords and Named-entities to Manipuri: Phoneme vs Grapheme representation, Authors: Lenin Laitonjam, Loitongbam Gyanendro Singh and Sanasam Ranbir Singh. Title : Word Level Language Identification in Assamese-Bengali-Hindi-English Code-Mixed Social Media Text, Authors : Neelakshi Sarma, Sanasam Ranbir Singh, Diganta Goswami
Title: Network Sampling Using K-hop Random Walks for Heterogeneous Network Embedding, Authors: Akash Anil, Ajay Ladhar, Sandeep Singh, Uppinder Chugh, and Sanasam Ranbir Singh. Title: On Applying Meta-path for Network Embedding in Mining Heterogeneous DBLP Network, Authors: Akash Anil, Uppinder Chugh, and Sanasam Ranbir Singh.
Title: Word Polarity Detection using Syllable Features for Manipuri Language, Authors: Loitongbam Gyanendro Singh and Sanasam Ranbir Singh. Title: Effect of Language Independent Transcribers on Spoken Language Identification for Different Indian Languages, Authors: Rajlakshmi Saikia, Sanasam Ranbir Singh and Priyankoo Sarmah.
Paper Title: Automatic Pause Marking for Speech Synthesis, Authors: Loitongbam Gyanendro Singh, Nagaraj Adiga, Bidisha Sharma, Sanasam Ranbir Singh, S R M Prasanna.
Congratulations to Mihir Nanawati and Pardeep Singh Nagra (M.Tech) for getting placed @ Microsoft and Worksapp resepectively.
Also known as PK, an alumnus of CMU and Associate Professor at IIITD shared his research problems and discussed about the various research works being conducted at OSINT lab.
Title : Automatic Syllabification for Manipuri language. Author(s) : Loitongbam Gyanendro Singh, Lenin Laitonjam and Sanasam Ranbir Singh
Paper1: Title : Generating Manipuri English Pronunciation Dictionary Using Sequence Labelling Problem. Author(s) : Rajlakshmi Saikia and Sanasam Ranbir Singh Paper2: Title : Data-driven Approaches for Syllabifying Manipuri Words Author(s) : Loitongbam Gyanendro Singh, Lenin Laitonjam and Sanasam Ranbir Singh
Microsoft ML meet (India) 2016
WordNet is an on-line lexical database. It groups words into set of synonyms called synset. It groups words based on their part of speech category. There presently are 4 categories -- Noun, Adjective, Verb, and Adverb categories. Each of the categories has a different organizations amongst the words. Nouns are organized in a tree structure with each node connected to its parent node with Hypernomy relation. Adjectives are organized as a bipolar structure with the organizing relation being antonomy. WordNet is extensively used in applications like query expansion, sentiment lexicon generation, document classification, etc.
Syntactic Parsing in natural language is the process of analysing text in natural language to produce a well defined syntactic structure (a parse tree) corresponding to the input text . Such a syntactic structure is used in deducing useful relationships between the different components (words or phrases) in the given text and has its applications in different NLP problems like relationship extraction, anaphora resolution etc. Reference : http://www.cs.columbia.edu/~cs4705/
Sequence labeling involves the algorithmic assignment of a categorical label to each member of a sequence of observed values.
The role of context is very important in the similarity of a less biological kind of things like the word. Words that occur in similar contexts tend to have similar meanings. Here mainly we discuss on various distributional methods, in which the meaning of word is computed from the distribution of words around it. These words are generally represented as a vector of numbers related in some way to counts, and so these methods are often called vector semantics. Mainly there are two vector types : 1.Very sparse, long (high dimensional) vectors with many zeros. 2.Very dense, short vectors. Intuition of these approaches is to model a word by embedding it into a vector space. For this reason the representation of a word as a vector is often embedding called an embedding. Representation for sparse vector : 1.Vectors and documents (i) Term-document matrix (ii) Term-term matrix. 2. Positive Pointwise Mutual Information (PMI). 3. Measuring similarity: cosine similarity. Representation for dense vector : 1.Singular value decomposition(SVD). 2.Skip-gram 3.CBOW
The problem of constructing a language model from a set of example sentences in a language have been discussed. A statistical "language model" is a probability distribution over sequences of words. Given such a sequence, say of length m, it assigns a probability P (W1 , ... , Wm ) to the whole sequence. Having a way to estimate the relative likelihood of different phrases is useful in many natural language processing applications. Language modeling is used in speech recognition, machine translation, part-of-speech tagging, parsing, handwriting recognition, information retrieval and other applications.
Paper Title : Personalised PageRank as a method of exploiting Heterogeneous Network of Counter-terrorism and Homeland Security. Author(s) : Akash Anil, S. Ranbir Singh, Ranjan Sarmah
SVD and EVD play vital role in infortmation representation as well as extraction and widely used for various Areas such as Image Retrieval, LSI( Latent Semantic Indexing) and Sentiment Analysis.
Sentiment lexicon is a list of sentiment words and phrases. Sentimentcwords are words that inherently express either positive or negative sentiment. For instance, words "good", "amazing", "excellent" express positive sentiment orientation and the words "bad", "poor", "abysmal" express negative sentiment orientation. It is not feasible for a human reader to read a large number of opinion documents on an entity. Hence, there is a need for opinion summarization. Different levels of opinion summarization include : aspect-based summarization, contrast view summarization, and traditional summarization.
Sentiment Analysis is the computational study of subjectivity in text. Sentiment analysis is studied in three levels, namely, document-level, sentence-level, and aspect-level.