Findings Accepted Papers
Fully Quantized Transformer for Machine Translation. Summarizing Chinese Medical Answer with Graph Convolution Networks and Question-focused Dual Attention. Stay Hungry, Stay Focused: Generating Informative and Specific Questions in Information-Seeking Conversations. GRACE: Gradient Harmonized and Cascaded Labeling for Aspect-based Sentiment Analysis. Reducing Sentiment Bias in Language Models via Counterfactual Evaluation. Improving Text Understanding via Deep Syntax-Semantics Communication. GRUEN for Evaluating Linguistic Quality of Generated Text. Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation. An Attentive Recurrent Model for Incremental Prediction of Sentence-final Verbs. Transformer-GCRF: Recovering Chinese Dropped Pronouns with General Conditional Random Fields. Converting the Point of View of Messages Spoken to Virtual Assistants. Few-shot Natural Language Generation for Task-Oriented Dialog. Mimic and Conquer: Heterogeneous Tree Structure Distillation for Syntactic NLP. A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining. Active Testing: An Unbiased Evaluation Method for Distantly Supervised Relation Extraction. Semantic Matching for Sequence-to-Sequence Learning. How Decoding Strategies Affect the Verifiability of Generated Text. Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction. Gradient-based Analysis of NLP Models is Manipulable. A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction. Understanding tables with intermediate pre-training. Multilingual Argument Mining: Datasets and Analysis. Improving Grammatical Error Correction with Machine Translation Pairs. Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives. The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification. Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation. Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation. Dual Low-Rank Multimodal Fusion. Contextual Modulation for Relation-Level Metaphor Identification. Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning. A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification. Claim Check-Worthiness Detection as Positive Unlabelled Learning. ConceptBert: Concept-Aware Representation for Visual Question Answering. Bootstrapping a Crosslingual Semantic Parser. Revisiting Representation Degeneration Problem in Language Modeling. The workweek is the best time to start a family – A Study of GPT-2 Based Claim Generation. Dynamic Data Selection for Curriculum Learning via Ability Estimation. Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation. ZEST: Zero-shot Learning from Text Descriptions using Textual Similarity and Visual Summarization. A structure-enhanced graph convolutional network for sentiment analysis. PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding. Interpretable Entity Representations through Large-Scale Typing. Empirical Studies of Institutional Federated Learning For Natural Language Processing. NeuReduce: Reducing Mixed Boolean-Arithmetic Expressions by Recurrent Neural Network. From Language to Language-ish: How Brain-Like is an LSTM's Representation of Nonsensical Language Stimuli?. Revisiting Pre-Trained Models for Chinese Natural Language Processing. Cascaded Semantic and Positional Self-Attention Network for Document Classification. Toward Recognizing More Entity Types in NER: An Efficient Implementation using Only Entity Lexicons. From Disjoint Sets to Parallel Data to Train Seq2Seq Models for Sentiment Transfer. Document Ranking with a Pretrained Sequence-to-Sequence Model. Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior. Rethinking Self-Attention: Towards Interpretability in Neural Parsing. A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions. Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue State Tracking. Syntactic and Semantic-driven Learning for Open Information Extraction. Group-wise Contrastive Learning for Neural Dialogue Generation. E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT. A Multi-task Learning Framework for Opinion Triplet Extraction. Event Extraction as Multi-turn Question Answering. Improving QA Generalization by Concurrent Modeling of Multiple Biases. Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems. AirConcierge: Generating Task-Oriented Dialogue via Efficient Large-Scale Knowledge Retrieval. DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding. A Study in Improving BLEU Reference Coverage with Diverse Automatic Paraphrasing. Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study. SeNsER: Learning Cross-Building Sensor Metadata Tagger. Persian \textitEzafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging. Scene Graph Modification Based on Natural Language Commands. LiMiT: The Literal Motion in Text Dataset. Generative Data Augmentation for Commonsense Reasoning. HybridQA: A Dataset of Multi-Hop Question Answering \\over Tabular and Textual Data. ESTeR: Combining Word Co-occurrences and Word Associations for Unsupervised Emotion Detection. GCDST: A Graph-based and Copy-augmented Multi-domain Dialogue State Tracking. Why do you think that? Exploring Faithful Sentence-Level Rationales Without Supervision. Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT. Using the Past Knowledge to Improve Sentiment Classification. High-order Semantic Role Labeling. Undersensitivity in Neural Reading Comprehension. AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding. Learning Robust and Multilingual Speech Representations. FQuAD: French Question Answering Dataset. Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection. Quantifying the Contextualization of Word Representations with Semantic Class Probing. FELIX: Flexible Text Editing Through Tagging and Insertion. Unsupervised Relation Extraction from Language Models using Constrained Cloze Completion. Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced Monte-Carlo Approach. Evaluating Models’ Local Decision Boundaries via Contrast Sets. Optimizing Word Segmentation for Downstream Task. A Compare Aggregate Transformer for Understanding Document-grounded Dialogue. TextHide: Tackling Data Privacy in Language Understanding Tasks. Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection. Improving Knowledge-Aware Dialogue Response Generation by Using Human-Written Prototype Dialogues. Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots. Privacy-Preserving News Recommendation Model Learning. Balancing via Generation for Multi-Class Text Classification Improvement. Conditional Neural Generation using Sub-Aspect Functions for Extractive News Summarization. Research Replication Prediction Using Weakly Supervised Learning. Semantically Driven Sentence Fusion: Modeling and Evaluation. Will it Unblend?. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. StyleDGPT: Stylized Response Generation with Pre-trained Language Models. Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking. Neural Dialogue State Tracking with Temporally Expressive Networks. Inferring about fraudulent collusion risk on Brazilian public works contracts in official texts using a Bi-LSTM approach. Data-to-Text Generation with Style Imitation. Teaching Machine Comprehension with Compositional Explanations. A Knowledge-Driven Approach to Classifying Object and Attribute Coreferences in Opinion Mining. SimAlign: High Quality Word Alignments Without Parallel Training Data Using Static and Contextualized Embeddings. Octa: Omissions and Conflicts in Target-Aspect Sentiment Analysis. On the Language Neutrality of Pre-trained Multilingual Representations. Improving Constituency Parsing with Span Attention. RecoBERT: A Catalog Language Model for Text-Based Recommendations. Multi-Agent Mutual Learning at Sentence-Level and Token-Level for Neural Machine Translation. Will This Idea Spread Beyond Academia? Understanding Knowledge Transfer of Scientific Concepts across Text Corpora. Recurrent Inference in Text Editing. An Empirical Exploration of Local Ordering Pre-training for Structured Prediction. Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers. Active Learning Approaches to Enhancing Neural Machine Translation. AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning. On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries. TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising. Improving End-to-End Bangla Speech Recognition with Semi-supervised Training. No Gestures Left Behind: Learning Relationships between Spoken Language and Freeform Gestures. UNIFIEDQA: Crossing Format Boundaries with a Single QA System. Robust and Interpretable Grounding of Spatial References with Relation Networks. SynET: Synonym Expansion using Transitivity. Scheduled DropHead: A Regularization Method for Transformer Models. Multi-Turn Dialogue Generation in E-Commerce Platform with the Context of Historical Dialogue. Ruler: Data Programming by Demonstration for Document Labeling. Dual Reconstruction: a Unifying Objective for Semi-Supervised Neural Machine Translation. Focus-Constrained Attention Mechanism for CVAE-based Response Generation. Chunk-based Chinese Spelling Check with Global Optimization. Multi-pretraining for Large-scale Text Classification. End-to-End Speech Recognition and Disfluency Removal. Characterizing the Value of Information in Medical Notes. KLearn: Background Knowledge Inference from Summarization Data. Extracting Chemical-Protein Interactions via Calibrated Deep Neural Network and Self-training. Logic2Text: High-Fidelity Natural Language Generation from Logical Forms. Diversify Question Generation with Continuous Content Selectors and Question Type Modeling. Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages. ConveRT: Efficient and Accurate Conversational Representations from Transformers. Computer Assisted Translation with Neural Quality Estimation and Automatic Post-Editing. Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering. The Role of Reentrancies in Abstract Meaning Representation Parsing. Cross-Lingual Suicidal-Oriented Word Embedding toward Suicide Prevention. Reinforcement Learning with Imbalanced Dataset for Data-to-Text Medical Report Generation. Reducing Quantity Hallucinations in Abstractive Summarization. Rethinking Topic Modelling: From Document-Space to Term-Space. A Semi-supervised Approach to Generate the Code-Mixed Text using Pre-trained Encoder and Transfer Learning. BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models. Recursive Top-Down Production for Sentence Generation with Latent Trees. Guided Dialogue Policy Learning without Adversarial Learning in the Loop. MultiDM-GCN: Aspect-guided Response Generation in Multi-domain Multi-modal Dialogue System using Graph Convolutional Network. Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation. Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization. Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness. Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems. ProphetNet: Predicting Future N-gram for Sequence-to-SequencePre-training. DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network. Plug-and-Play Conversational Models. Event-Driven Learning of Systematic Behaviours in Stock Markets. COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. Improving Compositional Generalization in Semantic Parsing. On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers. How Does Context Matter? On the Robustness of Event Detection with Context-Selective Mask Generalization. Adaptive Feature Selection for End-to-End Speech Translation. Abstractive Multi-Document Summarization via Joint Learning with Single-Document Summarization. Blockwise Self-Attention for Long Document Understanding. Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling. Grid Tagging Scheme for Aspect-oriented Fine-grained Opinion Extraction. Learning Numeral Embedding. Fast End-to-end Coreference Resolution for Korean. Toward Stance-based Personas for Opinionated Dialogues. Hierarchical Pre-training for Sequence Labelling in Spoken Dialog. Out-of-Sample Representation Learning for Knowledge Graphs. Fine-Grained Grounding for Multimodal Speech Recognition. Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains. Textual Supervision for Visually Grounded Spoken Language Understanding. Universal Dependencies According to BERT: Both More Specific and More General. Visual Objects As Context: Exploiting Visual Objects for Lexical Entailment. Learning to Plan and Realize Separately for Open-Ended Dialogue Systems. Be Different to Be Better! A Benchmark to Leverage the Complementarity of Language and Vision. Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition. PharmMT: A Neural Machine Translation Approach to Simplify Prescription Directions. Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs. Corpora Evaluation and System Bias Detection in Multi-document Summarization. Graph-to-Tree Neural Networks for Learning Structured Input-Output Translation with Applications to Semantic Parsing and Math Word Problem. Target Conditioning for One-to-Many Generation. Inferring symmetry in natural language. A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder. Enhancing Content Planning for Table-to-Text Generation with Data Understanding and Verification. Contextual Text Style Transfer. DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling. Cross-Lingual Dependency Parsing by POS-Guided Word Reordering. Assessing Robustness of Text Classification through Maximal Safe Radius Computation. TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog. A little goes a long way: Improving toxic language classification despite data scarcity. General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference. Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles. Learning to Generalize for Sequential Decision Making. Effective Crowd-Annotation of Participants, Interventions, and Outcomes in the Text of Clinical Trial Reports. Adversarial Grammatical Error Correction. CLAR: A Cross-Lingual Argument Regularizer for Semantic Role Labeling. Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation. Towards Domain-Independent Text Structuring Trainable on Large Discourse Treebanks. A Multilingual View of Unsupervised Machine Translation. An Evaluation Method for Diachronic Word Sense Induction. Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning. Towards Zero-Shot Conditional Summarization with Adaptive Multi-Task Fine-Tuning. Multilingual Knowledge Graph Completion via Ensemble Knowledge Transfer. Towards Controllable Biases in Language Generation. RobBERT: a Dutch RoBERTa-based Language Model. Regularization of Distinct Strategies for Unsupervised Question Generation. Graph-to-Graph Transformer for Transition-based Dependency Parsing. DeSMOG: Detecting Stance in Media On Global Warming. Improve Transformer Models with Better Relative Position Embeddings. A Sentiment-Controllable Topic-to-Essay Generator with Topic Knowledge Graph. What-if I ask you to explain: Explaining the effects of perturbations in procedural text. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. Composed Variational Natural Language Generation for Few-shot Intents. Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization. On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks. A Dual-Attention Network for Joint Named Entity Recognition and Sentence Classification of Adverse Drug Events. BERT-kNN: Adding a kNN Search Component to Pretrained Language Models for Better QA. Identifying Spurious Correlations for Robust Text Classification. HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification. Continual Learning for Natural Language Generation in Task-oriented Dialog Systems. UNQOVERing Stereotyping Biases via Underspecified Questions. A Semantics-based Approach to Disclosure Classification in User-Generated Online Content. Mining Knowledge for Natural Language Inference from Wikipedia Categories. OCNLI: Original Chinese Natural Language Inference. Margin-aware Unsupervised Domain Adaptation for Cross-lingual Text Labeling. Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems. Accurate polyglot semantic parsing with DAG grammars. Approximation of Response Knowledge Retrieval in Knowledge-grounded Dialogue Generation. Evaluating Factuality in Generation with Dependency-level Entailment. Cross-Lingual Text Classification with Minimal Resources by Transferring a Sparse Teacher. A Multi-Persona Chatbot for Hotline Counselor Training. Narrative Text Generation with a Latent Discrete Plan. Graph Transformer Networks with Syntactic and Semantic Structures for Event Argument Extraction. The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation. CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems. Attending to Long-Distance Document Context for Sequence Labeling. Global Bootstrapping Neural Network for Entity Set Expansion. Adversarial Augmentation Policy Search for Domain and Cross-Lingual Generalization in Reading Comprehension. Denoising Multi-Source Weak Supervision for Neural Text Classification. Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures.. Generating Accurate Electronic Health Assessment from Medical Graph. Context Analysis for Pre-trained Masked Language Models. Controllable Text Generation with Focused Variation. Modeling Preconditions in Text with a Crowd-sourced Dataset. Reevaluating Adversarial Examples in Natural Language. Question Answering with Long Multiple-Span Answers. It's not a Non-Issue: Negation as a Source of Error in Machine Translation. Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework. Joint Turn and Dialogue level User Satisfaction Estimation on Multi-Domain Conversations. Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training. Towards Context-Aware Code Comment Generation. Finding the Optimal Vocabulary Size for Neural Machine Translation. Making Information Seeking Easier: An Improved Pipeline for Conversational Search. Generalizable and Explainable Dialogue Generation via Explicit Action Learning. Determining Event Outcomes: The Case of #fail. WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization. Adversarial Training for Code Retrieval with Question-Description Relevance Regularization. Large Product Key Memory for Pretrained Language Models. STANDER: An Expert-Annotated Dataset for News Stance Detection and Evidence Retrieval. SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy. Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering. No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension. Reference Language based Unsupervised Neural Machine Translation. TinyBERT: Distilling BERT for Natural Language Understanding. Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder. Assessing Human-Parity in Machine Translation on the Segment Level. Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels. Factorized Transformer for Multi-Domain Neural Machine Translation. Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information. Contract Discovery: Dataset and a Few-Shot Semantic Retrieval Challenge with Competitive Baselines. Vocabulary Adaptation for Domain Adaptation in Neural Machine Translation. Detecting Media Bias in News Articles using Gaussian Bias Distributions. Looking inside Noun Compounds: Unsupervised Prepositional and Free Paraphrasing. BERT for Monolingual and Cross-Lingual Reverse Dictionary. What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models. Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?. A Pointer Network Architecture for Joint Morphological Segmentation and Tagging. Beyond Language: Learning Commonsense from Images for Reasoning. A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies.. Consistent Response Generation with Controlled Specificity. Internal and external pressures on language emergence: least effort, object constancy and frequency. Parsing All: Syntax and Semantics, Dependencies and Spans. LIMIT-BERT : Linguistics Informed Multi-Task BERT. Improving Limited Labeled Dialogue State Tracking with Self-Supervision. PrivNet: Safeguarding Private Attributes in Transfer Learning for Recommendation. Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation. An Empirical Investigation of Beam-Aware Training in Supertagging. Decoding Language Spatial Relations to 2D Spatial Arrangements. Why and when should you pool? Analyzing Pooling in Recurrent Architectures. Long Document Ranking with Query-Directed Sparse Transformer. Visuo-Linguistic Question Answering (VLQA) Challenge. Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic Priming. Multi-hop Question Generation with Graph Convolutional Network. MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering. Thinking Like a Skeptic: Defeasible Inference in Natural Language. Guiding Attention for Self-Supervised Learning with Transformers. Language-Conditioned Feature Pyramids for Visual Selection Tasks. BERT-QE: Contextualized Query Expansion for Document Re-ranking. ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations. Probabilistic Case-based Reasoning for \\Open-World Knowledge Graph Completion. TLDR: Extreme Summarization of Scientific Documents. Tri-Train: Automatic Pre-Fine Tuning between Pre-Training and Fine-Tuning for SciNER. On the Sub-layer Functionalities of Transformer Decoder. Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation. Robust Backed-off Estimation of Out-of-Vocabulary Embeddings. Tensorized Embedding Layers. Speaker or Listener? The Role of a Dialog Agent. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing. Paraphrasing vs Coreferring: Two Sides of the Same Coin. Active Sentence Learning by Adversarial Uncertainty Sampling in Discrete Space. Coming to Terms: Automatic Formation of Neologisms in Hebrew. IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages. Weakly-Supervised Modeling of Contextualized Event Embedding for Discourse Relations.
Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentences. Adversarial Text Generation via Sequence Contrast Discrimination. A Greedy Bit-flip Training Algorithm for Binarized Knowledge Graph Embeddings. Neural Speed Reading Audited. Pretrain-KGE: Learning Knowledge Representation from Pretrained Language Models. Enhance Robustness of Sequence Labelling with Masked Adversarial Training. Context-aware Stand-alone Neural Spelling Correction. A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels. KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding. Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning. Few-Shot Multi-Hop Relation Reasoning over Knowledge Bases. PolicyQA: A Reading Comprehension Dataset for Privacy Policies. Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data. Sequential Span Classification with Neural Semi-Markov CRFs for Biomedical Abstracts. Where to Submit? Helping Researchers to Choose the Right Venue. Pretrained Language Models for Dialogue Generation with Multiple Input Sources. Hybrid Emoji-Based Masked Language Models for Zero-Shot Abusive Language Detection. Transition-based Parsing with Stack-Transformers. PhoBERT: Pre-trained language models for Vietnamese. Make Templates Smarter: A Template Based Data2Text System Powered by Text Stitch Model. Incorporating Stylistic Lexical Preferences in Generative Language Models. LGPSolver - Solving Logic Grid Puzzles Automatically. HyperText: Endowing FastText with Hyperbolic Geometry. Learning to Generate Clinically Coherent Chest X-Ray Reports. What Can We Do to Improve Peer Review in NLP?. Biomedical Event Extraction with Hierarchical Knowledge Graphs. Examining the Ordering of Rhetorical Strategies in Persuasive Requests. Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank. OptSLA: an Optimization-Based Approach for Sequential Label Aggregation. Dynamically Updating Event Representations for Temporal Relation Classification with Multi-category Learning. Investigating Transferability in Pretrained Language Models. exBERT: Extending Pre-trained Models with Domain-specific Vocabulary Under Constrained Training Resources. Open Domain Question Answering based on Text Enhanced Knowledge Graph with Hyperedge Infusion. Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA. Pseudo-Bidirectional Decoding for Local Sequence Transduction. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. TopicBERT for Energy Efﬁcient Document Classiﬁcation. DomBERT: Domain-oriented Language Model for Aspect-based Sentiment Analysis. Continual Learning Long Short Term Memory. Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers. PTUM: Pre-training User Model from Unlabeled User Behaviors via Self-supervision. Adversarial Subword Regularization for Robust Neural Machine Translation. Learning Visual-Semantic Embeddings for Reporting Abnormal Findings on Chest X-rays. Automatically Identifying Gender Issues in Machine Translation using Perturbations. TSDG: Content-aware Neural Response Generation with Two-stage Decoding Process. Unsupervised Cross-Lingual Adaptation of Dependency Parsers Using CRF Autoencoders. Service-oriented Text-to-SQL Parsing. Helpful or Hierarchical? Predicting the Communicative Strategies of Chat Participants, and their Impact on Success. Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling. Learning Improvised Chatbots from Adversarial Modifications of Natural Language Feedback. Adapting Coreference Resolution to Twitter Conversations. On Romanization for Model Transfer Between Scripts in Neural Machine Translation. Answer Span Correction in Machine Reading Comprehension. An Investigation of Potential Function Designs for Neural CRF. Extending Multilingual BERT to Low-Resource Languages. Cross-Lingual Training of Neural Models for Document Ranking. Can Pre-training help VQA with Lexical Variations?. FENAS: Flexible and Expressive Neural Architecture Search. LEGAL-BERT: The Muppets straight out of Law School. An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text. On Long-Tailed Phenomena in Neural Machine Translation. Knowing What You Know: Calibrating Dialogue Belief State Distributions via Ensembles. Domain Adversarial Fine-Tuning as an Effective Regularizer. Data Annealing for Informal Language Understanding Tasks. Integrating Task Specific Information into Pretrained Language Models for Low Resource Fine Tuning. KoBE: Knowledge-Based Machine Translation Evaluation. Pushing the Limits of AMR Parsing with Self-Learning. WER we are and WER we think we are. A Novel Challenge Set for Hebrew Morphological Disambiguation and Diacritics Restoration. Improving Event Duration Prediction via Time-aware Pre-training. What do we expect from Multiple-choice QA Systems?. Resource-Enhanced Neural Model for Event Argument Extraction. Improving Target-side Lexical Transfer in Multilingual Neural Machine Translation. Using Visual Feature Space as a Pivot Across Languages. Document Classification for COVID-19 Literature. Inserting Information Bottlenecks for Attribution in Transformers. MCMH: Learning Multi-Chain Multi-Hop Rules for Knowledge Graph Reasoning. Weakly- and Semi-supervised Evidence Extraction. More Embeddings, Better Sequence Labelers?. NLP Service APIs and Models for Efficient Registration of New Clients. Effects of Naturalistic Variation in Goal-Oriented Dialog. Temporal Reasoning in Natural Language Inference. A Pilot Study of Text-to-SQL Semantic Parsing for Vietnamese. An Empirical Methodology for Detecting and Prioritizing Needs during Crisis Events. Towards Low-Resource Semi-Supervised Dialogue Generation with Meta-Learning. #Turki$hTweets: A Benchmark Dataset for Turkish Text Correction. Query-Key Normalization for Transformers. How Can Self-Attention Networks Recognize Dyck-n Languages?. Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation. The birth of Romanian BERT. How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?. Visually-Grounded Planning without Vision: Language Models Infer Detailed Plans from High-level Instructions. On the Branching Bias of Syntax Extracted from Pre-trained Language Models. The Pragmatics behind Politics: Modelling Metaphor, Framing and Emotion in Political Discourse. SMRT Chatbots: Improving Non-Task-Oriented Dialog with Simulated Multiple Reference Training. Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional Networks and Syntax-based Regulation. The Dots Have Their Values: Exploiting the Node-Edge Connections in Graph-based Neural Models for Document-level Relation Extraction. Byte Pair Encoding is Suboptimal for Language Model Pretraining. Learning to Classify Events from Human Needs Category Descriptions. Automatic Term Name Generation for Gene Ontology: Task and Dataset. Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings. Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory. Hierarchical Region Learning for Nested Named Entity Recognition. Understanding User Resistance Strategies in Persuasive Conversations. Exploiting Unsupervised Data for Emotion Recognition in Conversations. Do Language Embeddings capture Scales?. Dual Inference for Improving Language Understanding and Generation. Enhancing Generalization in Natural Language Inference by Syntax.