Automated identification of media bias in news articles: an interdisciplinary literature review

F Hamborg, K Donnay, B Gipp - International Journal on Digital Libraries, 2019 - Springer
Media bias, ie, slanted news coverage, can strongly impact the public perception of the
reported topics. In the social sciences, research over the past decades has developed …

One embedder, any task: Instruction-finetuned text embeddings

H Su, W Shi, J Kasai, Y Wang, Y Hu… - arXiv preprint arXiv …, 2022 - arxiv.org
We introduce INSTRUCTOR, a new method for computing text embeddings given task
instructions: every text input is embedded together with instructions explaining the use case …

Sheared llama: Accelerating language model pre-training via structured pruning

M Xia, T Gao, Z Zeng, D Chen - arXiv preprint arXiv:2310.06694, 2023 - arxiv.org
The popularity of LLaMA (Touvron et al., 2023a; b) and other recently emerged moderate-
sized large language models (LLMs) highlights the potential of building smaller yet powerful …

Towards continual knowledge learning of language models

J Jang, S Ye, S Yang, J Shin, J Han, G Kim… - arXiv preprint arXiv …, 2021 - arxiv.org
Large Language Models (LMs) are known to encode world knowledge in their parameters
as they pretrain on a vast amount of web corpus, which is often utilized for performing …

Task-aware retrieval with instructions

A Asai, T Schick, P Lewis, X Chen, G Izacard… - arXiv preprint arXiv …, 2022 - arxiv.org
We study the problem of retrieval with instructions, where users of a retrieval system
explicitly describe their intent along with their queries. We aim to develop a general-purpose …

Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation

J Chen, S Xiao, P Zhang, K Luo, D Lian… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we present a new embedding model, called M3-Embedding, which is
distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. It …

Trafilatura: A web scraping library and command-line tool for text discovery and extraction

A Barbaresi - Proceedings of the 59th Annual Meeting of the …, 2021 - aclanthology.org
An essential operation in web corpus construction consists in retaining the desired content
while discarding the rest. Another challenge finding one's way through websites. This article …

Tackling fake news detection by continually improving social context representations using graph neural networks

N Mehta, ML Pacheco… - Proceedings of the 60th …, 2022 - aclanthology.org
Easy access, variety of content, and fast widespread interactions are some of the reasons
making social media increasingly popular. However, this rise has also enabled the …

CoVerifi: A COVID-19 news verification system

NL Kolluri, D Murthy - Online Social Networks and Media, 2021 - Elsevier
There is an abundance of misinformation, disinformation, and “fake news” related to COVID-
19, leading the director-general of the World Health Organization to term this an 'infodemic' …

[HTML][HTML] Automated identification of bias inducing words in news articles using linguistic and context-oriented features

T Spinde, L Rudnitckaia, J Mitrović, F Hamborg… - Information Processing …, 2021 - Elsevier
Media has a substantial impact on public perception of events, and, accordingly, the way
media presents events can potentially alter the beliefs and views of the public. One of the …