You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The dataset compiles information from seven prominent Indonesian news platforms: Tempo, CNN Indonesia, CNBC Indonesia, Okezone, Suara, Kumparan, and JawaPos. Each source contributes a diverse range of articles, collectively forming a comprehensive repository of Indonesian news content. This dataset includes 2 special columns, 'embedding' which houses the text embeddings extracted using the OpenAI text-embedding-ada-002 model, and 'summary' which encapsulates the concise article summary generated via the ChatGPT API.
License
CC-BY-NC-4.0
The text was updated successfully, but these errors were encountered:
NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?id_news_dataset
The text was updated successfully, but these errors were encountered: